Closes#26030
*Note: This is my first contribution to Zed*
This addresses a second streaming bottleneck in Bedrock that remained
after the initial fix in #28281 (released in preview 194).
The issue is in the mechanism used to convert Zed's internal `AsyncBody`
into the `SdkBody` expected by the Bedrock language provider. We are
using a non-streaming converter that buffers responses.
**How the fix works:**
The AWS SDK provides streaming-compatible converters to create `SdkBody`
instances, but these require the input body to implement the `Body`
trait from the `http-body` crate.
This PR enables streaming by implementing the required trait and
switching to the streaming-compatible converter.
**Changes (2 commits):**
* 1st Commit - **Implement http-body Body trait for AsyncBody:**
- Add `http-body = 1.0` dependency (already an indirect dependency)
- Implement the `Body` trait for our existing `AsyncBody` type
- Uses `poll_frame` to read data chunks asynchronously, preserving
streaming behavior
* 2nd Commit - **Use streaming-compatible AWS SDK converter:**
- Create `SdkBody` using `SdkBody::from_body_1_x()` with the new `Body`
trait implementation
**Details/FAQ:**
**Q: Why add another dependency?**
A: We tried to avoid adding a dependency, but the AWS SDK requires the
`Body` trait and `http-body` is where it's defined. The crate is already
an indirect dependency, making this a reasonable solution.
**Q: Why modify the shared `http_client` crate instead of just
`aws_bedrock_client`?**
A: We considered implementing the `Body` trait on a wrapper in
`aws_bedrock_client`, but since `AsyncBody` already uses `http` crate
types, extending support to the companion `http-body` crate seems
reasonable and may benefit other integrations.
**Q: How was this bottleneck discovered?**
A: After @5herlocked's initial streaming fix in #28281, I tested preview
194 and noticed streaming still had issues. I found a way to reproduce
the problem and chatted with @5herlocked about it. He immediately
pinpointed the exact location where the issue was occurring, his
diagnosis made this fix possible.
**Q: How does this relate to the previous fix?**
A: #28281 fixed buffering issues higher in the stack, but unfortunately
there was another bottleneck lower-down in the aws-http-client. This PR
addresses that separate buffering issue.
**Q: Does this use zero-copy or one-copy?**
A: The `Body` implementation includes one copy. Someone more
knowledgeable might be able to achieve a zero-copy approach, but we
opted for a conservative approach. The performance impact should not be
perceptible in typical usage.
**Testing:**
Confirmed that Bedrock streaming now works without buffering delays in a
local build.
Release Notes:
- Improved Bedrock streaming by eliminating response buffering delays
---------
Co-authored-by: Marshall Bowers <git@maxdeviant.com>
This introduces a new field `thinking_allowed` on `LanguageModelRequest`
which lets us control whether thinking should be enabled if the model
supports it.
We permit thinking in the Inline Assistant, Edit File tool and the Git
Commit message generator, this should make generation faster when using
a thinking model, e.g. `claude-sonnet-4-thinking`
Release Notes:
- N/A
Closes#26030
Release Notes:
- Fixed Bedrock bug causing streaming responses to return as one big
chunk
---------
Co-authored-by: Peter Tripp <peter@zed.dev>
* Updates to `zed_llm_client-0.8.5` which adds support for `retry_after`
when anthropic provides it.
* Distinguishes upstream provider errors and rate limits from errors
that originate from zed's servers
* Moves `LanguageModelCompletionError::BadInputJson` to
`LanguageModelCompletionEvent::ToolUseJsonParseError`. While arguably
this is an error case, the logic in thread is cleaner with this move.
There is also precedent for inclusion of errors in the event type -
`CompletionRequestStatus::Failed` is how cloud errors arrive.
* Updates `PROVIDER_ID` / `PROVIDER_NAME` constants to use proper types
instead of `&str`, since they can be constructed in a const fashion.
* Removes use of `CLIENT_SUPPORTS_EXA_WEB_SEARCH_PROVIDER_HEADER_NAME`
as the server no longer reads this header and just defaults to that
behavior.
Release notes for this is covered by #33275
Release Notes:
- N/A
---------
Co-authored-by: Richard Feldman <oss@rtfeldman.com>
Co-authored-by: Richard <richard@zed.dev>
Closes#30714
Bedrock converse api expect to see tool options if at least one tool was
used in conversation in the past messages.
Right now if `LanguageModelToolChoice::None` isn't supported edit agent
[remove][1] tools from request. That point breaks Converse API of
Bedrock. As was proposed in [the issue][2] we won't drop tool choose but
instead will deny any of them if model will respond with a tool choose.
[1]:
fceba6c795/crates/assistant_tools/src/edit_agent.rs (L703)
[2]:
https://github.com/zed-industries/zed/issues/30714#issuecomment-2886422716
Release Notes:
- Fixed bedrock tool calls in edit mode
Previously we were using a mix of `u32` and `usize`, e.g. `max_tokens:
usize, max_output_tokens: Option<u32>` in the same `struct`.
Although [tiktoken](https://github.com/openai/tiktoken) uses `usize`,
token counts should be consistent across targets (e.g. the same model
doesn't suddenly get a smaller context window if you're compiling for
wasm32), and these token counts could end up getting serialized using a
binary protocol, so `usize` is not the right choice for token counts.
I chose to standardize on `u64` over `u32` because we don't store many
of them (so the extra size should be insignificant) and future models
may exceed `u32::MAX` tokens.
Release Notes:
- N/A
Bubbles up rate limit information so that we can retry after a certain
duration if needed higher up in the stack.
Also caps the number of concurrent evals running at once to also help.
Release Notes:
- N/A
Closes#30535
Release Notes:
- AWS Bedrock: Add support for Meta Llama 4 Scout and Maverick models.
- AWS Bedrock: Fixed cross-region inference for all regions.
- AWS Bedrock: Updated all models available through Cross Region
inference.
---------
Co-authored-by: Marshall Bowers <git@maxdeviant.com>
This expands our deserialization of JSON from models to be more tolerant
of different variations that the model may send, including
capitalization, wrapping things in objects vs. being plain strings, etc.
Also when deserialization fails, it reports the entire error in the JSON
so we can see what failed to deserialize. (Previously these errors were
very unhelpful at diagnosing the problem.)
Finally, also removes the `WrappedText` variant since the custom
deserializer just turns that style of JSON into a normal `Text` variant.
Release Notes:
- N/A
https://github.com/zed-industries/zed/issues/30972 brought up another
case where our context is not enough to track the actual source of the
issue: we get a general top-level error without inner error.
The reason for this was `.ok_or_else(|| anyhow!("failed to read HEAD
SHA"))?; ` on the top level.
The PR finally reworks the way we use anyhow to reduce such issues (or
at least make it simpler to bubble them up later in a fix).
On top of that, uses a few more anyhow methods for better readability.
* `.ok_or_else(|| anyhow!("..."))`, `map_err` and other similar error
conversion/option reporting cases are replaced with `context` and
`with_context` calls
* in addition to that, various `anyhow!("failed to do ...")` are
stripped with `.context("Doing ...")` messages instead to remove the
parasitic `failed to` text
* `anyhow::ensure!` is used instead of `if ... { return Err(...); }`
calls
* `anyhow::bail!` is used instead of `return Err(anyhow!(...));`
Release Notes:
- N/A
Some providers sometimes send `{ "type": "text", "text": ... }` instead
of just the text as a string. Now we accept those instead of erroring.
Release Notes:
- N/A
This is very basic support for them. There are a number of other TODOs
before this is really a first-class supported feature, so not adding any
release notes for it; for now, this PR just makes it so that if
read_file tries to read a PNG (which has come up in practice), it at
least correctly sends it to Anthropic instead of messing up.
This also lays the groundwork for future PRs for more first-class
support for images in tool calls across more image file formats and LLM
providers.
Release Notes:
- N/A
---------
Co-authored-by: Agus Zubiaga <hi@aguz.me>
Co-authored-by: Agus Zubiaga <agus@zed.dev>
This is to enable alternative streaming solutions at the application
layer. I'm not sure we really should have performed parsing of the input
at this layer. Either way I want to experiment with streaming approaches
in a separate crate on a branch, and this will help.
/cc @maxdeviant @bennetbo @rtfeldman
Closes #ISSUE
Release Notes:
- N/A
* Adds a fast / cheaper model to providers and defaults thread
summarization to this model. Initial motivation for this was that
https://github.com/zed-industries/zed/pull/29099 would cause these
requests to fail when used with a thinking model. It doesn't seem
correct to use a thinking model for summarization.
* Skips system prompt, context, and thinking segments.
* If tool use is happening, allows 2 tool uses + one more agent response
before summarizing.
Downside of this is that there was potential for some prefix cache reuse
before, especially for title summarization (thread summarization omitted
tool results and so would not share a prefix for those). This seems fine
as these requests should typically be fairly small. Even for full thread
summarization, skipping all tool use / context should greatly reduce the
token use.
Release Notes:
- N/A
Looks like the required backend component of this was deployed.
https://github.com/zed-industries/monorepo/actions/runs/14541199197
Release Notes:
- N/A
---------
Co-authored-by: Antonio Scandurra <me@as-cii.com>
Co-authored-by: Agus Zubiaga <hi@aguz.me>
Co-authored-by: Richard Feldman <oss@rtfeldman.com>
Co-authored-by: Nathan Sobo <nathan@zed.dev>
Closes#27223
Merges: #27996, #26734, #27949
Release Notes:
- AWS Bedrock: Added advanced authentication strategies with:
- Short lived credentials with Session Tokens
- AWS Named Profile
- EC2 Identity, Pod Identity, Web Identity
- AWS Bedrock: Added Claude 3.7 Thinking support.
- AWS Bedrock: Adding Cross Region Inference for all combinations of
regions and model availability.
- Agent Beta: Added support for AWS Bedrock.
---------
Co-authored-by: Marshall Bowers <git@maxdeviant.com>
This PR removes the `use_any_tool` method from the `LanguageModel`
trait.
It was not being used anywhere, and doesn't really fit in our new tool
use story.
Release Notes:
- N/A
This is the core change:
https://github.com/zed-industries/zed/pull/26758/files#diff-044302c0d57147af17e68a0009fee3e8dcdfb4f32c27a915e70cfa80e987f765R1052
TODO:
- [x] Use AsyncFn instead of Fn() -> Future in GPUI spawn methods
- [x] Implement it in the whole app
- [x] Implement it in the debugger
- [x] Glance at the RPC crate, and see if those box future methods can
be switched over. Answer: It can't directly, as you can't make an
AsyncFn* into a trait object. There's ways around that, but they're all
more complex than just keeping the code as is.
- [ ] Fix platform specific code
Release Notes:
- N/A
This PR removes a number of `.unwrap`s in the Bedrock provider.
We must not `.unwrap` in situations where it is not provably safe to do
so, which it was not in any of these cases.
Release Notes:
- Fixed some potential panics in the AWS Bedrock model provider.
I've been bothered by using simple hyphens for bullet lists here for a
while; it kinda looked cheap and not well-formatted. So, in this PR, I'm
adding a new, custom UI component in the `language_models` crate, called
`InstructionListItem`, based off the `ListItem` that's somewhat
mimic'ing what a `<li>` would be on the web.
It does have a "rigid" structure as in it's always a label followed by a
button (which is optional), but that seems okay given it has been the
overall shape of the copy we've been using here. Also, never really
loved that we were pasting URLs directly, that kinda felt cheap, too. I
could see an argument where it's just clearer, but it looks too
cluttered, as URLs aren't super pretty, necessarily.
| Before | After |
|--------|--------|
| <img
src="https://github.com/user-attachments/assets/ffd1ac27-b1f4-450d-abf5-079285fc9877"
width="700px" /> | <img
src="https://github.com/user-attachments/assets/28fb9d0d-205d-45d8-9e43-1aaa947adc96"
width="700px" /> |
Release Notes:
- N/A
Going for a different, arguably simpler design for the Assistant 2 empty
state here. Also took the opportunity to adjust other elements like the
toolbar, message editor, and some items in the configuration page.
<img
src="https://github.com/user-attachments/assets/03fd1d48-a675-4eac-b694-bbe4eeaf06e9"
width="700px"/>
Release Notes:
- N/A
Closes#25714
Internal team reported issue where the Bedrock provider defaulted to
"us-east-1" for all requests regardless of what is configured in the
credentials until first zed restart.
Release Notes:
- Fixed an issue where the Bedrock model provider would not always
respect the region.
Closes#16544
Release Notes:
- Added support for AWS Bedrock to the Assistant.
---------
Co-authored-by: Marshall Bowers <git@maxdeviant.com>
Co-authored-by: Anthony <anthony@zed.dev>
Co-authored-by: Anthony Eid <hello@anthonyeid.me>