* Updates to `zed_llm_client-0.8.5` which adds support for `retry_after`
when anthropic provides it.
* Distinguishes upstream provider errors and rate limits from errors
that originate from zed's servers
* Moves `LanguageModelCompletionError::BadInputJson` to
`LanguageModelCompletionEvent::ToolUseJsonParseError`. While arguably
this is an error case, the logic in thread is cleaner with this move.
There is also precedent for inclusion of errors in the event type -
`CompletionRequestStatus::Failed` is how cloud errors arrive.
* Updates `PROVIDER_ID` / `PROVIDER_NAME` constants to use proper types
instead of `&str`, since they can be constructed in a const fashion.
* Removes use of `CLIENT_SUPPORTS_EXA_WEB_SEARCH_PROVIDER_HEADER_NAME`
as the server no longer reads this header and just defaults to that
behavior.
Release notes for this is covered by #33275
Release Notes:
- N/A
---------
Co-authored-by: Richard Feldman <oss@rtfeldman.com>
Co-authored-by: Richard <richard@zed.dev>
Seeing this come up in our server logs when sending requests to
Anthropic: `final assistant content cannot end with trailing
whitespace`.
Release Notes:
- agent: Fixed an issue where Anthropic requests would sometimes fail
because of malformed assistant messages
This cleans up our settings to not include any `version` fields, as we
have an actual settings migrator now.
This PR removes `language_models > anthropic > version`,
`language_models > openai > version` and `agent > version`.
We had migration paths in the code for a long time, so in practice
almost everyone should be using the latest version of these settings.
Release Notes:
- Remove `version` fields in settings for `agent`, `language_models >
anthropic`, `language_models > openai`. Your settings will automatically
be migrated. If you're running into issues with this open an issue
[here](https://github.com/zed-industries/zed/issues)
cc @osyvokon
We were seeing a bunch of errors in our backend when people were using
Claude models with thinking enabled.
In the logs we would see
> an error occurred while interacting with the Anthropic API:
invalid_request_error: messages.x.content.0.type: Expected `thinking` or
`redacted_thinking`, but found `text`. When `thinking` is enabled, a
final `assistant` message must start with a thinking block (preceeding
the lastmost set of `tool_use` and `tool_result` blocks). We recommend
you include thinking blocks from previous turns. To avoid this
requirement, disable `thinking`. Please consult our documentation at
https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking
However, this issue did not occur frequently and was not easily
reproducible. Turns out it was triggered by us not correctly handling
[Redacted Thinking
Blocks](https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking#thinking-redaction).
I could constantly reproduce this issue by including this magic string:
`ANTHROPIC_MAGIC_STRING_TRIGGER_REDACTED_THINKING_46C9A13E193C177646C7398A98432ECCCE4C1253D5E2D82641AC0E52CC2876CB
` in the request, which forces `claude-3-7-sonnet` to emit redacted
thinking blocks (confusingly the magic string does not seem to be
working for `claude-sonnet-4`). As soon as we hit a tool call Anthropic
would return an error.
Thanks to @osyvokon for pointing me in the right direction 😄!
Release Notes:
- agent: Fixed an issue where Anthropic models would sometimes return an
error when thinking was enabled
This PR is in preparation for doing automatic retries for certain
errors, e.g. Overloaded. It doesn't change behavior yet (aside from some
granularity of error messages shown to the user), but rather mostly
changes some error handling to be exhaustive enum matches instead of
`anyhow` downcasts, and leaves some comments for where the behavior
change will be in a future PR.
Release Notes:
- N/A
Previously we were using a mix of `u32` and `usize`, e.g. `max_tokens:
usize, max_output_tokens: Option<u32>` in the same `struct`.
Although [tiktoken](https://github.com/openai/tiktoken) uses `usize`,
token counts should be consistent across targets (e.g. the same model
doesn't suddenly get a smaller context window if you're compiling for
wasm32), and these token counts could end up getting serialized using a
binary protocol, so `usize` is not the right choice for token counts.
I chose to standardize on `u64` over `u32` because we don't store many
of them (so the extra size should be insignificant) and future models
may exceed `u32::MAX` tokens.
Release Notes:
- N/A
Bubbles up rate limit information so that we can retry after a certain
duration if needed higher up in the stack.
Also caps the number of concurrent evals running at once to also help.
Release Notes:
- N/A
Closes#31438
Release Notes:
- agent: Fixed an edge case were the request would fail when using
Claude and multiple images were attached
---------
Co-authored-by: Richard Feldman <oss@rtfeldman.com>
This expands our deserialization of JSON from models to be more tolerant
of different variations that the model may send, including
capitalization, wrapping things in objects vs. being plain strings, etc.
Also when deserialization fails, it reports the entire error in the JSON
so we can see what failed to deserialize. (Previously these errors were
very unhelpful at diagnosing the problem.)
Finally, also removes the `WrappedText` variant since the custom
deserializer just turns that style of JSON into a normal `Text` variant.
Release Notes:
- N/A
This PR updates the default/recommended models for the Anthropic and Zed
providers to be Claude Sonnet 4.
Release Notes:
- Updated default/recommended Anthropic models to Claude Sonnet 4.
https://github.com/zed-industries/zed/issues/30972 brought up another
case where our context is not enough to track the actual source of the
issue: we get a general top-level error without inner error.
The reason for this was `.ok_or_else(|| anyhow!("failed to read HEAD
SHA"))?; ` on the top level.
The PR finally reworks the way we use anyhow to reduce such issues (or
at least make it simpler to bubble them up later in a fix).
On top of that, uses a few more anyhow methods for better readability.
* `.ok_or_else(|| anyhow!("..."))`, `map_err` and other similar error
conversion/option reporting cases are replaced with `context` and
`with_context` calls
* in addition to that, various `anyhow!("failed to do ...")` are
stripped with `.context("Doing ...")` messages instead to remove the
parasitic `failed to` text
* `anyhow::ensure!` is used instead of `if ... { return Err(...); }`
calls
* `anyhow::bail!` is used instead of `return Err(anyhow!(...));`
Release Notes:
- N/A
Some providers sometimes send `{ "type": "text", "text": ... }` instead
of just the text as a string. Now we accept those instead of erroring.
Release Notes:
- N/A
This is very basic support for them. There are a number of other TODOs
before this is really a first-class supported feature, so not adding any
release notes for it; for now, this PR just makes it so that if
read_file tries to read a PNG (which has come up in practice), it at
least correctly sends it to Anthropic instead of messing up.
This also lays the groundwork for future PRs for more first-class
support for images in tool calls across more image file formats and LLM
providers.
Release Notes:
- N/A
---------
Co-authored-by: Agus Zubiaga <hi@aguz.me>
Co-authored-by: Agus Zubiaga <agus@zed.dev>
This sets us up to display queue position information to the user, once
our language model backend is updated to support request queuing.
The JSON returned by the LLM backend will need to look like this:
```json
{"queue": {"status": "queued", "position": 1}}
{"queue": {"status": "started"}}
{"event": {"THE_UPSTREAM_MODEL_PROVIDER_EVENT": "..."}}
```
Release Notes:
- N/A
---------
Co-authored-by: Marshall Bowers <git@maxdeviant.com>
This is to enable alternative streaming solutions at the application
layer. I'm not sure we really should have performed parsing of the input
at this layer. Either way I want to experiment with streaming approaches
in a separate crate on a branch, and this will help.
/cc @maxdeviant @bennetbo @rtfeldman
Closes #ISSUE
Release Notes:
- N/A
Now we're more tolerant of invalid JSON coming back from the model
(possibly because it was incomplete and we're streaming), plus if we do
end up with invalid JSON once it has all streamed back, we report what
the malformed JSON actually was:
<img width="444" alt="Screenshot 2025-04-23 at 1 49 14 PM"
src="https://github.com/user-attachments/assets/480f5da7-869b-49f3-9ffd-8f08ccddb33d"
/>
Release Notes:
- N/A
* Adds a fast / cheaper model to providers and defaults thread
summarization to this model. Initial motivation for this was that
https://github.com/zed-industries/zed/pull/29099 would cause these
requests to fail when used with a thinking model. It doesn't seem
correct to use a thinking model for summarization.
* Skips system prompt, context, and thinking segments.
* If tool use is happening, allows 2 tool uses + one more agent response
before summarizing.
Downside of this is that there was potential for some prefix cache reuse
before, especially for title summarization (thread summarization omitted
tool results and so would not share a prefix for those). This seems fine
as these requests should typically be fairly small. Even for full thread
summarization, skipping all tool use / context should greatly reduce the
token use.
Release Notes:
- N/A
Looks like the required backend component of this was deployed.
https://github.com/zed-industries/monorepo/actions/runs/14541199197
Release Notes:
- N/A
---------
Co-authored-by: Antonio Scandurra <me@as-cii.com>
Co-authored-by: Agus Zubiaga <hi@aguz.me>
Co-authored-by: Richard Feldman <oss@rtfeldman.com>
Co-authored-by: Nathan Sobo <nathan@zed.dev>
The UI was mistakenly using the cumulative token usage for the token
counter. It will now display the last request token count, plus an
estimation of the tokens in the message editor and context entries that
haven't been sent yet.
https://github.com/user-attachments/assets/0438c501-b850-4397-9135-57214ca3c07a
Additionally, when the user edits a message, we'll display the actual
token count up to it and estimate the tokens in the new message.
Note: We don't currently estimate the delta when switching profiles. In
the future, we want to use the count tokens API to measure every part of
the request and display a breakdown.
Release Notes:
- agent: Made the token count more accurate and added back estimation of
used tokens as you type and add context.
---------
Co-authored-by: Bennet Bo Fenner <bennetbo@gmx.de>
Co-authored-by: Danilo Leal <daniloleal09@gmail.com>
Release Notes:
- agent: Show recommended models in the agent model selector and display
the provider in the model selector's trigger.
---------
Co-authored-by: Danilo Leal <daniloleal09@gmail.com>
Co-authored-by: Danilo Leal <67129314+danilo-leal@users.noreply.github.com>
This PR removes the `use_any_tool` method from the `LanguageModel`
trait.
It was not being used anywhere, and doesn't really fit in our new tool
use story.
Release Notes:
- N/A
Closes#25671
Release Notes:
- Added support for `claude-3-7-sonnet-thinking` in the assistant panel
---------
Co-authored-by: Danilo Leal <daniloleal09@gmail.com>
Co-authored-by: Antonio Scandurra <me@as-cii.com>
Co-authored-by: Agus Zubiaga <hi@aguz.me>
This is the core change:
https://github.com/zed-industries/zed/pull/26758/files#diff-044302c0d57147af17e68a0009fee3e8dcdfb4f32c27a915e70cfa80e987f765R1052
TODO:
- [x] Use AsyncFn instead of Fn() -> Future in GPUI spawn methods
- [x] Implement it in the whole app
- [x] Implement it in the debugger
- [x] Glance at the RPC crate, and see if those box future methods can
be switched over. Answer: It can't directly, as you can't make an
AsyncFn* into a trait object. There's ways around that, but they're all
more complex than just keeping the code as is.
- [ ] Fix platform specific code
Release Notes:
- N/A
This PR changes the default value when no input is provided with a tool
use from `null` to `{}`.
This fixes an issue I was seeing where tools that didn't accept input
were not being called correctly.
Release Notes:
- N/A
I've been bothered by using simple hyphens for bullet lists here for a
while; it kinda looked cheap and not well-formatted. So, in this PR, I'm
adding a new, custom UI component in the `language_models` crate, called
`InstructionListItem`, based off the `ListItem` that's somewhat
mimic'ing what a `<li>` would be on the web.
It does have a "rigid" structure as in it's always a label followed by a
button (which is optional), but that seems okay given it has been the
overall shape of the copy we've been using here. Also, never really
loved that we were pasting URLs directly, that kinda felt cheap, too. I
could see an argument where it's just clearer, but it looks too
cluttered, as URLs aren't super pretty, necessarily.
| Before | After |
|--------|--------|
| <img
src="https://github.com/user-attachments/assets/ffd1ac27-b1f4-450d-abf5-079285fc9877"
width="700px" /> | <img
src="https://github.com/user-attachments/assets/28fb9d0d-205d-45d8-9e43-1aaa947adc96"
width="700px" /> |
Release Notes:
- N/A
This PR moves the `report_assistant_event` function from the
`language_models` crate to the `language_model` crate.
This allows us to drop some dependencies on `language_models`.
Release Notes:
- N/A
This PR removes the dependencies on the individual model provider crates
from the `language_model` crate.
The various conversion methods for converting a `LanguageModelRequest`
into its provider-specific request type have been inlined into the
various provider modules in the `language_models` crate.
The model providers we provide via Zed's cloud offering get to stay, for
now.
Release Notes:
- N/A
While investigating #24896, I noticed two issues:
1. The default configuration for the `zed.dev` provider was using the
wrong string for Claude 3.5 Sonnet. This meant the provider would always
result as not configured until the user selected it from the model
picker, because we couldn't deserialize that string to a valid
`anthropic::Model` enum variant.
2. When clicking on `Open New Chat`/`Start New Thread` in the provider
configuration, we would select `Claude 3.5 Haiku` by default instead of
Claude 3.5 Sonnet.
Release Notes:
- Fixed some issues that caused AI providers to sometimes be
misconfigured.
This PR adds a new `CredentialsProvider` trait that abstracts over
interacting with the system keychain.
We had previously introduced a version of this scoped just to Zed auth
in https://github.com/zed-industries/zed/pull/11505.
However, after landing https://github.com/zed-industries/zed/pull/25123,
we now have a similar issue with the credentials for language model
providers that are also stored in the keychain (and thus also produce a
spam of popups when running a development build of Zed).
This PR takes the existing approach and makes it more generic, such that
we can use it everywhere that we need to read/store credentials in the
keychain.
There are still two credential provider implementations:
- `KeychainCredentialsProvider` will interact with the system keychain
(using the existing GPUI APIs)
- `DevelopmentCredentialsProvider` will use a local file on the file
system
We only use the `DevelopmentCredentialsProvider` when:
1. We are running a development build of Zed
2. The `ZED_DEVELOPMENT_AUTH` environment variable is set
- I am considering removing the need for this and making it the default,
but that will be explored in a follow-up PR.
Release Notes:
- N/A
This PR updates the `LanguageModelProvider::authenticate` method to
return an `AuthenticateError` instead of an `anyhow::Error`.
This allows us to model the "credentials not found" state explicitly as
`AuthenticateError::CredentialsNotFound`, which enables the caller to
check for this state and act accordingly.
Planning to use this in #25123 to silence errors about missing
credentials when authenticating providers in the background.
Release Notes:
- N/A
Done automatically with
> ast-grep -p '$A.background_executor().spawn($B)' -r
'$A.background_spawn($B)' --update-all --globs "\!crates/gpui"
Followed by:
* `cargo fmt`
* Unexpected need to remove some trailing whitespace.
* Manually adding imports of `gpui::{AppContext as _}` which provides
`background_spawn`
* Added `AppContext as _` to existing use of `AppContext`
Release Notes:
- N/A
This PR does some somewhat light UI adjustment to the Assistant 2
settings view. The Prompt Library section should feature the default
prompts in the future, so that's why it's been separated that way.
<img width="800" alt="Screenshot 2025-01-29 at 2 59 59 PM"
src="https://github.com/user-attachments/assets/7b033bde-51ab-44d5-9e53-3f72b8ff5f51"
/>
Release Notes:
- N/A
Fix a bug where a GPUI macro still used `ModelContext`
Rename `AsyncAppContext` -> `AsyncApp`
Rename update_model, read_model, insert_model, and reserve_model to
update_entity, read_entity, insert_entity, and reserve_entity
Release Notes:
- N/A
There's still a bit more work to do on this, but this PR is compiling
(with warnings) after eliminating the key types. When the tasks below
are complete, this will be the new narrative for GPUI:
- `Entity<T>` - This replaces `View<T>`/`Model<T>`. It represents a unit
of state, and if `T` implements `Render`, then `Entity<T>` implements
`Element`.
- `&mut App` This replaces `AppContext` and represents the app.
- `&mut Context<T>` This replaces `ModelContext` and derefs to `App`. It
is provided by the framework when updating an entity.
- `&mut Window` Broken out of `&mut WindowContext` which no longer
exists. Every method that once took `&mut WindowContext` now takes `&mut
Window, &mut App` and every method that took `&mut ViewContext<T>` now
takes `&mut Window, &mut Context<T>`
Not pictured here are the two other failed attempts. It's been quite a
month!
Tasks:
- [x] Remove `View`, `ViewContext`, `WindowContext` and thread through
`Window`
- [x] [@cole-miller @mikayla-maki] Redraw window when entities change
- [x] [@cole-miller @mikayla-maki] Get examples and Zed running
- [x] [@cole-miller @mikayla-maki] Fix Zed rendering
- [x] [@mikayla-maki] Fix todo! macros and comments
- [x] Fix a bug where the editor would not be redrawn because of view
caching
- [x] remove publicness window.notify() and replace with
`AppContext::notify`
- [x] remove `observe_new_window_models`, replace with
`observe_new_models` with an optional window
- [x] Fix a bug where the project panel would not be redrawn because of
the wrong refresh() call being used
- [x] Fix the tests
- [x] Fix warnings by eliminating `Window` params or using `_`
- [x] Fix conflicts
- [x] Simplify generic code where possible
- [x] Rename types
- [ ] Update docs
### issues post merge
- [x] Issues switching between normal and insert mode
- [x] Assistant re-rendering failure
- [x] Vim test failures
- [x] Mac build issue
Release Notes:
- N/A
---------
Co-authored-by: Antonio Scandurra <me@as-cii.com>
Co-authored-by: Cole Miller <cole@zed.dev>
Co-authored-by: Mikayla <mikayla@zed.dev>
Co-authored-by: Joseph <joseph@zed.dev>
Co-authored-by: max <max@zed.dev>
Co-authored-by: Michael Sloan <michael@zed.dev>
Co-authored-by: Mikayla Maki <mikaylamaki@Mikaylas-MacBook-Pro.local>
Co-authored-by: Mikayla <mikayla.c.maki@gmail.com>
Co-authored-by: joão <joao@zed.dev>
Release Notes:
- Added the ability to specify additional beta headers for custom
Anthropic models.
---------
Co-authored-by: David Soria Parra <167242713+dsp-ant@users.noreply.github.com>
Co-authored-by: Marshall Bowers <elliott.codes@gmail.com>