Commit graph

234 commits

Author SHA1 Message Date
Bennet Bo Fenner
858ab9cc23
Revert "ai: Auto select user model when there's no default" (#36932)
Reverts zed-industries/zed#36722

Release Notes:

- N/A
2025-08-26 13:55:09 +00:00
Ben Brandt
b249593abe
agent2: Always finalize diffs from the edit tool (#36918)
Previously, we wouldn't finalize the diff if an error occurred during
editing or the tool call was canceled.

Release Notes:

- N/A

---------

Co-authored-by: Antonio Scandurra <me@as-cii.com>
2025-08-26 09:46:29 +00:00
Antonio Scandurra
61bc1cc441
acp: Support launching custom agent servers (#36805)
It's enough to add this to your settings:

```json
{
    "agent_servers": {
        "Name Of Your Agent": {
            "command": "/path/to/custom/agent",
            "args": ["arguments", "that", "you", "want"],
        }
    }
}
```

Release Notes:

- N/A
2025-08-23 14:30:54 +00:00
Anthony Eid
8204ef1e51
onboarding: Remove accept AI ToS from within Zed (#36612)
Users now accept ToS from Zed's website when they sign in to Zed the
first time. So it's no longer possible that a signed in account could
not have accepted the ToS.


Release Notes:

- N/A

---------

Co-authored-by: Mikayla Maki <mikayla.c.maki@gmail.com>
2025-08-22 11:45:47 -04:00
Anthony Eid
b349a8f34c
ai: Auto select user model when there's no default (#36722)
This PR identifies automatic configuration options that users can select
from the agent panel. If no default provider is set in their settings,
the PR defaults to the first recommended option. Additionally, it
updates the selected provider for a thread when a user changes the
default provider through the settings file, if the thread hasn't had any
queries yet.

Release Notes:

- agent: automatically select a language model provider if there's no
user set provider.

---------

Co-authored-by: Michael Sloan <michael@zed.dev>
2025-08-22 01:12:12 -04:00
tidely
7bdc99abc1
Fix clippy::redundant_clone lint violations (#36558)
This removes around 900 unnecessary clones, ranging from cloning a few
ints all the way to large data structures and images.

A lot of these were fixed using `cargo clippy --fix --workspace
--all-targets`, however it often breaks other lints and needs to be run
again. This was then followed up with some manual fixing.

I understand this is a large diff, but all the changes are pretty
trivial. Rust is doing some heavy lifting here for us. Once I get it up
to speed with main, I'd appreciate this getting merged rather sooner
than later.

Release Notes:

- N/A
2025-08-20 12:20:13 +02:00
Piotr Osiewicz
cf7c64d77f
lints: A bunch of extra style lint fixes (#36568)
- **lints: Fix 'doc_lazy_continuation'**
- **lints: Fix 'doc_overindented_list_items'**
- **inherent_to_string and io_other_error**
- **Some more lint fixes**
- **lints: enable bool_assert_comparison, match_like_matches_macro and
wrong_self_convention**


Release Notes:

- N/A
2025-08-20 12:05:58 +02:00
Piotr Osiewicz
6825715503
Another batch of lint fixes (#36521)
- **Enable a bunch of extra lints**
- **First batch of fixes**
- **More fixes**

Release Notes:

- N/A
2025-08-19 20:33:44 +00:00
Piotr Osiewicz
8f567383e4
Auto-fix clippy::collapsible_if violations (#36428)
Release Notes:

- N/A
2025-08-19 13:27:24 +00:00
Bennet Bo Fenner
0ea0d466d2
agent2: Port retry logic (#36421)
Release Notes:

- N/A
2025-08-19 09:41:55 +00:00
Danilo Leal
b7edc89a87
agent: Improve error and warnings display (#36425)
This PR refactors the callout component and improves how we display
errors and warnings in the agent panel, along with improvements for
specific cases (e.g., you have `zed.dev` as your LLM provider and is
signed out).

Still a work in progress, though, wrapping up some details.

Release Notes:

- N/A
2025-08-18 21:44:07 -03:00
Piotr Osiewicz
9e0e233319
Fix clippy::needless_borrow lint violations (#36444)
Release Notes:

- N/A
2025-08-18 21:54:35 +00:00
Agus Zubiaga
8b89ea1a80
Handle auth for claude (#36442)
We'll now use the anthropic provider to get credentials for `claude` and
embed its configuration view in the panel when they are not present.

Release Notes:

- N/A
2025-08-18 20:40:59 +00:00
Bennet Bo Fenner
6f3cd42411
agent2: Port Zed AI features (#36172)
Release Notes:

- N/A

---------

Co-authored-by: Antonio Scandurra <me@as-cii.com>
2025-08-15 11:17:17 +00:00
Agus Zubiaga
2526dcb5a5
agent2: Port edit_file tool (#35844)
TODO:
- [x] Authorization
- [x] Restore tests

Release Notes:

- N/A

---------

Co-authored-by: Antonio Scandurra <me@as-cii.com>
Co-authored-by: Ben Brandt <benjamin.j.brandt@gmail.com>
2025-08-08 12:43:53 +00:00
Marshall Bowers
50482a6bc2
language_model: Refresh the LLM token upon receiving a UserUpdated message from Cloud (#35839)
This PR makes it so we refresh the LLM token upon receiving a
`UserUpdated` message from Cloud over the WebSocket connection.

Release Notes:

- N/A
2025-08-07 23:00:45 +00:00
Ben Brandt
eb4b73b88e
ACP champagne (#35609)
cherry pick changes from #35510 onto latest main

Release Notes:

- N/A

---------

Co-authored-by: Nathan Sobo <nathan@zed.dev>
Co-authored-by: Antonio Scandurra <me@as-cii.com>
Co-authored-by: Lukas Wirth <lukas@zed.dev>
2025-08-06 09:01:06 +00:00
Antonio Scandurra
f888f3fc0b
Start separating authentication from connection to collab (#35471)
This pull request should be idempotent, but lays the groundwork for
avoiding to connect to collab in order to interact with AI features
provided by Zed.

Release Notes:

- N/A

---------

Co-authored-by: Marshall Bowers <git@maxdeviant.com>
Co-authored-by: Richard Feldman <oss@rtfeldman.com>
2025-08-01 17:37:38 +00:00
Marshall Bowers
410348deb0
Acquire LLM token from Cloud instead of Collab for Edit Predictions (#35431)
This PR updates the Zed Edit Prediction provider to acquire the LLM
token from Cloud instead of Collab to allow using Edit Predictions even
when disconnected from or unable to connect to the Collab server.

Release Notes:

- N/A

---------

Co-authored-by: Richard Feldman <oss@rtfeldman.com>
2025-07-31 22:12:04 +00:00
Marshall Bowers
7be1f2418d
Replace zed_llm_client with cloud_llm_client (#35309)
This PR replaces the usage of the `zed_llm_client` with the
`cloud_llm_client`.

It was ported into this repo in #35307.

Release Notes:

- N/A
2025-07-30 00:09:14 +00:00
Bennet Bo Fenner
230061a6cb
Support multiple OpenAI compatible providers (#34212)
TODO
- [x] OpenAI Compatible API Icon
- [x] Docs
- [x] Link to docs in OpenAI provider section about configuring OpenAI
API compatible providers

Closes #33992

Related to #30010

Release Notes:

- agent: Add support for adding multiple OpenAI API compatible providers

---------

Co-authored-by: MrSubidubi <dev@bahn.sh>
Co-authored-by: Danilo Leal <daniloleal09@gmail.com>
2025-07-22 12:20:07 -03:00
Danilo Leal
eaccd542fd
Add fast-follows to the AI onboarding flow (#34737)
Follow-up to https://github.com/zed-industries/zed/pull/33738.

Release Notes:

- N/A

---------

Co-authored-by: Bennet Bo Fenner <bennetbo@gmx.de>
2025-07-22 02:09:05 -03:00
Danilo Leal
4476860664
Add refinements to the AI onboarding flow (#33738)
This includes making sure that both the agent panel and Zed's edit
prediction have a consistent narrative when it comes to onboarding users
into the AI features, considering the possible different plans and
conditions (such as being signed in/out, account age, etc.)

Release Notes:

- N/A

---------

Co-authored-by: Bennet Bo Fenner <53836821+bennetbo@users.noreply.github.com>
Co-authored-by: Bennet Bo Fenner <bennetbo@gmx.de>
2025-07-18 18:25:36 +02:00
Richard Feldman
d470411725
Improve upstream error reporting (#34668)
Now we handle more upstream error cases using the same auto-retry logic.

Release Notes:

- N/A
2025-07-17 18:12:48 -04:00
Richard Feldman
b4c2ae5196
Handle upstream_http_error completion responses (#34573)
Addresses upstream errors such as:
<img width="831" height="100" alt="Screenshot 2025-07-16 at 3 37 03 PM"
src="https://github.com/user-attachments/assets/2aeb0257-6761-4148-b687-25fae93c68d8"
/>

These should now automatically retry like other upstream HTTP error
codes.

Release Notes:

- N/A
2025-07-16 16:31:31 -04:00
Bennet Bo Fenner
41fe2a2ab4
agent: Disable thinking when using inline assistant/edit file tool (#34141)
This introduces a new field `thinking_allowed` on `LanguageModelRequest`
which lets us control whether thinking should be enabled if the model
supports it.
We permit thinking in the Inline Assistant, Edit File tool and the Git
Commit message generator, this should make generation faster when using
a thinking model, e.g. `claude-sonnet-4-thinking`

Release Notes:

- N/A
2025-07-09 18:05:39 +00:00
Bennet Bo Fenner
66a1c356bf
agent: Fix max token count mismatch when not using burn mode (#34025)
Closes #31854

Release Notes:

- agent: Fixed an issue where the maximum token count would be displayed
incorrectly when burn mode was not being used.
2025-07-07 23:13:24 +02:00
Bennet Bo Fenner
782fbfad90
agent: Add component preview for Zed AI configuration (#33704)
As we are in the process of improving our Onboarding UX for Zed AI, I
added component previews for the Zed AI Configuration section. This
should make it easier to inspect the different states we can run into.

<img width="1198" alt="image"
src="https://github.com/user-attachments/assets/eb774f27-9091-450d-bfae-c688d533c25e"
/>


Release Notes:

- N/A
2025-07-01 11:12:51 +00:00
Michael Sloan
d497f52e17
agent: Improve error handling and retry for zed-provided models (#33565)
* Updates to `zed_llm_client-0.8.5` which adds support for `retry_after`
when anthropic provides it.

* Distinguishes upstream provider errors and rate limits from errors
that originate from zed's servers

* Moves `LanguageModelCompletionError::BadInputJson` to
`LanguageModelCompletionEvent::ToolUseJsonParseError`. While arguably
this is an error case, the logic in thread is cleaner with this move.
There is also precedent for inclusion of errors in the event type -
`CompletionRequestStatus::Failed` is how cloud errors arrive.

* Updates `PROVIDER_ID` / `PROVIDER_NAME` constants to use proper types
instead of `&str`, since they can be constructed in a const fashion.

* Removes use of `CLIENT_SUPPORTS_EXA_WEB_SEARCH_PROVIDER_HEADER_NAME`
as the server no longer reads this header and just defaults to that
behavior.

Release notes for this is covered by #33275

Release Notes:

- N/A

---------

Co-authored-by: Richard Feldman <oss@rtfeldman.com>
Co-authored-by: Richard <richard@zed.dev>
2025-06-30 21:01:32 -06:00
Ben Brandt
6c46e1129d
Cleanup remaining references to max mode (#33509)
Release Notes:

- N/A
2025-06-27 08:32:13 +00:00
Bennet Bo Fenner
7be57baef0
agent: Fix issue with Anthropic thinking models (#33317)
cc @osyvokon 

We were seeing a bunch of errors in our backend when people were using
Claude models with thinking enabled.

In the logs we would see
> an error occurred while interacting with the Anthropic API:
invalid_request_error: messages.x.content.0.type: Expected `thinking` or
`redacted_thinking`, but found `text`. When `thinking` is enabled, a
final `assistant` message must start with a thinking block (preceeding
the lastmost set of `tool_use` and `tool_result` blocks). We recommend
you include thinking blocks from previous turns. To avoid this
requirement, disable `thinking`. Please consult our documentation at
https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking

However, this issue did not occur frequently and was not easily
reproducible. Turns out it was triggered by us not correctly handling
[Redacted Thinking
Blocks](https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking#thinking-redaction).

I could constantly reproduce this issue by including this magic string:
`ANTHROPIC_MAGIC_STRING_TRIGGER_REDACTED_THINKING_46C9A13E193C177646C7398A98432ECCCE4C1253D5E2D82641AC0E52CC2876CB
` in the request, which forces `claude-3-7-sonnet` to emit redacted
thinking blocks (confusingly the magic string does not seem to be
working for `claude-sonnet-4`). As soon as we hit a tool call Anthropic
would return an error.

Thanks to @osyvokon for pointing me in the right direction 😄!


Release Notes:

- agent: Fixed an issue where Anthropic models would sometimes return an
error when thinking was enabled
2025-06-24 16:23:59 +00:00
Richard Feldman
c610ebfb03
Thread Anthropic errors into LanguageModelKnownError (#33261)
This PR is in preparation for doing automatic retries for certain
errors, e.g. Overloaded. It doesn't change behavior yet (aside from some
granularity of error messages shown to the user), but rather mostly
changes some error handling to be exhaustive enum matches instead of
`anyhow` downcasts, and leaves some comments for where the behavior
change will be in a future PR.

Release Notes:

- N/A
2025-06-23 18:48:26 +00:00
Michael Sloan
7e801dccb0
agent: Fix issues with usage display sometimes showing initially fetched usage (#33125)
Having `Thread::last_usage` as an override of the initially fetched
usage could cause the initial usage to be displayed when the current
thread is empty or in text threads. Fix is to just store last usage info
in `UserStore` and not have these overrides

Release Notes:

- Agent: Fixed request usage display to always include the most recently
known usage - there were some cases where it would show the initially
requested usage.
2025-06-20 21:28:48 +00:00
Richard Feldman
5405c2c2d3
Standardize on u64 for token counts (#32869)
Previously we were using a mix of `u32` and `usize`, e.g. `max_tokens:
usize, max_output_tokens: Option<u32>` in the same `struct`.

Although [tiktoken](https://github.com/openai/tiktoken) uses `usize`,
token counts should be consistent across targets (e.g. the same model
doesn't suddenly get a smaller context window if you're compiling for
wasm32), and these token counts could end up getting serialized using a
binary protocol, so `usize` is not the right choice for token counts.

I chose to standardize on `u64` over `u32` because we don't store many
of them (so the extra size should be insignificant) and future models
may exceed `u32::MAX` tokens.

Release Notes:

- N/A
2025-06-17 10:43:07 -04:00
Richard Feldman
cfbc2d0972
Don't spawn Anthropic telemetry event when API key is missing (#32813)
Minor refactor that I'm extracting from a branch because it can stand
alone.

- Now we no longer spawn an executor for `report_anthropic_event` if
it's just going to immediately fail due to API key being missing
- `report_anthropic_event` now takes a `String` API key instead of
`Option<String>` and the error reporting if the key is missing has been
moved to the caller.
- `report_anthropic_event` is longer coupled to `AnthropicError`,
because all it ever did was generate an `AnthropicEvent::Other`, which
in turn was then only used for `log_err` - so, can just be an
`anyhow::Result`.

Release Notes:

- N/A
2025-06-16 14:58:37 -04:00
Ben Brandt
9427833fdf
Distinguish between missing models and registries in error messages (#32678)
Consolidates configuration error handling by moving the error type and
logic from assistant_context_editor to language_model::registry.

The registry now provides a single method to check for configuration
errors, making the error handling more consistent across the agent panel
and context editor.

This also now checks if the issue is that we don't have any providers,
or if we just can't find the model.

Previously, an incorrect model name showed up as having no providers,
which is very confusing.

Release Notes:

- N/A
2025-06-13 10:31:52 +00:00
Ben Brandt
e4bd115a63
More resilient eval (#32257)
Bubbles up rate limit information so that we can retry after a certain
duration if needed higher up in the stack.

Also caps the number of concurrent evals running at once to also help.

Release Notes:

- N/A
2025-06-09 18:07:22 +00:00
Ben Brandt
4304521655
Remove unused load_model method from LanguageModelProvider (#32070)
Removes the load_model trait method and its implementations in Ollama
and LM Studio providers, along with associated preload_model functions
and unused imports.

Release Notes:

- N/A
2025-06-04 14:07:01 +00:00
Marshall Bowers
a23ee61a4b
Pass up intent with completion requests (#31710)
This PR adds a new `intent` field to completion requests to assist in
categorizing them correctly.

Release Notes:

- N/A

---------

Co-authored-by: Ben Brandt <benjamin.j.brandt@gmail.com>
2025-05-29 20:43:12 +00:00
Umesh Yadav
703ee29658
Rename Max Mode to Burn Mode throughout code and docs (#31668)
Follow up to https://github.com/zed-industries/zed/pull/31470.

I started looking at config and changed preferred_completion_mode to
burn to only find its max so made changes to align it better with
rebrand. As this is in preview build now.

This doesn't touch zed_llm_client. Only the Zed changes the code and doc
to match the new UI of burn mode. There are still more things to be
renamed, though.

Release Notes:

- N/A

---------

Signed-off-by: Umesh Yadav <git@umesh.dev>
Co-authored-by: Danilo Leal <daniloleal09@gmail.com>
2025-05-29 13:12:42 +00:00
Richard Feldman
00fd045844
Make language model deserialization more resilient (#31311)
This expands our deserialization of JSON from models to be more tolerant
of different variations that the model may send, including
capitalization, wrapping things in objects vs. being plain strings, etc.

Also when deserialization fails, it reports the entire error in the JSON
so we can see what failed to deserialize. (Previously these errors were
very unhelpful at diagnosing the problem.)

Finally, also removes the `WrappedText` variant since the custom
deserializer just turns that style of JSON into a normal `Text` variant.

Release Notes:

- N/A
2025-05-28 12:06:07 -04:00
Antonio Scandurra
4f78165ee8
Show progress as the agent locates which range it needs to edit (#31582)
Release Notes:

- Improved latency when the agent starts streaming edits.

---------

Co-authored-by: Ben Brandt <benjamin.j.brandt@gmail.com>
2025-05-28 12:32:54 +00:00
Abdelhakim Qbaich
e42cf21703
Default to fast model first for commit messages (#31385)
I was surprised to see this being done for thread summaries, but not
commit messages.

I believe it's a better default as most people would want a faster
commit message generation without spending premium requests.

Considering how the default fast model for copilot is set to the base
one, this is ideal for me (and likely many others), as opposed to
tweaking the configuration every time the base model changes.

Release Notes:

- git: Default to fast model first if not configured for generating
commit messages
2025-05-26 10:37:44 +02:00
Marshall Bowers
7fb9569c15
language_model: Remove CloudModel enum (#31322)
This PR removes the `CloudModel` enum, as it is no longer needed after
#31316.

Release Notes:

- N/A
2025-05-24 02:04:51 +00:00
Marshall Bowers
685933b5c8
language_models: Fetch Zed models from the server (#31316)
This PR updates the Zed LLM provider to fetch the available models from
the server instead of hard-coding them in the binary.

Release Notes:

- Updated the Zed provider to fetch the list of available language
models from the server.
2025-05-23 23:00:35 +00:00
Marshall Bowers
5c0b161563
Handle new refusal stop reason from Claude 4 models (#31217)
This PR adds support for handling the new [`refusal` stop
reason](https://docs.anthropic.com/en/docs/test-and-evaluate/strengthen-guardrails/handle-streaming-refusals)
from Claude 4 models.

<img width="409" alt="Screenshot 2025-05-22 at 4 31 56 PM"
src="https://github.com/user-attachments/assets/707b04f5-5a52-4a19-95d9-cbd2be2dd86f"
/>

Release Notes:

- Added handling for `"stop_reason": "refusal"` from Claude 4 models.
2025-05-22 16:56:59 -04:00
Marshall Bowers
fc78408ee4
language_model: Allow Max Mode for Claude 4 models (#31207)
This PR adds the Claude 4 models to the list of models that support Max
Mode.

Release Notes:

- Added Max Mode support for Claude 4 models.
2025-05-22 18:50:30 +00:00
Marshall Bowers
1475ace6f1
anthropic: Add support for Claude 4 (#31203)
This PR adds support for [Claude
4](https://www.anthropic.com/news/claude-4).

Release Notes:

- Added support for Claude Opus 4 and Claude Sonnet 4.

---------

Co-authored-by: Antonio Scandurra <me@as-cii.com>
Co-authored-by: Richard Feldman <oss@rtfeldman.com>
2025-05-22 18:09:35 +00:00
Kirill Bulatov
16366cf9f2
Use anyhow more idiomatically (#31052)
https://github.com/zed-industries/zed/issues/30972 brought up another
case where our context is not enough to track the actual source of the
issue: we get a general top-level error without inner error.

The reason for this was `.ok_or_else(|| anyhow!("failed to read HEAD
SHA"))?; ` on the top level.

The PR finally reworks the way we use anyhow to reduce such issues (or
at least make it simpler to bubble them up later in a fix).
On top of that, uses a few more anyhow methods for better readability.

* `.ok_or_else(|| anyhow!("..."))`, `map_err` and other similar error
conversion/option reporting cases are replaced with `context` and
`with_context` calls
* in addition to that, various `anyhow!("failed to do ...")` are
stripped with `.context("Doing ...")` messages instead to remove the
parasitic `failed to` text
* `anyhow::ensure!` is used instead of `if ... { return Err(...); }`
calls
* `anyhow::bail!` is used instead of `return Err(anyhow!(...));`

Release Notes:

- N/A
2025-05-20 23:06:07 +00:00
Richard Feldman
4bb04cef9d
Accept wrapped text content from LLM providers (#31048)
Some providers sometimes send `{ "type": "text", "text": ... }` instead
of just the text as a string. Now we accept those instead of erroring.

Release Notes:

- N/A
2025-05-20 20:50:02 +00:00