Commit graph

87 commits

Author SHA1 Message Date
Marshall Bowers
5a70f2131c Update Agent panel to work with CloudUserStore (#35436)
This PR updates the Agent panel to work with the `CloudUserStore`
instead of the `UserStore`, reducing its reliance on being connected to
Collab to function.

Release Notes:

- N/A

---------

Co-authored-by: Richard Feldman <oss@rtfeldman.com>
2025-08-08 10:17:56 -04:00
Marshall Bowers
7be1f2418d
Replace zed_llm_client with cloud_llm_client (#35309)
This PR replaces the usage of the `zed_llm_client` with the
`cloud_llm_client`.

It was ported into this repo in #35307.

Release Notes:

- N/A
2025-07-30 00:09:14 +00:00
Michael Sloan
65250fe08d
cloud provider: Use CompletionEvent type from zed_llm_client (#35285)
Release Notes:

- N/A
2025-07-29 17:28:18 +00:00
Danilo Leal
29332c1962
ai onboarding: Add overall fixes to the whole flow (#34996)
Closes https://github.com/zed-industries/zed/issues/34979

Release Notes:

- N/A

---------

Co-authored-by: Agus Zubiaga <hi@aguz.me>
Co-authored-by: Ben Kunkle <Ben.kunkle@gmail.com>
2025-07-24 11:26:15 -03:00
Danilo Leal
eaccd542fd
Add fast-follows to the AI onboarding flow (#34737)
Follow-up to https://github.com/zed-industries/zed/pull/33738.

Release Notes:

- N/A

---------

Co-authored-by: Bennet Bo Fenner <bennetbo@gmx.de>
2025-07-22 02:09:05 -03:00
Danilo Leal
4476860664
Add refinements to the AI onboarding flow (#33738)
This includes making sure that both the agent panel and Zed's edit
prediction have a consistent narrative when it comes to onboarding users
into the AI features, considering the possible different plans and
conditions (such as being signed in/out, account age, etc.)

Release Notes:

- N/A

---------

Co-authored-by: Bennet Bo Fenner <53836821+bennetbo@users.noreply.github.com>
Co-authored-by: Bennet Bo Fenner <bennetbo@gmx.de>
2025-07-18 18:25:36 +02:00
Richard Feldman
d470411725
Improve upstream error reporting (#34668)
Now we handle more upstream error cases using the same auto-retry logic.

Release Notes:

- N/A
2025-07-17 18:12:48 -04:00
Marshall Bowers
eca36c502e
Route all LLM traffic through cloud.zed.dev (#34404)
This PR makes it so all LLM traffic is routed through `cloud.zed.dev`.

We're already routing `llm.zed.dev` to `cloud.zed.dev` on the server,
but we want to standardize on `cloud.zed.dev` moving forward.

Release Notes:

- N/A
2025-07-14 16:03:19 +00:00
Marshall Bowers
cfc9cfa4ab
language_models: Refresh the list of models when the LLM token is refreshed (#34222)
This PR makes it so we refresh the list of models whenever the LLM token
is refreshed.

This allows us to add or remove models based on the plan in the new
token.

Release Notes:

- Fixed model list not refreshing when subscribing to Zed Pro.

---------

Co-authored-by: Bennet Bo Fenner <bennetbo@gmx.de>
2025-07-10 17:05:41 +00:00
Bennet Bo Fenner
41fe2a2ab4
agent: Disable thinking when using inline assistant/edit file tool (#34141)
This introduces a new field `thinking_allowed` on `LanguageModelRequest`
which lets us control whether thinking should be enabled if the model
supports it.
We permit thinking in the Inline Assistant, Edit File tool and the Git
Commit message generator, this should make generation faster when using
a thinking model, e.g. `claude-sonnet-4-thinking`

Release Notes:

- N/A
2025-07-09 18:05:39 +00:00
Marshall Bowers
1220049089
Add feature flag to use cloud.zed.dev instead of llm.zed.dev (#34076)
This PR adds a new `zed-cloud` feature flag that can be used to send
traffic to `cloud.zed.dev` instead of `llm.zed.dev`.

This is just so Zed staff can test the new infrastructure. When we're
ready for prime-time we'll reroute traffic on the server.

Release Notes:

- N/A
2025-07-08 18:44:51 +00:00
Bennet Bo Fenner
66a1c356bf
agent: Fix max token count mismatch when not using burn mode (#34025)
Closes #31854

Release Notes:

- agent: Fixed an issue where the maximum token count would be displayed
incorrectly when burn mode was not being used.
2025-07-07 23:13:24 +02:00
Marshall Bowers
52c42125a7
language_models: Fix casing of ZedAiConfiguration (#33712)
This PR fixes the casing of the `ZedAiConfiguration` identifier.

Release Notes:

- N/A
2025-07-01 13:29:43 +00:00
Bennet Bo Fenner
0629804390
agent: Clarify upgrade path when starting trial (#33706)
Release Notes:

- N/A
2025-07-01 11:32:14 +00:00
Bennet Bo Fenner
782fbfad90
agent: Add component preview for Zed AI configuration (#33704)
As we are in the process of improving our Onboarding UX for Zed AI, I
added component previews for the Zed AI Configuration section. This
should make it easier to inspect the different states we can run into.

<img width="1198" alt="image"
src="https://github.com/user-attachments/assets/eb774f27-9091-450d-bfae-c688d533c25e"
/>


Release Notes:

- N/A
2025-07-01 11:12:51 +00:00
Michael Sloan
2ee5bedfa9
agent: Only consider zed provider authenticated if TOS is accepted (#33693)
Also now auto-expands the zed provider section when TOS is not accepted

Release Notes:

- N/A
2025-07-01 04:51:32 +00:00
Michael Sloan
d497f52e17
agent: Improve error handling and retry for zed-provided models (#33565)
* Updates to `zed_llm_client-0.8.5` which adds support for `retry_after`
when anthropic provides it.

* Distinguishes upstream provider errors and rate limits from errors
that originate from zed's servers

* Moves `LanguageModelCompletionError::BadInputJson` to
`LanguageModelCompletionEvent::ToolUseJsonParseError`. While arguably
this is an error case, the logic in thread is cleaner with this move.
There is also precedent for inclusion of errors in the event type -
`CompletionRequestStatus::Failed` is how cloud errors arrive.

* Updates `PROVIDER_ID` / `PROVIDER_NAME` constants to use proper types
instead of `&str`, since they can be constructed in a const fashion.

* Removes use of `CLIENT_SUPPORTS_EXA_WEB_SEARCH_PROVIDER_HEADER_NAME`
as the server no longer reads this header and just defaults to that
behavior.

Release notes for this is covered by #33275

Release Notes:

- N/A

---------

Co-authored-by: Richard Feldman <oss@rtfeldman.com>
Co-authored-by: Richard <richard@zed.dev>
2025-06-30 21:01:32 -06:00
Ben Brandt
6c46e1129d
Cleanup remaining references to max mode (#33509)
Release Notes:

- N/A
2025-06-27 08:32:13 +00:00
Bennet Bo Fenner
18f1221a44
vercel: Reuse existing OpenAI code (#33362)
Follow up to #33292

Since Vercel's API is OpenAI compatible, we can reuse a bunch of code.

Release Notes:

- N/A
2025-06-25 15:04:43 +02:00
Michael Sloan
7e801dccb0
agent: Fix issues with usage display sometimes showing initially fetched usage (#33125)
Having `Thread::last_usage` as an override of the initially fetched
usage could cause the initial usage to be displayed when the current
thread is empty or in text threads. Fix is to just store last usage info
in `UserStore` and not have these overrides

Release Notes:

- Agent: Fixed request usage display to always include the most recently
known usage - there were some cases where it would show the initially
requested usage.
2025-06-20 21:28:48 +00:00
Richard Feldman
5405c2c2d3
Standardize on u64 for token counts (#32869)
Previously we were using a mix of `u32` and `usize`, e.g. `max_tokens:
usize, max_output_tokens: Option<u32>` in the same `struct`.

Although [tiktoken](https://github.com/openai/tiktoken) uses `usize`,
token counts should be consistent across targets (e.g. the same model
doesn't suddenly get a smaller context window if you're compiling for
wasm32), and these token counts could end up getting serialized using a
binary protocol, so `usize` is not the right choice for token counts.

I chose to standardize on `u64` over `u32` because we don't store many
of them (so the extra size should be insignificant) and future models
may exceed `u32::MAX` tokens.

Release Notes:

- N/A
2025-06-17 10:43:07 -04:00
Ben Brandt
e4bd115a63
More resilient eval (#32257)
Bubbles up rate limit information so that we can retry after a certain
duration if needed higher up in the stack.

Also caps the number of concurrent evals running at once to also help.

Release Notes:

- N/A
2025-06-09 18:07:22 +00:00
90aca
cf931247d0
Add thinking budget for Gemini custom models (#31251)
Closes #31243

As described in my issue, the [thinking
budget](https://ai.google.dev/gemini-api/docs/thinking) gets
automatically chosen by Gemini unless it is specifically set to
something. In order to have fast responses (inline assistant) I prefer
to set it to 0.

Release Notes:

- ai: Added `thinking` mode for custom Google models with configurable
token budget

---------

Co-authored-by: Ben Brandt <benjamin.j.brandt@gmail.com>
2025-06-03 13:40:20 +02:00
Marshall Bowers
a23ee61a4b
Pass up intent with completion requests (#31710)
This PR adds a new `intent` field to completion requests to assist in
categorizing them correctly.

Release Notes:

- N/A

---------

Co-authored-by: Ben Brandt <benjamin.j.brandt@gmail.com>
2025-05-29 20:43:12 +00:00
Marshall Bowers
685933b5c8
language_models: Fetch Zed models from the server (#31316)
This PR updates the Zed LLM provider to fetch the available models from
the server instead of hard-coding them in the binary.

Release Notes:

- Updated the Zed provider to fetch the list of available language
models from the server.
2025-05-23 23:00:35 +00:00
Marshall Bowers
37047a6fde
language_models: Update default/recommended Anthropic models to Claude Sonnet 4 (#31209)
This PR updates the default/recommended models for the Anthropic and Zed
providers to be Claude Sonnet 4.

Release Notes:

- Updated default/recommended Anthropic models to Claude Sonnet 4.
2025-05-22 19:10:08 +00:00
Marshall Bowers
1475ace6f1
anthropic: Add support for Claude 4 (#31203)
This PR adds support for [Claude
4](https://www.anthropic.com/news/claude-4).

Release Notes:

- Added support for Claude Opus 4 and Claude Sonnet 4.

---------

Co-authored-by: Antonio Scandurra <me@as-cii.com>
Co-authored-by: Richard Feldman <oss@rtfeldman.com>
2025-05-22 18:09:35 +00:00
Kirill Bulatov
16366cf9f2
Use anyhow more idiomatically (#31052)
https://github.com/zed-industries/zed/issues/30972 brought up another
case where our context is not enough to track the actual source of the
issue: we get a general top-level error without inner error.

The reason for this was `.ok_or_else(|| anyhow!("failed to read HEAD
SHA"))?; ` on the top level.

The PR finally reworks the way we use anyhow to reduce such issues (or
at least make it simpler to bubble them up later in a fix).
On top of that, uses a few more anyhow methods for better readability.

* `.ok_or_else(|| anyhow!("..."))`, `map_err` and other similar error
conversion/option reporting cases are replaced with `context` and
`with_context` calls
* in addition to that, various `anyhow!("failed to do ...")` are
stripped with `.context("Doing ...")` messages instead to remove the
parasitic `failed to` text
* `anyhow::ensure!` is used instead of `if ... { return Err(...); }`
calls
* `anyhow::bail!` is used instead of `return Err(anyhow!(...));`

Release Notes:

- N/A
2025-05-20 23:06:07 +00:00
Marshall Bowers
7cad943fde
agent: Remove unused max monthly spend reached error (#30615)
This PR removes the code for showing the max monthly spend limit reached
error, as it is no longer used.

Release Notes:

- N/A
2025-05-13 09:43:13 +00:00
Richard Feldman
8fdf309a4a
Have read_file support images (#30435)
This is very basic support for them. There are a number of other TODOs
before this is really a first-class supported feature, so not adding any
release notes for it; for now, this PR just makes it so that if
read_file tries to read a PNG (which has come up in practice), it at
least correctly sends it to Anthropic instead of messing up.

This also lays the groundwork for future PRs for more first-class
support for images in tool calls across more image file formats and LLM
providers.

Release Notes:

- N/A

---------

Co-authored-by: Agus Zubiaga <hi@aguz.me>
Co-authored-by: Agus Zubiaga <agus@zed.dev>
2025-05-13 10:58:00 +02:00
Kirill Bulatov
471e02d48f
Separate timeout and connection dropped errors out (#30457) 2025-05-10 15:12:58 +03:00
Marshall Bowers
f29c6e5661
Update zed_llm_client to v0.8.1 (#30433)
This PR updates the `zed_llm_client` crate to v0.8.1.

The name of `Plan::Free` changed to `Plan::ZedFree` in this version.

Release Notes:

- N/A
2025-05-09 21:08:03 +00:00
Marshall Bowers
f21780cef3
Remove individual URL overrides for LLM service (#30290)
This PR removes the individual URL overrides for the LLM service.

We initially had `ZED_PREDICT_EDITS_URL` to allow for directing traffic
to the LLM Worker back when there was still the split of the
Collab-based LLM Service and the Cloudflare-based LLM Worker.

But now that all of the LLM functionality has been moved into the
Worker, we can just direct all traffic there.

Release Notes:

- N/A
2025-05-08 17:54:46 +00:00
Marshall Bowers
b343a8aa22
language_models: Improve subscription states in the Agent configuration view (#30252)
This PR improves the subscription states in the Agent configuration view
to the new billing system.

Zed Free (legacy):

<img width="638" alt="Screenshot 2025-05-08 at 8 42 59 AM"
src="https://github.com/user-attachments/assets/7b62d4c1-2a9c-4c6a-aa8f-060730b6d7b3"
/>

Zed Free (new):

<img width="640" alt="Screenshot 2025-05-08 at 8 43 56 AM"
src="https://github.com/user-attachments/assets/8a48448e-813e-4633-955d-623d3e6d603c"
/>

Zed Pro trial:

<img width="641" alt="Screenshot 2025-05-08 at 8 45 52 AM"
src="https://github.com/user-attachments/assets/1ec7ee62-e954-48e7-8447-4584527307c9"
/>

Zed Pro:

<img width="636" alt="Screenshot 2025-05-08 at 8 47 21 AM"
src="https://github.com/user-attachments/assets/f934b2e3-0943-4b78-b8dc-0a31e731d8fb"
/>

Release Notes:

- agent: Improved the subscription-related information in the
configuration view.
2025-05-08 09:10:50 -04:00
Antonio Scandurra
9f6809a28d
Reuse conversation cache when streaming edits (#30245)
Release Notes:

- Improved latency when the agent applies edits.
2025-05-08 14:36:34 +02:00
Marshall Bowers
4469b7339f
language_models: Update copy for Zed Pro subscription (#30152)
This PR updates the copy around the Zed Pro description to be more
accurate.

Release Notes:

- agent: Updated some copy about Zed Pro in the configuration view.
2025-05-07 17:15:02 +00:00
Marshall Bowers
a34fb6f6b1
Send up Zed version with edit prediction and completion requests (#30136)
This PR makes it so we send up an `x-zed-version` header with the
client's version when making a request to llm.zed.dev for edit
predictions and completions.

Release Notes:

- N/A
2025-05-07 15:44:30 +00:00
Michael Sloan
76ad1a29a5
Add support for getting the token count for all parts of Gemini generation requests (#29630)
* `CountTokensRequest` now takes a full `GenerateContentRequest` instead
of just content.

* Fixes use of `models/` prefix in `model` field of
`GenerateContentRequest`, since that's required for use in
`CountTokensRequest`. This didn't cause issues before because it was
always cleared and used in the path.

Release Notes:

- N/A
2025-05-04 21:32:45 +00:00
Michael Sloan
f4e9ea3cd8
In error text of cloud LLM API: completion failed -> request failed (#29888)
This error is used for more requests than completion requests

Release Notes:

- N/A
2025-05-04 21:04:34 +00:00
Max Brunsfeld
c3d9cdecab
Change cloud language model provider JSON protocol to surface errors and usage information (#29830)
Release Notes:

- N/A

---------

Co-authored-by: Nathan Sobo <nathan@zed.dev>
Co-authored-by: Marshall Bowers <git@maxdeviant.com>
2025-05-04 17:37:42 +00:00
Marshall Bowers
f0515d1c34
agent: Show a notice when reaching consecutive tool use limits (#29833)
This PR adds a notice when reaching consecutive tool use limits when
using normal mode.

Here's an example with the limit artificially lowered to 2 consecutive
tool uses:


https://github.com/user-attachments/assets/32da8d38-67de-4d6b-8f24-754d2518e5d4

Release Notes:

- agent: Added a notice when reaching consecutive tool use limits when
using a model in normal mode.
2025-05-03 02:09:54 +00:00
Max Brunsfeld
04772bf17d
Add support for queuing status updates in cloud language model provider (#29818)
This sets us up to display queue position information to the user, once
our language model backend is updated to support request queuing.

The JSON returned by the LLM backend will need to look like this:

```json
{"queue": {"status": "queued", "position": 1}}
{"queue": {"status": "started"}}
{"event": {"THE_UPSTREAM_MODEL_PROVIDER_EVENT": "..."}} 
```

Release Notes:

- N/A

---------

Co-authored-by: Marshall Bowers <git@maxdeviant.com>
2025-05-02 20:36:39 +00:00
Marshall Bowers
b2df395918
language_models: Change default fast model for Zed provider (#29600)
This PR changes the default fast model for the Zed provider from Claude
3.5 Haiku to Claude 3.5 Sonnet.

We don't offer Claude 3.5 Haiku to users.

Closes https://github.com/zed-industries/zed/issues/29505.

Release Notes:

- agent: Changed the default fast model for the Zed provider to Claude
3.5 Sonnet.
2025-04-29 14:46:27 +00:00
Marshall Bowers
cd86905ebe
language_models: Pass up mode from the LanguageModelRequest (#29552)
This PR makes it so we pass up the `mode` from the
`LanguageModelRequest` when interacting with the Zed provider instead of
passing a hard-coded value.

Release Notes:

- N/A
2025-04-28 17:38:55 +00:00
Marshall Bowers
187f851613
feature_flags: Add FeatureFlag suffix to feature flag types (#29392)
This PR adds the `FeatureFlag` suffix to the feature flag types that
were missing them.

This makes the names easier to search in the codebase.

Release Notes:

- N/A
2025-04-25 04:07:49 +00:00
Marshall Bowers
6bb6be826d
language_models: Use POST /completions endpoint for Zed provider (#29389)
This PR updates the Zed provider to use the `POST /completions`
endpoint.

There is no functional difference from `POST /completion`, but the
pluralized version reads better.

Release Notes:

- N/A
2025-04-25 02:58:02 +00:00
Richard Feldman
720dfee803
Treat invalid JSON in tool calls as failed tool calls (#29375)
Release Notes:

- N/A

---------

Co-authored-by: Max <max@zed.dev>
Co-authored-by: Max Brunsfeld <maxbrunsfeld@gmail.com>
2025-04-24 16:54:27 -04:00
Marshall Bowers
fef2681cfa
language_models: Count Google AI tokens through LLM service (#29319)
This PR wires the counting of Google AI tokens back up.

It now goes through the LLM service instead of collab's RPC.

Still only available for Zed staff.

Release Notes:

- N/A
2025-04-24 01:21:53 +00:00
Marshall Bowers
74442b68ea
collab: Remove CountLanguageModelTokens RPC message (#29314)
This PR removes the `CountLanguageModelTokens` RPC message from collab.

We were only using this for Google AI models through the Zed provider
(which is only available to Zed staff).

For now we're returning `0`, but will bring back soon.

Release Notes:

- N/A
2025-04-23 23:10:47 +00:00
Marshall Bowers
92e810bfec
language_models: Pass up mode for completion requests through Zed (#29294)
This PR makes it so we pass up the `mode` for completion requests
through the Zed provider.

Release Notes:

- N/A
2025-04-23 18:02:03 +00:00