Commit graph

194 commits

Author SHA1 Message Date
Richard Feldman
00fd045844
Make language model deserialization more resilient (#31311)
This expands our deserialization of JSON from models to be more tolerant
of different variations that the model may send, including
capitalization, wrapping things in objects vs. being plain strings, etc.

Also when deserialization fails, it reports the entire error in the JSON
so we can see what failed to deserialize. (Previously these errors were
very unhelpful at diagnosing the problem.)

Finally, also removes the `WrappedText` variant since the custom
deserializer just turns that style of JSON into a normal `Text` variant.

Release Notes:

- N/A
2025-05-28 12:06:07 -04:00
Antonio Scandurra
4f78165ee8
Show progress as the agent locates which range it needs to edit (#31582)
Release Notes:

- Improved latency when the agent starts streaming edits.

---------

Co-authored-by: Ben Brandt <benjamin.j.brandt@gmail.com>
2025-05-28 12:32:54 +00:00
Abdelhakim Qbaich
e42cf21703
Default to fast model first for commit messages (#31385)
I was surprised to see this being done for thread summaries, but not
commit messages.

I believe it's a better default as most people would want a faster
commit message generation without spending premium requests.

Considering how the default fast model for copilot is set to the base
one, this is ideal for me (and likely many others), as opposed to
tweaking the configuration every time the base model changes.

Release Notes:

- git: Default to fast model first if not configured for generating
commit messages
2025-05-26 10:37:44 +02:00
Marshall Bowers
7fb9569c15
language_model: Remove CloudModel enum (#31322)
This PR removes the `CloudModel` enum, as it is no longer needed after
#31316.

Release Notes:

- N/A
2025-05-24 02:04:51 +00:00
Marshall Bowers
685933b5c8
language_models: Fetch Zed models from the server (#31316)
This PR updates the Zed LLM provider to fetch the available models from
the server instead of hard-coding them in the binary.

Release Notes:

- Updated the Zed provider to fetch the list of available language
models from the server.
2025-05-23 23:00:35 +00:00
Marshall Bowers
5c0b161563
Handle new refusal stop reason from Claude 4 models (#31217)
This PR adds support for handling the new [`refusal` stop
reason](https://docs.anthropic.com/en/docs/test-and-evaluate/strengthen-guardrails/handle-streaming-refusals)
from Claude 4 models.

<img width="409" alt="Screenshot 2025-05-22 at 4 31 56 PM"
src="https://github.com/user-attachments/assets/707b04f5-5a52-4a19-95d9-cbd2be2dd86f"
/>

Release Notes:

- Added handling for `"stop_reason": "refusal"` from Claude 4 models.
2025-05-22 16:56:59 -04:00
Marshall Bowers
fc78408ee4
language_model: Allow Max Mode for Claude 4 models (#31207)
This PR adds the Claude 4 models to the list of models that support Max
Mode.

Release Notes:

- Added Max Mode support for Claude 4 models.
2025-05-22 18:50:30 +00:00
Marshall Bowers
1475ace6f1
anthropic: Add support for Claude 4 (#31203)
This PR adds support for [Claude
4](https://www.anthropic.com/news/claude-4).

Release Notes:

- Added support for Claude Opus 4 and Claude Sonnet 4.

---------

Co-authored-by: Antonio Scandurra <me@as-cii.com>
Co-authored-by: Richard Feldman <oss@rtfeldman.com>
2025-05-22 18:09:35 +00:00
Kirill Bulatov
16366cf9f2
Use anyhow more idiomatically (#31052)
https://github.com/zed-industries/zed/issues/30972 brought up another
case where our context is not enough to track the actual source of the
issue: we get a general top-level error without inner error.

The reason for this was `.ok_or_else(|| anyhow!("failed to read HEAD
SHA"))?; ` on the top level.

The PR finally reworks the way we use anyhow to reduce such issues (or
at least make it simpler to bubble them up later in a fix).
On top of that, uses a few more anyhow methods for better readability.

* `.ok_or_else(|| anyhow!("..."))`, `map_err` and other similar error
conversion/option reporting cases are replaced with `context` and
`with_context` calls
* in addition to that, various `anyhow!("failed to do ...")` are
stripped with `.context("Doing ...")` messages instead to remove the
parasitic `failed to` text
* `anyhow::ensure!` is used instead of `if ... { return Err(...); }`
calls
* `anyhow::bail!` is used instead of `return Err(anyhow!(...));`

Release Notes:

- N/A
2025-05-20 23:06:07 +00:00
Richard Feldman
4bb04cef9d
Accept wrapped text content from LLM providers (#31048)
Some providers sometimes send `{ "type": "text", "text": ... }` instead
of just the text as a string. Now we accept those instead of erroring.

Release Notes:

- N/A
2025-05-20 20:50:02 +00:00
Oleksiy Syvokon
5112fcebeb
evals: Make LLMs configurable in edit_agent evals (#30813)
Release Notes:

- N/A
2025-05-16 11:10:15 +00:00
Marshall Bowers
7cad943fde
agent: Remove unused max monthly spend reached error (#30615)
This PR removes the code for showing the max monthly spend limit reached
error, as it is no longer used.

Release Notes:

- N/A
2025-05-13 09:43:13 +00:00
Richard Feldman
8fdf309a4a
Have read_file support images (#30435)
This is very basic support for them. There are a number of other TODOs
before this is really a first-class supported feature, so not adding any
release notes for it; for now, this PR just makes it so that if
read_file tries to read a PNG (which has come up in practice), it at
least correctly sends it to Anthropic instead of messing up.

This also lays the groundwork for future PRs for more first-class
support for images in tool calls across more image file formats and LLM
providers.

Release Notes:

- N/A

---------

Co-authored-by: Agus Zubiaga <hi@aguz.me>
Co-authored-by: Agus Zubiaga <agus@zed.dev>
2025-05-13 10:58:00 +02:00
Umesh Yadav
a6c3d49bb9
language_models: Add vision support for Copilot Chat models (#30155)
Problem Statement:
Support for image analysis (vision) is currently restricted to Anthropic
and Gemini models. This limits users who wish to leverage vision
capabilities available in other models, such as Copilot, for tasks like
attaching image context within the agent message editor.

Proposed Change:
This PR extends vision support to include Copilot models that are
already equipped with vision capabilities. This integration will allow
users within VS Code to attach and analyze images using supported
Copilot models via the agent message editor.

Scope Limitation:

This PR does not implement controls within the message editor to ensure
that image context (e.g., through copy-paste or attachment) is
exclusively enabled or prompted only when a vision-supported model is
active. Long term the message editor should have access to each models
vision capability and stop the users from attaching images by either
greying out the context saying it's not support or not work through both
copy paste and file/directory search.

Closes #30076 

Release Notes:

- Add vision support for Copilot Chat models

---------

Co-authored-by: Bennet Bo Fenner <bennet@zed.dev>
2025-05-12 13:11:38 +00:00
Antonio Scandurra
9f6809a28d
Reuse conversation cache when streaming edits (#30245)
Release Notes:

- Improved latency when the agent applies edits.
2025-05-08 14:36:34 +02:00
Mikayla Maki
0cdd8bdded
Restore tool cards on thread deserialization (#30053)
Release Notes:

- N/A

---------

Co-authored-by: Julia Ryan <juliaryan3.14@gmail.com>
2025-05-06 18:16:34 -07:00
Max Brunsfeld
c3d9cdecab
Change cloud language model provider JSON protocol to surface errors and usage information (#29830)
Release Notes:

- N/A

---------

Co-authored-by: Nathan Sobo <nathan@zed.dev>
Co-authored-by: Marshall Bowers <git@maxdeviant.com>
2025-05-04 17:37:42 +00:00
Marshall Bowers
f0515d1c34
agent: Show a notice when reaching consecutive tool use limits (#29833)
This PR adds a notice when reaching consecutive tool use limits when
using normal mode.

Here's an example with the limit artificially lowered to 2 consecutive
tool uses:


https://github.com/user-attachments/assets/32da8d38-67de-4d6b-8f24-754d2518e5d4

Release Notes:

- agent: Added a notice when reaching consecutive tool use limits when
using a model in normal mode.
2025-05-03 02:09:54 +00:00
Max Brunsfeld
04772bf17d
Add support for queuing status updates in cloud language model provider (#29818)
This sets us up to display queue position information to the user, once
our language model backend is updated to support request queuing.

The JSON returned by the LLM backend will need to look like this:

```json
{"queue": {"status": "queued", "position": 1}}
{"queue": {"status": "started"}}
{"event": {"THE_UPSTREAM_MODEL_PROVIDER_EVENT": "..."}} 
```

Release Notes:

- N/A

---------

Co-authored-by: Marshall Bowers <git@maxdeviant.com>
2025-05-02 20:36:39 +00:00
Bennet Bo Fenner
4812c9094b
agent: Support images via @file and the file context picker (#29596)
Release Notes:

- agent: Add support for @mentioning images
- agent: Add support for including images via file context picker

---------

Co-authored-by: Oleksiy Syvokon <oleksiy.syvokon@gmail.com>
2025-04-29 16:26:27 +02:00
Max Brunsfeld
17903a0999
Associate each thread with a model (#29573)
This PR makes it possible to use different LLM models in the agent
panels of two different projects, simultaneously. It also properly
restores a thread's original model when restoring it from the history,
rather than having it use the default model. As before, newly-created
threads will use the current default model.

Release Notes:

- Enabled different project windows to use different models in the agent
panel
- Enhanced the agent panel so that when revisiting old threads, their
original model will be used.

---------

Co-authored-by: Richard Feldman <oss@rtfeldman.com>
2025-04-28 23:43:16 +00:00
Marshall Bowers
ce93961fe0
agent: Add "max mode" toggle (#29549)
This PR adds a "max mode" toggle to the Agent panel, for models that
support it.

Only visible to folks in the `new-billing` feature flag.

Icon is just a placeholder.

Release Notes:

- N/A
2025-04-28 16:50:47 +00:00
Richard Feldman
720dfee803
Treat invalid JSON in tool calls as failed tool calls (#29375)
Release Notes:

- N/A

---------

Co-authored-by: Max <max@zed.dev>
Co-authored-by: Max Brunsfeld <maxbrunsfeld@gmail.com>
2025-04-24 16:54:27 -04:00
Nathan Sobo
8836c6fb42
Introduce LanguageModelToolUse::raw_input (#29322)
This is to enable alternative streaming solutions at the application
layer. I'm not sure we really should have performed parsing of the input
at this layer. Either way I want to experiment with streaming approaches
in a separate crate on a branch, and this will help.

/cc @maxdeviant @bennetbo @rtfeldman

Closes #ISSUE

Release Notes:

- N/A
2025-04-24 02:30:48 +00:00
Bennet Bo Fenner
822b6f837d
agent: Expose web search tool to beta users (#29273)
This gives all beta users access to the web search tool

Release Notes:

- agent: Added `web_search` tool
2025-04-23 15:30:20 +00:00
Stephan Seidt
10ded0ab75
agent: Add support for google gemini 2.5 flash preview (#29205)
Adds support for the new gemini-2.5-flash-preview-04-17

Release Notes:

- agent: Added support for gemini-2.5-flash-preview
2025-04-22 09:37:12 +00:00
Bennet Bo Fenner
eca6d5a04e
agent: Support pasting images as context (#29177)
https://github.com/user-attachments/assets/d6a27b05-3590-4f40-a820-f6f99f6bd581

Release Notes:

- agent: Added support for pasting images as context

---------

Co-authored-by: Danilo Leal <daniloleal09@gmail.com>
2025-04-22 09:01:01 +00:00
Agus Zubiaga
b14356d1d3
agent: Do not add <using_tool> placeholder (#29194)
Our provider code in `language_models` filters out messages for which
`LanguageModelRequestMessage::contents_empty` returns `false`. This
doesn't seem wrong by itself, but `contents_empty` was returning `false`
for messages whose first segment didn't contain non-whitespace text even
if they contained other non-empty segments. This caused requests to fail
when a message with a tool call didn't contain any preceding text.

Release Notes:

- N/A
2025-04-22 00:41:47 -03:00
Richard Feldman
4f2f9ff762
Streaming tool calls (#29179)
https://github.com/user-attachments/assets/7854a737-ef83-414c-b397-45122e4f32e8



Release Notes:

- Create file and edit file tools now stream their tool descriptions, so
you can see what they're doing sooner.

---------

Co-authored-by: Marshall Bowers <git@maxdeviant.com>
2025-04-21 22:28:32 +00:00
Michael Sloan
fbf7caf93e
Default to fast model for thread summaries and titles + don't include system prompt / context / thinking segments (#29102)
* Adds a fast / cheaper model to providers and defaults thread
summarization to this model. Initial motivation for this was that
https://github.com/zed-industries/zed/pull/29099 would cause these
requests to fail when used with a thinking model. It doesn't seem
correct to use a thinking model for summarization.

* Skips system prompt, context, and thinking segments.

* If tool use is happening, allows 2 tool uses + one more agent response
before summarizing.

Downside of this is that there was potential for some prefix cache reuse
before, especially for title summarization (thread summarization omitted
tool results and so would not share a prefix for those). This seems fine
as these requests should typically be fairly small. Even for full thread
summarization, skipping all tool use / context should greatly reduce the
token use.

Release Notes:

- N/A
2025-04-19 23:26:29 +00:00
Bennet Bo Fenner
bafc086d27
agent: Preserve thinking blocks between requests (#29055)
Looks like the required backend component of this was deployed.

https://github.com/zed-industries/monorepo/actions/runs/14541199197

Release Notes:

- N/A

---------

Co-authored-by: Antonio Scandurra <me@as-cii.com>
Co-authored-by: Agus Zubiaga <hi@aguz.me>
Co-authored-by: Richard Feldman <oss@rtfeldman.com>
Co-authored-by: Nathan Sobo <nathan@zed.dev>
2025-04-19 20:12:03 +00:00
Michael Sloan
d88b06a5dc
Simplify language model registry + only emit change events on change (#29086)
* Now only does default fallback logic in the registry

* Only emits change events when there is actually a change

Release Notes:

- N/A
2025-04-19 08:26:42 +00:00
Marshall Bowers
7abe2c9c31
agent: Attach thread ID and prompt ID to telemetry events (#29069)
This PR attaches the thread ID and the new prompt ID to telemetry events
for completions in the Agent panel.

Release Notes:

- N/A

---------

Co-authored-by: Mikayla Maki <mikayla.c.maki@gmail.com>
2025-04-18 20:41:02 +00:00
Marshall Bowers
676cc109a3
agent: Report usage from thread summarization requests (#29012)
This PR makes it so the thread summarization also reports the model
request usage, to prevent the case where the count would appear to jump
by 2 the next time a message was sent after summarization.

Release Notes:

- N/A
2025-04-17 23:05:12 +00:00
Marshall Bowers
d93141bded
agent: Extract usage information from response headers (#29002)
This PR updates the Agent to extract the usage information from the
response headers, if they are present.

For now we just log the information, but we'll be using this soon to
populate some UI.

Release Notes:

- N/A
2025-04-17 20:11:07 +00:00
Umesh Yadav
8117940aca
Add support for OpenAI o3 and o4-mini models (#28881)
Release Notes:

- Add support for OpenAI o3 and o4-mini models via OpenAI API and
Copilot Chat providers.

---------

Co-authored-by: Peter Tripp <peter@zed.dev>
2025-04-17 10:58:41 -04:00
Agus Zubiaga
0286b8ab3e
agent: Fix conversation token usage and estimate unsent message (#28878)
The UI was mistakenly using the cumulative token usage for the token
counter. It will now display the last request token count, plus an
estimation of the tokens in the message editor and context entries that
haven't been sent yet.


https://github.com/user-attachments/assets/0438c501-b850-4397-9135-57214ca3c07a

Additionally, when the user edits a message, we'll display the actual
token count up to it and estimate the tokens in the new message.

Note: We don't currently estimate the delta when switching profiles. In
the future, we want to use the count tokens API to measure every part of
the request and display a breakdown.

Release Notes:

- agent: Made the token count more accurate and added back estimation of
used tokens as you type and add context.

---------

Co-authored-by: Bennet Bo Fenner <bennetbo@gmx.de>
Co-authored-by: Danilo Leal <daniloleal09@gmail.com>
2025-04-16 16:27:36 -03:00
Marshall Bowers
97b044acf5
proto: Add ZedProTrial to Plan (#28885)
This PR adds the `ZedProTrial` member to the `Plan` enum.

Release Notes:

- N/A
2025-04-16 18:13:00 +00:00
Marshall Bowers
cb79420773
agent: Show an error when the model requests limit has been reached (#28868)
This PR adds an error message when the model requests limit has been
hit.

Release Notes:

- N/A

Co-authored-by: Oleksiy Syvokon <oleksiy.syvokon@gmail.com>
2025-04-16 15:11:35 +00:00
Bennet Bo Fenner
c381a500f8
agent: Show a warning when some tools are incompatible with the selected model (#28755)
WIP

<img width="644" alt="image"
src="https://github.com/user-attachments/assets/b24e1a57-f82e-457c-b788-1b314ade7c84"
/>


<img width="644" alt="image"
src="https://github.com/user-attachments/assets/b158953c-2015-4cc8-b8ed-35c6fcbe162d"
/>


Release Notes:

- agent: Improve compatibility with Gemini Tool Calling APIs. When a
tool is incompatible with the Gemini APIs a warning indicator will be
displayed. Incompatible tools will be automatically excluded from the
conversation

---------

Co-authored-by: Danilo Leal <daniloleal09@gmail.com>
2025-04-15 16:58:11 +00:00
Michael Sloan
6b80eb556c
Add judge to new eval + provide LSP diagnostics (#28713)
Release Notes:

- N/A

---------

Co-authored-by: Antonio Scandurra <antonio@zed.dev>
Co-authored-by: agus <agus@zed.dev>
2025-04-14 20:18:47 +00:00
Umesh Yadav
84aa480344
Add support for OpenAI GPT-4.1 models (#28708)
Release Notes:

- Add support for OpenAI GPT-4.1 via Copilot Chat and OpenAI API

---------

Co-authored-by: Danilo Leal <daniloleal09@gmail.com>
Co-authored-by: Bennet Bo Fenner <bennetbo@gmx.de>
2025-04-14 16:15:59 -03:00
Agus Zubiaga
b45230784d
agent: Handle context window exceeded errors from Anthropic (#28688)
![CleanShot 2025-04-14 at 11 15
38@2x](https://github.com/user-attachments/assets/9e803ffb-74fd-486b-bebc-2155a407a9fa)

Release Notes:

- agent: Handle context window exceeded errors from Anthropic
2025-04-14 14:39:33 +00:00
Bennet Bo Fenner
b22faf96e0
agent: Refine language model selector (#28597)
Release Notes:

- agent: Show recommended models in the agent model selector and display
the provider in the model selector's trigger.

---------

Co-authored-by: Danilo Leal <daniloleal09@gmail.com>
Co-authored-by: Danilo Leal <67129314+danilo-leal@users.noreply.github.com>
2025-04-11 23:02:50 +00:00
Bennet Bo Fenner
97abf21a28
agent: Add support for Google Gemini 2.5 preview (#28326)
Adds support for the new `gemini-2.5-pro-preview-03-25`

Release Notes:

- Added support for `gemini-2.5-pro-preview-03-25` in the assistant
2025-04-08 15:00:23 +00:00
Marshall Bowers
03aadb4e5b
telemetry_events: Rename AssistantEvent to AssistantEventData (#28133)
This PR renames the `AssistantEvent` type to `AssistantEventData`, as it
no longer represents the event itself, just the data needed to construct
it.

Pulling out of https://github.com/zed-industries/zed/pull/25179.

Release Notes:

- N/A
2025-04-04 19:28:32 -04:00
Agus Zubiaga
43cb925a59
ai: Separate model settings for each feature (#28088)
Closes: https://github.com/zed-industries/zed/issues/20582

Allows users to select a specific model for each AI-powered feature:
- Agent panel
- Inline assistant
- Thread summarization
- Commit message generation

If unspecified for a given feature, it will use the `default_model`
setting.

Release Notes:

- Added support for configuring a specific model for each AI-powered
feature

---------

Co-authored-by: Danilo Leal <daniloleal09@gmail.com>
Co-authored-by: Bennet Bo Fenner <bennetbo@gmx.de>
2025-04-04 11:40:55 -03:00
Danilo Leal
b9724d9cbe
agent: Add token count in the thread view (#28037)
This PR adds the token count to the active thread view. It doesn't
behaves quite like Assistant 1 where it updates as you type, though; it
updates after you submit the message.

<img
src="https://github.com/user-attachments/assets/82d2a180-554a-43ee-b776-3743359b609b"
width="700" />

---

Release Notes:

- agent: Add token count in the thread view

---------

Co-authored-by: Agus Zubiaga <hi@aguz.me>
2025-04-03 15:43:58 -03:00
Julia Ryan
01ec6e0f77
Add workspace-hack (#27277)
This adds a "workspace-hack" crate, see
[mozilla's](https://hg.mozilla.org/mozilla-central/file/3a265fdc9f33e5946f0ca0a04af73acd7e6d1a39/build/workspace-hack/Cargo.toml#l7)
for a concise explanation of why this is useful. For us in practice this
means that if I were to run all the tests (`cargo nextest r
--workspace`) and then `cargo r`, all the deps from the previous cargo
command will be reused. Before this PR it would rebuild many deps due to
resolving different sets of features for them. For me this frequently
caused long rebuilds when things "should" already be cached.

To avoid manually maintaining our workspace-hack crate, we will use
[cargo hakari](https://docs.rs/cargo-hakari) to update the build files
when there's a necessary change. I've added a step to CI that checks
whether the workspace-hack crate is up to date, and instructs you to
re-run `script/update-workspace-hack` when it fails.

Finally, to make sure that people can still depend on crates in our
workspace without pulling in all the workspace deps, we use a `[patch]`
section following [hakari's
instructions](https://docs.rs/cargo-hakari/0.9.36/cargo_hakari/patch_directive/index.html)

One possible followup task would be making guppy use our
`rust-toolchain.toml` instead of having to duplicate that list in its
config, I opened an issue for that upstream: guppy-rs/guppy#481.

TODO:
- [x] Fix the extension test failure
- [x] Ensure the dev dependencies aren't being unified by Hakari into
the main dependencies
- [x] Ensure that the remote-server binary continues to not depend on
LibSSL

Release Notes:

- N/A

---------

Co-authored-by: Mikayla <mikayla@zed.dev>
Co-authored-by: Mikayla Maki <mikayla.c.maki@gmail.com>
2025-04-02 13:26:34 -07:00
Marshall Bowers
889bc13b7d
language_model: Remove use_any_tool method from LanguageModel (#27930)
This PR removes the `use_any_tool` method from the `LanguageModel`
trait.

It was not being used anywhere, and doesn't really fit in our new tool
use story.

Release Notes:

- N/A
2025-04-02 15:49:21 +00:00