Now we're more tolerant of invalid JSON coming back from the model
(possibly because it was incomplete and we're streaming), plus if we do
end up with invalid JSON once it has all streamed back, we report what
the malformed JSON actually was:
<img width="444" alt="Screenshot 2025-04-23 at 1 49 14 PM"
src="https://github.com/user-attachments/assets/480f5da7-869b-49f3-9ffd-8f08ccddb33d"
/>
Release Notes:
- N/A
* Adds a fast / cheaper model to providers and defaults thread
summarization to this model. Initial motivation for this was that
https://github.com/zed-industries/zed/pull/29099 would cause these
requests to fail when used with a thinking model. It doesn't seem
correct to use a thinking model for summarization.
* Skips system prompt, context, and thinking segments.
* If tool use is happening, allows 2 tool uses + one more agent response
before summarizing.
Downside of this is that there was potential for some prefix cache reuse
before, especially for title summarization (thread summarization omitted
tool results and so would not share a prefix for those). This seems fine
as these requests should typically be fairly small. Even for full thread
summarization, skipping all tool use / context should greatly reduce the
token use.
Release Notes:
- N/A
Looks like the required backend component of this was deployed.
https://github.com/zed-industries/monorepo/actions/runs/14541199197
Release Notes:
- N/A
---------
Co-authored-by: Antonio Scandurra <me@as-cii.com>
Co-authored-by: Agus Zubiaga <hi@aguz.me>
Co-authored-by: Richard Feldman <oss@rtfeldman.com>
Co-authored-by: Nathan Sobo <nathan@zed.dev>
This PR attaches the thread ID and the new prompt ID to telemetry events
for completions in the Agent panel.
Release Notes:
- N/A
---------
Co-authored-by: Mikayla Maki <mikayla.c.maki@gmail.com>
This PR updates the Agent to extract the usage information from the
response headers, if they are present.
For now we just log the information, but we'll be using this soon to
populate some UI.
Release Notes:
- N/A
Release Notes:
- Add support for OpenAI o3 and o4-mini models via OpenAI API and
Copilot Chat providers.
---------
Co-authored-by: Peter Tripp <peter@zed.dev>
See #28793, the name of the field is actually `systemInstruction` not
`systemInstructions`.
Release Notes:
- Fixed an issue where Gemini requests would fail
This PR makes it so we use more types and constants from the
`zed_llm_client` crate to avoid duplicating information.
Also updates the current usage endpoint to use limits derived from the
`Plan`.
Release Notes:
- N/A
The UI was mistakenly using the cumulative token usage for the token
counter. It will now display the last request token count, plus an
estimation of the tokens in the message editor and context entries that
haven't been sent yet.
https://github.com/user-attachments/assets/0438c501-b850-4397-9135-57214ca3c07a
Additionally, when the user edits a message, we'll display the actual
token count up to it and estimate the tokens in the new message.
Note: We don't currently estimate the delta when switching profiles. In
the future, we want to use the count tokens API to measure every part of
the request and display a breakdown.
Release Notes:
- agent: Made the token count more accurate and added back estimation of
used tokens as you type and add context.
---------
Co-authored-by: Bennet Bo Fenner <bennetbo@gmx.de>
Co-authored-by: Danilo Leal <daniloleal09@gmail.com>
This PR adds an error message when the model requests limit has been
hit.
Release Notes:
- N/A
Co-authored-by: Oleksiy Syvokon <oleksiy.syvokon@gmail.com>
When we do not have any tools, we want to set the `tools` field to
`None`
Release Notes:
- Fixed an issue where Gemini requests would sometimes return a Bad
Request ("Invalid argument...")
Release Notes:
- Add support for OpenAI GPT-4.1 via Copilot Chat and OpenAI API
---------
Co-authored-by: Danilo Leal <daniloleal09@gmail.com>
Co-authored-by: Bennet Bo Fenner <bennetbo@gmx.de>
Release Notes:
- agent: Show recommended models in the agent model selector and display
the provider in the model selector's trigger.
---------
Co-authored-by: Danilo Leal <daniloleal09@gmail.com>
Co-authored-by: Danilo Leal <67129314+danilo-leal@users.noreply.github.com>
Closes#27223
Merges: #27996, #26734, #27949
Release Notes:
- AWS Bedrock: Added advanced authentication strategies with:
- Short lived credentials with Session Tokens
- AWS Named Profile
- EC2 Identity, Pod Identity, Web Identity
- AWS Bedrock: Added Claude 3.7 Thinking support.
- AWS Bedrock: Adding Cross Region Inference for all combinations of
regions and model availability.
- Agent Beta: Added support for AWS Bedrock.
---------
Co-authored-by: Marshall Bowers <git@maxdeviant.com>
This PR adds tool calling support for GitHub Copilot Chat models.
Currently only supports the Claude family of models.
Release Notes:
- agent: Added tool calling support for Claude models in GitHub Copilot
Chat.
---------
Co-authored-by: Marshall Bowers <git@maxdeviant.com>
This PR disables `parallel_tool_calls` for the models that support it,
as the Agent currently expects at most one tool use per turn.
It was a bit of trial and error to figure this out. OpenAI's API
annoyingly will return an error if passing `parallel_tool_calls` to a
model that doesn't support it.
Release Notes:
- N/A
This adds a "workspace-hack" crate, see
[mozilla's](https://hg.mozilla.org/mozilla-central/file/3a265fdc9f33e5946f0ca0a04af73acd7e6d1a39/build/workspace-hack/Cargo.toml#l7)
for a concise explanation of why this is useful. For us in practice this
means that if I were to run all the tests (`cargo nextest r
--workspace`) and then `cargo r`, all the deps from the previous cargo
command will be reused. Before this PR it would rebuild many deps due to
resolving different sets of features for them. For me this frequently
caused long rebuilds when things "should" already be cached.
To avoid manually maintaining our workspace-hack crate, we will use
[cargo hakari](https://docs.rs/cargo-hakari) to update the build files
when there's a necessary change. I've added a step to CI that checks
whether the workspace-hack crate is up to date, and instructs you to
re-run `script/update-workspace-hack` when it fails.
Finally, to make sure that people can still depend on crates in our
workspace without pulling in all the workspace deps, we use a `[patch]`
section following [hakari's
instructions](https://docs.rs/cargo-hakari/0.9.36/cargo_hakari/patch_directive/index.html)
One possible followup task would be making guppy use our
`rust-toolchain.toml` instead of having to duplicate that list in its
config, I opened an issue for that upstream: guppy-rs/guppy#481.
TODO:
- [x] Fix the extension test failure
- [x] Ensure the dev dependencies aren't being unified by Hakari into
the main dependencies
- [x] Ensure that the remote-server binary continues to not depend on
LibSSL
Release Notes:
- N/A
---------
Co-authored-by: Mikayla <mikayla@zed.dev>
Co-authored-by: Mikayla Maki <mikayla.c.maki@gmail.com>
This PR removes the `use_any_tool` method from the `LanguageModel`
trait.
It was not being used anywhere, and doesn't really fit in our new tool
use story.
Release Notes:
- N/A
This seems to improve the performance of `gemini-2.5-pro-exp-03-25`
significantly.
We know define a single `Tool` that has multiple `FunctionDeclaration`s,
instead of defining multiple `Tool`s with a single
`FunctionDeclaration`.
Oddly enough the `flash` models seemed to work perfectly fine with the
multiple `Tool { ... }` definitions
Release Notes:
- N/A
Closes#25671
Release Notes:
- Added support for `claude-3-7-sonnet-thinking` in the assistant panel
---------
Co-authored-by: Danilo Leal <daniloleal09@gmail.com>
Co-authored-by: Antonio Scandurra <me@as-cii.com>
Co-authored-by: Agus Zubiaga <hi@aguz.me>
This is the core change:
https://github.com/zed-industries/zed/pull/26758/files#diff-044302c0d57147af17e68a0009fee3e8dcdfb4f32c27a915e70cfa80e987f765R1052
TODO:
- [x] Use AsyncFn instead of Fn() -> Future in GPUI spawn methods
- [x] Implement it in the whole app
- [x] Implement it in the debugger
- [x] Glance at the RPC crate, and see if those box future methods can
be switched over. Answer: It can't directly, as you can't make an
AsyncFn* into a trait object. There's ways around that, but they're all
more complex than just keeping the code as is.
- [ ] Fix platform specific code
Release Notes:
- N/A
Closes#25883
This PR allows you to use copilot chat for assistant without setting
copilot as the edit prediction provider.
[copilot.webm](https://github.com/user-attachments/assets/fecfbde1-d72c-4c0c-b080-a07671fb846e)
Todos:
- [x] Remove redudant "copilot" key from settings
- [x] Do not disable copilot LSP when `edit_prediction_provider` is not
set to `copilot`
- [x] Start copilot LSP when:
- [x] `edit_prediction_provider` is set to `copilot`
- [x] Copilot sign in clicked from assistant settings
- [x] Handle flicker for frame after starting LSP, but before signing in
caused due to signed out status
- [x] Fixed this by adding intermediate state for awaiting signing in in
sign out enum
- [x] Handle cancel button should sign out from `copilot` (existing bug)
- [x] Handle modal dismissal should sign out if not in signed in state
(existing bug)
Release Notes:
- You can now sign into Copilot from assistant settings without making
it your edit prediction provider. This is useful if you want to use
Copilot chat while keeping a different provider, like Zed, for
predictions.
- Removed the `copilot` key from `features` in settings. Use
`edit_prediction_provider` instead.