I don't think this makes much of a difference in current use, but this
more closely matches other providers and cleans up the "Response"
section of eval markdown output
Release Notes:
- N/A
This PR adds a notice when reaching consecutive tool use limits when
using normal mode.
Here's an example with the limit artificially lowered to 2 consecutive
tool uses:
https://github.com/user-attachments/assets/32da8d38-67de-4d6b-8f24-754d2518e5d4
Release Notes:
- agent: Added a notice when reaching consecutive tool use limits when
using a model in normal mode.
This sets us up to display queue position information to the user, once
our language model backend is updated to support request queuing.
The JSON returned by the LLM backend will need to look like this:
```json
{"queue": {"status": "queued", "position": 1}}
{"queue": {"status": "started"}}
{"event": {"THE_UPSTREAM_MODEL_PROVIDER_EVENT": "..."}}
```
Release Notes:
- N/A
---------
Co-authored-by: Marshall Bowers <git@maxdeviant.com>
This PR changes the default fast model for the Zed provider from Claude
3.5 Haiku to Claude 3.5 Sonnet.
We don't offer Claude 3.5 Haiku to users.
Closes https://github.com/zed-industries/zed/issues/29505.
Release Notes:
- agent: Changed the default fast model for the Zed provider to Claude
3.5 Sonnet.
This PR makes it so we pass up the `mode` from the
`LanguageModelRequest` when interacting with the Zed provider instead of
passing a hard-coded value.
Release Notes:
- N/A
This PR removes the `language-models` feature flag.
This feature is already generally available, so we no longer need the
feature flag.
Release Notes:
- N/A
This PR adds the `FeatureFlag` suffix to the feature flag types that
were missing them.
This makes the names easier to search in the codebase.
Release Notes:
- N/A
This PR updates the Zed provider to use the `POST /completions`
endpoint.
There is no functional difference from `POST /completion`, but the
pluralized version reads better.
Release Notes:
- N/A
#29354 introduced a bug where we would append tool uses to the last
assistant message even if it was from a previous request.
Release Notes:
- N/A
Co-authored-by: Bennet Bo Fenner <bennetbo@gmx.de>
This is to enable alternative streaming solutions at the application
layer. I'm not sure we really should have performed parsing of the input
at this layer. Either way I want to experiment with streaming approaches
in a separate crate on a branch, and this will help.
/cc @maxdeviant @bennetbo @rtfeldman
Closes #ISSUE
Release Notes:
- N/A
This PR wires the counting of Google AI tokens back up.
It now goes through the LLM service instead of collab's RPC.
Still only available for Zed staff.
Release Notes:
- N/A
This PR removes the `CountLanguageModelTokens` RPC message from collab.
We were only using this for Google AI models through the Zed provider
(which is only available to Zed staff).
For now we're returning `0`, but will bring back soon.
Release Notes:
- N/A
Things this doesn't currently handle:
- [x] ~testing~
- ~we really need an snapshot test that takes a vscode settings file
with all options that we support, and verifies the zed settings file you
get from importing it, both from an empty starting file or one with lots
of conflicts. that way we can open said vscode settings file in vscode
to ensure that those options all still exist in the future.~
- Discussed this, we don't think this will meaningfully protect us from
future failures, and we will just do this as a manual validation step
before merging this PR. Any imports that have meaningfully complex
translation steps should still be tested.
- [x] confirmation (right now it just clobbers your settings file
silently)
- it'd be really cool if we could show a diff multibuffer of your
current settings with the result of the vscode import and let you pick
"hunks" to keep, but that's probably too much effort for this feature,
especially given that we expect most of the people using it to have an
empty/barebones zed config when they run the import.
- [x] ~UI in the "welcome" page~
- we're planning on redoing our welcome/walkthrough experience anyways,
but in the meantime it'd be nice to conditionally show a button there if
we see a user level vscode config
- we'll add it to the UI when we land the new walkthrough experience,
for now it'll be accessible through the action
- [ ] project-specific settings
- handling translation of `.vscode/settings.json` or `.code-workspace`
settings to `.zed/settings.json` will come in a future PR, along with UI
to prompt the user for those actions when opening a project with local
vscode settings for the first time
- [ ] extension settings
- we probably want to do a best-effort pass of popular extensions like
vim and git lens
- it's also possible to look for installed/enabled extensions with `code
--list-extensions`, but we'd have to maintain some sort of mapping of
those to our settings and/or extensions
- [ ] LSP settings
- these are tricky without access to the json schemas for various
language server extensions. we could probably manage to do translations
for a couple popular languages and avoid solving it in the general case.
- [ ] platform specific settings (`[macos].blah`)
- this is blocked on #16392 which I'm hoping to address soon
- [ ] language specific settings (`[rust].foo`)
- totally doable, just haven't gotten to it yet
~We may want to put this behind some kind of flag and/or not land it
until some of the above issues are addressed, given that we expect
people to only run this importer once there's an incentive to get it
right the first time. Maybe we land it alongside a keymap importer so
you don't have to go through separate imports for those?~
We are gonna land this as-is, all these unchecked items at the bottom
will be addressed in followup PRs, so maybe don't run the importer for
now if you have a large and complex VsCode settings file you'd like to
import.
Release Notes:
- Added a VSCode settings importer, available via a
`zed::ImportVsCodeSettings` action
---------
Co-authored-by: Mikayla Maki <mikayla@zed.dev>
Co-authored-by: Kirill Bulatov <kirill@zed.dev>
Co-authored-by: Mikayla Maki <mikayla.c.maki@gmail.com>
Co-authored-by: Marshall Bowers <git@maxdeviant.com>
Now we're more tolerant of invalid JSON coming back from the model
(possibly because it was incomplete and we're streaming), plus if we do
end up with invalid JSON once it has all streamed back, we report what
the malformed JSON actually was:
<img width="444" alt="Screenshot 2025-04-23 at 1 49 14 PM"
src="https://github.com/user-attachments/assets/480f5da7-869b-49f3-9ffd-8f08ccddb33d"
/>
Release Notes:
- N/A
* Adds a fast / cheaper model to providers and defaults thread
summarization to this model. Initial motivation for this was that
https://github.com/zed-industries/zed/pull/29099 would cause these
requests to fail when used with a thinking model. It doesn't seem
correct to use a thinking model for summarization.
* Skips system prompt, context, and thinking segments.
* If tool use is happening, allows 2 tool uses + one more agent response
before summarizing.
Downside of this is that there was potential for some prefix cache reuse
before, especially for title summarization (thread summarization omitted
tool results and so would not share a prefix for those). This seems fine
as these requests should typically be fairly small. Even for full thread
summarization, skipping all tool use / context should greatly reduce the
token use.
Release Notes:
- N/A
Looks like the required backend component of this was deployed.
https://github.com/zed-industries/monorepo/actions/runs/14541199197
Release Notes:
- N/A
---------
Co-authored-by: Antonio Scandurra <me@as-cii.com>
Co-authored-by: Agus Zubiaga <hi@aguz.me>
Co-authored-by: Richard Feldman <oss@rtfeldman.com>
Co-authored-by: Nathan Sobo <nathan@zed.dev>
This PR attaches the thread ID and the new prompt ID to telemetry events
for completions in the Agent panel.
Release Notes:
- N/A
---------
Co-authored-by: Mikayla Maki <mikayla.c.maki@gmail.com>
This PR updates the Agent to extract the usage information from the
response headers, if they are present.
For now we just log the information, but we'll be using this soon to
populate some UI.
Release Notes:
- N/A
Release Notes:
- Add support for OpenAI o3 and o4-mini models via OpenAI API and
Copilot Chat providers.
---------
Co-authored-by: Peter Tripp <peter@zed.dev>
See #28793, the name of the field is actually `systemInstruction` not
`systemInstructions`.
Release Notes:
- Fixed an issue where Gemini requests would fail
This PR makes it so we use more types and constants from the
`zed_llm_client` crate to avoid duplicating information.
Also updates the current usage endpoint to use limits derived from the
`Plan`.
Release Notes:
- N/A
The UI was mistakenly using the cumulative token usage for the token
counter. It will now display the last request token count, plus an
estimation of the tokens in the message editor and context entries that
haven't been sent yet.
https://github.com/user-attachments/assets/0438c501-b850-4397-9135-57214ca3c07a
Additionally, when the user edits a message, we'll display the actual
token count up to it and estimate the tokens in the new message.
Note: We don't currently estimate the delta when switching profiles. In
the future, we want to use the count tokens API to measure every part of
the request and display a breakdown.
Release Notes:
- agent: Made the token count more accurate and added back estimation of
used tokens as you type and add context.
---------
Co-authored-by: Bennet Bo Fenner <bennetbo@gmx.de>
Co-authored-by: Danilo Leal <daniloleal09@gmail.com>
This PR adds an error message when the model requests limit has been
hit.
Release Notes:
- N/A
Co-authored-by: Oleksiy Syvokon <oleksiy.syvokon@gmail.com>
When we do not have any tools, we want to set the `tools` field to
`None`
Release Notes:
- Fixed an issue where Gemini requests would sometimes return a Bad
Request ("Invalid argument...")
Release Notes:
- Add support for OpenAI GPT-4.1 via Copilot Chat and OpenAI API
---------
Co-authored-by: Danilo Leal <daniloleal09@gmail.com>
Co-authored-by: Bennet Bo Fenner <bennetbo@gmx.de>
Release Notes:
- agent: Show recommended models in the agent model selector and display
the provider in the model selector's trigger.
---------
Co-authored-by: Danilo Leal <daniloleal09@gmail.com>
Co-authored-by: Danilo Leal <67129314+danilo-leal@users.noreply.github.com>
Closes#27223
Merges: #27996, #26734, #27949
Release Notes:
- AWS Bedrock: Added advanced authentication strategies with:
- Short lived credentials with Session Tokens
- AWS Named Profile
- EC2 Identity, Pod Identity, Web Identity
- AWS Bedrock: Added Claude 3.7 Thinking support.
- AWS Bedrock: Adding Cross Region Inference for all combinations of
regions and model availability.
- Agent Beta: Added support for AWS Bedrock.
---------
Co-authored-by: Marshall Bowers <git@maxdeviant.com>
This PR adds tool calling support for GitHub Copilot Chat models.
Currently only supports the Claude family of models.
Release Notes:
- agent: Added tool calling support for Claude models in GitHub Copilot
Chat.
---------
Co-authored-by: Marshall Bowers <git@maxdeviant.com>
This PR disables `parallel_tool_calls` for the models that support it,
as the Agent currently expects at most one tool use per turn.
It was a bit of trial and error to figure this out. OpenAI's API
annoyingly will return an error if passing `parallel_tool_calls` to a
model that doesn't support it.
Release Notes:
- N/A
This adds a "workspace-hack" crate, see
[mozilla's](https://hg.mozilla.org/mozilla-central/file/3a265fdc9f33e5946f0ca0a04af73acd7e6d1a39/build/workspace-hack/Cargo.toml#l7)
for a concise explanation of why this is useful. For us in practice this
means that if I were to run all the tests (`cargo nextest r
--workspace`) and then `cargo r`, all the deps from the previous cargo
command will be reused. Before this PR it would rebuild many deps due to
resolving different sets of features for them. For me this frequently
caused long rebuilds when things "should" already be cached.
To avoid manually maintaining our workspace-hack crate, we will use
[cargo hakari](https://docs.rs/cargo-hakari) to update the build files
when there's a necessary change. I've added a step to CI that checks
whether the workspace-hack crate is up to date, and instructs you to
re-run `script/update-workspace-hack` when it fails.
Finally, to make sure that people can still depend on crates in our
workspace without pulling in all the workspace deps, we use a `[patch]`
section following [hakari's
instructions](https://docs.rs/cargo-hakari/0.9.36/cargo_hakari/patch_directive/index.html)
One possible followup task would be making guppy use our
`rust-toolchain.toml` instead of having to duplicate that list in its
config, I opened an issue for that upstream: guppy-rs/guppy#481.
TODO:
- [x] Fix the extension test failure
- [x] Ensure the dev dependencies aren't being unified by Hakari into
the main dependencies
- [x] Ensure that the remote-server binary continues to not depend on
LibSSL
Release Notes:
- N/A
---------
Co-authored-by: Mikayla <mikayla@zed.dev>
Co-authored-by: Mikayla Maki <mikayla.c.maki@gmail.com>
This PR removes the `use_any_tool` method from the `LanguageModel`
trait.
It was not being used anywhere, and doesn't really fit in our new tool
use story.
Release Notes:
- N/A