This PR makes it so that all kinds of @-mentions start loading their
context as soon as they are confirmed. Previously, we were waiting to
load the context for file, symbol, selection, and rule mentions until
the user's message was sent. By kicking off loading immediately for
these kinds of context, we can support adding selections from unsaved
buffers, and we make the semantics of @-mentions more consistent.
Loading all kinds of context eagerly also makes it possible to simplify
the structure of the MentionSet and the code around it. Now MentionSet
is just a single hash map, all the management of creases happens in a
uniform way in `MessageEditor::confirm_completion`, and the helper
methods for loading different kinds of context are much more focused and
orthogonal.
Release Notes:
- N/A
---------
Co-authored-by: Conrad <conrad@zed.dev>
This pull request should be idempotent, but lays the groundwork for
avoiding to connect to collab in order to interact with AI features
provided by Zed.
Release Notes:
- N/A
---------
Co-authored-by: Marshall Bowers <git@maxdeviant.com>
Co-authored-by: Richard Feldman <oss@rtfeldman.com>
This PR updates the Agent panel to work with the `CloudUserStore`
instead of the `UserStore`, reducing its reliance on being connected to
Collab to function.
Release Notes:
- N/A
---------
Co-authored-by: Richard Feldman <oss@rtfeldman.com>
Closes#33792
Follow up to #33237 - Turns out my fix for this was not correct
Release Notes:
- agent: Fixed an issue where tools would not work when two MCP servers
provided a tool with the same name
cc @osyvokon
We were seeing a bunch of errors in our backend when people were using
Claude models with thinking enabled.
In the logs we would see
> an error occurred while interacting with the Anthropic API:
invalid_request_error: messages.x.content.0.type: Expected `thinking` or
`redacted_thinking`, but found `text`. When `thinking` is enabled, a
final `assistant` message must start with a thinking block (preceeding
the lastmost set of `tool_use` and `tool_result` blocks). We recommend
you include thinking blocks from previous turns. To avoid this
requirement, disable `thinking`. Please consult our documentation at
https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking
However, this issue did not occur frequently and was not easily
reproducible. Turns out it was triggered by us not correctly handling
[Redacted Thinking
Blocks](https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking#thinking-redaction).
I could constantly reproduce this issue by including this magic string:
`ANTHROPIC_MAGIC_STRING_TRIGGER_REDACTED_THINKING_46C9A13E193C177646C7398A98432ECCCE4C1253D5E2D82641AC0E52CC2876CB
` in the request, which forces `claude-3-7-sonnet` to emit redacted
thinking blocks (confusingly the magic string does not seem to be
working for `claude-sonnet-4`). As soon as we hit a tool call Anthropic
would return an error.
Thanks to @osyvokon for pointing me in the right direction 😄!
Release Notes:
- agent: Fixed an issue where Anthropic models would sometimes return an
error when thinking was enabled
This breaks a transitive dependency of `agent` on UI crates. I've also
found and eliminated some dead code in assistant_context_editor.
Release Notes:
- N/A
This PR moves the UI-dependent logic in the `agent` crate into its own
crate, `agent_ui`. The remaining `agent` crate no longer depends on
`editor`, `picker`, `ui`, `workspace`, etc.
This has compile time benefits, but the main motivation is to isolate
our core agentic logic, so that we can make agents more
pluggable/configurable.
Release Notes:
- N/A
Some MCP servers expose tools that take absolute paths as arguments. To
interact with these, the agent needs to know the absolute path to the
project directories, not just their names. This PR changes the system
prompt to include the full path to each worktree, and updates some tool
descriptions to reflect this.
Todo:
* [x] Run evals, make sure assistant still understand how to specify
paths for tools, now that we include abs paths in the system prompt.
Release Notes:
- Improved the agent's ability to use MPC tools that require absolute
paths to files and directories in the project.
---------
Co-authored-by: Ben Brandt <benjamin.j.brandt@gmail.com>
This was causing a lot of work on startup, particularly due to
instantiating edit tool cards. The minor downside is that now these
threads don't open quite as fast.
Includes a few other improvements:
* On text thread rename, now immediately updates the metadata for
display in the UI instead of waiting for reload.
* On text thread rename, first renames the file before writing. Before
if the file removal failed you'd end up with a duplicate.
* Now only stores text thread file names instead of full paths. This is
more concise and allows for the app data dir changing location.
* Renames `ThreadStore::unordered_threads` to
`ThreadStore::reverse_chronological_threads` (and removes the old one
that sorted), since the recent change to use a SQL database queries them
in that order.
* Removes `ContextStore::reverse_chronological_contexts` since it was
only used in one location where it does sorting anyway - no need to sort
twice.
* `SavedContextMetadata::title` is now `SharedString` instead of
`String`.
Release Notes:
- Fixed regression in startup performance by not deserializing and
instantiating recently opened agent threads.
This changes the context server crate so that the input/output for a
request are encoded at the type level, similar to how it is done for LSP
requests.
This also makes it easier to write tests that mock context servers, e.g.
you can write something like this now when using the `test-support`
feature of the `context-server` crate:
```rust
create_fake_transport("mcp-1", cx.background_executor())
.on_request::<context_server::types::request::PromptsList>(|_params| {
PromptsListResponse {
prompts: vec![/* some prompts */],
..
}
})
```
Release Notes:
- N/A
This allows storing the profile per thread, as well as moving the logic
of which tools are enabled or not to the profile itself.
This makes it much easier to switch between profiles, means there is
less global state being changed on every profile change.
Release Notes:
- agent panel: allow saving the profile per thread
---------
Co-authored-by: Ben Kunkle <ben.kunkle@gmail.com>
These started to be used more recently, so we should also support them.
Release Notes:
- agent: Added support for `AGENT.md` and `AGENTS.md` as rules file
names.
Previously, LMDB was used for storing threads, but it consumed excessive
disk space and was capped at 1GB.
This change migrates thread storage to an SQLite database. Thread JSON
objects are now compressed using zstd.
I considered training a custom zstd dictionary and storing it in a
separate table. However, the additional complexity outweighed the modest
space savings (up to 20%). I ended up using the default dictionary
stored with data.
Threads can be exported relatively easily from outside the application:
```
$ sqlite3 threads.db "SELECT hex(data) FROM threads LIMIT 5;" |
xxd -r -p |
zstd -d |
fx
```
Benchmarks:
- Original heed database: 200MB
- Sqlite uncompressed: 51MB
- sqlite compressed (this PR): 4.0MB
- sqlite compressed with a trained dictionary: 3.8MB
Release Notes:
- Migrated thread storage to SQLite with compression
This PR improves the consecutive tool call UX by allowing users to
quickly continue an interrupted with one-click. What we do here is
insert a hidden "Continue" message that will just nudge the LLM to keep
going. We're also using the opportunity to upsell the previously called
"Max Mode", now rebranded as "Burn Mode", which allows users to don't be
interrupted anymore if they ever have 25 consecutive tool calls again.
Release Notes:
- agent: Improve consecutive tool call UX by allowing users to quickly
continue an interrupted thread with one click.
---------
Co-authored-by: Ben Brandt <benjamin.j.brandt@gmail.com>
Co-authored-by: Agus Zubiaga <hi@aguz.me>
Co-authored-by: Agus Zubiaga <agus@zed.dev>
https://github.com/zed-industries/zed/issues/30972 brought up another
case where our context is not enough to track the actual source of the
issue: we get a general top-level error without inner error.
The reason for this was `.ok_or_else(|| anyhow!("failed to read HEAD
SHA"))?; ` on the top level.
The PR finally reworks the way we use anyhow to reduce such issues (or
at least make it simpler to bubble them up later in a fix).
On top of that, uses a few more anyhow methods for better readability.
* `.ok_or_else(|| anyhow!("..."))`, `map_err` and other similar error
conversion/option reporting cases are replaced with `context` and
`with_context` calls
* in addition to that, various `anyhow!("failed to do ...")` are
stripped with `.context("Doing ...")` messages instead to remove the
parasitic `failed to` text
* `anyhow::ensure!` is used instead of `if ... { return Err(...); }`
calls
* `anyhow::bail!` is used instead of `return Err(anyhow!(...));`
Release Notes:
- N/A
1. The `edit_file` tool tended to use `create_or_overwrite` a bit too
often, leading to corruption of long files. This change replaces the
boolean flag with an `EditFileMode` enum, which helps Agent make a more
deliberate choice when overwriting files.
With this change, the pass rate of the new eval increased from 10% to
100%.
2. eval: Added ability to run eval on top of an existing thread. Threads
can now be loaded from JSON files in the `SerializedThread` format,
which makes it easy to use real threads as starting points for
tests/evals.
3. Don't try to restore tool cards when running in headless or eval mode
-- we don't have a window to properly do this.
Release Notes:
- N/A
This is very basic support for them. There are a number of other TODOs
before this is really a first-class supported feature, so not adding any
release notes for it; for now, this PR just makes it so that if
read_file tries to read a PNG (which has come up in practice), it at
least correctly sends it to Anthropic instead of messing up.
This also lays the groundwork for future PRs for more first-class
support for images in tool calls across more image file formats and LLM
providers.
Release Notes:
- N/A
---------
Co-authored-by: Agus Zubiaga <hi@aguz.me>
Co-authored-by: Agus Zubiaga <agus@zed.dev>
This reverts commit 3615d6d96c.
Ultimately, we want to restore the ability to store a profile
per-thread, but for now reverting this fixes a fairly disruptive bug.
Release Notes:
- Fixed a bug causing the agent to use the wrong profile in some cases.
When deciding if a model supports tools or not, we weren't reading from
the configured model in a given thread.
This also stores the profile on the thread, which matches the behavior
of the Model and Max Mode, which we also already store per thread.
Hopefully this helps alleviate some confusion.
Release Notes:
- agent: Save profile selection per-Agent thread
Supersedes: https://github.com/zed-industries/zed/pull/29936
Thanks for your contribution @imumesh18, but we had a slightly different
take on it :)
Release Notes:
- N/A
Co-authored-by: Ben Brandt <benjamin.j.brandt@gmail.com>
Because we instantiated `ContextServerManager` both in `agent` and
`assistant-context-editor`, and these two entities track the running MCP
servers separately, we were effectively running every MCP server twice.
This PR moves the `ContextServerManager` into the project crate (now
called `ContextServerStore`). The store can be accessed via a project
instance. This ensures that we only instantiate one `ContextServerStore`
per project.
Also, this PR adds a bunch of tests to ensure that the
`ContextServerStore` behaves correctly (Previously there were none).
Closes#28714Closes#29530
Release Notes:
- N/A
Closes #ISSUE
Co-authored-by: Bennet <bennet@zed.dev>
Release Notes:
- Added support for context `@mentions` in the inline prompt editor and
when editing past messages in the agent panel.
---------
Co-authored-by: Bennet Bo Fenner <bennet@zed.dev>
Co-authored-by: Ben Brandt <benjamin.j.brandt@gmail.com>
This PR makes it possible to use different LLM models in the agent
panels of two different projects, simultaneously. It also properly
restores a thread's original model when restoring it from the history,
rather than having it use the default model. As before, newly-created
threads will use the current default model.
Release Notes:
- Enabled different project windows to use different models in the agent
panel
- Enhanced the agent panel so that when revisiting old threads, their
original model will be used.
---------
Co-authored-by: Richard Feldman <oss@rtfeldman.com>
When using the agent with a project shared by a collaborator, rules file
loading didn't work as it was trying to read from the client's
filesystem
Release Notes:
- Fixed rules file loading when using the agent with a project shared by
a collaborator.
Previously, all MCP tools would be completed regardless if they were
disabled/enabled for the profile. This meant that the "Write" profile
was always using all MCP tools, even if you disabled them in the
settings.
Now, when `enable_all_context_servers` is set to `true`, we will enable
all tools from all MCP servers by default but disable the ones that are
explicitly disabled for the profile.
Also fixes an issue where the tools would not show up as enabled when
using `enable_all_context_servers: true`
Release Notes:
- agent: Fix an issue where MCP tools could not be enabled/disabled
Simplifies the data structures involved in agent context by removing
caching and limiting the use of ContextId:
* `AssistantContext` enum is now like an ID / handle to context that
does not need to be updated. `ContextId` still exists but is only used
for generating unique `ElementId`.
* `ContextStore` has a `IndexMap<ContextSetEntry>`. Only need to keep a
`HashSet<ThreadId>` consistent with it. `ContextSetEntry` is a newtype
wrapper around `AssistantContext` which implements eq / hash on a subset
of fields.
* Thread `Message` directly stores its context.
Fixes the following bugs:
* If a context entry is removed from the strip and added again, it was
reincluded in the next message.
* Clicking file context in the thread that has been removed from the
context strip didn't jump to the file.
* Refresh of directory context didn't reflect added / removed files.
* Deleted directories would remain in the message editor context strip.
* Token counting requests didn't include image context.
* File, directory, and symbol context deduplication relied on
`ProjectPath` for identity, and so didn't handle renames.
* Symbol context line numbers didn't update when shifted
Known bugs (not fixed):
* Deleting a directory causes it to disappear from messages in threads.
Fixing this in a nice way is tricky. One easy fix is to store the
original path and show that on deletion. It's weird that deletion would
cause the name to "revert", though. Another possibility would be to
snapshot context metadata on add (ala `AddedContext`), and keep that
around despite deletion.
Release Notes:
- N/A
We used to insert empty user messages into the `Thread::messages` `Vec`
when tools finished running and then we would attach the results when
creating the request. This approach was very easy to mess up during
state handling, leading to empty user messages displayed in the
conversation and API failures.
Instead, we will no longer insert actual user messages for tool results
to the `Thread`, and will only do this on the fly when creating the
model request. This simplifies a lot of code and show fix the mentioned
errors.
Release Notes:
- agent: Improve reliability of LLM requests when including tool results
---------
Co-authored-by: Bennet Bo Fenner <bennetbo@gmx.de>
Co-authored-by: Oleksiy Syvokon <oleksiy.syvokon@gmail.com>
Looks like the required backend component of this was deployed.
https://github.com/zed-industries/monorepo/actions/runs/14541199197
Release Notes:
- N/A
---------
Co-authored-by: Antonio Scandurra <me@as-cii.com>
Co-authored-by: Agus Zubiaga <hi@aguz.me>
Co-authored-by: Richard Feldman <oss@rtfeldman.com>
Co-authored-by: Nathan Sobo <nathan@zed.dev>
Now that we've established a proper eval in tree, this PR is reboots of
our agent loop back to a set of minimal tools and simpler prompts. We
should aim to get this branch feeling subjectively competitive with
what's on main and then merge it, and build from there.
Let's invest in our eval and use it to drive better performance of the
agent loop. How you can help: Pick an example, and then make the outcome
faster or better. It's fine to even use your own subjective judgment, as
our evaluation criteria likely need tuning as well at this point. Focus
on making the agent work better in your own subjective experience first.
Let's focus on simple/practical improvements to make this thing work
better, then determine how we can craft our judgment criteria to lock
those improvements in.
Release Notes:
- N/A
---------
Co-authored-by: Max <max@zed.dev>
Co-authored-by: Antonio <antonio@zed.dev>
Co-authored-by: Agus <agus@zed.dev>
Co-authored-by: Richard <richard@zed.dev>
Co-authored-by: Max Brunsfeld <maxbrunsfeld@gmail.com>
Co-authored-by: Antonio Scandurra <me@as-cii.com>
Co-authored-by: Michael Sloan <mgsloan@gmail.com>
Related to #28490.
- Default prompts from the prompt library are now included as "user
rules" in the system prompt.
- Presence of these user rules is shown at the beginning of the thread
in the UI.
_ Now uses an `Entity<PromptStore>` instead of an `Arc<PromptStore>`.
Motivation for this is emitting a `PromptsUpdatedEvent`.
- Now disallows concurrent reloading of the system prompt. Before this
change it was possible for reloads to race.
Release Notes:
- agent: Added support for including default prompts from the Prompt
Library as "user rules" in the system prompt.
---------
Co-authored-by: Danilo Leal <daniloleal09@gmail.com>
The UI was mistakenly using the cumulative token usage for the token
counter. It will now display the last request token count, plus an
estimation of the tokens in the message editor and context entries that
haven't been sent yet.
https://github.com/user-attachments/assets/0438c501-b850-4397-9135-57214ca3c07a
Additionally, when the user edits a message, we'll display the actual
token count up to it and estimate the tokens in the new message.
Note: We don't currently estimate the delta when switching profiles. In
the future, we want to use the count tokens API to measure every part of
the request and display a breakdown.
Release Notes:
- agent: Made the token count more accurate and added back estimation of
used tokens as you type and add context.
---------
Co-authored-by: Bennet Bo Fenner <bennetbo@gmx.de>
Co-authored-by: Danilo Leal <daniloleal09@gmail.com>