Adds popular examples of long-running commands to system prompt.
Unfortunately, I couldn't add an eval example as the new terminal tool
no longer works in `eval`. We can look into that tomorrow, but I'm
seeing improvements when manually testing this, so I'd like to merge it.
<img
src="https://github.com/user-attachments/assets/ac24e617-e068-466f-875d-c30e1f2465c4"
width=400></img>
Release Notes:
- agent: Discourage long-running commands
The `grep` tool used to include 4 lines of context around the match, but
the lines included would often be unhelpful. This PR improves this
behavior by using the range of the parent syntax node that contains the
full line(s) matched.
The match headers will also now include symbol breadcrumbs so that the
model can already gather code structure before/without reading files.
````md
### impl GitRepository for RealGitRepository › fn compare_checkpoints › L1278-1284
```rust
let result = git
.run(&[
"diff-tree",
"--quiet",
&left.commit_sha.to_string(),
&right.commit_sha.to_string(),
])
```
````
This positively impacts the `add_arg_to_trait_method` eval example with
better diff output, fewer tool failures, and reduced total turns.
Note: We have some plans to use a an "elision" approach where we would
combine all matches for a given file, skipping lines between them while
keeping symbol declaration lines. The theory is that this would be map
more closely to the expected input for edits. For now, this PR is a
significant improvement.
Release Notes:
- Agent: Enrich `grep` tool output with syntax information
This PR addresses the behavior of the agent's terminal tool when the
executed command is interrupted or fails after producing some output.
Currently, if the command doesn't finish successfully, any partial
output captured before the interruption/failure is discarded, and only
an error message (or a generic cancellation message) is returned to the
LLM.
This change modifies the `run_command_limited` function in the terminal
tool to catch errors when awaiting the command's status (which includes
interruptions). In the case of such an error, it now includes any
partial stdout/stderr captured up to that point within the error message
returned to the `ToolUseState`. This ensures the LLM receives the
partial context even when the command doesn't complete cleanly, framed
appropriately as part of an error/interruption message.
Closes#29101
Release Notes:
- N/A
One motivation is that the outlines returned by `read_file` for large
files list line numbers assuming an inclusive `end_line`. As a result,
when the agent uses these outlines for `read_line` calls, it would
otherwise miss the last line.
Release Notes:
- N/A
Implementing the `ToolCard` for the path_search tool. It also adds the
"jump to file" functionality if you expand the results.
Release Notes:
- N/A
---------
Co-authored-by: Richard Feldman <oss@rtfeldman.com>
Co-authored-by: Agus Zubiaga <hi@aguz.me>
This PR removes two fields from JSON schemas (`$schema` and `title`),
which are not expected by any model provider, but were spuriously
included by our JSON schema library, `schemars`.
These added noise to requests and cost wasted input tokens.
### Old
```json
{
"$schema": "http://json-schema.org/draft-07/schema#",
"title": "FetchToolInput",
"type": "object",
"required": [
"url"
],
"properties": {
"url": {
"description": "The URL to fetch.",
"type": "string"
}
}
}
```
### New:
```json
{
"properties": {
"url": {
"description": "The URL to fetch.",
"type": "string"
}
},
"required": [
"url"
],
"type": "object"
}
```
- N/A
This PR significantly improves the quality of the initial file search
that occurs when the model doesn't yet know the full path to a file it
needs to read/edit.
Previously, the assertions in file_search often failed on main as the
model attempted to guess full file paths. On this branch, it reliably
calls `find_path` (previously `path_search`) before reading files.
After getting the model to find paths first, I noticed it would try
using `grep` instead of `path_search`. This motivated renaming
`path_search` to `find_path` (continuing the analogy to unix commands)
and adding system prompt instructions about proper tool selection.
Note: I know the command is just called `find`, but that seemed too
general.
In my eval runs, the `file_search` example improved from 40% ± 10% to
98% ± 2%. The only assertion I'm seeing occasionally fail is "glob
starts with `**` or project". We can probably add some instructions in
that regard.
Release Notes:
- N/A
Instructs the model to include the fields that we display first in the
input object, so that e.g the user can see the path of a file while the
model generates the content.
Release Notes:
- N/A
This PR implements the `ToolCard` for the edit file tool, which allow us
to display an editor with a diff in the thread view with the changes
performed by the model.
- [x] Fix buffer sometimes displaying empty
- [x] Stop buffer from scrolling together with the thread
- [x] Fix multibuffer header sometimes appearing
- [x] Fix buffer height issue
- [x] Implement "full height" expand button
- [x] Add "Jump To File" functionality
- [x] Polish and refine styles
Release Notes:
- agent: Added diff preview cards in the thread view for edits performed
by the agent.
---------
Co-authored-by: João Marcos <marcospb19@hotmail.com>
Co-authored-by: Richard Feldman <oss@rtfeldman.com>
Co-authored-by: Agus Zubiaga <hi@aguz.me>
Co-authored-by: Conrad Irwin <conrad.irwin@gmail.com>
This PR refines a bit the web search tool UI by introducing a component
(`ToolCallCardHeader`) that aims to standardize the heading element of
tool calls in the thread.
In terms of next steps, I plan to evolve this component further soon
(e.g., building a full-blown "tool call card" component), and even move
it to a place where I can re-use it in the active_thread as well without
making the `assistant_tools` a dependency of it.
Release Notes:
- N/A
This PR renames the `regex_search` tool to `grep` because I think it
conveys more meaning to the model, the idea of searching the filesystem
with a regular expression. It's also one word and the model seems to be
using it effectively after some additional prompt tuning.
It also takes an include pattern to filter on the specific files we try
to search. I'd like to encourage the model to scope its searches more
aggressively, as in my testing, I'm only seeing it filter on file
extension.
Release Notes:
- N/A
Now that we've established a proper eval in tree, this PR is reboots of
our agent loop back to a set of minimal tools and simpler prompts. We
should aim to get this branch feeling subjectively competitive with
what's on main and then merge it, and build from there.
Let's invest in our eval and use it to drive better performance of the
agent loop. How you can help: Pick an example, and then make the outcome
faster or better. It's fine to even use your own subjective judgment, as
our evaluation criteria likely need tuning as well at this point. Focus
on making the agent work better in your own subjective experience first.
Let's focus on simple/practical improvements to make this thing work
better, then determine how we can craft our judgment criteria to lock
those improvements in.
Release Notes:
- N/A
---------
Co-authored-by: Max <max@zed.dev>
Co-authored-by: Antonio <antonio@zed.dev>
Co-authored-by: Agus <agus@zed.dev>
Co-authored-by: Richard <richard@zed.dev>
Co-authored-by: Max Brunsfeld <maxbrunsfeld@gmail.com>
Co-authored-by: Antonio Scandurra <me@as-cii.com>
Co-authored-by: Michael Sloan <mgsloan@gmail.com>
Staff only for now. We'll work on making this usable for non zed.dev
users later
Release Notes:
- N/A
---------
Co-authored-by: Antonio Scandurra <me@as-cii.com>
Co-authored-by: Danilo Leal <daniloleal09@gmail.com>
Co-authored-by: Marshall Bowers <git@maxdeviant.com>
This is just a refactor which adds no functionality.
We now return a `ToolResult` from `Tool > run(...)`. For now this just
wraps the output task in a struct. We'll use this to implement custom
rendering of tools, see #28621.
Release Notes:
- N/A
This is a combination of the "read file" and "list directory contents"
tools as part of a push to reduce our quantity of builtin tools by
combining some of them.
The functionality is all there for this tool, although there's room for
improvement on the visuals side: it currently always shows the same icon
and always says "Read" - so you can't tell at a glance when it's reading
a directory vs an individual file. Changing this will require a change
to the `Tool` trait, which can be in a separate PR. (FYI @danilo-leal!)
<img width="606" alt="Screenshot 2025-04-14 at 11 56 27 PM"
src="https://github.com/user-attachments/assets/bded72af-6476-4469-97c6-2f344629b0e4"
/>
Release Notes:
- Added `contents` tool
This ensures that we respect the `LanguageModelToolSchemaFormat` value
when we call `tool.input_schema`. This prevents us from breaking Gemini
compatibility when adding/changing built-in tools. See #28634.
The test suite will now fail with an error message like this, when
providing an incompatible input_schema:
```
thread 'tests::test_tool_schema_compatibility' panicked at crates/assistant_tools/src/assistant_tools.rs:108:17:
Tool schema for `code_actions` is not compatible with `language_model::LanguageModelToolSchemaFormat::JsonSchemaSubset` (Gemini Models).
Are you using `schema::json_schema_for<T>(format)` to generate the schema?
```
Release Notes:
- N/A
Closes#28475
Updates `rename` and `code_action` `input_schema` methods to use
`json_schema_for<T>()` which transforms standard JSONSchema into the
subset required by Gemini.
Also makes `input_schema` implementations consistent.
Tested tools against Gemini 2.5 Pro Preview, Zed Claude 3.7 Sonnet
Thinking, o3-mini
Release Notes:
- Agent Beta: Fixed error 400 `INVALID_ARGUMENT` when using Gemini with
`code_actions` or `rename` tools enabled.
Release Notes:
- Fixed a regression that caused the agent to hang sometimes.
---------
Co-authored-by: Thomas Mickley-Doyle <tmickleydoyle@gmail.com>
Co-authored-by: Nathan Sobo <nathan@zed.dev>
Co-authored-by: Michael Sloan <mgsloan@gmail.com>
Release Notes:
- agent: Replace `bash` tool with `terminal` tool which uses the current
shell
---------
Co-authored-by: Bennet <bennet@zed.dev>
Co-authored-by: Antonio <antonio@zed.dev>
Having a separate rename tool seems to make the agent more likely to use
it compared to having it be part of the code actions tool.
Release Notes:
- Added code action tool and rename tool.
This Pull Request updates the default behavior of the substitute (`s`)
command in vim mode to only replace the next match by default, instead
of all, and replace all matches only when the `g` flag is provided,
making it more similar to NeoVim's behavior.
In order to achieve this, the following changes were introduced:
- Update `BufferSearchBar::replace_next` to be a public method, so it
can be called from `Vim::replace_command` .
- Update the `Replacement::parse` to set the `should_replace_all` field
to `false` by default, and only set it to `true` if the `'g'` flag is
present in the query.
- Add support for when the `Replacement.should_replace_all` is set to
`false` in `Vim::replace_command`, so as to have it only replace the
next occurrence instead of all occurrences in the line.
- Introduce `BufferSearchBar::select_first_match` so as to activate the
first match on the line under the cursor.
Closes#24450
Release Notes:
- Improved vim's substitute command so as to only replace the first
match by default, and replace all matches if the `'g'` flag is provided
---------
Co-authored-by: Conrad Irwin <conrad.irwin@gmail.com>
This also increases the threshold for when we return an outline during
`read_file`.
Release Notes:
- Fixed an issue that caused the agent to fail reading large files if
the LSP hadn't started yet.
The bash tool will now truncate its output to 8192 bytes (or the last
newline before that).
We also added a global limit for any tool that produces a clearly large
output that wouldn't fit the context window.
Release Notes:
- agent: Truncate bash tool output
---------
Co-authored-by: Michael Sloan <mgsloan@gmail.com>
<img width="622" alt="Screenshot 2025-04-05 at 5 48 14 PM"
src="https://github.com/user-attachments/assets/24b9c7d4-d3e2-4929-bca8-79db5b4e5748"
/>
Release Notes:
- The `read_files` tool now reads only the symbol outline files above a
certain size, to conserve context window space. Then it suggests that
the agent call `read_files` again with the relevant line ranges it saw
in the outline.
Closes: https://github.com/zed-industries/zed/issues/20582
Allows users to select a specific model for each AI-powered feature:
- Agent panel
- Inline assistant
- Thread summarization
- Commit message generation
If unspecified for a given feature, it will use the `default_model`
setting.
Release Notes:
- Added support for configuring a specific model for each AI-powered
feature
---------
Co-authored-by: Danilo Leal <daniloleal09@gmail.com>
Co-authored-by: Bennet Bo Fenner <bennetbo@gmx.de>
This adds a "workspace-hack" crate, see
[mozilla's](https://hg.mozilla.org/mozilla-central/file/3a265fdc9f33e5946f0ca0a04af73acd7e6d1a39/build/workspace-hack/Cargo.toml#l7)
for a concise explanation of why this is useful. For us in practice this
means that if I were to run all the tests (`cargo nextest r
--workspace`) and then `cargo r`, all the deps from the previous cargo
command will be reused. Before this PR it would rebuild many deps due to
resolving different sets of features for them. For me this frequently
caused long rebuilds when things "should" already be cached.
To avoid manually maintaining our workspace-hack crate, we will use
[cargo hakari](https://docs.rs/cargo-hakari) to update the build files
when there's a necessary change. I've added a step to CI that checks
whether the workspace-hack crate is up to date, and instructs you to
re-run `script/update-workspace-hack` when it fails.
Finally, to make sure that people can still depend on crates in our
workspace without pulling in all the workspace deps, we use a `[patch]`
section following [hakari's
instructions](https://docs.rs/cargo-hakari/0.9.36/cargo_hakari/patch_directive/index.html)
One possible followup task would be making guppy use our
`rust-toolchain.toml` instead of having to duplicate that list in its
config, I opened an issue for that upstream: guppy-rs/guppy#481.
TODO:
- [x] Fix the extension test failure
- [x] Ensure the dev dependencies aren't being unified by Hakari into
the main dependencies
- [x] Ensure that the remote-server binary continues to not depend on
LibSSL
Release Notes:
- N/A
---------
Co-authored-by: Mikayla <mikayla@zed.dev>
Co-authored-by: Mikayla Maki <mikayla.c.maki@gmail.com>
This PR makes the command permission prompt part of the tool card and
allow users to straight away change the `always_allow_tool_actions`
setting via the "Always Allow" button from that card. If that button is
clicked, that setting is turned on, and any command that requires
permission from that point on will auto-run.
Additionally, if a bash command spans multiple lines, we show the line
count at the end of the command string. (Note: this is not perfect yet
because it can likely be not visible by default, but we didn't think
this was a major blocker for now. We'll work on improving this next).
### Thread View
<img
src="https://github.com/user-attachments/assets/00f93c39-990f-4b79-84ec-0427b997167f"
width="500"/>
### Settings View
<img
src="https://github.com/user-attachments/assets/52d32435-7c8d-4ab4-a319-6cabc007267b"
width="500"/>
Release Notes:
- N/A
---------
Co-authored-by: Thomas Mickley-Doyle <tmickleydoyle@gmail.com>
Co-authored-by: Bennet Bo Fenner <bennetbo@gmx.de>
Co-authored-by: Nathan Sobo <nathan@zed.dev>
Co-authored-by: Antonio Scandurra <me@as-cii.com>
It's super easy to undo those changes. In a future PR, we should also
avoid requiring confirmation in the batch tool if all the underlying
tools don't require confirmation.
Release Notes:
- N/A