This PR implements the `ToolCard` for the edit file tool, which allow us
to display an editor with a diff in the thread view with the changes
performed by the model.
- [x] Fix buffer sometimes displaying empty
- [x] Stop buffer from scrolling together with the thread
- [x] Fix multibuffer header sometimes appearing
- [x] Fix buffer height issue
- [x] Implement "full height" expand button
- [x] Add "Jump To File" functionality
- [x] Polish and refine styles
Release Notes:
- agent: Added diff preview cards in the thread view for edits performed
by the agent.
---------
Co-authored-by: João Marcos <marcospb19@hotmail.com>
Co-authored-by: Richard Feldman <oss@rtfeldman.com>
Co-authored-by: Agus Zubiaga <hi@aguz.me>
Co-authored-by: Conrad Irwin <conrad.irwin@gmail.com>
Now we're more tolerant of invalid JSON coming back from the model
(possibly because it was incomplete and we're streaming), plus if we do
end up with invalid JSON once it has all streamed back, we report what
the malformed JSON actually was:
<img width="444" alt="Screenshot 2025-04-23 at 1 49 14 PM"
src="https://github.com/user-attachments/assets/480f5da7-869b-49f3-9ffd-8f08ccddb33d"
/>
Release Notes:
- N/A
TODO:
- [x] Make it work in the project diff:
- [x] Support non-singleton buffers
- [x] Adjust excerpt boundaries to show full conflicts
- [x] Write tests for conflict-related events and state management
- [x] Prevent hunk buttons from appearing inside conflicts
- [x] Make sure it works over SSH, collab
- [x] Allow separate theming of markers
Bonus:
- [ ] Count of conflicts in toolbar
- [ ] Keyboard-driven navigation and resolution
- [ ] ~~Inlay hints to contextualize "ours"/"theirs"~~
Release Notes:
- Implemented initial support for resolving merge conflicts.
---------
Co-authored-by: Max Brunsfeld <maxbrunsfeld@gmail.com>
When dragging the pane separator of the bottom dock to full window
height, the contents at the bottom of the dock and workspace window
overflowed the screen, becoming obscured. This happened because setting
a new size in resize_bottom_dock(...) was not taking in consideration
the top bounds of the workspace window, which caused the bottom bounds
of both dock and workspace to overflow. The issue was fixed by
subtracting the workspace.bounds.top() value to the dock's new size.
Closes#12966
Release Notes:
- N/A
This PR adds a `can_use_web_search_tool` field to the LLM token claims.
Currently anyone in the `assistant2` feature flag will have access to
the web search tool.
Co-authored-by: Bennet <bennet@zed.dev>
Release Notes:
- N/A
This PR allows the `theme-importer` utility to handle comma-separated
token scopes.
Normally, a token in a VS Code theme is defined as either a string or a
string array:
```json
{
"scope": "token.debug-token",
"settings": {
"foreground": "#d55fde"
}
},
{
"name": "String interpolation",
"scope": [
"punctuation.definition.template-expression.begin",
"punctuation.definition.template-expression.end",
"punctuation.section.embedded"
],
"settings": {
"foreground": "#d55fde"
}
},
```
However, [some
themes](ac85540d64/src/variants/TokenColors.ts (L1771-L1777))
seem to use comma-separated values in a single scope string which VS
Code seems to accept as well:
```json
{
"name": "Comments",
"scope": "comment, punctuation.definition.comment",
"settings": {
"foreground": "#7f848e"
}
},
```
This PR handles these definitions by splitting scopes by commas before
trying to match them with the scopes that match Zed syntax tokens.
Release Notes:
- N/A
Updating macOS development readme with some gotchas that I ran into
while getting setup.
- Linked to collab readme because that contained the steps to setup the
postgres database so integration tests pass
- Added section under troubleshooting. Recommending `cargo-nextest`
since the CI uses it and it got me past the failures I was seeing.
Release Notes:
- N/A
---------
Co-authored-by: KyleBarton <kjbarton4@gmail.com>
Attempt to lookup exact relative paths before full worktree traversal,
only do the full traversal if all other methods fail.
Closes https://github.com/zed-industries/zed/issues/28407
Release Notes:
- Fixed wrong paths opening when cmd-clicking in the terminal
This adds `last.message.json` file that contains the full request plus
response (serialized as a message from assistant for consistency with
other messages).
Motivation: to capture more info and to make analysis of finished runs
easier.
Release Notes:
- N/A
Release Notes:
- Improved performance of agent checkpoint creation.
- Fixed a bug that sometimes caused accidental deletions when restoring
to a previous agent checkpoint.
- Fixed a bug that caused checkpoints to be visible in the Git history.
Before this change, when syncing a multibuffer (such as
find-all-references) to a remote, we would renumber the excerpts from 1.
This did not matter in the past because the buffers' list of excerpts
could not change. In #27876, I added the ability for excerpts to merge,
which meant that the excerpt list could change. This manifested as
people seeing "invalid excerpt id" panics when syncing.
The initial fix to this (to re-use the excerpt ids from the host) ran
into problems because `insert_excerpts_with_ids_after` assumes that you
call it in excerpt-id order. This change de-optimizes that code to
insert the excerpts 1-by-1 in excerpt-id order, but with the
insert_after set to preserve the correct UI order.
I hope to soon remove this code path and use something more like
set-excerpts-for-path for syncing, but in the meantime we should not
panic.
Release Notes:
- Fix a panic when joining a project with a multibuffer with merged
excerpts
Closes#27994, #29050, #27352, #27616
This PR implements new logic for code completions, which improve cases
where local variables, etc LSP based hints are not shown on top of code
completion menu. The new logic is explained in comment of code.
This new sort is similar to VSCode's completions sort where order of
sort is like:
Fuzzy > Snippet > LSP sort_key > LSP sort_text
whenever two items have same value, it proceeds to use next one as tie
breaker. Changing fuzzy score from float to int based makes it possible
for two items two have same fuzzy int score, making them get sorted by
next criteria.
Release Notes:
- Improved code completions to prioritize LSP hints, such as local
variables, so they appear at the top of the list.
Now all debug sessions are routed through the debug panel and are
started synchronously instead of by a task that returns a session once
the initialization process is finished. A session is `Mode::Booting`
while it's starting the debug adapter process and then transitions to
`Mode::Running` once this is completed.
This PR also added new tests for the dap logger, reverse start debugging
request, and debugging over SSH.
Release Notes:
- N/A
---------
Co-authored-by: Anthony Eid <hello@anthonyeid.me>
Co-authored-by: Anthony <anthony@zed.dev>
Co-authored-by: Cole Miller <m@cole-miller.net>
Co-authored-by: Cole Miller <cole@zed.dev>
Co-authored-by: Zed AI <ai@zed.dev>
Co-authored-by: Remco Smits <djsmits12@gmail.com>
Added default filters to `zlog`, a piece that was present in our
`simple_log` setup, but was missed when switching to `zlog`, resulting
in logspam primarily on linux.
also - more explicit precedence & precedence testing
Release Notes:
- N/A
cc: @sunli829 @huacnlee @probably-neb
I really liked the earlier PR, but had an idea for how to utilize the
element state so that you don't need to construct the cache externally.
I've updated the APIs to introduce an `ImageCacheProvider` trait, and
added an example implementation of it to the image gallery :)
Release Notes:
- N/A
Fixes issue described in [description of
#28683](https://github.com/zed-industries/zed/issues/28683#issue-2992849891)
Makes sure that the `--system-specs` arg is handled before
`Application::new` is called, so that it can be used even when Zed is
panicking during app initialization (e.g. Failing to create a Vulkan
context in blade)
Release Notes:
- Fixed an issue where the `--system-specs` arg wouldn't work if Zed
panicked during app initialization (e.g. When failing to create a Vulkan
context in blade)
This PR updates the usage meter for edit predictions to use the limits
returned from the API instead of basing it off the plan.
This will allow limits to be updated from the server rather than being
embedded in the client.
Release Notes:
- N/A
Closes#28533
Release Notes:
- Linux: Improved parsing of `ZED_DEVICE_ID` environment variable in an
attempt to fix some cases where it erroneously failed to parse. The
`ZED_DEVICE_ID` is now expected to always be a 4 digit hexadecimal
number (as it is in the output of `lcpci`) with an optional `0x` or `0X`
prefix.
This PR updates the usage banners in the Agent panel to use the limits
returned from the API instead of basing it off the plan.
This will allow limits to be updated from the server rather than being
embedded in the client.
Release Notes:
- N/A
Closes#27414
`ImageCache` is independent of the original image loader and can
actively release its cached images to solve the problem of images loaded
from the network or files not being released.
It has two constructors:
- `ImageCache::new`: Manually manage the cache.
- `ImageCache::max_items`: Remove the least recently used items when the
cache reaches the specified number.
When creating an `img` element, you can specify the cache object with
`Img::cache`, and the image cache will be managed by `ImageCache`.
In the example `crates\gpui\examples\image-gallery.rs`, the
`ImageCache::clear` method is actively called when switching a set of
images, and the memory will no longer continuously increase.
Release Notes:
- N/A
---------
Co-authored-by: Ben Kunkle <ben@zed.dev>
Closes#29176
This PR fix an issue where uncommenting a code block in Markdown would
add Markdown comments instead of removing the language-specific
comments.
Why?
`language_scope_at` for comments in a code block in Markdown would
result in the language being detected as Markdown. This happens because
the smallest range, such as `//` or `#` on the Markdown layer, is
preferred over `// whole comment line` for any other language. This
results in language detection as Markdown for that point.
To fix this, we also use a depth factor and try to prefer the layer with
greater depth over one with lesser depth. In this case, the code block's
language depth would be preferred over Markdown. The smallest range is
now used as a tiebreaker.
Added test for this case.
Release Notes:
- Fixed issue where uncommenting a code block in Markdown would add
Markdown comments instead of removing the language comments.
This PR updates the `plan` field in the LLM token to be based on the
subscription.
We weren't using this field anywhere outside of the new billing code, so
it is safe to change its meaning.
Release Notes:
- N/A
As-salamu alaykum,
[I recently started suffering from the same issue as this
user](https://users.rust-lang.org/t/rust-analyzer-checkonsave-command-works-but-shows-invalid-config-warning/128652),
which is caused by something the docs of Zed promote, so I decided to
help fix it.
>[anutrix](https://users.rust-lang.org/u/anutrix)
> When I add "rust-analyzer.checkOnSave.command": "clippy" I get:
>
> invalid config value: /checkOnSave: invalid type: map, expected a
boolean;
> Extension Info: Version 0.3.2433, Server Version 0.3.2433-standalone
(66e3b5819e 2025-04-21)
> and in Language Server logs:
>
> [Error - 3:26:22 AM] Server process exited with code 0.
> Clippy works fine but these warnings stays and extensions shows
yellow/unstable in VSCode:
>
> Additionally, if I replace
>
> "rust-analyzer.checkOnSave.command": "clippy"
> with
>
> "rust-analyzer.checkOnSave": true,
> "rust-analyzer.checkOnSave.command": "clippy"
> [jplatte](https://users.rust-lang.org/u/jplatte)
> From the documentation, it seems like
rust-analyzer.checkOnSave.command does not exist. It should be
rust-analyzer.check.command.
Release Notes:
- agent: Improved the AI-generated changes review UX by clearly exposing
the generating state in the multibuffer tab.
---------
Co-authored-by: Danilo Leal <daniloleal09@gmail.com>
This PR adds support for transferring any existing usage from a trial
subscription to a Zed Pro subscription when the user upgrades.
Release Notes:
- N/A
---------
Co-authored-by: Mikayla <mikayla@zed.dev>
### Problem
We want to start continuously tracking our progress on agent evals over
time. As part of this, we'd like the *score* to have a clear,
interpretable meaning. Right now, it's a number from 0 to 5, but it's
not clear what any particular number works. In addition, scores vary
widely from run to run, because the agent's output is deterministic. We
try to stabilize the score using a panel of judges, but the behavior of
the agent itself varies much more widely than the judges' scores for a
given run.
### Solution
* **explicit meanings of scores** - In this PR, we're prescribing the
diff and thread criteria files so that they *must* be unordered lists of
assertions. For both the thread and the diff, rather than providing an
abstract score, the judge's task is simply to count how many of these
assertions are satisfied. A percentage score can be derived from this
number, divided by the total number of assertions.
* **repetitions** - Rather than running each example once, and judging
it N times, we'll **run** the example N times. Right now, I'm just
judging the output once per run, because I believe that with these more
clear scoring criteria, the main source of non-determinism will be the
*agent's* behavior, not the judge's
### Questions
* **accounting for diagnostic errors** - Previously, the judge was asked
to incorporate diagnostics into their abstract scores. Now that the
"score" is determined directly from the criteria, the diagnostic will
not be captured in the score. How should the diagnostics be accounted
for in the eval? One thought is - let's simply count and report the
number of errors remaining after the agent finishes, as a separate field
of the run (along with diff score and thread score). We could consider
normalizing it using the total lines of added code (like errors per 100
lines of code added) in order to give it some semblance of stability
between examples.
* **repetitions** - How many repetitions should we run on CI? Each
repetition takes significant time, but I think running more than one
repetition will make the scores significantly less volatile.
### Todo
* [x] Fix `--concurrency` implementation so that only N tasks are
spawned
* [x] Support `--repetitions` efficiently (re-using the same worktree)
* [x] Restructure judge prompts to count passing criteria, not compute
abstract score
* [x] Report total number of diagnostics in some way
* [x] Format output nicely
Release Notes:
- N/A *or* Added/Fixed/Improved ...
---------
Co-authored-by: Antonio Scandurra <me@as-cii.com>
This PR refines a bit the web search tool UI by introducing a component
(`ToolCallCardHeader`) that aims to standardize the heading element of
tool calls in the thread.
In terms of next steps, I plan to evolve this component further soon
(e.g., building a full-blown "tool call card" component), and even move
it to a place where I can re-use it in the active_thread as well without
making the `assistant_tools` a dependency of it.
Release Notes:
- N/A
Mainly removing the "You" label, which didn't add a lot of value. Still
figuring out an issue with font size Markdown rendering before merging
this PR.
Release Notes:
- N/A