Commit graph

8 commits

Author SHA1 Message Date
Michael Sloan
0f3ac38332
Agent eval: Copy .rules file into eval worktree for examples based on Zed (#29116)
Also reverts #29108, which cherry-picked the rules file for an eval
example.

Release Notes:

- N/A
2025-04-21 12:02:44 -06:00
Nathan Sobo
107d8ca483
Rename regex search tool to grep and accept an include glob pattern (#29100)
This PR renames the `regex_search` tool to `grep` because I think it
conveys more meaning to the model, the idea of searching the filesystem
with a regular expression. It's also one word and the model seems to be
using it effectively after some additional prompt tuning.

It also takes an include pattern to filter on the specific files we try
to search. I'd like to encourage the model to scope its searches more
aggressively, as in my testing, I'm only seeing it filter on file
extension.

Release Notes:

- N/A
2025-04-20 00:53:30 +00:00
Michael Sloan
a91948aeb4
agent: Reorder some linux keybindings to match mac keybindings (#29107)
Release Notes:

- Made keybindings for agent panel closer to the precedence order used
on Mac. This fixes use of `enter` to add context from the menu triggered
by `@` referencing.
2025-04-20 00:01:43 +00:00
Nathan Sobo
bab28560ef
Systematically optimize agentic editing performance (#28961)
Now that we've established a proper eval in tree, this PR is reboots of
our agent loop back to a set of minimal tools and simpler prompts. We
should aim to get this branch feeling subjectively competitive with
what's on main and then merge it, and build from there.

Let's invest in our eval and use it to drive better performance of the
agent loop. How you can help: Pick an example, and then make the outcome
faster or better. It's fine to even use your own subjective judgment, as
our evaluation criteria likely need tuning as well at this point. Focus
on making the agent work better in your own subjective experience first.
Let's focus on simple/practical improvements to make this thing work
better, then determine how we can craft our judgment criteria to lock
those improvements in.

Release Notes:

- N/A

---------

Co-authored-by: Max <max@zed.dev>
Co-authored-by: Antonio <antonio@zed.dev>
Co-authored-by: Agus <agus@zed.dev>
Co-authored-by: Richard <richard@zed.dev>
Co-authored-by: Max Brunsfeld <maxbrunsfeld@gmail.com>
Co-authored-by: Antonio Scandurra <me@as-cii.com>
Co-authored-by: Michael Sloan <mgsloan@gmail.com>
2025-04-19 02:47:59 +00:00
Thomas Mickley-Doyle
b1e4e6048a
agent: Add more Rust code examples, update TODO check (#28737)
Release Notes:

- N/A
2025-04-15 16:52:08 +00:00
Thomas Mickley-Doyle
d74f0735c2
Add more eval examples + filtering examples by language + fix git concurrent usage (#28719)
Release Notes:

- N/A

---------

Co-authored-by: michael <michael@zed.dev>
Co-authored-by: agus <agus@zed.dev>
2025-04-14 22:05:46 +00:00
Michael Sloan
6b80eb556c
Add judge to new eval + provide LSP diagnostics (#28713)
Release Notes:

- N/A

---------

Co-authored-by: Antonio Scandurra <antonio@zed.dev>
Co-authored-by: agus <agus@zed.dev>
2025-04-14 20:18:47 +00:00
Antonio Scandurra
8ac378b86e
Lay the groundwork for a Rust-based eval (#28488)
Also, we moved the logic for driving the agentic loop into `Thread` so
that we don't have to re-implement it.

Release Notes:

- N/A

---------

Co-authored-by: Nathan Sobo <nathan@zed.dev>
2025-04-10 04:45:27 +00:00