History

Agus Zubiaga 8b5835de17 agent: Improve initial file search quality (#29317 ) This PR significantly improves the quality of the initial file search that occurs when the model doesn't yet know the full path to a file it needs to read/edit. Previously, the assertions in file_search often failed on main as the model attempted to guess full file paths. On this branch, it reliably calls `find_path` (previously `path_search`) before reading files. After getting the model to find paths first, I noticed it would try using `grep` instead of `path_search`. This motivated renaming `path_search` to `find_path` (continuing the analogy to unix commands) and adding system prompt instructions about proper tool selection. Note: I know the command is just called `find`, but that seemed too general. In my eval runs, the `file_search` example improved from 40% ± 10% to 98% ± 2%. The only assertion I'm seeing occasionally fail is "glob starts with `**` or project". We can probably add some instructions in that regard. Release Notes: - N/A		2025-04-23 21:24:41 -03:00
..
src	agent: Improve initial file search quality (#29317 )	2025-04-23 21:24:41 -03:00
.gitignore	Add judge to new eval + provide LSP diagnostics (#28713 )	2025-04-14 20:18:47 +00:00
Cargo.toml	eval: New `add_arg_to_trait_method` example (#29297 )	2025-04-23 18:46:39 +00:00
LICENSE-GPL	Lay the groundwork for a Rust-based eval (#28488 )	2025-04-10 04:45:27 +00:00
README.md	Lay the groundwork for a Rust-based eval (#28488 )	2025-04-10 04:45:27 +00:00
runner_settings.json	eval: Fix stalling on tool confirmation (#28786 )	2025-04-15 16:53:45 +00:00

README.md

Eval

This eval assumes the working directory is the root of the repository. Run it with:

cargo run -p eval