ZIm/crates/eval
Oleksiy Syvokon 6df4c537b9
agent: Less disruptive changed file notification (#31693)
When the user edits one of the tracked files, we used to notify the
agent by inserting a user message at the end of the thread. This was
causing a few problems:
- The agent would stop doing its work and start reading changed files
- The agent would write something like, "Thank you for letting me know
about these changed files."

This fix contains two parts:
1. Changing the prompt to indicate this is a service message
2. Moving the message higher in the conversation thread

This works, but it slightly hurts caching.

We may consider making these notification messages stick in history,
trading context tokens count for the cache.

This might be related to #30906

Release Notes:

- N/A

---------

Co-authored-by: Marshall Bowers <git@maxdeviant.com>
2025-06-16 18:45:24 +03:00
..
docs eval: Add HTML overview for evaluation runs (#29413) 2025-04-25 17:49:05 +03:00
src agent: Less disruptive changed file notification (#31693) 2025-06-16 18:45:24 +03:00
.gitignore Add judge to new eval + provide LSP diagnostics (#28713) 2025-04-14 20:18:47 +00:00
Cargo.toml Replace async-watch with a custom watch (#32245) 2025-06-06 16:00:09 +00:00
LICENSE-GPL Lay the groundwork for a Rust-based eval (#28488) 2025-04-10 04:45:27 +00:00
README.md eval: Add support for reading from a .env file (#29426) 2025-04-25 15:53:02 +00:00
runner_settings.json Introduce a new StreamingEditFileTool (#29733) 2025-05-01 17:37:43 +02:00

Eval

This eval assumes the working directory is the root of the repository. Run it with:

cargo run -p eval

The eval will optionally read a .env file in crates/eval if you need it to set environment variables, such as API keys.

Explorer Tool

The explorer tool generates a self-contained HTML view from one or more thread JSON file. It provides a visual interface to explore the agent thread, including tool calls and results. See ./docs/explorer.md for more details.

Usage

cargo run -p eval --bin explorer -- --input <path-to-json-files> --output <output-html-path>

Example:

cargo run -p eval --bin explorer -- --input ./runs/2025-04-23_15-53-30/fastmcp_bugifx/*/last.messages.json --output /tmp/explorer.html