History

Michael Sloan d497f52e17 agent: Improve error handling and retry for zed-provided models (#33565 ) * Updates to `zed_llm_client-0.8.5` which adds support for `retry_after` when anthropic provides it. * Distinguishes upstream provider errors and rate limits from errors that originate from zed's servers * Moves `LanguageModelCompletionError::BadInputJson` to `LanguageModelCompletionEvent::ToolUseJsonParseError`. While arguably this is an error case, the logic in thread is cleaner with this move. There is also precedent for inclusion of errors in the event type - `CompletionRequestStatus::Failed` is how cloud errors arrive. * Updates `PROVIDER_ID` / `PROVIDER_NAME` constants to use proper types instead of `&str`, since they can be constructed in a const fashion. * Removes use of `CLIENT_SUPPORTS_EXA_WEB_SEARCH_PROVIDER_HEADER_NAME` as the server no longer reads this header and just defaults to that behavior. Release notes for this is covered by #33275 Release Notes: - N/A --------- Co-authored-by: Richard Feldman <oss@rtfeldman.com> Co-authored-by: Richard <richard@zed.dev>		2025-06-30 21:01:32 -06:00
..
docs	eval: Add HTML overview for evaluation runs (#29413 )	2025-04-25 17:49:05 +03:00
src	agent: Improve error handling and retry for zed-provided models (#33565 )	2025-06-30 21:01:32 -06:00
.gitignore	Add judge to new eval + provide LSP diagnostics (#28713 )	2025-04-14 20:18:47 +00:00
Cargo.toml	Extract an agent_ui crate from agent (#33284 )	2025-06-23 18:00:28 -07:00
LICENSE-GPL	Lay the groundwork for a Rust-based eval (#28488 )	2025-04-10 04:45:27 +00:00
README.md	eval: Add support for reading from a `.env` file (#29426 )	2025-04-25 15:53:02 +00:00
runner_settings.json	Introduce a new `StreamingEditFileTool` (#29733 )	2025-05-01 17:37:43 +02:00

README.md

Eval

This eval assumes the working directory is the root of the repository. Run it with:

cargo run -p eval

The eval will optionally read a .env file in crates/eval if you need it to set environment variables, such as API keys.

Explorer Tool

The explorer tool generates a self-contained HTML view from one or more thread JSON file. It provides a visual interface to explore the agent thread, including tool calls and results. See ./docs/explorer.md for more details.

Usage

cargo run -p eval --bin explorer -- --input <path-to-json-files> --output <output-html-path>

Example:

cargo run -p eval --bin explorer -- --input ./runs/2025-04-23_15-53-30/fastmcp_bugifx/*/last.messages.json --output /tmp/explorer.html