History

Antonio Scandurra 019a14bcde Replace `async-watch` with a custom watch (#32245 ) The `async-watch` crate doesn't seem to be maintained and we noticed several panics coming from it, such as: ``` [bug] failed to observe change after notificaton. zed::reliability::init_panic_hook::{{closure}}::hea8cdcb6299fad6b+154543526 std::panicking::rust_panic_with_hook::h33b18b24045abff4+127578547 std::panicking::begin_panic_handler::{{closure}}::hf8313cc2fd0126bc+127577770 std::sys::backtrace::__rust_end_short_backtrace::h57fe07c8aea5c98a+127571385 __rustc[95feac21a9532783]::rust_begin_unwind+127576909 core::panicking::panic_fmt::hd54fb667be51beea+9433328 core::option::expect_failed::h8456634a3dada3e4+9433291 assistant_tools::edit_agent::EditAgent::apply_edit_chunks::{{closure}}::habe2e1a32b267fd4+26921553 gpui::app::async_context::AsyncApp::spawn::{{closure}}::h12f5f25757f572ea+25923441 async_task::raw::RawTask<F,T,S,M>::run::h3cca0d402690ccba+25186815 <gpui::platform::linux::x11::client::X11Client as gpui::platform::linux::platform::LinuxClient>::run::h26264aefbcfbc14b+73961666 gpui::platform::linux::platform::<impl gpui::platform::Platform for P>::run::hb12dcd4abad715b5+73562509 gpui::app::Application::run::h0f936a5f855a3f9f+150676820 zed::main::ha17f9a25fe257d35+154788471 std::sys::backtrace::__rust_begin_short_backtrace::h1edd02429370b2bd+154624579 std::rt::lang_start::{{closure}}::h3d2e300f10059b0a+154264777 std::rt::lang_start_internal::h418648f91f5be3a1+127502049 main+154806636 __libc_start_main+46051972301573 _start+12358494 ``` I didn't find an executor-agnostic watch crate that was well maintained (we already tried postage and async-watch), so decided to implement it our own version. Release Notes: - Fixed a panic that could sometimes occur when the agent performed edits.		2025-06-06 16:00:09 +00:00
..
docs	eval: Add HTML overview for evaluation runs (#29413 )	2025-04-25 17:49:05 +03:00
src	Replace `async-watch` with a custom watch (#32245 )	2025-06-06 16:00:09 +00:00
.gitignore	Add judge to new eval + provide LSP diagnostics (#28713 )	2025-04-14 20:18:47 +00:00
Cargo.toml	Replace `async-watch` with a custom watch (#32245 )	2025-06-06 16:00:09 +00:00
LICENSE-GPL	Lay the groundwork for a Rust-based eval (#28488 )	2025-04-10 04:45:27 +00:00
README.md	eval: Add support for reading from a `.env` file (#29426 )	2025-04-25 15:53:02 +00:00
runner_settings.json	Introduce a new `StreamingEditFileTool` (#29733 )	2025-05-01 17:37:43 +02:00

README.md

Eval

This eval assumes the working directory is the root of the repository. Run it with:

cargo run -p eval

The eval will optionally read a .env file in crates/eval if you need it to set environment variables, such as API keys.

Explorer Tool

The explorer tool generates a self-contained HTML view from one or more thread JSON file. It provides a visual interface to explore the agent thread, including tool calls and results. See ./docs/explorer.md for more details.

Usage

cargo run -p eval --bin explorer -- --input <path-to-json-files> --output <output-html-path>

Example:

cargo run -p eval --bin explorer -- --input ./runs/2025-04-23_15-53-30/fastmcp_bugifx/*/last.messages.json --output /tmp/explorer.html