ZIm/crates/eval
Oleksiy Syvokon 76a78b550b
eval: Write JSON-serialized thread (#29271)
This adds `last.message.json` file that contains the full request plus
response (serialized as a message from assistant for consistency with
other messages).

Motivation: to capture more info and to make analysis of finished runs
easier.

Release Notes:

- N/A
2025-04-23 15:22:19 +03:00
..
src eval: Write JSON-serialized thread (#29271) 2025-04-23 15:22:19 +03:00
.gitignore Add judge to new eval + provide LSP diagnostics (#28713) 2025-04-14 20:18:47 +00:00
Cargo.toml eval: Fine-grained assertions (#29246) 2025-04-22 23:58:58 -03:00
LICENSE-GPL Lay the groundwork for a Rust-based eval (#28488) 2025-04-10 04:45:27 +00:00
README.md Lay the groundwork for a Rust-based eval (#28488) 2025-04-10 04:45:27 +00:00
runner_settings.json eval: Fix stalling on tool confirmation (#28786) 2025-04-15 16:53:45 +00:00

Eval

This eval assumes the working directory is the root of the repository. Run it with:

cargo run -p eval