ZIm/crates/eval/src
2025-06-27 18:38:25 -04:00
..
examples WIP and merge 2025-06-27 18:38:25 -04:00
assertions.rs eval: Count execution errors as failures (#30712) 2025-05-14 20:44:19 +03:00
eval.rs WIP and merge 2025-06-27 18:38:25 -04:00
example.rs WIP and merge 2025-06-27 18:38:25 -04:00
explorer.html eval: Add HTML overview for evaluation runs (#29413) 2025-04-25 17:49:05 +03:00
explorer.rs evals: Allow threads explorer to search for JSON files recursively (#31509) 2025-05-27 14:18:47 +00:00
ids.rs Use anyhow more idiomatically (#31052) 2025-05-20 23:06:07 +00:00
instance.rs WIP and merge 2025-06-27 18:38:25 -04:00
judge_diff_prompt.hbs eval: Fine-grained assertions (#29246) 2025-04-22 23:58:58 -03:00
judge_thread_prompt.hbs eval: Fine-grained assertions (#29246) 2025-04-22 23:58:58 -03:00
tool_metrics.rs eval: Fine-grained assertions (#29246) 2025-04-22 23:58:58 -03:00