eval: Add HTML overview for evaluation runs (#29413)

This update generates a single self-contained .html file that shows an
overview of evaluation threads in the browser. It's useful for:

- Quickly reviewing results
- Sharing evaluation runs
- Debugging
- Comparing models (TBD)

Features:

- Export thread JSON from the UI
- Keyboard navigation (j/k or Ctrl + ←/→)
- Toggle between compact and full views

Generating the overview:

- `cargo run -p eval` will write this file in the run dir's root.
- Or you can call `cargo run -p eval --bin explorer` to generate it
without running evals.


Screenshot:

![image](https://github.com/user-attachments/assets/4ead71f6-da08-48ea-8fcb-2148d2e4b4db)


Release Notes:

- N/A
This commit is contained in:
Oleksiy Syvokon 2025-04-25 17:49:05 +03:00 committed by GitHub
parent f106dfca42
commit 3389327df5
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
7 changed files with 1351 additions and 149 deletions

1
Cargo.lock generated
View file

@ -4983,6 +4983,7 @@ dependencies = [
"language_models",
"languages",
"node_runtime",
"pathdiff",
"paths",
"project",
"prompt_store",