Commit graph

3 commits

Author SHA1 Message Date
Marshall Bowers
a5405fcbd7
eval: Add support for reading from a .env file (#29426)
This PR adds support for the eval to read environment variables from a
`.env` file located in the `crates/eval` directory.

For instance, you can use it to set your Anthropic API key:

```
ANTHROPIC_API_KEY=<secret>
```

Release Notes:

- N/A
2025-04-25 15:53:02 +00:00
Oleksiy Syvokon
3389327df5
eval: Add HTML overview for evaluation runs (#29413)
This update generates a single self-contained .html file that shows an
overview of evaluation threads in the browser. It's useful for:

- Quickly reviewing results
- Sharing evaluation runs
- Debugging
- Comparing models (TBD)

Features:

- Export thread JSON from the UI
- Keyboard navigation (j/k or Ctrl + ←/→)
- Toggle between compact and full views

Generating the overview:

- `cargo run -p eval` will write this file in the run dir's root.
- Or you can call `cargo run -p eval --bin explorer` to generate it
without running evals.


Screenshot:

![image](https://github.com/user-attachments/assets/4ead71f6-da08-48ea-8fcb-2148d2e4b4db)


Release Notes:

- N/A
2025-04-25 17:49:05 +03:00
Antonio Scandurra
8ac378b86e
Lay the groundwork for a Rust-based eval (#28488)
Also, we moved the logic for driving the agentic loop into `Thread` so
that we don't have to re-implement it.

Release Notes:

- N/A

---------

Co-authored-by: Nathan Sobo <nathan@zed.dev>
2025-04-10 04:45:27 +00:00