Yehowshua/ZIm - Forgejo: Beyond coding. We Forge.

Author SHA1 Message Date

Author	SHA1	Message	Date
Marshall Bowers	a5405fcbd7	eval: Add support for reading from a `.env` file (#29426 ) This PR adds support for the eval to read environment variables from a `.env` file located in the `crates/eval` directory. For instance, you can use it to set your Anthropic API key: ``` ANTHROPIC_API_KEY=<secret> ``` Release Notes: - N/A	2025-04-25 15:53:02 +00:00
Oleksiy Syvokon	3389327df5	eval: Add HTML overview for evaluation runs (#29413 ) This update generates a single self-contained .html file that shows an overview of evaluation threads in the browser. It's useful for: - Quickly reviewing results - Sharing evaluation runs - Debugging - Comparing models (TBD) Features: - Export thread JSON from the UI - Keyboard navigation (j/k or Ctrl + ←/→) - Toggle between compact and full views Generating the overview: - `cargo run -p eval` will write this file in the run dir's root. - Or you can call `cargo run -p eval --bin explorer` to generate it without running evals. Screenshot: ![image](https://github.com/user-attachments/assets/4ead71f6-da08-48ea-8fcb-2148d2e4b4db) Release Notes: - N/A	2025-04-25 17:49:05 +03:00
Antonio Scandurra	8ac378b86e	Lay the groundwork for a Rust-based eval (#28488 ) Also, we moved the logic for driving the agentic loop into `Thread` so that we don't have to re-implement it. Release Notes: - N/A --------- Co-authored-by: Nathan Sobo <nathan@zed.dev>	2025-04-10 04:45:27 +00:00

Marshall Bowers

a5405fcbd7

eval: Add support for reading from a .env file (#29426 )

This PR adds support for the eval to read environment variables from a
`.env` file located in the `crates/eval` directory.

For instance, you can use it to set your Anthropic API key:

```
ANTHROPIC_API_KEY=<secret>
```

Release Notes:

- N/A

2025-04-25 15:53:02 +00:00

Oleksiy Syvokon

3389327df5

eval: Add HTML overview for evaluation runs (#29413 )

This update generates a single self-contained .html file that shows an
overview of evaluation threads in the browser. It's useful for:

- Quickly reviewing results
- Sharing evaluation runs
- Debugging
- Comparing models (TBD)

Features:

- Export thread JSON from the UI
- Keyboard navigation (j/k or Ctrl + ←/→)
- Toggle between compact and full views

Generating the overview:

- `cargo run -p eval` will write this file in the run dir's root.
- Or you can call `cargo run -p eval --bin explorer` to generate it
without running evals.


Screenshot:

![image](https://github.com/user-attachments/assets/4ead71f6-da08-48ea-8fcb-2148d2e4b4db)


Release Notes:

- N/A

2025-04-25 17:49:05 +03:00

Antonio Scandurra

8ac378b86e

Lay the groundwork for a Rust-based eval (#28488 )

Also, we moved the logic for driving the agentic loop into `Thread` so
that we don't have to re-implement it.

Release Notes:

- N/A

---------

Co-authored-by: Nathan Sobo <nathan@zed.dev>

2025-04-10 04:45:27 +00:00

3 commits