Commit graph

4 commits

Author SHA1 Message Date
Oleksiy Syvokon
cb187b0b4d
evals: Configurable number of max dialog turns (#31680)
Release Notes:

- N/A
2025-05-29 10:35:29 +00:00
Marshall Bowers
8faeb34367
Rename assistant_settings to agent_settings (#31513)
This PR renames the `assistant_settings` crate to `agent_settings`, as
well a number of constructs within it.

Release Notes:

- N/A
2025-05-27 15:16:55 +00:00
Oleksiy Syvokon
61a40e293d
evals: Allow threads explorer to search for JSON files recursively (#31509)
It's just more convenient to call it from CLI this way.

+ minor fixes in evals

Release Notes:

- N/A
2025-05-27 14:18:47 +00:00
Oleksiy Syvokon
255d8f7cf8
agent: Overwrite files more cautiously (#30649)
1. The `edit_file` tool tended to use `create_or_overwrite` a bit too
often, leading to corruption of long files. This change replaces the
boolean flag with an `EditFileMode` enum, which helps Agent make a more
deliberate choice when overwriting files.

With this change, the pass rate of the new eval increased from 10% to
100%.

2. eval: Added ability to run eval on top of an existing thread. Threads
can now be loaded from JSON files in the `SerializedThread` format,
which makes it easy to use real threads as starting points for
tests/evals.

3. Don't try to restore tool cards when running in headless or eval mode
-- we don't have a window to properly do this.

Release Notes:

- N/A
2025-05-14 10:40:44 +03:00