![]() 1. The `edit_file` tool tended to use `create_or_overwrite` a bit too often, leading to corruption of long files. This change replaces the boolean flag with an `EditFileMode` enum, which helps Agent make a more deliberate choice when overwriting files. With this change, the pass rate of the new eval increased from 10% to 100%. 2. eval: Added ability to run eval on top of an existing thread. Threads can now be loaded from JSON files in the `SerializedThread` format, which makes it easy to use real threads as starting points for tests/evals. 3. Don't try to restore tool cards when running in headless or eval mode -- we don't have a window to properly do this. Release Notes: - N/A |
||
---|---|---|
.. | ||
threads | ||
add_arg_to_trait_method.rs | ||
code_block_citations.rs | ||
comment_translation.rs | ||
file_search.rs | ||
find_and_replace_diff_card.toml | ||
hallucinated_tool_calls.toml | ||
mod.rs | ||
no_tools_enabled.toml | ||
overwrite_file.rs | ||
planets.rs | ||
tree_sitter_drop_emscripten_dep.toml |