![]() This pull request introduces a new tool for streaming edits. The short-term goal is for this tool to replace the existing `EditFileTool`, but we want to get this out the door as soon as possible so that we can start testing it. `StreamingEditFileTool` is mutually exclusive with `EditFileTool`. It will be enabled by default for anyone who has the `agent-stream-edits` feature flag, as well as people that set `assistant.stream_edits` to `true` in their settings. ### Implementation Streaming is achieved by requesting a completion while the `edit_file` tool gets called. We invoke the model by taking the existing conversation with the agent and appending a prompt specifically tailored for editing. In that prompt, we ask the model to produce a stream of `<old_text>`/`<new_text>` tags. As the model streams text in, we incrementally parse it and start editing as soon as we can. ### Evals Note that, as part of this pull request, I also defined some new evals that I used to drive the behavior of the recursive LLM call. To run them, use this command: ```bash cargo test --package=assistant_tools --features eval -- eval_extract_handle_command_output ``` Or comment out the `#[cfg_attr(not(feature = "eval"), ignore)]` macro. I recommend running them one at a time, because right now we don't really have a way of orchestrating of all these evals. I think we should invest into that effort once the new agent panel goes live. Release Notes: - N/A --------- Co-authored-by: Nathan Sobo <nathan@zed.dev> Co-authored-by: Bennet Bo Fenner <bennetbo@gmx.de> Co-authored-by: Oleksiy Syvokon <oleksiy.syvokon@gmail.com> |
||
---|---|---|
.. | ||
docs | ||
src | ||
.gitignore | ||
Cargo.toml | ||
LICENSE-GPL | ||
README.md | ||
runner_settings.json |
Eval
This eval assumes the working directory is the root of the repository. Run it with:
cargo run -p eval
The eval will optionally read a .env
file in crates/eval
if you need it to set environment variables, such as API keys.
Explorer Tool
The explorer tool generates a self-contained HTML view from one or more thread JSON file. It provides a visual interface to explore the agent thread, including tool calls and results. See ./docs/explorer.md for more details.
Usage
cargo run -p eval --bin explorer -- --input <path-to-json-files> --output <output-html-path>
Example:
cargo run -p eval --bin explorer -- --input ./runs/2025-04-23_15-53-30/fastmcp_bugifx/*/last.messages.json --output /tmp/explorer.html