ZIm/crates/eval/Cargo.toml
Antonio Scandurra f891dfb358
Introduce a new StreamingEditFileTool (#29733)
This pull request introduces a new tool for streaming edits. The
short-term goal is for this tool to replace the existing `EditFileTool`,
but we want to get this out the door as soon as possible so that we can
start testing it.

`StreamingEditFileTool` is mutually exclusive with `EditFileTool`. It
will be enabled by default for anyone who has the `agent-stream-edits`
feature flag, as well as people that set `assistant.stream_edits` to
`true` in their settings.

### Implementation

Streaming is achieved by requesting a completion while the `edit_file`
tool gets called. We invoke the model by taking the existing
conversation with the agent and appending a prompt specifically tailored
for editing. In that prompt, we ask the model to produce a stream of
`<old_text>`/`<new_text>` tags. As the model streams text in, we
incrementally parse it and start editing as soon as we can.

### Evals

Note that, as part of this pull request, I also defined some new evals
that I used to drive the behavior of the recursive LLM call. To run
them, use this command:

```bash
cargo test --package=assistant_tools --features eval -- eval_extract_handle_command_output
```

Or comment out the `#[cfg_attr(not(feature = "eval"), ignore)]` macro.

I recommend running them one at a time, because right now we don't
really have a way of orchestrating of all these evals. I think we should
invest into that effort once the new agent panel goes live.

Release Notes:

- N/A

---------

Co-authored-by: Nathan Sobo <nathan@zed.dev>
Co-authored-by: Bennet Bo Fenner <bennetbo@gmx.de>
Co-authored-by: Oleksiy Syvokon <oleksiy.syvokon@gmail.com>
2025-05-01 17:37:43 +02:00

67 lines
1.5 KiB
TOML

[package]
name = "eval"
version = "0.1.0"
publish.workspace = true
edition.workspace = true
license = "GPL-3.0-or-later"
default-run = "eval"
[lints]
workspace = true
[[bin]]
name = "eval"
path = "src/eval.rs"
[[bin]]
name = "explorer"
path = "src/explorer.rs"
[dependencies]
agent.workspace = true
anyhow.workspace = true
assistant_tool.workspace = true
assistant_tools.workspace = true
async-trait.workspace = true
async-watch.workspace = true
buffer_diff.workspace = true
chrono.workspace = true
clap.workspace = true
client.workspace = true
collections.workspace = true
context_server.workspace = true
dirs.workspace = true
dotenv.workspace = true
env_logger.workspace = true
extension.workspace = true
fs.workspace = true
futures.workspace = true
gpui.workspace = true
gpui_tokio.workspace = true
handlebars.workspace = true
language.workspace = true
language_extension.workspace = true
language_model.workspace = true
language_models.workspace = true
languages = { workspace = true, features = ["load-grammars"] }
markdown.workspace = true
node_runtime.workspace = true
pathdiff.workspace = true
paths.workspace = true
pretty_assertions.workspace = true
project.workspace = true
prompt_store.workspace = true
regex.workspace = true
release_channel.workspace = true
reqwest_client.workspace = true
serde.workspace = true
serde_json.workspace = true
settings.workspace = true
shellexpand.workspace = true
smol.workspace = true
telemetry.workspace = true
toml.workspace = true
unindent.workspace = true
util.workspace = true
uuid.workspace = true
workspace-hack.workspace = true