Yehowshua/ZIm - Forgejo: Beyond coding. We Forge.

Author	SHA1	Message	Date
Oleksiy Syvokon	5112fcebeb	evals: Make LLMs configurable in edit_agent evals (#30813 ) Release Notes: - N/A	2025-05-16 11:10:15 +00:00
Piotr Osiewicz	0f17e82154	chore: Bump Rust to 1.87 (#30739 ) Closes #ISSUE Release Notes: - N/A	2025-05-15 22:28:52 +00:00
Oleksiy Syvokon	255d8f7cf8	agent: Overwrite files more cautiously (#30649 ) 1. The `edit_file` tool tended to use `create_or_overwrite` a bit too often, leading to corruption of long files. This change replaces the boolean flag with an `EditFileMode` enum, which helps Agent make a more deliberate choice when overwriting files. With this change, the pass rate of the new eval increased from 10% to 100%. 2. eval: Added ability to run eval on top of an existing thread. Threads can now be loaded from JSON files in the `SerializedThread` format, which makes it easy to use real threads as starting points for tests/evals. 3. Don't try to restore tool cards when running in headless or eval mode -- we don't have a window to properly do this. Release Notes: - N/A	2025-05-14 10:40:44 +03:00
Richard Feldman	8fdf309a4a	Have read_file support images (#30435 ) This is very basic support for them. There are a number of other TODOs before this is really a first-class supported feature, so not adding any release notes for it; for now, this PR just makes it so that if read_file tries to read a PNG (which has come up in practice), it at least correctly sends it to Anthropic instead of messing up. This also lays the groundwork for future PRs for more first-class support for images in tool calls across more image file formats and LLM providers. Release Notes: - N/A --------- Co-authored-by: Agus Zubiaga <hi@aguz.me> Co-authored-by: Agus Zubiaga <agus@zed.dev>	2025-05-13 10:58:00 +02:00
Antonio Scandurra	1b593f616f	Include `EditAgent`'s raw output when inspecting thread (#30337 ) This allows us to debug the raw edits that were generated when people report feedback, when running evals and when opening the thread as Markdown. Release Notes: - Improved debug output for agent threads.	2025-05-09 06:58:45 +00:00
Antonio Scandurra	9f6809a28d	Reuse conversation cache when streaming edits (#30245 ) Release Notes: - Improved latency when the agent applies edits.	2025-05-08 14:36:34 +02:00
Antonio Scandurra	89430a019c	Fix agent reading and editing files over SSH (#30144 ) Release Notes: - Fixed a bug that would prevent the agent from working over SSH. --------- Co-authored-by: Nathan Sobo <nathan@zed.dev> Co-authored-by: Richard Feldman <oss@rtfeldman.com> Co-authored-by: Max Brunsfeld <maxbrunsfeld@gmail.com> Co-authored-by: Cole Miller <m@cole-miller.net>	2025-05-07 17:07:01 +00:00
Mikayla Maki	0cdd8bdded	Restore tool cards on thread deserialization (#30053 ) Release Notes: - N/A --------- Co-authored-by: Julia Ryan <juliaryan3.14@gmail.com>	2025-05-06 18:16:34 -07:00
Antonio Scandurra	07e6e49583	Add new editing eval scenario and improve it substantially (#29997 ) This improves the new eval scenario by ~80% (`0.29` vs `0.525`) without decreasing performance in the other evals. Release Notes: - Improved the performance of the `edit_file` tool.	2025-05-06 12:22:42 +00:00
Antonio Scandurra	210c338df4	Restore original file content when rejecting an overwritten file (#29974 ) Release Notes: - Fixed a bug that would cause rejecting a hunk from the agent to delete the file if the agent had decided to rewrite that file from scratch.	2025-05-06 07:05:55 +00:00
Antonio Scandurra	545ae27079	Add the ability to follow the agent as it makes edits (#29839 ) Nathan here: I also tacked on a bunch of UI refinement. Release Notes: - Introduced the ability to follow the agent around as it reads and edits files. --------- Co-authored-by: Nathan Sobo <nathan@zed.dev> Co-authored-by: Max Brunsfeld <maxbrunsfeld@gmail.com>	2025-05-04 08:28:39 +00:00
Antonio Scandurra	35539847a4	Allow `StreamingEditFileTool` to also create files (#29785 ) Refs #29733 This pull request introduces a new field to the `StreamingEditFileTool` that lets the model create or overwrite a file in a streaming way. When one of the `assistant.stream_edits` setting / `agent-stream-edits` feature flag is enabled, we are going to disable the `CreateFileTool` so that the agent model can only use `StreamingEditFileTool` for file creation. Release Notes: - N/A --------- Co-authored-by: Ben Brandt <benjamin.j.brandt@gmail.com> Co-authored-by: Oleksiy Syvokon <oleksiy.syvokon@gmail.com>	2025-05-02 09:57:04 +00:00
Antonio Scandurra	f891dfb358	Introduce a new `StreamingEditFileTool` (#29733 ) This pull request introduces a new tool for streaming edits. The short-term goal is for this tool to replace the existing `EditFileTool`, but we want to get this out the door as soon as possible so that we can start testing it. `StreamingEditFileTool` is mutually exclusive with `EditFileTool`. It will be enabled by default for anyone who has the `agent-stream-edits` feature flag, as well as people that set `assistant.stream_edits` to `true` in their settings. ### Implementation Streaming is achieved by requesting a completion while the `edit_file` tool gets called. We invoke the model by taking the existing conversation with the agent and appending a prompt specifically tailored for editing. In that prompt, we ask the model to produce a stream of `<old_text>`/`<new_text>` tags. As the model streams text in, we incrementally parse it and start editing as soon as we can. ### Evals Note that, as part of this pull request, I also defined some new evals that I used to drive the behavior of the recursive LLM call. To run them, use this command: ```bash cargo test --package=assistant_tools --features eval -- eval_extract_handle_command_output ``` Or comment out the `#[cfg_attr(not(feature = "eval"), ignore)]` macro. I recommend running them one at a time, because right now we don't really have a way of orchestrating of all these evals. I think we should invest into that effort once the new agent panel goes live. Release Notes: - N/A --------- Co-authored-by: Nathan Sobo <nathan@zed.dev> Co-authored-by: Bennet Bo Fenner <bennetbo@gmx.de> Co-authored-by: Oleksiy Syvokon <oleksiy.syvokon@gmail.com>	2025-05-01 17:37:43 +02:00

13 commits