Yehowshua/ZIm - Forgejo: Beyond coding. We Forge.

Author	SHA1	Message	Date
Kirill Bulatov	16366cf9f2	Use `anyhow` more idiomatically (#31052 ) https://github.com/zed-industries/zed/issues/30972 brought up another case where our context is not enough to track the actual source of the issue: we get a general top-level error without inner error. The reason for this was `.ok_or_else(\|\| anyhow!("failed to read HEAD SHA"))?; ` on the top level. The PR finally reworks the way we use anyhow to reduce such issues (or at least make it simpler to bubble them up later in a fix). On top of that, uses a few more anyhow methods for better readability. * `.ok_or_else(\|\| anyhow!("..."))`, `map_err` and other similar error conversion/option reporting cases are replaced with `context` and `with_context` calls * in addition to that, various `anyhow!("failed to do ...")` are stripped with `.context("Doing ...")` messages instead to remove the parasitic `failed to` text * `anyhow::ensure!` is used instead of `if ... { return Err(...); }` calls * `anyhow::bail!` is used instead of `return Err(anyhow!(...));` Release Notes: - N/A	2025-05-20 23:06:07 +00:00
Richard Feldman	4bb04cef9d	Accept wrapped text content from LLM providers (#31048 ) Some providers sometimes send `{ "type": "text", "text": ... }` instead of just the text as a string. Now we accept those instead of erroring. Release Notes: - N/A	2025-05-20 20:50:02 +00:00
Oleksiy Syvokon	255d8f7cf8	agent: Overwrite files more cautiously (#30649 ) 1. The `edit_file` tool tended to use `create_or_overwrite` a bit too often, leading to corruption of long files. This change replaces the boolean flag with an `EditFileMode` enum, which helps Agent make a more deliberate choice when overwriting files. With this change, the pass rate of the new eval increased from 10% to 100%. 2. eval: Added ability to run eval on top of an existing thread. Threads can now be loaded from JSON files in the `SerializedThread` format, which makes it easy to use real threads as starting points for tests/evals. 3. Don't try to restore tool cards when running in headless or eval mode -- we don't have a window to properly do this. Release Notes: - N/A	2025-05-14 10:40:44 +03:00
Richard Feldman	8fdf309a4a	Have read_file support images (#30435 ) This is very basic support for them. There are a number of other TODOs before this is really a first-class supported feature, so not adding any release notes for it; for now, this PR just makes it so that if read_file tries to read a PNG (which has come up in practice), it at least correctly sends it to Anthropic instead of messing up. This also lays the groundwork for future PRs for more first-class support for images in tool calls across more image file formats and LLM providers. Release Notes: - N/A --------- Co-authored-by: Agus Zubiaga <hi@aguz.me> Co-authored-by: Agus Zubiaga <agus@zed.dev>	2025-05-13 10:58:00 +02:00
Antonio Scandurra	1b593f616f	Include `EditAgent`'s raw output when inspecting thread (#30337 ) This allows us to debug the raw edits that were generated when people report feedback, when running evals and when opening the thread as Markdown. Release Notes: - Improved debug output for agent threads.	2025-05-09 06:58:45 +00:00
Antonio Scandurra	9f6809a28d	Reuse conversation cache when streaming edits (#30245 ) Release Notes: - Improved latency when the agent applies edits.	2025-05-08 14:36:34 +02:00
Oleksiy Syvokon	8199664a5a	agent: Handle attempts to use hallucinated tools (#29946 ) This change: 1. Catches attempts to use missing tools. If this happens, we now send Agent a message listing available tools, after which Agent can gracefully recover. Prior behavior: thread would stop in a broken state. Example of a hallucinated call and a message we send back: ![image](https://github.com/user-attachments/assets/92a8f700-b192-4038-8c7e-0a74ca2e0146) 2. Adds evals for hallucinated tool use and imagined edits 3. Adds ability to configure a profile name in evals. Release Notes: - N/A	2025-05-05 19:31:11 +00:00
Max Brunsfeld	c3d9cdecab	Change cloud language model provider JSON protocol to surface errors and usage information (#29830 ) Release Notes: - N/A --------- Co-authored-by: Nathan Sobo <nathan@zed.dev> Co-authored-by: Marshall Bowers <git@maxdeviant.com>	2025-05-04 17:37:42 +00:00
Max Brunsfeld	04772bf17d	Add support for queuing status updates in cloud language model provider (#29818 ) This sets us up to display queue position information to the user, once our language model backend is updated to support request queuing. The JSON returned by the LLM backend will need to look like this: ```json {"queue": {"status": "queued", "position": 1}} {"queue": {"status": "started"}} {"event": {"THE_UPSTREAM_MODEL_PROVIDER_EVENT": "..."}} ``` Release Notes: - N/A --------- Co-authored-by: Marshall Bowers <git@maxdeviant.com>	2025-05-02 20:36:39 +00:00
Richard Feldman	c8685dc90f	Fix eval judging missing final response (#29638 ) Fixed issue where eval thread judges were not considering the last response in the thread. The problem was that they were getting the full list of messages from `last_request`, which (being a request!) did not have the response yet. Release Notes: - N/A	2025-04-29 23:02:46 -04:00
Marshall Bowers	ce93961fe0	agent: Add "max mode" toggle (#29549 ) This PR adds a "max mode" toggle to the Agent panel, for models that support it. Only visible to folks in the `new-billing` feature flag. Icon is just a placeholder. Release Notes: - N/A	2025-04-28 16:50:47 +00:00
Michael Sloan	609c528ceb	Refactor markdown formatting utilities to avoid building intermediate strings (#29511 ) These were nearly always used when using `format!` / `write!` etc, so it makes sense to not have an intermediate `String`. Release Notes: - N/A	2025-04-27 19:04:51 +00:00
Michael Sloan	17ecf94f6f	Restructure agent context (#29233 ) Simplifies the data structures involved in agent context by removing caching and limiting the use of ContextId: * `AssistantContext` enum is now like an ID / handle to context that does not need to be updated. `ContextId` still exists but is only used for generating unique `ElementId`. * `ContextStore` has a `IndexMap<ContextSetEntry>`. Only need to keep a `HashSet<ThreadId>` consistent with it. `ContextSetEntry` is a newtype wrapper around `AssistantContext` which implements eq / hash on a subset of fields. * Thread `Message` directly stores its context. Fixes the following bugs: * If a context entry is removed from the strip and added again, it was reincluded in the next message. * Clicking file context in the thread that has been removed from the context strip didn't jump to the file. * Refresh of directory context didn't reflect added / removed files. * Deleted directories would remain in the message editor context strip. * Token counting requests didn't include image context. * File, directory, and symbol context deduplication relied on `ProjectPath` for identity, and so didn't handle renames. * Symbol context line numbers didn't update when shifted Known bugs (not fixed): * Deleting a directory causes it to disappear from messages in threads. Fixing this in a nice way is tricky. One easy fix is to store the original path and show that on deletion. It's weird that deletion would cause the name to "revert", though. Another possibility would be to snapshot context metadata on add (ala `AddedContext`), and keep that around despite deletion. Release Notes: - N/A	2025-04-24 21:29:33 +00:00
Oleksiy Syvokon	f69aeb6311	Do not log unfinished tools use that are in the middle of streaming (#29275 ) Release Notes: - N/A	2025-04-23 13:19:01 +00:00
Oleksiy Syvokon	76a78b550b	eval: Write JSON-serialized thread (#29271 ) This adds `last.message.json` file that contains the full request plus response (serialized as a message from assistant for consistency with other messages). Motivation: to capture more info and to make analysis of finished runs easier. Release Notes: - N/A	2025-04-23 15:22:19 +03:00
Agus Zubiaga	ce1a674eba	eval: Fine-grained assertions (#29246 ) - Support programmatic examples ([example](`17feb260a0/crates/eval/src/examples/file_search.rs`)) - Combine data-driven example declarations into a single `.toml` file ([example](`17feb260a0/crates/eval/src/examples/find_and_replace_diff_card.toml`)) - Run judge on individual assertions (previously called "criteria") - Report judge and programmatic assertions in one combined table Note: We still need to work on concept naming <img width=400 src="https://github.com/user-attachments/assets/fc719c93-467f-412b-8d47-68821bd8a5f5"> Release Notes: - N/A --------- Co-authored-by: Richard Feldman <oss@rtfeldman.com> Co-authored-by: Max Brunsfeld <maxbrunsfeld@gmail.com> Co-authored-by: Thomas Mickley-Doyle <tmickleydoyle@gmail.com>	2025-04-22 23:58:58 -03:00

16 commits