Yehowshua/ZIm - Forgejo: Beyond coding. We Forge.

Author	SHA1	Message	Date
Michael Sloan	d497f52e17	agent: Improve error handling and retry for zed-provided models (#33565 ) * Updates to `zed_llm_client-0.8.5` which adds support for `retry_after` when anthropic provides it. * Distinguishes upstream provider errors and rate limits from errors that originate from zed's servers * Moves `LanguageModelCompletionError::BadInputJson` to `LanguageModelCompletionEvent::ToolUseJsonParseError`. While arguably this is an error case, the logic in thread is cleaner with this move. There is also precedent for inclusion of errors in the event type - `CompletionRequestStatus::Failed` is how cloud errors arrive. * Updates `PROVIDER_ID` / `PROVIDER_NAME` constants to use proper types instead of `&str`, since they can be constructed in a const fashion. * Removes use of `CLIENT_SUPPORTS_EXA_WEB_SEARCH_PROVIDER_HEADER_NAME` as the server no longer reads this header and just defaults to that behavior. Release notes for this is covered by #33275 Release Notes: - N/A --------- Co-authored-by: Richard Feldman <oss@rtfeldman.com> Co-authored-by: Richard <richard@zed.dev>	2025-06-30 21:01:32 -06:00
Ben Brandt	6c46e1129d	Cleanup remaining references to max mode (#33509 ) Release Notes: - N/A	2025-06-27 08:32:13 +00:00
Bennet Bo Fenner	7be57baef0	agent: Fix issue with Anthropic thinking models (#33317 ) cc @osyvokon We were seeing a bunch of errors in our backend when people were using Claude models with thinking enabled. In the logs we would see > an error occurred while interacting with the Anthropic API: invalid_request_error: messages.x.content.0.type: Expected `thinking` or `redacted_thinking`, but found `text`. When `thinking` is enabled, a final `assistant` message must start with a thinking block (preceeding the lastmost set of `tool_use` and `tool_result` blocks). We recommend you include thinking blocks from previous turns. To avoid this requirement, disable `thinking`. Please consult our documentation at https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking However, this issue did not occur frequently and was not easily reproducible. Turns out it was triggered by us not correctly handling [Redacted Thinking Blocks](https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking#thinking-redaction). I could constantly reproduce this issue by including this magic string: `ANTHROPIC_MAGIC_STRING_TRIGGER_REDACTED_THINKING_46C9A13E193C177646C7398A98432ECCCE4C1253D5E2D82641AC0E52CC2876CB ` in the request, which forces `claude-3-7-sonnet` to emit redacted thinking blocks (confusingly the magic string does not seem to be working for `claude-sonnet-4`). As soon as we hit a tool call Anthropic would return an error. Thanks to @osyvokon for pointing me in the right direction 😄! Release Notes: - agent: Fixed an issue where Anthropic models would sometimes return an error when thinking was enabled	2025-06-24 16:23:59 +00:00
Richard Feldman	c610ebfb03	Thread Anthropic errors into LanguageModelKnownError (#33261 ) This PR is in preparation for doing automatic retries for certain errors, e.g. Overloaded. It doesn't change behavior yet (aside from some granularity of error messages shown to the user), but rather mostly changes some error handling to be exhaustive enum matches instead of `anyhow` downcasts, and leaves some comments for where the behavior change will be in a future PR. Release Notes: - N/A	2025-06-23 18:48:26 +00:00
Michael Sloan	7e801dccb0	agent: Fix issues with usage display sometimes showing initially fetched usage (#33125 ) Having `Thread::last_usage` as an override of the initially fetched usage could cause the initial usage to be displayed when the current thread is empty or in text threads. Fix is to just store last usage info in `UserStore` and not have these overrides Release Notes: - Agent: Fixed request usage display to always include the most recently known usage - there were some cases where it would show the initially requested usage.	2025-06-20 21:28:48 +00:00
Richard Feldman	5405c2c2d3	Standardize on u64 for token counts (#32869 ) Previously we were using a mix of `u32` and `usize`, e.g. `max_tokens: usize, max_output_tokens: Option<u32>` in the same `struct`. Although [tiktoken](https://github.com/openai/tiktoken) uses `usize`, token counts should be consistent across targets (e.g. the same model doesn't suddenly get a smaller context window if you're compiling for wasm32), and these token counts could end up getting serialized using a binary protocol, so `usize` is not the right choice for token counts. I chose to standardize on `u64` over `u32` because we don't store many of them (so the extra size should be insignificant) and future models may exceed `u32::MAX` tokens. Release Notes: - N/A	2025-06-17 10:43:07 -04:00
Richard Feldman	cfbc2d0972	Don't spawn Anthropic telemetry event when API key is missing (#32813 ) Minor refactor that I'm extracting from a branch because it can stand alone. - Now we no longer spawn an executor for `report_anthropic_event` if it's just going to immediately fail due to API key being missing - `report_anthropic_event` now takes a `String` API key instead of `Option<String>` and the error reporting if the key is missing has been moved to the caller. - `report_anthropic_event` is longer coupled to `AnthropicError`, because all it ever did was generate an `AnthropicEvent::Other`, which in turn was then only used for `log_err` - so, can just be an `anyhow::Result`. Release Notes: - N/A	2025-06-16 14:58:37 -04:00
Ben Brandt	9427833fdf	Distinguish between missing models and registries in error messages (#32678 ) Consolidates configuration error handling by moving the error type and logic from assistant_context_editor to language_model::registry. The registry now provides a single method to check for configuration errors, making the error handling more consistent across the agent panel and context editor. This also now checks if the issue is that we don't have any providers, or if we just can't find the model. Previously, an incorrect model name showed up as having no providers, which is very confusing. Release Notes: - N/A	2025-06-13 10:31:52 +00:00
Ben Brandt	e4bd115a63	More resilient eval (#32257 ) Bubbles up rate limit information so that we can retry after a certain duration if needed higher up in the stack. Also caps the number of concurrent evals running at once to also help. Release Notes: - N/A	2025-06-09 18:07:22 +00:00
Ben Brandt	4304521655	Remove unused load_model method from LanguageModelProvider (#32070 ) Removes the load_model trait method and its implementations in Ollama and LM Studio providers, along with associated preload_model functions and unused imports. Release Notes: - N/A	2025-06-04 14:07:01 +00:00
Marshall Bowers	a23ee61a4b	Pass up intent with completion requests (#31710 ) This PR adds a new `intent` field to completion requests to assist in categorizing them correctly. Release Notes: - N/A --------- Co-authored-by: Ben Brandt <benjamin.j.brandt@gmail.com>	2025-05-29 20:43:12 +00:00
Umesh Yadav	703ee29658	Rename Max Mode to Burn Mode throughout code and docs (#31668 ) Follow up to https://github.com/zed-industries/zed/pull/31470. I started looking at config and changed preferred_completion_mode to burn to only find its max so made changes to align it better with rebrand. As this is in preview build now. This doesn't touch zed_llm_client. Only the Zed changes the code and doc to match the new UI of burn mode. There are still more things to be renamed, though. Release Notes: - N/A --------- Signed-off-by: Umesh Yadav <git@umesh.dev> Co-authored-by: Danilo Leal <daniloleal09@gmail.com>	2025-05-29 13:12:42 +00:00
Richard Feldman	00fd045844	Make language model deserialization more resilient (#31311 ) This expands our deserialization of JSON from models to be more tolerant of different variations that the model may send, including capitalization, wrapping things in objects vs. being plain strings, etc. Also when deserialization fails, it reports the entire error in the JSON so we can see what failed to deserialize. (Previously these errors were very unhelpful at diagnosing the problem.) Finally, also removes the `WrappedText` variant since the custom deserializer just turns that style of JSON into a normal `Text` variant. Release Notes: - N/A	2025-05-28 12:06:07 -04:00
Antonio Scandurra	4f78165ee8	Show progress as the agent locates which range it needs to edit (#31582 ) Release Notes: - Improved latency when the agent starts streaming edits. --------- Co-authored-by: Ben Brandt <benjamin.j.brandt@gmail.com>	2025-05-28 12:32:54 +00:00
Abdelhakim Qbaich	e42cf21703	Default to fast model first for commit messages (#31385 ) I was surprised to see this being done for thread summaries, but not commit messages. I believe it's a better default as most people would want a faster commit message generation without spending premium requests. Considering how the default fast model for copilot is set to the base one, this is ideal for me (and likely many others), as opposed to tweaking the configuration every time the base model changes. Release Notes: - git: Default to fast model first if not configured for generating commit messages	2025-05-26 10:37:44 +02:00
Marshall Bowers	7fb9569c15	language_model: Remove `CloudModel` enum (#31322 ) This PR removes the `CloudModel` enum, as it is no longer needed after #31316. Release Notes: - N/A	2025-05-24 02:04:51 +00:00
Marshall Bowers	685933b5c8	language_models: Fetch Zed models from the server (#31316 ) This PR updates the Zed LLM provider to fetch the available models from the server instead of hard-coding them in the binary. Release Notes: - Updated the Zed provider to fetch the list of available language models from the server.	2025-05-23 23:00:35 +00:00
Marshall Bowers	5c0b161563	Handle new `refusal` stop reason from Claude 4 models (#31217 ) This PR adds support for handling the new [`refusal` stop reason](https://docs.anthropic.com/en/docs/test-and-evaluate/strengthen-guardrails/handle-streaming-refusals) from Claude 4 models. <img width="409" alt="Screenshot 2025-05-22 at 4 31 56 PM" src="https://github.com/user-attachments/assets/707b04f5-5a52-4a19-95d9-cbd2be2dd86f" /> Release Notes: - Added handling for `"stop_reason": "refusal"` from Claude 4 models.	2025-05-22 16:56:59 -04:00
Marshall Bowers	fc78408ee4	language_model: Allow Max Mode for Claude 4 models (#31207 ) This PR adds the Claude 4 models to the list of models that support Max Mode. Release Notes: - Added Max Mode support for Claude 4 models.	2025-05-22 18:50:30 +00:00
Marshall Bowers	1475ace6f1	anthropic: Add support for Claude 4 (#31203 ) This PR adds support for [Claude 4](https://www.anthropic.com/news/claude-4). Release Notes: - Added support for Claude Opus 4 and Claude Sonnet 4. --------- Co-authored-by: Antonio Scandurra <me@as-cii.com> Co-authored-by: Richard Feldman <oss@rtfeldman.com>	2025-05-22 18:09:35 +00:00
Kirill Bulatov	16366cf9f2	Use `anyhow` more idiomatically (#31052 ) https://github.com/zed-industries/zed/issues/30972 brought up another case where our context is not enough to track the actual source of the issue: we get a general top-level error without inner error. The reason for this was `.ok_or_else(\|\| anyhow!("failed to read HEAD SHA"))?; ` on the top level. The PR finally reworks the way we use anyhow to reduce such issues (or at least make it simpler to bubble them up later in a fix). On top of that, uses a few more anyhow methods for better readability. * `.ok_or_else(\|\| anyhow!("..."))`, `map_err` and other similar error conversion/option reporting cases are replaced with `context` and `with_context` calls * in addition to that, various `anyhow!("failed to do ...")` are stripped with `.context("Doing ...")` messages instead to remove the parasitic `failed to` text * `anyhow::ensure!` is used instead of `if ... { return Err(...); }` calls * `anyhow::bail!` is used instead of `return Err(anyhow!(...));` Release Notes: - N/A	2025-05-20 23:06:07 +00:00
Richard Feldman	4bb04cef9d	Accept wrapped text content from LLM providers (#31048 ) Some providers sometimes send `{ "type": "text", "text": ... }` instead of just the text as a string. Now we accept those instead of erroring. Release Notes: - N/A	2025-05-20 20:50:02 +00:00
Oleksiy Syvokon	5112fcebeb	evals: Make LLMs configurable in edit_agent evals (#30813 ) Release Notes: - N/A	2025-05-16 11:10:15 +00:00
Marshall Bowers	7cad943fde	agent: Remove unused max monthly spend reached error (#30615 ) This PR removes the code for showing the max monthly spend limit reached error, as it is no longer used. Release Notes: - N/A	2025-05-13 09:43:13 +00:00
Richard Feldman	8fdf309a4a	Have read_file support images (#30435 ) This is very basic support for them. There are a number of other TODOs before this is really a first-class supported feature, so not adding any release notes for it; for now, this PR just makes it so that if read_file tries to read a PNG (which has come up in practice), it at least correctly sends it to Anthropic instead of messing up. This also lays the groundwork for future PRs for more first-class support for images in tool calls across more image file formats and LLM providers. Release Notes: - N/A --------- Co-authored-by: Agus Zubiaga <hi@aguz.me> Co-authored-by: Agus Zubiaga <agus@zed.dev>	2025-05-13 10:58:00 +02:00
Umesh Yadav	a6c3d49bb9	language_models: Add vision support for Copilot Chat models (#30155 ) Problem Statement: Support for image analysis (vision) is currently restricted to Anthropic and Gemini models. This limits users who wish to leverage vision capabilities available in other models, such as Copilot, for tasks like attaching image context within the agent message editor. Proposed Change: This PR extends vision support to include Copilot models that are already equipped with vision capabilities. This integration will allow users within VS Code to attach and analyze images using supported Copilot models via the agent message editor. Scope Limitation: This PR does not implement controls within the message editor to ensure that image context (e.g., through copy-paste or attachment) is exclusively enabled or prompted only when a vision-supported model is active. Long term the message editor should have access to each models vision capability and stop the users from attaching images by either greying out the context saying it's not support or not work through both copy paste and file/directory search. Closes #30076 Release Notes: - Add vision support for Copilot Chat models --------- Co-authored-by: Bennet Bo Fenner <bennet@zed.dev>	2025-05-12 13:11:38 +00:00
Antonio Scandurra	9f6809a28d	Reuse conversation cache when streaming edits (#30245 ) Release Notes: - Improved latency when the agent applies edits.	2025-05-08 14:36:34 +02:00
Mikayla Maki	0cdd8bdded	Restore tool cards on thread deserialization (#30053 ) Release Notes: - N/A --------- Co-authored-by: Julia Ryan <juliaryan3.14@gmail.com>	2025-05-06 18:16:34 -07:00
Max Brunsfeld	c3d9cdecab	Change cloud language model provider JSON protocol to surface errors and usage information (#29830 ) Release Notes: - N/A --------- Co-authored-by: Nathan Sobo <nathan@zed.dev> Co-authored-by: Marshall Bowers <git@maxdeviant.com>	2025-05-04 17:37:42 +00:00
Marshall Bowers	f0515d1c34	agent: Show a notice when reaching consecutive tool use limits (#29833 ) This PR adds a notice when reaching consecutive tool use limits when using normal mode. Here's an example with the limit artificially lowered to 2 consecutive tool uses: https://github.com/user-attachments/assets/32da8d38-67de-4d6b-8f24-754d2518e5d4 Release Notes: - agent: Added a notice when reaching consecutive tool use limits when using a model in normal mode.	2025-05-03 02:09:54 +00:00
Max Brunsfeld	04772bf17d	Add support for queuing status updates in cloud language model provider (#29818 ) This sets us up to display queue position information to the user, once our language model backend is updated to support request queuing. The JSON returned by the LLM backend will need to look like this: ```json {"queue": {"status": "queued", "position": 1}} {"queue": {"status": "started"}} {"event": {"THE_UPSTREAM_MODEL_PROVIDER_EVENT": "..."}} ``` Release Notes: - N/A --------- Co-authored-by: Marshall Bowers <git@maxdeviant.com>	2025-05-02 20:36:39 +00:00
Bennet Bo Fenner	4812c9094b	agent: Support images via @file and the file context picker (#29596 ) Release Notes: - agent: Add support for @mentioning images - agent: Add support for including images via file context picker --------- Co-authored-by: Oleksiy Syvokon <oleksiy.syvokon@gmail.com>	2025-04-29 16:26:27 +02:00
Max Brunsfeld	17903a0999	Associate each thread with a model (#29573 ) This PR makes it possible to use different LLM models in the agent panels of two different projects, simultaneously. It also properly restores a thread's original model when restoring it from the history, rather than having it use the default model. As before, newly-created threads will use the current default model. Release Notes: - Enabled different project windows to use different models in the agent panel - Enhanced the agent panel so that when revisiting old threads, their original model will be used. --------- Co-authored-by: Richard Feldman <oss@rtfeldman.com>	2025-04-28 23:43:16 +00:00
Marshall Bowers	ce93961fe0	agent: Add "max mode" toggle (#29549 ) This PR adds a "max mode" toggle to the Agent panel, for models that support it. Only visible to folks in the `new-billing` feature flag. Icon is just a placeholder. Release Notes: - N/A	2025-04-28 16:50:47 +00:00
Richard Feldman	720dfee803	Treat invalid JSON in tool calls as failed tool calls (#29375 ) Release Notes: - N/A --------- Co-authored-by: Max <max@zed.dev> Co-authored-by: Max Brunsfeld <maxbrunsfeld@gmail.com>	2025-04-24 16:54:27 -04:00
Nathan Sobo	8836c6fb42	Introduce LanguageModelToolUse::raw_input (#29322 ) This is to enable alternative streaming solutions at the application layer. I'm not sure we really should have performed parsing of the input at this layer. Either way I want to experiment with streaming approaches in a separate crate on a branch, and this will help. /cc @maxdeviant @bennetbo @rtfeldman Closes #ISSUE Release Notes: - N/A	2025-04-24 02:30:48 +00:00
Bennet Bo Fenner	822b6f837d	agent: Expose web search tool to beta users (#29273 ) This gives all beta users access to the web search tool Release Notes: - agent: Added `web_search` tool	2025-04-23 15:30:20 +00:00
Stephan Seidt	10ded0ab75	agent: Add support for google gemini 2.5 flash preview (#29205 ) Adds support for the new gemini-2.5-flash-preview-04-17 Release Notes: - agent: Added support for gemini-2.5-flash-preview	2025-04-22 09:37:12 +00:00
Bennet Bo Fenner	eca6d5a04e	agent: Support pasting images as context (#29177 ) https://github.com/user-attachments/assets/d6a27b05-3590-4f40-a820-f6f99f6bd581 Release Notes: - agent: Added support for pasting images as context --------- Co-authored-by: Danilo Leal <daniloleal09@gmail.com>	2025-04-22 09:01:01 +00:00
Agus Zubiaga	b14356d1d3	agent: Do not add `<using_tool>` placeholder (#29194 ) Our provider code in `language_models` filters out messages for which `LanguageModelRequestMessage::contents_empty` returns `false`. This doesn't seem wrong by itself, but `contents_empty` was returning `false` for messages whose first segment didn't contain non-whitespace text even if they contained other non-empty segments. This caused requests to fail when a message with a tool call didn't contain any preceding text. Release Notes: - N/A	2025-04-22 00:41:47 -03:00
Richard Feldman	4f2f9ff762	Streaming tool calls (#29179 ) https://github.com/user-attachments/assets/7854a737-ef83-414c-b397-45122e4f32e8 Release Notes: - Create file and edit file tools now stream their tool descriptions, so you can see what they're doing sooner. --------- Co-authored-by: Marshall Bowers <git@maxdeviant.com>	2025-04-21 22:28:32 +00:00
Michael Sloan	fbf7caf93e	Default to fast model for thread summaries and titles + don't include system prompt / context / thinking segments (#29102 ) * Adds a fast / cheaper model to providers and defaults thread summarization to this model. Initial motivation for this was that https://github.com/zed-industries/zed/pull/29099 would cause these requests to fail when used with a thinking model. It doesn't seem correct to use a thinking model for summarization. * Skips system prompt, context, and thinking segments. * If tool use is happening, allows 2 tool uses + one more agent response before summarizing. Downside of this is that there was potential for some prefix cache reuse before, especially for title summarization (thread summarization omitted tool results and so would not share a prefix for those). This seems fine as these requests should typically be fairly small. Even for full thread summarization, skipping all tool use / context should greatly reduce the token use. Release Notes: - N/A	2025-04-19 23:26:29 +00:00
Bennet Bo Fenner	bafc086d27	agent: Preserve thinking blocks between requests (#29055 ) Looks like the required backend component of this was deployed. https://github.com/zed-industries/monorepo/actions/runs/14541199197 Release Notes: - N/A --------- Co-authored-by: Antonio Scandurra <me@as-cii.com> Co-authored-by: Agus Zubiaga <hi@aguz.me> Co-authored-by: Richard Feldman <oss@rtfeldman.com> Co-authored-by: Nathan Sobo <nathan@zed.dev>	2025-04-19 20:12:03 +00:00
Michael Sloan	d88b06a5dc	Simplify language model registry + only emit change events on change (#29086 ) * Now only does default fallback logic in the registry * Only emits change events when there is actually a change Release Notes: - N/A	2025-04-19 08:26:42 +00:00
Marshall Bowers	7abe2c9c31	agent: Attach thread ID and prompt ID to telemetry events (#29069 ) This PR attaches the thread ID and the new prompt ID to telemetry events for completions in the Agent panel. Release Notes: - N/A --------- Co-authored-by: Mikayla Maki <mikayla.c.maki@gmail.com>	2025-04-18 20:41:02 +00:00
Marshall Bowers	676cc109a3	agent: Report usage from thread summarization requests (#29012 ) This PR makes it so the thread summarization also reports the model request usage, to prevent the case where the count would appear to jump by 2 the next time a message was sent after summarization. Release Notes: - N/A	2025-04-17 23:05:12 +00:00
Marshall Bowers	d93141bded	agent: Extract usage information from response headers (#29002 ) This PR updates the Agent to extract the usage information from the response headers, if they are present. For now we just log the information, but we'll be using this soon to populate some UI. Release Notes: - N/A	2025-04-17 20:11:07 +00:00
Umesh Yadav	8117940aca	Add support for OpenAI o3 and o4-mini models (#28881 ) Release Notes: - Add support for OpenAI o3 and o4-mini models via OpenAI API and Copilot Chat providers. --------- Co-authored-by: Peter Tripp <peter@zed.dev>	2025-04-17 10:58:41 -04:00
Agus Zubiaga	0286b8ab3e	agent: Fix conversation token usage and estimate unsent message (#28878 ) The UI was mistakenly using the cumulative token usage for the token counter. It will now display the last request token count, plus an estimation of the tokens in the message editor and context entries that haven't been sent yet. https://github.com/user-attachments/assets/0438c501-b850-4397-9135-57214ca3c07a Additionally, when the user edits a message, we'll display the actual token count up to it and estimate the tokens in the new message. Note: We don't currently estimate the delta when switching profiles. In the future, we want to use the count tokens API to measure every part of the request and display a breakdown. Release Notes: - agent: Made the token count more accurate and added back estimation of used tokens as you type and add context. --------- Co-authored-by: Bennet Bo Fenner <bennetbo@gmx.de> Co-authored-by: Danilo Leal <daniloleal09@gmail.com>	2025-04-16 16:27:36 -03:00
Marshall Bowers	97b044acf5	proto: Add `ZedProTrial` to `Plan` (#28885 ) This PR adds the `ZedProTrial` member to the `Plan` enum. Release Notes: - N/A	2025-04-16 18:13:00 +00:00

1 2 3 4 5

206 commits