Yehowshua/ZIm - Forgejo: Beyond coding. We Forge.

Author	SHA1	Message	Date
Richard Feldman	5405c2c2d3	Standardize on u64 for token counts (#32869 ) Previously we were using a mix of `u32` and `usize`, e.g. `max_tokens: usize, max_output_tokens: Option<u32>` in the same `struct`. Although [tiktoken](https://github.com/openai/tiktoken) uses `usize`, token counts should be consistent across targets (e.g. the same model doesn't suddenly get a smaller context window if you're compiling for wasm32), and these token counts could end up getting serialized using a binary protocol, so `usize` is not the right choice for token counts. I chose to standardize on `u64` over `u32` because we don't store many of them (so the extra size should be insignificant) and future models may exceed `u32::MAX` tokens. Release Notes: - N/A	2025-06-17 10:43:07 -04:00
Umesh Yadav	ed4b29f80c	language_models: Improve token counting for providers (#32853 ) We push the usage data whenever we receive it from the provider to make sure the counting is correct after the turn has ended. - [x] Ollama - [x] Copilot - [x] Mistral - [x] OpenRouter - [x] LMStudio Put all the changes into a single PR open to move these to separate PR if that makes the review and testing easier. Release Notes: - N/A	2025-06-17 10:46:29 +00:00
Umesh Yadav	4b88090cca	language_models: Add images support to LMStudio provider (#32741 ) Tested with gemma3:4b LMStudio: beta version 0.3.17 Release Notes: - Add images support to LMStudio provider	2025-06-17 12:14:44 +02:00
Ben Brandt	e4bd115a63	More resilient eval (#32257 ) Bubbles up rate limit information so that we can retry after a certain duration if needed higher up in the stack. Also caps the number of concurrent evals running at once to also help. Release Notes: - N/A	2025-06-09 18:07:22 +00:00
Umesh Yadav	4ac7935589	language_models: Add thinking support to LM Studio provider (#32337 ) It works similar to how deepseek works where the thinking is returned as reasoning_content and we don't have to send the reasoning_content back in the request. This is a experiment feature which can be enabled from settings like this: <img width="1381" alt="Screenshot 2025-06-08 at 4 26 06 PM" src="https://github.com/user-attachments/assets/d2f60f3c-0f93-45fc-bae2-4ded42981820" /> Here is how it looks to use(tested with `deepseek/deepseek-r1-0528-qwen3-8b` <img width="528" alt="Screenshot 2025-06-08 at 5 12 33 PM" src="https://github.com/user-attachments/assets/f7716f52-5417-4f14-82b8-e853de054f63" /> Release Notes: - Add thinking support to LM Studio provider	2025-06-09 11:55:34 +02:00
Elijah McMorris	52fa7ababb	lmstudio: Fill max_tokens using the response from /models (#25606 ) The info for `max_tokens` for the model is included in `{api_url}/models` I don't think this needs to be `.clamp` like in `crates/ollama/src/ollama.rs` `get_max_tokens`, but it might need to be ## Before: Every model shows 2k ![image](https://github.com/user-attachments/assets/676075c8-0ceb-44b1-ae27-72ed6a6d783c) ## After: ![image](https://github.com/user-attachments/assets/8291535b-976e-4601-b617-1a508bf44e12) ### Json from `{api_url}/models` with model not loaded ```json { "id": "qwen2.5-coder-1.5b-instruct-mlx", "object": "model", "type": "llm", "publisher": "lmstudio-community", "arch": "qwen2", "compatibility_type": "mlx", "quantization": "4bit", "state": "not-loaded", "max_context_length": 32768 }, ``` ## Notes The response from `{api_url}/models` seems to return the `max_tokens` for the model, not the currently configured context length, but I think showing the `max_tokens` for the model is better than setting 2k for everything `loaded_context_length` exists, but only if the model is loaded at the startup of zed, which usually isn't the case maybe `fetch_models` should be rerun when swapping lmstudio models ### Currently configured context this isn't shown in `{api_url}/models` ![image](https://github.com/user-attachments/assets/8511cb9d-914b-4065-9eba-c0b086ad253b) ### Json from `{api_url}/models` with model loaded ```json { "id": "qwen2.5-coder-1.5b-instruct-mlx", "object": "model", "type": "llm", "publisher": "lmstudio-community", "arch": "qwen2", "compatibility_type": "mlx", "quantization": "4bit", "state": "loaded", "max_context_length": 32768, "loaded_context_length": 4096 }, ``` Release Notes: - lmstudio: Fixed showing `max_tokens` in the assistant panel --------- Co-authored-by: Peter Tripp <peter@zed.dev>	2025-06-06 20:21:23 +00:00
Ben Brandt	4304521655	Remove unused load_model method from LanguageModelProvider (#32070 ) Removes the load_model trait method and its implementations in Ollama and LM Studio providers, along with associated preload_model functions and unused imports. Release Notes: - N/A	2025-06-04 14:07:01 +00:00
Umesh Yadav	4e7dc37f01	language_models: Remove handling of WrappedTextContent in tool result content (#31605 ) Fixes ci pipeline Release Notes: - N/A	2025-05-28 16:43:08 +00:00
Fedor Nezhivoi	998542b048	language_models: Add support for tool use to LM Studio provider (#30589 ) Closes #30004 Quick demo: https://github.com/user-attachments/assets/0ac93851-81d7-4128-a34b-1f3ae4bcff6d Additional notes: I've tried to stick to existing code in OpenAI provider as much as possible without changing much to keep the diff small. This PR is done in collaboration with @yagil from LM Studio. We agreed upon the format in which LM Studio will return information about tool use support for the model in the upcoming version. As of current stable version nothing is going to change for the users, but once they update to a newer LM Studio tool use gets automatically enabled for them. I think this is much better UX then defaulting to true right now. Release Notes: - Added support for tool calls to LM Studio provider --------- Co-authored-by: Ben Brandt <benjamin.j.brandt@gmail.com>	2025-05-26 13:54:17 +02:00
Richard Feldman	8fdf309a4a	Have read_file support images (#30435 ) This is very basic support for them. There are a number of other TODOs before this is really a first-class supported feature, so not adding any release notes for it; for now, this PR just makes it so that if read_file tries to read a PNG (which has come up in practice), it at least correctly sends it to Anthropic instead of messing up. This also lays the groundwork for future PRs for more first-class support for images in tool calls across more image file formats and LLM providers. Release Notes: - N/A --------- Co-authored-by: Agus Zubiaga <hi@aguz.me> Co-authored-by: Agus Zubiaga <agus@zed.dev>	2025-05-13 10:58:00 +02:00
Antonio Scandurra	9f6809a28d	Reuse conversation cache when streaming edits (#30245 ) Release Notes: - Improved latency when the agent applies edits.	2025-05-08 14:36:34 +02:00
Umesh Yadav	a743035286	lmstudio: Fix streaming not working in v0.3.15 (#30013 ) Closes #29781 Tested this with llama3, gemma3 and qwen3. This is a breaking change, which means after adding this code changes in future version zed we will require atleast lmstudio >= 0.3.15. For context why it's breaking changes check out the issue: #29781. What this doesn't try to solve is: * Tool calling, thinking text rendering. Will raise a seperate PR for these as those are not required in this PR to make it work. https://github.com/user-attachments/assets/945f9c73-6323-4a88-92e2-2219b760a249 Release Notes: - lmstudio: Fixed Zed support for LMStudio >= v0.3.15 (breaking change -- older versions are no longer supported). --------- Co-authored-by: Peter Tripp <peter@zed.dev>	2025-05-06 12:59:36 -04:00
Richard Feldman	720dfee803	Treat invalid JSON in tool calls as failed tool calls (#29375 ) Release Notes: - N/A --------- Co-authored-by: Max <max@zed.dev> Co-authored-by: Max Brunsfeld <maxbrunsfeld@gmail.com>	2025-04-24 16:54:27 -04:00
Michael Sloan	fbf7caf93e	Default to fast model for thread summaries and titles + don't include system prompt / context / thinking segments (#29102 ) * Adds a fast / cheaper model to providers and defaults thread summarization to this model. Initial motivation for this was that https://github.com/zed-industries/zed/pull/29099 would cause these requests to fail when used with a thinking model. It doesn't seem correct to use a thinking model for summarization. * Skips system prompt, context, and thinking segments. * If tool use is happening, allows 2 tool uses + one more agent response before summarizing. Downside of this is that there was potential for some prefix cache reuse before, especially for title summarization (thread summarization omitted tool results and so would not share a prefix for those). This seems fine as these requests should typically be fairly small. Even for full thread summarization, skipping all tool use / context should greatly reduce the token use. Release Notes: - N/A	2025-04-19 23:26:29 +00:00
Danilo Leal	e27f6a984f	agent: Simplify design of the settings view (#29041 ) Containing everything in boxes wasn't super necessary here. Want to still improve the switch color contrast here, but will probably do that in a separate PR. <img src="https://github.com/user-attachments/assets/f826a7a8-beaf-45d0-9dc2-36dc210c418e" width="700"/> Release Notes: - N/A	2025-04-18 14:24:53 -03:00
Marshall Bowers	889bc13b7d	language_model: Remove `use_any_tool` method from `LanguageModel` (#27930 ) This PR removes the `use_any_tool` method from the `LanguageModel` trait. It was not being used anywhere, and doesn't really fit in our new tool use story. Release Notes: - N/A	2025-04-02 15:49:21 +00:00
Marshall Bowers	5880271b11	language_model: Add `supports_tools` method to `LanguageModel` (#27867 ) This PR adds a new `supports_tools` method to the `LanguageModel` trait to indicate whether a given model supports tool use. Release Notes: - N/A	2025-04-01 19:56:05 +00:00
Piotr Osiewicz	dc64ec9cc8	chore: Bump Rust edition to 2024 (#27800 ) Follow-up to https://github.com/zed-industries/zed/pull/27791 Release Notes: - N/A	2025-03-31 20:55:27 +02:00
Piotr Osiewicz	0729d24d77	chore: Prepare for Rust edition bump to 2024 (without autofix) (#27791 ) Successor to #27779 - in this PR I've applied changes manually, without futzing with if let lifetimes at all. Release Notes: - N/A	2025-03-31 20:10:36 +02:00
Mikayla Maki	1aefa5178b	Move "async move" a few characters to the left in cx.spawn() (#26758 ) This is the core change: https://github.com/zed-industries/zed/pull/26758/files#diff-044302c0d57147af17e68a0009fee3e8dcdfb4f32c27a915e70cfa80e987f765R1052 TODO: - [x] Use AsyncFn instead of Fn() -> Future in GPUI spawn methods - [x] Implement it in the whole app - [x] Implement it in the debugger - [x] Glance at the RPC crate, and see if those box future methods can be switched over. Answer: It can't directly, as you can't make an AsyncFn* into a trait object. There's ways around that, but they're all more complex than just keeping the code as is. - [ ] Fix platform specific code Release Notes: - N/A	2025-03-19 02:09:02 +00:00
Marshall Bowers	aceab76ae4	gpui: Rename `rounded_md` to `rounded_sm` (#26228 ) This PR renames the `rounded_md` style method to `rounded_sm`. Follow up to https://github.com/zed-industries/zed/pull/26221, which freed up the `rounded_sm` name. Release Notes: - N/A	2025-03-06 17:57:31 +00:00
Antonio Scandurra	f517050548	Partially fix assistant onboarding (#25313 ) While investigating #24896, I noticed two issues: 1. The default configuration for the `zed.dev` provider was using the wrong string for Claude 3.5 Sonnet. This meant the provider would always result as not configured until the user selected it from the model picker, because we couldn't deserialize that string to a valid `anthropic::Model` enum variant. 2. When clicking on `Open New Chat`/`Start New Thread` in the provider configuration, we would select `Claude 3.5 Haiku` by default instead of Claude 3.5 Sonnet. Release Notes: - Fixed some issues that caused AI providers to sometimes be misconfigured.	2025-02-24 07:29:55 +00:00
Marshall Bowers	7a6b652ebc	language_model: Return `AuthenticateError`s from `LanguageModelProvider::authenticate` (#25126 ) This PR updates the `LanguageModelProvider::authenticate` method to return an `AuthenticateError` instead of an `anyhow::Error`. This allows us to model the "credentials not found" state explicitly as `AuthenticateError::CredentialsNotFound`, which enables the caller to check for this state and act accordingly. Planning to use this in #25123 to silence errors about missing credentials when authenticating providers in the background. Release Notes: - N/A	2025-02-19 00:01:48 +00:00
Danilo Leal	f8dddf0a5c	assistant2: Tweak the settings UI (#23845 ) This PR does some somewhat light UI adjustment to the Assistant 2 settings view. The Prompt Library section should feature the default prompts in the future, so that's why it's been separated that way. <img width="800" alt="Screenshot 2025-01-29 at 2 59 59 PM" src="https://github.com/user-attachments/assets/7b033bde-51ab-44d5-9e53-3f72b8ff5f51" /> Release Notes: - N/A	2025-01-29 16:20:09 -03:00
Mikayla Maki	a6b1514246	Fix missed renames in #22632 (#23688 ) Fix a bug where a GPUI macro still used `ModelContext` Rename `AsyncAppContext` -> `AsyncApp` Rename update_model, read_model, insert_model, and reserve_model to update_entity, read_entity, insert_entity, and reserve_entity Release Notes: - N/A	2025-01-26 23:37:34 +00:00
Nathan Sobo	6fca1d2b0b	Eliminate GPUI View, ViewContext, and WindowContext types (#22632 ) There's still a bit more work to do on this, but this PR is compiling (with warnings) after eliminating the key types. When the tasks below are complete, this will be the new narrative for GPUI: - `Entity<T>` - This replaces `View<T>`/`Model<T>`. It represents a unit of state, and if `T` implements `Render`, then `Entity<T>` implements `Element`. - `&mut App` This replaces `AppContext` and represents the app. - `&mut Context<T>` This replaces `ModelContext` and derefs to `App`. It is provided by the framework when updating an entity. - `&mut Window` Broken out of `&mut WindowContext` which no longer exists. Every method that once took `&mut WindowContext` now takes `&mut Window, &mut App` and every method that took `&mut ViewContext<T>` now takes `&mut Window, &mut Context<T>` Not pictured here are the two other failed attempts. It's been quite a month! Tasks: - [x] Remove `View`, `ViewContext`, `WindowContext` and thread through `Window` - [x] [@cole-miller @mikayla-maki] Redraw window when entities change - [x] [@cole-miller @mikayla-maki] Get examples and Zed running - [x] [@cole-miller @mikayla-maki] Fix Zed rendering - [x] [@mikayla-maki] Fix todo! macros and comments - [x] Fix a bug where the editor would not be redrawn because of view caching - [x] remove publicness window.notify() and replace with `AppContext::notify` - [x] remove `observe_new_window_models`, replace with `observe_new_models` with an optional window - [x] Fix a bug where the project panel would not be redrawn because of the wrong refresh() call being used - [x] Fix the tests - [x] Fix warnings by eliminating `Window` params or using `_` - [x] Fix conflicts - [x] Simplify generic code where possible - [x] Rename types - [ ] Update docs ### issues post merge - [x] Issues switching between normal and insert mode - [x] Assistant re-rendering failure - [x] Vim test failures - [x] Mac build issue Release Notes: - N/A --------- Co-authored-by: Antonio Scandurra <me@as-cii.com> Co-authored-by: Cole Miller <cole@zed.dev> Co-authored-by: Mikayla <mikayla@zed.dev> Co-authored-by: Joseph <joseph@zed.dev> Co-authored-by: max <max@zed.dev> Co-authored-by: Michael Sloan <michael@zed.dev> Co-authored-by: Mikayla Maki <mikaylamaki@Mikaylas-MacBook-Pro.local> Co-authored-by: Mikayla <mikayla.c.maki@gmail.com> Co-authored-by: joão <joao@zed.dev>	2025-01-26 03:02:45 +00:00
Yagil Burowski	c038696aa8	Add LM Studio support to the Assistant (#23097 ) #### Release Notes: - Added support for [LM Studio](https://lmstudio.ai/) to the Assistant. #### Quick demo: https://github.com/user-attachments/assets/af58fc13-1abc-4898-9747-3511016da86a #### Future enhancements: - wire up tool calling (new in [LM Studio 0.3.6](https://lmstudio.ai/blog/lmstudio-v0.3.6)) --------- Co-authored-by: Marshall Bowers <elliott.codes@gmail.com>	2025-01-14 20:41:58 +00:00

27 commits