ZIm/crates/language_models/src/provider at 9775747ba91c0e7e0cf5a01ffa1fc8e656800080 - Yehowshua/ZIm

History

Elijah McMorris 52fa7ababb lmstudio: Fill max_tokens using the response from /models (#25606 ) The info for `max_tokens` for the model is included in `{api_url}/models` I don't think this needs to be `.clamp` like in `crates/ollama/src/ollama.rs` `get_max_tokens`, but it might need to be ## Before: Every model shows 2k ![image](https://github.com/user-attachments/assets/676075c8-0ceb-44b1-ae27-72ed6a6d783c) ## After: ![image](https://github.com/user-attachments/assets/8291535b-976e-4601-b617-1a508bf44e12) ### Json from `{api_url}/models` with model not loaded ```json { "id": "qwen2.5-coder-1.5b-instruct-mlx", "object": "model", "type": "llm", "publisher": "lmstudio-community", "arch": "qwen2", "compatibility_type": "mlx", "quantization": "4bit", "state": "not-loaded", "max_context_length": 32768 }, ``` ## Notes The response from `{api_url}/models` seems to return the `max_tokens` for the model, not the currently configured context length, but I think showing the `max_tokens` for the model is better than setting 2k for everything `loaded_context_length` exists, but only if the model is loaded at the startup of zed, which usually isn't the case maybe `fetch_models` should be rerun when swapping lmstudio models ### Currently configured context this isn't shown in `{api_url}/models` ![image](https://github.com/user-attachments/assets/8511cb9d-914b-4065-9eba-c0b086ad253b) ### Json from `{api_url}/models` with model loaded ```json { "id": "qwen2.5-coder-1.5b-instruct-mlx", "object": "model", "type": "llm", "publisher": "lmstudio-community", "arch": "qwen2", "compatibility_type": "mlx", "quantization": "4bit", "state": "loaded", "max_context_length": 32768, "loaded_context_length": 4096 }, ``` Release Notes: - lmstudio: Fixed showing `max_tokens` in the assistant panel --------- Co-authored-by: Peter Tripp <peter@zed.dev>		2025-06-06 20:21:23 +00:00
..
anthropic.rs	anthropic: Fix error when attaching multiple images (#32092 )	2025-06-05 16:29:49 +00:00
bedrock.rs	bedrock: Fix cross-region inference (#30659 )	2025-06-03 15:46:35 +00:00
cloud.rs	Add thinking budget for Gemini custom models (#31251 )	2025-06-03 13:40:20 +02:00
copilot_chat.rs	Add UI for configuring the API Url directly (#32248 )	2025-06-06 18:05:40 +02:00
deepseek.rs	Add tool support for DeepSeek (#30223 )	2025-06-03 10:59:36 +02:00
google.rs	google: Add latest versions of Gemini 2.5 Pro and Flash Preview (#32183 )	2025-06-05 19:30:34 +00:00
lmstudio.rs	lmstudio: Fill max_tokens using the response from /models (#25606 )	2025-06-06 20:21:23 +00:00
mistral.rs	language_models: Fix Mistral tool->user message sequence handling (#31736 )	2025-06-06 12:35:22 +03:00
ollama.rs	Remove unused load_model method from LanguageModelProvider (#32070 )	2025-06-04 14:07:01 +00:00
open_ai.rs	Pass up intent with completion requests (#31710 )	2025-05-29 20:43:12 +00:00
open_router.rs	Add support for OpenRouter as a language model provider (#29496 )	2025-06-03 15:59:46 +00:00