Commit graph

15 commits

Author SHA1 Message Date
Umesh Yadav
f21ba9e2c6
lmstudio: Propagate actual error message from server (#34538)
Discovered in this issue: #34513

Previously, we were propagating deserialization errors to users when
using LMStudio, instead of the actual error message sent from the
LMStudio server. This change will help users understand why their
request failed while streaming responses.

Release Notes:

- lmsudio: Display specific backend error messaging on failure rather
than generic ones

---------

Signed-off-by: Umesh Yadav <git@umesh.dev>
Co-authored-by: Peter Tripp <peter@zed.dev>
2025-07-25 09:36:43 -04:00
Richard Feldman
5405c2c2d3
Standardize on u64 for token counts (#32869)
Previously we were using a mix of `u32` and `usize`, e.g. `max_tokens:
usize, max_output_tokens: Option<u32>` in the same `struct`.

Although [tiktoken](https://github.com/openai/tiktoken) uses `usize`,
token counts should be consistent across targets (e.g. the same model
doesn't suddenly get a smaller context window if you're compiling for
wasm32), and these token counts could end up getting serialized using a
binary protocol, so `usize` is not the right choice for token counts.

I chose to standardize on `u64` over `u32` because we don't store many
of them (so the extra size should be insignificant) and future models
may exceed `u32::MAX` tokens.

Release Notes:

- N/A
2025-06-17 10:43:07 -04:00
Umesh Yadav
4b88090cca
language_models: Add images support to LMStudio provider (#32741)
Tested with gemma3:4b
LMStudio: beta version 0.3.17

Release Notes:

- Add images support to LMStudio provider
2025-06-17 12:14:44 +02:00
Umesh Yadav
4ac7935589
language_models: Add thinking support to LM Studio provider (#32337)
It works similar to how deepseek works where the thinking is returned as
reasoning_content and we don't have to send the reasoning_content back
in the request.

This is a experiment feature which can be enabled from settings like
this:
<img width="1381" alt="Screenshot 2025-06-08 at 4 26 06 PM"
src="https://github.com/user-attachments/assets/d2f60f3c-0f93-45fc-bae2-4ded42981820"
/>

Here is how it looks to use(tested with
`deepseek/deepseek-r1-0528-qwen3-8b`

<img width="528" alt="Screenshot 2025-06-08 at 5 12 33 PM"
src="https://github.com/user-attachments/assets/f7716f52-5417-4f14-82b8-e853de054f63"
/>


Release Notes:

- Add thinking support to LM Studio provider
2025-06-09 11:55:34 +02:00
Elijah McMorris
52fa7ababb
lmstudio: Fill max_tokens using the response from /models (#25606)
The info for `max_tokens` for the model is included in
`{api_url}/models`
I don't think this needs to be `.clamp` like in
`crates/ollama/src/ollama.rs` `get_max_tokens`, but it might need to be

## Before:
Every model shows 2k

![image](https://github.com/user-attachments/assets/676075c8-0ceb-44b1-ae27-72ed6a6d783c)

## After:

![image](https://github.com/user-attachments/assets/8291535b-976e-4601-b617-1a508bf44e12)

### Json from `{api_url}/models` with model not loaded
```json
  {
      "id": "qwen2.5-coder-1.5b-instruct-mlx",
      "object": "model",
      "type": "llm",
      "publisher": "lmstudio-community",
      "arch": "qwen2",
      "compatibility_type": "mlx",
      "quantization": "4bit",
      "state": "not-loaded",
      "max_context_length": 32768
    },
```

## Notes
The response from `{api_url}/models` seems to return the `max_tokens`
for the model, not the currently configured context length, but I think
showing the `max_tokens` for the model is better than setting 2k for
everything

`loaded_context_length` exists, but only if the model is loaded at the
startup of zed, which usually isn't the case

maybe `fetch_models` should be rerun when swapping lmstudio models

### Currently configured context
this isn't shown in `{api_url}/models`

![image](https://github.com/user-attachments/assets/8511cb9d-914b-4065-9eba-c0b086ad253b)

### Json from `{api_url}/models` with model loaded
```json
  {
     "id": "qwen2.5-coder-1.5b-instruct-mlx",
      "object": "model",
      "type": "llm",
      "publisher": "lmstudio-community",
      "arch": "qwen2",
      "compatibility_type": "mlx",
      "quantization": "4bit",
      "state": "loaded",
      "max_context_length": 32768,
      "loaded_context_length": 4096
    },
```

Release Notes:

- lmstudio: Fixed showing `max_tokens` in the assistant panel

---------

Co-authored-by: Peter Tripp <peter@zed.dev>
2025-06-06 20:21:23 +00:00
Ben Brandt
4304521655
Remove unused load_model method from LanguageModelProvider (#32070)
Removes the load_model trait method and its implementations in Ollama
and LM Studio providers, along with associated preload_model functions
and unused imports.

Release Notes:

- N/A
2025-06-04 14:07:01 +00:00
Fedor Nezhivoi
998542b048
language_models: Add support for tool use to LM Studio provider (#30589)
Closes #30004

**Quick demo:**


https://github.com/user-attachments/assets/0ac93851-81d7-4128-a34b-1f3ae4bcff6d

**Additional notes:**

I've tried to stick to existing code in OpenAI provider as much as
possible without changing much to keep the diff small.

This PR is done in collaboration with @yagil from LM Studio. We agreed
upon the format in which LM Studio will return information about tool
use support for the model in the upcoming version. As of current stable
version nothing is going to change for the users, but once they update
to a newer LM Studio tool use gets automatically enabled for them. I
think this is much better UX then defaulting to true right now.


Release Notes:

- Added support for tool calls to LM Studio provider

---------

Co-authored-by: Ben Brandt <benjamin.j.brandt@gmail.com>
2025-05-26 13:54:17 +02:00
Kirill Bulatov
16366cf9f2
Use anyhow more idiomatically (#31052)
https://github.com/zed-industries/zed/issues/30972 brought up another
case where our context is not enough to track the actual source of the
issue: we get a general top-level error without inner error.

The reason for this was `.ok_or_else(|| anyhow!("failed to read HEAD
SHA"))?; ` on the top level.

The PR finally reworks the way we use anyhow to reduce such issues (or
at least make it simpler to bubble them up later in a fix).
On top of that, uses a few more anyhow methods for better readability.

* `.ok_or_else(|| anyhow!("..."))`, `map_err` and other similar error
conversion/option reporting cases are replaced with `context` and
`with_context` calls
* in addition to that, various `anyhow!("failed to do ...")` are
stripped with `.context("Doing ...")` messages instead to remove the
parasitic `failed to` text
* `anyhow::ensure!` is used instead of `if ... { return Err(...); }`
calls
* `anyhow::bail!` is used instead of `return Err(anyhow!(...));`

Release Notes:

- N/A
2025-05-20 23:06:07 +00:00
Umesh Yadav
a743035286
lmstudio: Fix streaming not working in v0.3.15 (#30013)
Closes #29781

Tested this with llama3, gemma3 and qwen3.

This is a breaking change, which means after adding this code changes in
future version zed we will require atleast lmstudio >= 0.3.15. For
context why it's breaking changes check out the issue: #29781.

What this doesn't try to solve is:

* Tool calling, thinking text rendering. Will raise a seperate PR for
these as those are not required in this PR to make it work.


https://github.com/user-attachments/assets/945f9c73-6323-4a88-92e2-2219b760a249

Release Notes:

- lmstudio: Fixed Zed support for LMStudio >= v0.3.15 (breaking change -- older versions are no longer supported).

---------

Co-authored-by: Peter Tripp <peter@zed.dev>
2025-05-06 12:59:36 -04:00
Julia Ryan
01ec6e0f77
Add workspace-hack (#27277)
This adds a "workspace-hack" crate, see
[mozilla's](https://hg.mozilla.org/mozilla-central/file/3a265fdc9f33e5946f0ca0a04af73acd7e6d1a39/build/workspace-hack/Cargo.toml#l7)
for a concise explanation of why this is useful. For us in practice this
means that if I were to run all the tests (`cargo nextest r
--workspace`) and then `cargo r`, all the deps from the previous cargo
command will be reused. Before this PR it would rebuild many deps due to
resolving different sets of features for them. For me this frequently
caused long rebuilds when things "should" already be cached.

To avoid manually maintaining our workspace-hack crate, we will use
[cargo hakari](https://docs.rs/cargo-hakari) to update the build files
when there's a necessary change. I've added a step to CI that checks
whether the workspace-hack crate is up to date, and instructs you to
re-run `script/update-workspace-hack` when it fails.

Finally, to make sure that people can still depend on crates in our
workspace without pulling in all the workspace deps, we use a `[patch]`
section following [hakari's
instructions](https://docs.rs/cargo-hakari/0.9.36/cargo_hakari/patch_directive/index.html)

One possible followup task would be making guppy use our
`rust-toolchain.toml` instead of having to duplicate that list in its
config, I opened an issue for that upstream: guppy-rs/guppy#481.

TODO:
- [x] Fix the extension test failure
- [x] Ensure the dev dependencies aren't being unified by Hakari into
the main dependencies
- [x] Ensure that the remote-server binary continues to not depend on
LibSSL

Release Notes:

- N/A

---------

Co-authored-by: Mikayla <mikayla@zed.dev>
Co-authored-by: Mikayla Maki <mikayla.c.maki@gmail.com>
2025-04-02 13:26:34 -07:00
Piotr Osiewicz
dc64ec9cc8
chore: Bump Rust edition to 2024 (#27800)
Follow-up to https://github.com/zed-industries/zed/pull/27791

Release Notes:

- N/A
2025-03-31 20:55:27 +02:00
Peter Tripp
3af37ddf6d
lmstudio: Support missing quantization in model metadata (#24054)
- Closes https://github.com/zed-industries/zed/issues/23764

Certain models do not include `quantization` parameter from lm studio rest API.
2025-01-31 22:28:11 +00:00
Nathan Sobo
6fca1d2b0b
Eliminate GPUI View, ViewContext, and WindowContext types (#22632)
There's still a bit more work to do on this, but this PR is compiling
(with warnings) after eliminating the key types. When the tasks below
are complete, this will be the new narrative for GPUI:

- `Entity<T>` - This replaces `View<T>`/`Model<T>`. It represents a unit
of state, and if `T` implements `Render`, then `Entity<T>` implements
`Element`.
- `&mut App` This replaces `AppContext` and represents the app.
- `&mut Context<T>` This replaces `ModelContext` and derefs to `App`. It
is provided by the framework when updating an entity.
- `&mut Window` Broken out of `&mut WindowContext` which no longer
exists. Every method that once took `&mut WindowContext` now takes `&mut
Window, &mut App` and every method that took `&mut ViewContext<T>` now
takes `&mut Window, &mut Context<T>`

Not pictured here are the two other failed attempts. It's been quite a
month!

Tasks:

- [x] Remove `View`, `ViewContext`, `WindowContext` and thread through
`Window`
- [x] [@cole-miller @mikayla-maki] Redraw window when entities change
- [x] [@cole-miller @mikayla-maki] Get examples and Zed running
- [x] [@cole-miller @mikayla-maki] Fix Zed rendering
- [x] [@mikayla-maki] Fix todo! macros and comments
- [x] Fix a bug where the editor would not be redrawn because of view
caching
- [x] remove publicness window.notify() and replace with
`AppContext::notify`
- [x] remove `observe_new_window_models`, replace with
`observe_new_models` with an optional window
- [x] Fix a bug where the project panel would not be redrawn because of
the wrong refresh() call being used
- [x] Fix the tests
- [x] Fix warnings by eliminating `Window` params or using `_`
- [x] Fix conflicts
- [x] Simplify generic code where possible
- [x] Rename types
- [ ] Update docs

### issues post merge

- [x] Issues switching between normal and insert mode
- [x] Assistant re-rendering failure
- [x] Vim test failures
- [x] Mac build issue



Release Notes:

- N/A

---------

Co-authored-by: Antonio Scandurra <me@as-cii.com>
Co-authored-by: Cole Miller <cole@zed.dev>
Co-authored-by: Mikayla <mikayla@zed.dev>
Co-authored-by: Joseph <joseph@zed.dev>
Co-authored-by: max <max@zed.dev>
Co-authored-by: Michael Sloan <michael@zed.dev>
Co-authored-by: Mikayla Maki <mikaylamaki@Mikaylas-MacBook-Pro.local>
Co-authored-by: Mikayla <mikayla.c.maki@gmail.com>
Co-authored-by: joão <joao@zed.dev>
2025-01-26 03:02:45 +00:00
Piotr Osiewicz
c9534e8025
chore: Use workspace fields for edition and publish (#23291)
This prepares us for an upcoming bump to Rust 2024 edition.

Release Notes:

- N/A
2025-01-17 17:39:22 +01:00
Yagil Burowski
c038696aa8
Add LM Studio support to the Assistant (#23097)
#### Release Notes:

- Added support for [LM Studio](https://lmstudio.ai/) to the Assistant.

#### Quick demo:


https://github.com/user-attachments/assets/af58fc13-1abc-4898-9747-3511016da86a

#### Future enhancements:
- wire up tool calling (new in [LM Studio
0.3.6](https://lmstudio.ai/blog/lmstudio-v0.3.6))

---------

Co-authored-by: Marshall Bowers <elliott.codes@gmail.com>
2025-01-14 20:41:58 +00:00