Commit graph

70 commits

Author SHA1 Message Date
Oliver Azevedo Barnes
dcb4c3163b
Use OLLAMA_API_KEY_VAR to store env var string 2025-08-23 22:53:28 -04:00
Oliver Azevedo Barnes
62ce840fc1
Rename OllamaService to State
To follow the convention used in language_models
2025-08-23 22:44:57 -04:00
Oliver Azevedo Barnes
fa1a6c297d
Remove unneeded comments 2025-08-17 15:54:31 -04:00
Oliver Azevedo Barnes
70f0297c48
Log error instead of warning when ollama server is unvailable 2025-08-17 15:27:12 -04:00
Oliver Azevedo Barnes
a9f248f259
Send api key for api/show and api/tags 2025-08-17 15:26:40 -04:00
Oliver Azevedo Barnes
8bd784f5ea
Merge branch 'main' into ollama-inline-completions
# Conflicts:
#	crates/edit_prediction_button/src/edit_prediction_button.rs
#	crates/editor/src/editor.rs
2025-08-08 20:44:27 +01:00
Umesh Yadav
b8e8fbd8e6
ollama: Add support for gpt-oss (#35648)
There is a know bug when calling tool discussion:
https://discord.com/channels/1128867683291627614/1402385744038858853
I have raised the issue with ollama team and they are currently fixing
it.

Release Notes:

- ollama: Add support for gpt-oss
2025-08-06 10:44:15 -04:00
Oliver Azevedo Barnes
947781bc48
Merge models in local settings with ones listed by ollama
This allows for the scenario where the user doesn't have access to ollama's listing and needs to tell Zed explicitly, by hand
2025-07-31 13:42:53 +01:00
Oliver Azevedo Barnes
d583a35a2d
Add new user_agent method required by updated HttpClient trait 2025-07-30 17:26:39 +01:00
Oliver Azevedo Barnes
1782243ae1
Merge branch 'main' into ollama-inline-completions
# Conflicts:
#	docs/src/ai/configuration.md
2025-07-29 12:57:11 +01:00
Oliver Azevedo Barnes
0bdb42e65d
Auto detect models WIP 2025-07-25 10:21:32 +01:00
versecafe
c08851a85e
ollama: Add Magistral to Ollama (#35000)
See also: #34983

Release Notes:

- Added magistral support to ollama
2025-07-24 00:17:54 -04:00
Oliver Azevedo Barnes
2350d4b9cd
Log a warning when ollama isn't available 2025-07-18 12:24:28 +01:00
Oliver Azevedo Barnes
909b2eca03
Send codellama:7b-code stop token in request
So Ollama filters it out
2025-07-17 18:02:35 +01:00
Oliver Azevedo Barnes
a50dc886da
Test completion invalidation 2025-07-15 16:15:49 +01:00
Oliver Azevedo Barnes
fcd718261d
Test partial acceptance of completion suggestion 2025-07-15 15:35:23 +01:00
Oliver Azevedo Barnes
5f5cdae62c
Updated dev dependencies 2025-07-13 10:44:22 +01:00
Oliver Azevedo Barnes
7953dc0543
Test partial typing 2025-07-13 10:25:32 +01:00
Oliver Azevedo Barnes
84413ab143
Test completion flow 2025-07-13 10:24:12 +01:00
Oliver Azevedo Barnes
dc7396380f
Ollama::fake 2025-07-13 10:22:45 +01:00
Oliver Azevedo Barnes
73426d7016
Let's start over with tests 2025-07-12 13:21:12 +01:00
Oliver Azevedo Barnes
fa5e7c4631
Default to Qwen Coder 2025-07-10 18:06:34 +01:00
Oliver Azevedo Barnes
cb9d2d40b8
Enable partial acceptance 2025-07-09 21:53:49 +01:00
Oliver Azevedo Barnes
2942f4aace
Eager / subtle now working 2025-07-09 19:51:16 +01:00
Oliver Azevedo Barnes
9188e3f5de
Support using an API key 2025-07-08 22:09:50 +01:00
Oliver Azevedo Barnes
cce9949d92
Remove stop tokens and completion cleanup 2025-07-07 20:49:08 -03:00
Oliver Azevedo Barnes
b50555b87a
Use Ollama's suffix field and remove FIM token handling 2025-07-03 14:37:24 -03:00
Oliver Azevedo Barnes
af66570bfe
Improved FIM token handling per model 2025-07-03 11:53:06 -03:00
Oliver Azevedo Barnes
902a07606b
Ollama model switcher working 2025-07-02 13:15:11 -03:00
Oliver Azevedo Barnes
7f8dc940f7
Remove unused dependency on multibuffer 2025-06-29 21:35:23 -03:00
Oliver Azevedo Barnes
72d0b2402a
Support using ollama as an inline_completion_provider 2025-06-29 13:29:45 -03:00
Richard Feldman
5405c2c2d3
Standardize on u64 for token counts (#32869)
Previously we were using a mix of `u32` and `usize`, e.g. `max_tokens:
usize, max_output_tokens: Option<u32>` in the same `struct`.

Although [tiktoken](https://github.com/openai/tiktoken) uses `usize`,
token counts should be consistent across targets (e.g. the same model
doesn't suddenly get a smaller context window if you're compiling for
wasm32), and these token counts could end up getting serialized using a
binary protocol, so `usize` is not the right choice for token counts.

I chose to standardize on `u64` over `u32` because we don't store many
of them (so the extra size should be insignificant) and future models
may exceed `u32::MAX` tokens.

Release Notes:

- N/A
2025-06-17 10:43:07 -04:00
Umesh Yadav
ed4b29f80c
language_models: Improve token counting for providers (#32853)
We push the usage data whenever we receive it from the provider to make
sure the counting is correct after the turn has ended.

- [x] Ollama 
- [x] Copilot 
- [x] Mistral 
- [x] OpenRouter 
- [x] LMStudio

Put all the changes into a single PR open to move these to separate PR
if that makes the review and testing easier.

Release Notes:

- N/A
2025-06-17 10:46:29 +00:00
Ben Brandt
4304521655
Remove unused load_model method from LanguageModelProvider (#32070)
Removes the load_model trait method and its implementations in Ollama
and LM Studio providers, along with associated preload_model functions
and unused imports.

Release Notes:

- N/A
2025-06-04 14:07:01 +00:00
Umesh Yadav
59686f1f44
language_models: Add images support for Ollama vision models (#31883)
Ollama supports vision to process input images. This PR adds support for
same. I have tested this with gemma3:4b and have attached the screenshot
of it working.

<img width="435" alt="image"
src="https://github.com/user-attachments/assets/5f17d742-0a37-4e6c-b4d8-05b750a0a158"
/>


Release Notes:

- Add image support for [Ollama vision models](https://ollama.com/search?c=vision)
2025-06-03 11:12:59 +02:00
Umesh Yadav
65e3e84cbc
language_models: Add thinking support for ollama (#31665)
This PR updates how we handle Ollama responses, leveraging the new
[v0.9.0](https://github.com/ollama/ollama/releases/tag/v0.9.0) release.
Previously, thinking text was embedded within the model's main content,
leading to it appearing directly in the agent's response. Now, thinking
content is provided as a separate parameter, allowing us to display it
correctly within the agent panel, similar to other providers. I have
tested this with qwen3:8b and works nicely. ~~We can release this once
the ollama is release is stable.~~ It's released now as stable.

<img width="433" alt="image"
src="https://github.com/user-attachments/assets/2983ef06-6679-4033-82c2-231ea9cd6434"
/>


Release Notes:

- Add thinking support for ollama

---------

Co-authored-by: Bennet Bo Fenner <bennetbo@gmx.de>
2025-06-02 15:12:41 +00:00
tidely
6d687a2c2c
ollama: Change default context size to 4096 (#31682)
Ollama increased their default context size from 2048 to 4096 tokens in
version v0.6.7, which released over a month ago.

https://github.com/ollama/ollama/releases/tag/v0.6.7

Release Notes:

- ollama: Update default model context to 4096 (matching upstream)
2025-05-30 16:12:39 -04:00
Umesh Yadav
cc428330a9
mistral: Add DevstralSmallLatest model to Mistral and Ollama (#31099)
Mistral just released a sota coding model:
https://mistral.ai/news/devstral

This PR adds support for it in both ollama and mistral

Release Notes:

- Add DevstralSmallLatest model to Mistral and Ollama
2025-05-22 14:22:35 -04:00
Kirill Bulatov
16366cf9f2
Use anyhow more idiomatically (#31052)
https://github.com/zed-industries/zed/issues/30972 brought up another
case where our context is not enough to track the actual source of the
issue: we get a general top-level error without inner error.

The reason for this was `.ok_or_else(|| anyhow!("failed to read HEAD
SHA"))?; ` on the top level.

The PR finally reworks the way we use anyhow to reduce such issues (or
at least make it simpler to bubble them up later in a fix).
On top of that, uses a few more anyhow methods for better readability.

* `.ok_or_else(|| anyhow!("..."))`, `map_err` and other similar error
conversion/option reporting cases are replaced with `context` and
`with_context` calls
* in addition to that, various `anyhow!("failed to do ...")` are
stripped with `.context("Doing ...")` messages instead to remove the
parasitic `failed to` text
* `anyhow::ensure!` is used instead of `if ... { return Err(...); }`
calls
* `anyhow::bail!` is used instead of `return Err(anyhow!(...));`

Release Notes:

- N/A
2025-05-20 23:06:07 +00:00
Richard Feldman
fcb9706022
Improve Ollama tool use (#30120)
<img width="458" alt="Screenshot 2025-05-07 at 9 37 39 AM"
src="https://github.com/user-attachments/assets/80f8a9b8-6a13-4e84-b91d-140e11475638"
/>

<img width="603" alt="Screenshot 2025-05-07 at 9 37 33 AM"
src="https://github.com/user-attachments/assets/7fe67a68-3885-4a0e-a282-aad37e92068b"
/>


Release Notes:

- Ollama models no longer require the supports_tools field in settings
(defaults to false)

---------

Co-authored-by: Antonio Scandurra <me@as-cii.com>
2025-05-07 15:37:06 +00:00
tidely
769ec59162
ollama: Add tool call support (#29563)
The goal of this PR is to support tool calls using ollama. A lot of the
serialization work was done in
https://github.com/zed-industries/zed/pull/15803 however the abstraction
over language models always disables tools.

## Changelog:

- Use `serde_json::Value` inside `OllamaFunctionCall` just as it's used
in `OllamaFunctionCall`. This fixes deserialization of ollama tool
calls.
- Added deserialization tests using json from official ollama api docs.
- Fetch model capabilities during model enumeration from ollama provider
- Added `supports_tools` setting to manually configure if a model
supports tools

## TODO:

- [x] Fix tool call serialization/deserialization
- [x] Fetch model capabilities from ollama api
- [x] Add tests for parsing model capabilities 
- [ ] Documentation for `supports_tools` field for ollama language model
config
- [ ] Convert between generic language model types
- [x] Pass tools to ollama

Release Notes:

- N/A

---------

Co-authored-by: Antonio Scandurra <me@as-cii.com>
Co-authored-by: Nathan Sobo <nathan@zed.dev>
2025-05-05 17:52:23 +00:00
Peter Tripp
4dc8ce8cf7
ollama: Add Qwen3 and Gemma3 (default to 16K context) (#29580)
If you have the VRAM you can increase the context by adding this to your
settings.json:

```json
  "language_models": {
    "ollama": {
      "available_models": [
        { "max_tokens": 65536, "name": "qwen3", "display_name": "Qwen3-64k" }
      ]
    }
  },
```

Release Notes:

- ollama: Add support for Qwen3. Defaults to 16K token context. See:
[Assistant Configuration
Docs](https://zed.dev/docs/assistant/configuration#ollama-context) to
increase.
2025-04-28 21:44:28 -04:00
tidely
8a717abe0d
ollama: Fix build with default features (#29502)
The `ollama` crate has a `use schemars::JsonSchema` statement even when
building with default features, which doesn't include the `schemars`
crate.

Release Notes:

- N/A
2025-04-28 09:58:10 -04:00
shenjack
b0609272c0
ollama: Add DeepSeek v3 max token length (#29156)
Add deepseek-v3 max token length for ollama

Release Notes:

- N/A
2025-04-24 13:20:22 -04:00
Julia Ryan
01ec6e0f77
Add workspace-hack (#27277)
This adds a "workspace-hack" crate, see
[mozilla's](https://hg.mozilla.org/mozilla-central/file/3a265fdc9f33e5946f0ca0a04af73acd7e6d1a39/build/workspace-hack/Cargo.toml#l7)
for a concise explanation of why this is useful. For us in practice this
means that if I were to run all the tests (`cargo nextest r
--workspace`) and then `cargo r`, all the deps from the previous cargo
command will be reused. Before this PR it would rebuild many deps due to
resolving different sets of features for them. For me this frequently
caused long rebuilds when things "should" already be cached.

To avoid manually maintaining our workspace-hack crate, we will use
[cargo hakari](https://docs.rs/cargo-hakari) to update the build files
when there's a necessary change. I've added a step to CI that checks
whether the workspace-hack crate is up to date, and instructs you to
re-run `script/update-workspace-hack` when it fails.

Finally, to make sure that people can still depend on crates in our
workspace without pulling in all the workspace deps, we use a `[patch]`
section following [hakari's
instructions](https://docs.rs/cargo-hakari/0.9.36/cargo_hakari/patch_directive/index.html)

One possible followup task would be making guppy use our
`rust-toolchain.toml` instead of having to duplicate that list in its
config, I opened an issue for that upstream: guppy-rs/guppy#481.

TODO:
- [x] Fix the extension test failure
- [x] Ensure the dev dependencies aren't being unified by Hakari into
the main dependencies
- [x] Ensure that the remote-server binary continues to not depend on
LibSSL

Release Notes:

- N/A

---------

Co-authored-by: Mikayla <mikayla@zed.dev>
Co-authored-by: Mikayla Maki <mikayla.c.maki@gmail.com>
2025-04-02 13:26:34 -07:00
Piotr Osiewicz
dc64ec9cc8
chore: Bump Rust edition to 2024 (#27800)
Follow-up to https://github.com/zed-industries/zed/pull/27791

Release Notes:

- N/A
2025-03-31 20:55:27 +02:00
Nathan Sobo
6fca1d2b0b
Eliminate GPUI View, ViewContext, and WindowContext types (#22632)
There's still a bit more work to do on this, but this PR is compiling
(with warnings) after eliminating the key types. When the tasks below
are complete, this will be the new narrative for GPUI:

- `Entity<T>` - This replaces `View<T>`/`Model<T>`. It represents a unit
of state, and if `T` implements `Render`, then `Entity<T>` implements
`Element`.
- `&mut App` This replaces `AppContext` and represents the app.
- `&mut Context<T>` This replaces `ModelContext` and derefs to `App`. It
is provided by the framework when updating an entity.
- `&mut Window` Broken out of `&mut WindowContext` which no longer
exists. Every method that once took `&mut WindowContext` now takes `&mut
Window, &mut App` and every method that took `&mut ViewContext<T>` now
takes `&mut Window, &mut Context<T>`

Not pictured here are the two other failed attempts. It's been quite a
month!

Tasks:

- [x] Remove `View`, `ViewContext`, `WindowContext` and thread through
`Window`
- [x] [@cole-miller @mikayla-maki] Redraw window when entities change
- [x] [@cole-miller @mikayla-maki] Get examples and Zed running
- [x] [@cole-miller @mikayla-maki] Fix Zed rendering
- [x] [@mikayla-maki] Fix todo! macros and comments
- [x] Fix a bug where the editor would not be redrawn because of view
caching
- [x] remove publicness window.notify() and replace with
`AppContext::notify`
- [x] remove `observe_new_window_models`, replace with
`observe_new_models` with an optional window
- [x] Fix a bug where the project panel would not be redrawn because of
the wrong refresh() call being used
- [x] Fix the tests
- [x] Fix warnings by eliminating `Window` params or using `_`
- [x] Fix conflicts
- [x] Simplify generic code where possible
- [x] Rename types
- [ ] Update docs

### issues post merge

- [x] Issues switching between normal and insert mode
- [x] Assistant re-rendering failure
- [x] Vim test failures
- [x] Mac build issue



Release Notes:

- N/A

---------

Co-authored-by: Antonio Scandurra <me@as-cii.com>
Co-authored-by: Cole Miller <cole@zed.dev>
Co-authored-by: Mikayla <mikayla@zed.dev>
Co-authored-by: Joseph <joseph@zed.dev>
Co-authored-by: max <max@zed.dev>
Co-authored-by: Michael Sloan <michael@zed.dev>
Co-authored-by: Mikayla Maki <mikaylamaki@Mikaylas-MacBook-Pro.local>
Co-authored-by: Mikayla <mikayla.c.maki@gmail.com>
Co-authored-by: joão <joao@zed.dev>
2025-01-26 03:02:45 +00:00
Peter Tripp
f38d0ff069
ollama: Set default max_tokens for llama3.3 (#23558) 2025-01-23 17:38:43 +00:00
Peter Tripp
14cd178ab0
ollama: Add deepseek-r1 context size to defaults (#23420) 2025-01-21 18:31:15 +00:00
Piotr Osiewicz
c9534e8025
chore: Use workspace fields for edition and publish (#23291)
This prepares us for an upcoming bump to Rust 2024 edition.

Release Notes:

- N/A
2025-01-17 17:39:22 +01:00