Commit graph

64 commits

Author SHA1 Message Date
Michael Sloan
fbf7caf93e
Default to fast model for thread summaries and titles + don't include system prompt / context / thinking segments (#29102)
* Adds a fast / cheaper model to providers and defaults thread
summarization to this model. Initial motivation for this was that
https://github.com/zed-industries/zed/pull/29099 would cause these
requests to fail when used with a thinking model. It doesn't seem
correct to use a thinking model for summarization.

* Skips system prompt, context, and thinking segments.

* If tool use is happening, allows 2 tool uses + one more agent response
before summarizing.

Downside of this is that there was potential for some prefix cache reuse
before, especially for title summarization (thread summarization omitted
tool results and so would not share a prefix for those). This seems fine
as these requests should typically be fairly small. Even for full thread
summarization, skipping all tool use / context should greatly reduce the
token use.

Release Notes:

- N/A
2025-04-19 23:26:29 +00:00
Bennet Bo Fenner
bafc086d27
agent: Preserve thinking blocks between requests (#29055)
Looks like the required backend component of this was deployed.

https://github.com/zed-industries/monorepo/actions/runs/14541199197

Release Notes:

- N/A

---------

Co-authored-by: Antonio Scandurra <me@as-cii.com>
Co-authored-by: Agus Zubiaga <hi@aguz.me>
Co-authored-by: Richard Feldman <oss@rtfeldman.com>
Co-authored-by: Nathan Sobo <nathan@zed.dev>
2025-04-19 20:12:03 +00:00
Thomas Mickley-Doyle
d74f0735c2
Add more eval examples + filtering examples by language + fix git concurrent usage (#28719)
Release Notes:

- N/A

---------

Co-authored-by: michael <michael@zed.dev>
Co-authored-by: agus <agus@zed.dev>
2025-04-14 22:05:46 +00:00
Agus Zubiaga
b45230784d
agent: Handle context window exceeded errors from Anthropic (#28688)
![CleanShot 2025-04-14 at 11 15
38@2x](https://github.com/user-attachments/assets/9e803ffb-74fd-486b-bebc-2155a407a9fa)

Release Notes:

- agent: Handle context window exceeded errors from Anthropic
2025-04-14 14:39:33 +00:00
Danilo Leal
73305ce45e
Change zed.dev's default model to Claude 3.7 Sonnet (#28541)
From Claude 3.5 Sonnet to **Claude 3.7 Sonnet**.

Release Notes:

- Change the default model of Zed's hosted LLM service to Claude 3.7
Sonnet.
2025-04-10 18:34:04 -03:00
Marshall Bowers
1a899fda60
collab: Capture upstream input/output rate limits from Anthropic (#28106)
This PR makes it so we capture the upstream rate limit information from
Anthropic for input and output tokens.

Release Notes:

- N/A
2025-04-04 17:09:00 +00:00
Richard Feldman
ef8fe52877
Try adding beta token-efficient tool use for 3.7 Sonnet (#28100)
Release Notes:

- Enabled [token-efficient tool use
(beta)](https://docs.anthropic.com/en/docs/build-with-claude/tool-use/token-efficient-tool-use)
for Claude 3.7 Sonnet models
2025-04-04 11:05:41 -05:00
Marshall Bowers
e5b347b03a
Remove unused extract_tool_args_from_events functions (#28038)
This PR removes the unused `extract_tool_args_from_events` functions
that were defined in some of the LLM provider crates.

Release Notes:

- N/A
2025-04-03 18:38:35 +00:00
Julia Ryan
01ec6e0f77
Add workspace-hack (#27277)
This adds a "workspace-hack" crate, see
[mozilla's](https://hg.mozilla.org/mozilla-central/file/3a265fdc9f33e5946f0ca0a04af73acd7e6d1a39/build/workspace-hack/Cargo.toml#l7)
for a concise explanation of why this is useful. For us in practice this
means that if I were to run all the tests (`cargo nextest r
--workspace`) and then `cargo r`, all the deps from the previous cargo
command will be reused. Before this PR it would rebuild many deps due to
resolving different sets of features for them. For me this frequently
caused long rebuilds when things "should" already be cached.

To avoid manually maintaining our workspace-hack crate, we will use
[cargo hakari](https://docs.rs/cargo-hakari) to update the build files
when there's a necessary change. I've added a step to CI that checks
whether the workspace-hack crate is up to date, and instructs you to
re-run `script/update-workspace-hack` when it fails.

Finally, to make sure that people can still depend on crates in our
workspace without pulling in all the workspace deps, we use a `[patch]`
section following [hakari's
instructions](https://docs.rs/cargo-hakari/0.9.36/cargo_hakari/patch_directive/index.html)

One possible followup task would be making guppy use our
`rust-toolchain.toml` instead of having to duplicate that list in its
config, I opened an issue for that upstream: guppy-rs/guppy#481.

TODO:
- [x] Fix the extension test failure
- [x] Ensure the dev dependencies aren't being unified by Hakari into
the main dependencies
- [x] Ensure that the remote-server binary continues to not depend on
LibSSL

Release Notes:

- N/A

---------

Co-authored-by: Mikayla <mikayla@zed.dev>
Co-authored-by: Mikayla Maki <mikayla.c.maki@gmail.com>
2025-04-02 13:26:34 -07:00
Piotr Osiewicz
dc64ec9cc8
chore: Bump Rust edition to 2024 (#27800)
Follow-up to https://github.com/zed-industries/zed/pull/27791

Release Notes:

- N/A
2025-03-31 20:55:27 +02:00
Richard Feldman
85740ddaa4
Make serialization backwards-compatible for collab server (#27626)
Sets up the collab server to accept the format of system message that
we'll introduce later for [prompt
caching](https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching).

Release Notes:

- N/A
2025-03-27 18:20:10 -04:00
Bennet Bo Fenner
a709d4c7c6
assistant: Add support for claude-3-7-sonnet-thinking (#27085)
Closes #25671

Release Notes:

- Added support for `claude-3-7-sonnet-thinking` in the assistant panel

---------

Co-authored-by: Danilo Leal <daniloleal09@gmail.com>
Co-authored-by: Antonio Scandurra <me@as-cii.com>
Co-authored-by: Agus Zubiaga <hi@aguz.me>
2025-03-21 12:29:07 +00:00
Michael Sloan
8e0e291bd5
Track cumulative token usage in assistant2 when using anthropic API (#26738)
Release Notes:

- N/A
2025-03-13 22:56:16 +00:00
Peter Tripp
10a4760f90
Add Anthropic Claude 3.7 support (#25497) 2025-02-24 16:10:26 -05:00
João Marcos
5bd7eaa173
Solve 50+ cargo doc warnings (#24071)
Release Notes:

- N/A
2025-02-01 06:19:29 +00:00
Marshall Bowers
19383036d5
anthropic: Fix license (#23867)
This PR fixes the license for the `anthropic` crate.

It was mistakenly licensed as AGPL, despite being used outside of
collab. It should be licensed as GPL.

Release Notes:

- N/A
2025-01-29 23:03:20 +00:00
Marshall Bowers
070890d361
anthropic: Don't bail out on unknown model ID (#23782)
This PR fixes an issue introduced in
https://github.com/zed-industries/zed/pull/20551/ that would prevent
models with unknown IDs from working in the LLM service.

We only need to look up a model from its ID for the beta headers, and if
we can't find that particular model we should fall back to the default
beta headers instead of bailing out completely,

Release Notes:

- N/A
2025-01-28 10:56:05 -05:00
Nathan Sobo
6fca1d2b0b
Eliminate GPUI View, ViewContext, and WindowContext types (#22632)
There's still a bit more work to do on this, but this PR is compiling
(with warnings) after eliminating the key types. When the tasks below
are complete, this will be the new narrative for GPUI:

- `Entity<T>` - This replaces `View<T>`/`Model<T>`. It represents a unit
of state, and if `T` implements `Render`, then `Entity<T>` implements
`Element`.
- `&mut App` This replaces `AppContext` and represents the app.
- `&mut Context<T>` This replaces `ModelContext` and derefs to `App`. It
is provided by the framework when updating an entity.
- `&mut Window` Broken out of `&mut WindowContext` which no longer
exists. Every method that once took `&mut WindowContext` now takes `&mut
Window, &mut App` and every method that took `&mut ViewContext<T>` now
takes `&mut Window, &mut Context<T>`

Not pictured here are the two other failed attempts. It's been quite a
month!

Tasks:

- [x] Remove `View`, `ViewContext`, `WindowContext` and thread through
`Window`
- [x] [@cole-miller @mikayla-maki] Redraw window when entities change
- [x] [@cole-miller @mikayla-maki] Get examples and Zed running
- [x] [@cole-miller @mikayla-maki] Fix Zed rendering
- [x] [@mikayla-maki] Fix todo! macros and comments
- [x] Fix a bug where the editor would not be redrawn because of view
caching
- [x] remove publicness window.notify() and replace with
`AppContext::notify`
- [x] remove `observe_new_window_models`, replace with
`observe_new_models` with an optional window
- [x] Fix a bug where the project panel would not be redrawn because of
the wrong refresh() call being used
- [x] Fix the tests
- [x] Fix warnings by eliminating `Window` params or using `_`
- [x] Fix conflicts
- [x] Simplify generic code where possible
- [x] Rename types
- [ ] Update docs

### issues post merge

- [x] Issues switching between normal and insert mode
- [x] Assistant re-rendering failure
- [x] Vim test failures
- [x] Mac build issue



Release Notes:

- N/A

---------

Co-authored-by: Antonio Scandurra <me@as-cii.com>
Co-authored-by: Cole Miller <cole@zed.dev>
Co-authored-by: Mikayla <mikayla@zed.dev>
Co-authored-by: Joseph <joseph@zed.dev>
Co-authored-by: max <max@zed.dev>
Co-authored-by: Michael Sloan <michael@zed.dev>
Co-authored-by: Mikayla Maki <mikaylamaki@Mikaylas-MacBook-Pro.local>
Co-authored-by: Mikayla <mikayla.c.maki@gmail.com>
Co-authored-by: joão <joao@zed.dev>
2025-01-26 03:02:45 +00:00
Peter Tripp
5f59536208
Fix older Anthropic models not supporting -latest tags (#23372)
- Closes: https://github.com/zed-industries/zed/issues/22322
2025-01-20 13:19:15 -05:00
Piotr Osiewicz
c9534e8025
chore: Use workspace fields for edition and publish (#23291)
This prepares us for an upcoming bump to Rust 2024 edition.

Release Notes:

- N/A
2025-01-17 17:39:22 +01:00
Roy Williams
b1a6e2427f
anthropic: Allow specifying additional beta headers for custom models (#20551)
Release Notes:

- Added the ability to specify additional beta headers for custom
Anthropic models.

---------

Co-authored-by: David Soria Parra <167242713+dsp-ant@users.noreply.github.com>
Co-authored-by: Marshall Bowers <elliott.codes@gmail.com>
2025-01-03 23:46:32 +00:00
saahityaedams
e4eef725de
Add support for Claude 3.5 Haiku model (#22323)
Partly Closes #22185

Release Notes:

- Added support for the Claude 3.5 Haiku model.

Co-authored-by: Marshall Bowers <elliott.codes@gmail.com>
2025-01-03 18:49:29 +00:00
Thorsten Ball
aee01f2c50
assistant: Remove low_speed_timeout (#20681)
This removes the `low_speed_timeout` setting from all providers as a
response to issue #19509.

Reason being that the original `low_speed_timeout` was only as part of
#9913 because users wanted to _get rid of timeouts_. They wanted to bump
the default timeout from 5sec to a lot more.

Then, in the meantime, the meaning of `low_speed_timeout` changed in
#19055 and was changed to a normal `timeout`, which is a different thing
and breaks slower LLMs that don't reply with a complete response in the
configured timeout.

So we figured: let's remove the whole thing and replace it with a
default _connect_ timeout to make sure that we can connect to a server
in 10s, but then give the server as long as it wants to complete its
response.

Closes #19509

Release Notes:

- Removed the `low_speed_timeout` setting from LLM provider settings,
since it was only used to _increase_ the timeout to give LLMs more time,
but since we don't have any other use for it, we simply remove the
setting to give LLMs as long as they need.

---------

Co-authored-by: Antonio <antonio@zed.dev>
Co-authored-by: Peter Tripp <peter@zed.dev>
2024-11-15 07:37:31 +01:00
David Soria Parra
a15f408f0c
anthropic: Remove stable headers (#20595)
The tool and context length headers are now stable and no longer needed.

Release Notes:

- N/A
2024-11-13 15:04:37 -05:00
Peter Tripp
291af664e1
Switch to Anthropic -latest tags (#19615)
- Closes: https://github.com/zed-industries/zed/issues/19609

Switches us to using `-latest` tags with Anthropic models instead of
pinning to a specific date version.
See: [Anthropic Model
Docs](https://docs.anthropic.com/en/docs/about-claude/models)

This is a no-op for:
- Claude 3 Opus (`claude-3-opus-20240229`)
- Claude 3 Sonnet (`claude-3-sonnet-20240229`)
- Claude 3 Haiku (`claude-3-haiku-20240307`)

For Claude 3.5 Sonnet this will update us from
`claude-3-5-sonnet-20240620` to `claude-3-5-sonnet-20241022`. We will
also pickup any subsequent model updates automatically when Anthropic
updates the `latest` tag.

This matches the behavior for OpenAI where use `gpt-4o` as the
model_name and not `gpt-4o-2024-08-06`.
2024-10-23 15:13:52 -04:00
Mikayla Maki
22ac178f9d
Restore HTTP client transition, but use reqwest everywhere (#19055)
Release Notes:

- N/A
2024-10-11 14:58:58 -07:00
Marshall Bowers
d55f025906
collab: Track cache writes/reads in LLM usage (#18834)
This PR extends the LLM usage tracking to support tracking usage for
cache writes and reads for Anthropic models.

Release Notes:

- N/A

---------

Co-authored-by: Antonio Scandurra <me@as-cii.com>
Co-authored-by: Antonio <antonio@zed.dev>
2024-10-07 17:32:49 -04:00
Conrad Irwin
e28496d4e2
Stop leaking isahc assumption (#18408)
Users of our http_client crate knew they were interacting with isahc as
they set its extensions on the request. This change adds our own
equivalents for their APIs in preparation for changing the default http
client.

Release Notes:

- N/A
2024-09-26 14:01:05 -06:00
Roy Williams
5905fbb9ac
Allow Anthropic custom models to override temperature (#18160)
Release Notes:

- Allow Anthropic custom models to override "temperature"

This also centralized the defaulting of "temperature" to be inside of
each model's `into_x` call instead of being sprinkled around the code.
2024-09-20 14:59:12 -06:00
Piotr Osiewicz
e6c1c51b37
chore: Fix several style lints (#17488)
It's not comprehensive enough to start linting on `style` group, but
hey, it's a start.

Release Notes:

- N/A
2024-09-06 11:58:39 +02:00
Marshall Bowers
30b2133336
language_model: Add tool results to message content (#17363)
This PR updates the message content for an LLM request to allow it
contain tool results.

Release Notes:

- N/A
2024-09-04 13:29:01 -04:00
Marshall Bowers
f38956943b
assistant: Propagate LLM stop reason upwards (#17358)
This PR makes it so we propagate the `stop_reason` from Anthropic up to
the Assistant so that we can take action based on it.

The `extract_content_from_events` function was moved from `anthropic` to
the `anthropic` module in `language_model` since it is more useful if it
is able to name the `LanguageModelCompletionEvent` type, as otherwise
we'd need an additional layer of plumbing.

Release Notes:

- N/A
2024-09-04 12:31:10 -04:00
Marshall Bowers
452272e5df
assistant: Stream tool uses as structured data (#17322)
This PR adjusts the approach we use to encoding tool uses in the
completion response to use a structured format rather than simply
injecting it into the response stream as text.

In #17170 we would encode the tool uses as XML and insert them as text.
This would require then re-parsing the tool uses out of the buffer in
order to use them.

The approach taken in this PR is to make `stream_completion` return a
stream of `LanguageModelCompletionEvent`s. Each of these events can be
either text, or a tool use.

A new `stream_completion_text` method has been added to `LanguageModel`
for scenarios where we only care about textual content (currently,
everywhere that isn't the Assistant context editor).

Release Notes:

- N/A
2024-09-03 15:04:51 -04:00
Marshall Bowers
68ea661711
assistant: Add foundation for receiving tool uses from Anthropic models (#17170)
This PR updates the Assistant with support for receiving tool uses from
Anthropic models and capturing them as text in the context editor.

This is just laying the foundation for tool use. We don't yet fulfill
the tool uses yet, or define any tools for the model to use.

Here's an example of what it looks like using the example `get_weather`
tool from the Anthropic docs:

<img width="644" alt="Screenshot 2024-08-30 at 1 51 13 PM"
src="https://github.com/user-attachments/assets/3614f953-0689-423c-8955-b146729ea638">

Release Notes:

- N/A
2024-08-30 14:05:55 -04:00
Marshall Bowers
ea25d438d1
anthropic: Remove cache_control field from ResponseContent (#17165)
This PR removes the `cache_control` field from the variants in
`ResponseContent`.

This field is used on requests to control the caching behavior, but is
not needed on content in the response.

Release Notes:

- N/A
2024-08-30 12:22:47 -04:00
Marshall Bowers
8901d926eb
anthropic: Use separate Content type in requests and responses (#17163)
This PR splits the `Content` type for Anthropic into two new types:
`RequestContent` and `ResponseContent`.

As I was going through the Anthropic API docs it seems that there are
different types of content that can be sent in requests vs what can be
returned in responses.

Using a separate type for each case tells the story a bit better and
makes it easier to understand, IMO.

Release Notes:

- N/A
2024-08-30 11:46:03 -04:00
Peter Tripp
4d6bb52d1f
Anthropic/OpenAI: Add country codes for territories (#17089)
- Cloudflare provides ISO-3166-1 country code for protectorates. Expand our allowlist to include the territories of countries on the allowlist (US, UK, France, Australia, New Zealand). 
- Also include the country_code in the error message when we block. 

Co-authored-by: Marshall Bowers <elliott.codes@gmail.com>
2024-08-29 11:32:29 -04:00
Max Brunsfeld
1b1070e0f7
Add tracing needed for LLM rate limit dashboards (#16388)
Release Notes:

- N/A

---------

Co-authored-by: Marshall <marshall@zed.dev>
2024-08-16 17:52:31 -04:00
Nathan Sobo
907d76208d
Allow display name of custom Anthropic models to be customized (#16376)
Also added some docs for our settings.

Release Notes:

- N/A
2024-08-16 14:02:37 -06:00
Roy Williams
b4f5f5024e
Support 8192 output tokens for Claude Sonnet 3.5 (#16358)
Release Notes:

- Added support for 8192 output tokens from Claude Sonnet 3.5
(https://x.com/alexalbert__/status/1812921642143900036)
2024-08-16 11:47:39 -04:00
Roy Williams
46fb917e02
Implement Anthropic prompt caching (#16274)
Release Notes:

- Adds support for Prompt Caching in Anthropic. For models that support
it this can dramatically lower cost while improving performance.
2024-08-15 22:21:06 -05:00
Max Brunsfeld
4c390b82fb
Make LanguageModel::use_any_tool return a stream of chunks (#16262)
This PR is a refactor to pave the way for allowing the user to view and
edit workflow step resolutions. I've made tool calls work more like
normal streaming completions for all providers. The `use_any_tool`
method returns a stream of strings (which contain chunks of JSON). I've
also done some minor cleanup of language model providers in general,
removing the duplication around handling streaming responses.

Release Notes:

- N/A
2024-08-14 18:02:46 -07:00
Marshall Bowers
ebdb755fef
Surface upstream rate limits from Anthropic (#16118)
This PR makes it so hitting upstream rate limits from Anthropic result
in an HTTP 429 response instead of an HTTP 500.

To do this we need to surface structured errors out of the `anthropic`
crate.

Release Notes:

- N/A
2024-08-12 11:59:24 -04:00
Max Brunsfeld
33e120d964
Capture telemetry data on per-user monthly LLM spending (#16050)
Release Notes:

- N/A

---------

Co-authored-by: Marshall <marshall@zed.dev>
2024-08-09 16:38:37 -07:00
Bennet Bo Fenner
514b79e461
collab: Always use newest anthropic model version (#15978)
When Anthropic releases a new version of their models, Zed AI users
should always get access to the new version even when using an old
version of zed.

Co-Authored-By: Thorsten <thorsten@zed.dev>

Release Notes:

- N/A

Co-authored-by: Thorsten <thorsten@zed.dev>
2024-08-08 15:24:08 +02:00
Marshall Bowers
cf5f4dddf5
Authorize access to language model providers based on country (#15859)
This PR updates the LLM service to authorize access to language model
providers based on the requester's country.

We detect the country using Cloudflare's
[`CF-IPCountry`](https://developers.cloudflare.com/fundamentals/reference/http-request-headers/#cf-ipcountry)
header.

The country code is then checked against the list of supported countries
for the given LLM provider. Countries that are not supported will
receive an `HTTP 451: Unavailable For Legal Reasons` response.

Release Notes:

- N/A
2024-08-06 11:49:04 -04:00
Kirill Bulatov
9384f665bb
Properly extract errors from the Anthropic API (#15534)
Before, we missed "successful" responses with the API errors, now they
are properly shown in the assistant panel.


![image](https://github.com/user-attachments/assets/0c0936af-86c2-4def-9a58-25d5e0912b97)


Release Notes:

- N/A
2024-07-31 16:31:11 +03:00
Antonio Scandurra
99bc90a372
Allow customization of the model used for tool calling (#15479)
We also eliminate the `completion` crate and moved its logic into
`LanguageModelRegistry`.

Release Notes:

- N/A

---------

Co-authored-by: Nathan <nathan@zed.dev>
2024-07-30 16:18:53 +02:00
Antonio Scandurra
6e1f7c6e1d
Use tool calling instead of XML parsing to generate edit operations (#15385)
Release Notes:

- N/A

---------

Co-authored-by: Nathan <nathan@zed.dev>
2024-07-29 16:42:08 +02:00
Antonio Scandurra
d6bdaa8a91
Simplify LLM protocol (#15366)
In this pull request, we change the zed.dev protocol so that we pass the
raw JSON for the specified provider directly to our server. This avoids
the need to define a protobuf message that's a superset of all these
formats.

@bennetbo: We also changed the settings for available_models under
zed.dev to be a flat format, because the nesting seemed too confusing.
Can you help us upgrade the local provider configuration to be
consistent with this? We do whatever we need to do when parsing the
settings to make this simple for users, even if it's a bit more complex
on our end. We want to use versioning to avoid breaking existing users,
but need to keep making progress.

```json
"zed.dev": {
  "available_models": [
    {
      "provider": "anthropic",
        "name": "some-newly-released-model-we-havent-added",
        "max_tokens": 200000
      }
  ]
}
```

Release Notes:

- N/A

---------

Co-authored-by: Nathan <nathan@zed.dev>
2024-07-28 11:07:10 +02:00