This PR fixes the writing of LLM rate limit events to Clickhouse.
We had a table in the table name: `llm_rate_limits` instead of
`llm_rate_limit_events`.
I also extracted a helper function to write to Clickhouse so we can use
it anywhere we need to.
Release Notes:
- N/A
This PR reworks how we do checks for model names in the LLM service.
We now normalize the model names using the models defined in the
database.
Release Notes:
- N/A
This PR updates the LLM service to include the GitHub login on its
spans.
We need to pass this information through on the LLM token, so it will
temporarily be `None` until this change is deployed and new tokens have
been issued.
Release Notes:
- N/A
- db deadlock in GetLlmToken for non-staff users
- typo in allowed model name for non-staff users
Release Notes:
- N/A
---------
Co-authored-by: Marshall <marshall@zed.dev>
Co-authored-by: Joseph <joseph@zed.dev>
This PR adds the ability to revoke access tokens for the LLM service.
There is a new `revoked_access_tokens` table that contains the
identifiers (`jti`) of revoked access tokens.
To revoke an access token, insert a record into this table:
```sql
insert into revoked_access_tokens (jti) values ('1e887b9e-37f5-49e8-8feb-3274e5a86b67');
```
We now attach the `jti` as `authn.jti` to the tracing spans so that we
can associate an access token with a given request to the LLM service.
Release Notes:
- N/A
This PR adds feature-flagged access to the LLM service.
We've repurposed the `language-models` feature flag to be used for
providing access to Claude 3.5 Sonnet through the Zed provider.
The remaining RPC endpoints that were previously behind the
`language-models` feature flag are now behind a staff check.
We also put some Zed Pro related messaging behind a feature flag.
Release Notes:
- N/A
---------
Co-authored-by: Max <max@zed.dev>
This PR restricts usage of the LLM service to accounts older than 30
days.
We now store the GitHub user's `created_at` timestamp to check the
GitHub account age. If this is not set—which it won't be for existing
users—then we use the `created_at` timestamp in the Zed database.
Release Notes:
- N/A
---------
Co-authored-by: Max <max@zed.dev>
Now, when an anthropic request is invalid or anthropic's API is down,
we'll expose that to the user instead of just returning a generic 500.
Release Notes:
- N/A
Co-authored-by: Marshall <marshall@zed.dev>
This PR makes it so Zed staff can use a separate Anthropic API key for
the LLM service.
We also added an `is_staff` column to the `usages` table so that we can
exclude staff usage from the "active users" metrics that influence the
rate limits.
Release Notes:
- N/A
---------
Co-authored-by: Max <max@zed.dev>
This PR adds a check to the LLM API token issuance to ensure that we
only issue tokens to users that have accepted the terms of service.
Release Notes:
- N/A
This PR makes it so hitting upstream rate limits from Anthropic result
in an HTTP 429 response instead of an HTTP 500.
To do this we need to surface structured errors out of the `anthropic`
crate.
Release Notes:
- N/A
This adds the requirement for users to accept the terms of service the
first time they send a message with the Cloud provider.
Once this is out and in a nightly, we need to add the check to the
server side too, to authenticate access to the models.
Demo:
https://github.com/user-attachments/assets/0edebf74-8120-4fa2-b801-bb76f04e8a17
Release Notes:
- N/A
This PR makes it so staff members will be exempt from rate limiting by
the LLM service.
This is just a temporary measure until we can tweak the rate-limiting
heuristics.
Staff members are still subject to upstream LLM provider rate limits.
Release Notes:
- N/A
When Anthropic releases a new version of their models, Zed AI users
should always get access to the new version even when using an old
version of zed.
Co-Authored-By: Thorsten <thorsten@zed.dev>
Release Notes:
- N/A
Co-authored-by: Thorsten <thorsten@zed.dev>
This PR adjusts how we display the "mode" collab is running in on the
root endpoint.
It's minor, but it does make things a bit cleaner.
Release Notes:
- N/A
This prevents users from accessing other models, such as OpenAI's GPT-4
or Google's Gemini-Pro.
Staff members can still access all models.
Co-authored-by: Thorsten <thorsten@zed.dev>
Release Notes:
- N/A
---------
Co-authored-by: Thorsten <thorsten@zed.dev>
This will help us as we hit issues with the /workflow and step
resolution. We can override the baked-in prompts and make tweaks, then
import our refinements back into the source tree when we're ready.
Release Notes:
- N/A
This PR removes the unused `ignore_checksum_mismatch` parameter to
`run_database_migrations`.
We were always passing `false`, which meant the behavior didn't need to
be parameterized.
Release Notes:
- N/A
This PR puts the initial infrastructure for the LLM service's database
in place.
The LLM service will be using a separate Postgres database, with its own
set of migrations.
Currently we only connect to the database in development, as we don't
yet have the database setup for the staging/production environments.
Release Notes:
- N/A
This PR updates the LLM service to authorize access to language model
providers based on the requester's country.
We detect the country using Cloudflare's
[`CF-IPCountry`](https://developers.cloudflare.com/fundamentals/reference/http-request-headers/#cf-ipcountry)
header.
The country code is then checked against the list of supported countries
for the given LLM provider. Countries that are not supported will
receive an `HTTP 451: Unavailable For Legal Reasons` response.
Release Notes:
- N/A
This PR introduces a separate backend service for making LLM calls.
It exposes an HTTP interface that can be called by Zed clients. To call
these endpoints, the client must provide a `Bearer` token. These tokens
are issued/refreshed by the collab service over RPC.
We're adding this in a backwards-compatible way. Right now the access
tokens can only be minted for Zed staff, and calling this separate LLM
service is behind the `llm-service` feature flag (which is not
automatically enabled for Zed staff).
Release Notes:
- N/A
---------
Co-authored-by: Marshall <marshall@zed.dev>
Co-authored-by: Marshall Bowers <elliott.codes@gmail.com>
This PR makes it so any Stripe events we receive that occurred over an
hour ago are marked as processed.
We don't want to process an old event long after it occurred and
potentially overwrite more recent updates.
This also makes running collab locally a bit nicer, as we won't be
getting errors for a bunch of older events that will never get processed
successfully.
The period after time after which we consider an event "stale" can be
modified, as needed.
Release Notes:
- N/A
This is just a refactor that we're landing ahead of any functional
changes to make sure we haven't broken anything.
Release Notes:
- N/A
Co-authored-by: Marshall <marshall@zed.dev>
Co-authored-by: Jason <jason@zed.dev>
This PR changes how we report the `geoip_country_code` in the tracing
spans.
I wasn't seeing it come through in the logs, and I think it was because
we didn't declare the field on the initial span.
Release Notes:
- N/A