This PR fixes an issue where active user counts were being computed
across _all_ measures instead of the per-minute measures.
We now compute them using the tokens per minute measure, as we're
concerned with usage in recent minutes.
Release Notes:
- N/A
This PR fixes an issue where the active user count spanned individual
models.
We now track the active user counts on a per-model basis.
Release Notes:
- N/A
This PR fixes the writing of LLM rate limit events to Clickhouse.
We had a table in the table name: `llm_rate_limits` instead of
`llm_rate_limit_events`.
I also extracted a helper function to write to Clickhouse so we can use
it anywhere we need to.
Release Notes:
- N/A
This PR reworks how we do checks for model names in the LLM service.
We now normalize the model names using the models defined in the
database.
Release Notes:
- N/A
This PR updates the LLM service to include the GitHub login on its
spans.
We need to pass this information through on the LLM token, so it will
temporarily be `None` until this change is deployed and new tokens have
been issued.
Release Notes:
- N/A
- db deadlock in GetLlmToken for non-staff users
- typo in allowed model name for non-staff users
Release Notes:
- N/A
---------
Co-authored-by: Marshall <marshall@zed.dev>
Co-authored-by: Joseph <joseph@zed.dev>
This PR adds the ability to revoke access tokens for the LLM service.
There is a new `revoked_access_tokens` table that contains the
identifiers (`jti`) of revoked access tokens.
To revoke an access token, insert a record into this table:
```sql
insert into revoked_access_tokens (jti) values ('1e887b9e-37f5-49e8-8feb-3274e5a86b67');
```
We now attach the `jti` as `authn.jti` to the tracing spans so that we
can associate an access token with a given request to the LLM service.
Release Notes:
- N/A
This PR makes it so Zed staff can use a separate Anthropic API key for
the LLM service.
We also added an `is_staff` column to the `usages` table so that we can
exclude staff usage from the "active users" metrics that influence the
rate limits.
Release Notes:
- N/A
---------
Co-authored-by: Max <max@zed.dev>
This prevents users from accessing other models, such as OpenAI's GPT-4
or Google's Gemini-Pro.
Staff members can still access all models.
Co-authored-by: Thorsten <thorsten@zed.dev>
Release Notes:
- N/A
---------
Co-authored-by: Thorsten <thorsten@zed.dev>
This PR removes the unused `ignore_checksum_mismatch` parameter to
`run_database_migrations`.
We were always passing `false`, which meant the behavior didn't need to
be parameterized.
Release Notes:
- N/A
This PR puts the initial infrastructure for the LLM service's database
in place.
The LLM service will be using a separate Postgres database, with its own
set of migrations.
Currently we only connect to the database in development, as we don't
yet have the database setup for the staging/production environments.
Release Notes:
- N/A
This PR updates the LLM service to authorize access to language model
providers based on the requester's country.
We detect the country using Cloudflare's
[`CF-IPCountry`](https://developers.cloudflare.com/fundamentals/reference/http-request-headers/#cf-ipcountry)
header.
The country code is then checked against the list of supported countries
for the given LLM provider. Countries that are not supported will
receive an `HTTP 451: Unavailable For Legal Reasons` response.
Release Notes:
- N/A
This PR introduces a separate backend service for making LLM calls.
It exposes an HTTP interface that can be called by Zed clients. To call
these endpoints, the client must provide a `Bearer` token. These tokens
are issued/refreshed by the collab service over RPC.
We're adding this in a backwards-compatible way. Right now the access
tokens can only be minted for Zed staff, and calling this separate LLM
service is behind the `llm-service` feature flag (which is not
automatically enabled for Zed staff).
Release Notes:
- N/A
---------
Co-authored-by: Marshall <marshall@zed.dev>
Co-authored-by: Marshall Bowers <elliott.codes@gmail.com>