Commit graph

23 commits

Author SHA1 Message Date
Marshall Bowers
93a7682659
collab: Count active users based on the tokens per minute measure (#16911)
This PR fixes an issue where active user counts were being computed
across _all_ measures instead of the per-minute measures.

We now compute them using the tokens per minute measure, as we're
concerned with usage in recent minutes.

Release Notes:

- N/A
2024-08-26 15:04:55 -04:00
Marshall Bowers
0229d3ccac
collab: Track active user counts independently for each model (#16624)
This PR fixes an issue where the active user count spanned individual
models.

We now track the active user counts on a per-model basis.

Release Notes:

- N/A
2024-08-21 17:19:47 -04:00
Max Brunsfeld
b5bd8a5c5d
Add logic for closed beta LLM models (#16482)
Release Notes:

- N/A

---------

Co-authored-by: Marshall <marshall@zed.dev>
2024-08-19 11:09:52 -07:00
Max Brunsfeld
1b1070e0f7
Add tracing needed for LLM rate limit dashboards (#16388)
Release Notes:

- N/A

---------

Co-authored-by: Marshall <marshall@zed.dev>
2024-08-16 17:52:31 -04:00
Marshall Bowers
a9441879c3
collab: Fix writing LLM rate limit events to Clickhouse (#16367)
This PR fixes the writing of LLM rate limit events to Clickhouse.

We had a table in the table name: `llm_rate_limits` instead of
`llm_rate_limit_events`.

I also extracted a helper function to write to Clickhouse so we can use
it anywhere we need to.

Release Notes:

- N/A
2024-08-16 14:03:34 -04:00
Marshall Bowers
7a5acc0b0c
collab: Rework model name checks (#16365)
This PR reworks how we do checks for model names in the LLM service.

We now normalize the model names using the models defined in the
database.

Release Notes:

- N/A
2024-08-16 13:54:28 -04:00
Marshall Bowers
9233418cb8
collab: Attach GitHub login to LLM spans (#16316)
This PR updates the LLM service to include the GitHub login on its
spans.

We need to pass this information through on the LLM token, so it will
temporarily be `None` until this change is deployed and new tokens have
been issued.

Release Notes:

- N/A
2024-08-15 17:06:20 -04:00
Max Brunsfeld
6b7664ef4a
Fix bugs preventing non-staff users from using LLM service (#16307)
- db deadlock in GetLlmToken for non-staff users
- typo in allowed model name for non-staff users

Release Notes:

- N/A

---------

Co-authored-by: Marshall <marshall@zed.dev>
Co-authored-by: Joseph <joseph@zed.dev>
2024-08-15 11:21:19 -07:00
Marshall Bowers
b4c22cc861
collab: Add ability to revoke LLM service access tokens (#16143)
This PR adds the ability to revoke access tokens for the LLM service.

There is a new `revoked_access_tokens` table that contains the
identifiers (`jti`) of revoked access tokens.

To revoke an access token, insert a record into this table:

```sql
insert into revoked_access_tokens (jti) values ('1e887b9e-37f5-49e8-8feb-3274e5a86b67');
```

We now attach the `jti` as `authn.jti` to the tracing spans so that we
can associate an access token with a given request to the LLM service.

Release Notes:

- N/A
2024-08-12 21:47:05 -04:00
Max Brunsfeld
dbcd06642c
Track lifetime spending for each user and model (#16137)
Release Notes:

- N/A

Co-authored-by: Marshall <marshall@zed.dev>
2024-08-12 20:15:26 -04:00
Max Brunsfeld
a3c79218c4
Report telemetry events for rate limit errors (#16130)
clickhouse telemetry schema:

```
CREATE TABLE default.llm_rate_limit_events
(
    `time` DateTime64(3),
    `user_id` Int32,
    `is_staff` Bool,
    `plan` LowCardinality(String),
    `model` String,
    `provider` LowCardinality(String),
    `usage_measure` LowCardinality(String),
    `requests_this_minute` UInt64,
    `tokens_this_minute` UInt64,
    `tokens_this_day` UInt64,
    `max_requests_per_minute` UInt64,
    `max_tokens_per_minute` UInt64,
    `max_tokens_per_day` UInt64,
    `users_in_recent_minutes` UInt64,
    `users_in_recent_days` UInt64
)
ORDER BY tuple()
```

Release Notes:

- N/A

Co-authored-by: Marshall <marshall@zed.dev>
2024-08-12 16:31:11 -04:00
Marshall Bowers
f3ec8d425f
collab: Use a separate Anthropic API key for Zed staff (#16128)
This PR makes it so Zed staff can use a separate Anthropic API key for
the LLM service.

We also added an `is_staff` column to the `usages` table so that we can
exclude staff usage from the "active users" metrics that influence the
rate limits.

Release Notes:

- N/A

---------

Co-authored-by: Max <max@zed.dev>
2024-08-12 15:20:34 -04:00
Max Brunsfeld
33e120d964
Capture telemetry data on per-user monthly LLM spending (#16050)
Release Notes:

- N/A

---------

Co-authored-by: Marshall <marshall@zed.dev>
2024-08-09 16:38:37 -07:00
Max Brunsfeld
8688b2ad19
Add telemetry for LLM usage (#16049)
Release Notes:

- N/A

Co-authored-by: Marshall <marshall@zed.dev>
2024-08-09 18:15:57 -04:00
Max Brunsfeld
423c7b999a
Larger rate limit integers (#16047)
Tokens per day may exceed the range of Postgres's 32-bit `integer` data
type.

Release Notes:

- N/A

Co-authored-by: Marshall <marshall@zed.dev>
2024-08-09 14:07:49 -07:00
Max Brunsfeld
240b7c641c
Fix llm queries (#16006)
Release Notes:

- N/A

---------

Co-authored-by: Marshall <marshall@zed.dev>
2024-08-08 17:21:38 -07:00
Max Brunsfeld
06625bfe94
Apply rate limits in LLM service (#15997)
Release Notes:

- N/A

---------

Co-authored-by: Marshall <marshall@zed.dev>
Co-authored-by: Marshall Bowers <elliott.codes@gmail.com>
2024-08-08 15:46:33 -07:00
Bennet Bo Fenner
3a52d6cc52
assistant: Limit model access for Zed AI users to Claude-3.5-sonnet (#15904)
This prevents users from accessing other models, such as OpenAI's GPT-4
or Google's Gemini-Pro.
Staff members can still access all models.

Co-authored-by: Thorsten <thorsten@zed.dev>

Release Notes:

- N/A

---------

Co-authored-by: Thorsten <thorsten@zed.dev>
2024-08-07 16:26:56 +02:00
Marshall Bowers
a54e16b7ea
collab: Add usages table to LLM database (#15884)
This PR adds a `usages` table to the LLM database.

We'll use this to track usage for rate-limiting purposes.

Release Notes:

- N/A
2024-08-06 18:40:10 -04:00
Marshall Bowers
b19f85f9b5
collab: Remove unused parameter to run_database_migrations (#15883)
This PR removes the unused `ignore_checksum_mismatch` parameter to
`run_database_migrations`.

We were always passing `false`, which meant the behavior didn't need to
be parameterized.

Release Notes:

- N/A
2024-08-06 17:31:52 -04:00
Marshall Bowers
7f6d0919c9
collab: Setup database for LLM service (#15882)
This PR puts the initial infrastructure for the LLM service's database
in place.

The LLM service will be using a separate Postgres database, with its own
set of migrations.

Currently we only connect to the database in development, as we don't
yet have the database setup for the staging/production environments.

Release Notes:

- N/A
2024-08-06 17:18:08 -04:00
Marshall Bowers
cf5f4dddf5
Authorize access to language model providers based on country (#15859)
This PR updates the LLM service to authorize access to language model
providers based on the requester's country.

We detect the country using Cloudflare's
[`CF-IPCountry`](https://developers.cloudflare.com/fundamentals/reference/http-request-headers/#cf-ipcountry)
header.

The country code is then checked against the list of supported countries
for the given LLM provider. Countries that are not supported will
receive an `HTTP 451: Unavailable For Legal Reasons` response.

Release Notes:

- N/A
2024-08-06 11:49:04 -04:00
Max Brunsfeld
8e9c2b1125
Introduce a separate backend service for LLM calls (#15831)
This PR introduces a separate backend service for making LLM calls.

It exposes an HTTP interface that can be called by Zed clients. To call
these endpoints, the client must provide a `Bearer` token. These tokens
are issued/refreshed by the collab service over RPC.

We're adding this in a backwards-compatible way. Right now the access
tokens can only be minted for Zed staff, and calling this separate LLM
service is behind the `llm-service` feature flag (which is not
automatically enabled for Zed staff).

Release Notes:

- N/A

---------

Co-authored-by: Marshall <marshall@zed.dev>
Co-authored-by: Marshall Bowers <elliott.codes@gmail.com>
2024-08-05 20:26:21 -04:00