Yehowshua/ZIm - Forgejo: Beyond coding. We Forge.

Author	SHA1	Message	Date
Marshall Bowers	93a7682659	collab: Count active users based on the tokens per minute measure (#16911 ) This PR fixes an issue where active user counts were being computed across _all_ measures instead of the per-minute measures. We now compute them using the tokens per minute measure, as we're concerned with usage in recent minutes. Release Notes: - N/A	2024-08-26 15:04:55 -04:00
Marshall Bowers	0229d3ccac	collab: Track active user counts independently for each model (#16624 ) This PR fixes an issue where the active user count spanned individual models. We now track the active user counts on a per-model basis. Release Notes: - N/A	2024-08-21 17:19:47 -04:00
Max Brunsfeld	b5bd8a5c5d	Add logic for closed beta LLM models (#16482 ) Release Notes: - N/A --------- Co-authored-by: Marshall <marshall@zed.dev>	2024-08-19 11:09:52 -07:00
Max Brunsfeld	1b1070e0f7	Add tracing needed for LLM rate limit dashboards (#16388 ) Release Notes: - N/A --------- Co-authored-by: Marshall <marshall@zed.dev>	2024-08-16 17:52:31 -04:00
Marshall Bowers	a9441879c3	collab: Fix writing LLM rate limit events to Clickhouse (#16367 ) This PR fixes the writing of LLM rate limit events to Clickhouse. We had a table in the table name: `llm_rate_limits` instead of `llm_rate_limit_events`. I also extracted a helper function to write to Clickhouse so we can use it anywhere we need to. Release Notes: - N/A	2024-08-16 14:03:34 -04:00
Marshall Bowers	7a5acc0b0c	collab: Rework model name checks (#16365 ) This PR reworks how we do checks for model names in the LLM service. We now normalize the model names using the models defined in the database. Release Notes: - N/A	2024-08-16 13:54:28 -04:00
Marshall Bowers	9233418cb8	collab: Attach GitHub login to LLM spans (#16316 ) This PR updates the LLM service to include the GitHub login on its spans. We need to pass this information through on the LLM token, so it will temporarily be `None` until this change is deployed and new tokens have been issued. Release Notes: - N/A	2024-08-15 17:06:20 -04:00
Max Brunsfeld	6b7664ef4a	Fix bugs preventing non-staff users from using LLM service (#16307 ) - db deadlock in GetLlmToken for non-staff users - typo in allowed model name for non-staff users Release Notes: - N/A --------- Co-authored-by: Marshall <marshall@zed.dev> Co-authored-by: Joseph <joseph@zed.dev>	2024-08-15 11:21:19 -07:00
Marshall Bowers	b4c22cc861	collab: Add ability to revoke LLM service access tokens (#16143 ) This PR adds the ability to revoke access tokens for the LLM service. There is a new `revoked_access_tokens` table that contains the identifiers (`jti`) of revoked access tokens. To revoke an access token, insert a record into this table: ```sql insert into revoked_access_tokens (jti) values ('1e887b9e-37f5-49e8-8feb-3274e5a86b67'); ``` We now attach the `jti` as `authn.jti` to the tracing spans so that we can associate an access token with a given request to the LLM service. Release Notes: - N/A	2024-08-12 21:47:05 -04:00
Max Brunsfeld	dbcd06642c	Track lifetime spending for each user and model (#16137 ) Release Notes: - N/A Co-authored-by: Marshall <marshall@zed.dev>	2024-08-12 20:15:26 -04:00
Max Brunsfeld	a3c79218c4	Report telemetry events for rate limit errors (#16130 ) clickhouse telemetry schema: ``` CREATE TABLE default.llm_rate_limit_events ( `time` DateTime64(3), `user_id` Int32, `is_staff` Bool, `plan` LowCardinality(String), `model` String, `provider` LowCardinality(String), `usage_measure` LowCardinality(String), `requests_this_minute` UInt64, `tokens_this_minute` UInt64, `tokens_this_day` UInt64, `max_requests_per_minute` UInt64, `max_tokens_per_minute` UInt64, `max_tokens_per_day` UInt64, `users_in_recent_minutes` UInt64, `users_in_recent_days` UInt64 ) ORDER BY tuple() ``` Release Notes: - N/A Co-authored-by: Marshall <marshall@zed.dev>	2024-08-12 16:31:11 -04:00
Marshall Bowers	f3ec8d425f	collab: Use a separate Anthropic API key for Zed staff (#16128 ) This PR makes it so Zed staff can use a separate Anthropic API key for the LLM service. We also added an `is_staff` column to the `usages` table so that we can exclude staff usage from the "active users" metrics that influence the rate limits. Release Notes: - N/A --------- Co-authored-by: Max <max@zed.dev>	2024-08-12 15:20:34 -04:00
Max Brunsfeld	33e120d964	Capture telemetry data on per-user monthly LLM spending (#16050 ) Release Notes: - N/A --------- Co-authored-by: Marshall <marshall@zed.dev>	2024-08-09 16:38:37 -07:00
Max Brunsfeld	8688b2ad19	Add telemetry for LLM usage (#16049 ) Release Notes: - N/A Co-authored-by: Marshall <marshall@zed.dev>	2024-08-09 18:15:57 -04:00
Max Brunsfeld	423c7b999a	Larger rate limit integers (#16047 ) Tokens per day may exceed the range of Postgres's 32-bit `integer` data type. Release Notes: - N/A Co-authored-by: Marshall <marshall@zed.dev>	2024-08-09 14:07:49 -07:00
Max Brunsfeld	240b7c641c	Fix llm queries (#16006 ) Release Notes: - N/A --------- Co-authored-by: Marshall <marshall@zed.dev>	2024-08-08 17:21:38 -07:00
Max Brunsfeld	06625bfe94	Apply rate limits in LLM service (#15997 ) Release Notes: - N/A --------- Co-authored-by: Marshall <marshall@zed.dev> Co-authored-by: Marshall Bowers <elliott.codes@gmail.com>	2024-08-08 15:46:33 -07:00
Bennet Bo Fenner	3a52d6cc52	assistant: Limit model access for Zed AI users to Claude-3.5-sonnet (#15904 ) This prevents users from accessing other models, such as OpenAI's GPT-4 or Google's Gemini-Pro. Staff members can still access all models. Co-authored-by: Thorsten <thorsten@zed.dev> Release Notes: - N/A --------- Co-authored-by: Thorsten <thorsten@zed.dev>	2024-08-07 16:26:56 +02:00
Marshall Bowers	a54e16b7ea	collab: Add `usages` table to LLM database (#15884 ) This PR adds a `usages` table to the LLM database. We'll use this to track usage for rate-limiting purposes. Release Notes: - N/A	2024-08-06 18:40:10 -04:00
Marshall Bowers	b19f85f9b5	collab: Remove unused parameter to `run_database_migrations` (#15883 ) This PR removes the unused `ignore_checksum_mismatch` parameter to `run_database_migrations`. We were always passing `false`, which meant the behavior didn't need to be parameterized. Release Notes: - N/A	2024-08-06 17:31:52 -04:00
Marshall Bowers	7f6d0919c9	collab: Setup database for LLM service (#15882 ) This PR puts the initial infrastructure for the LLM service's database in place. The LLM service will be using a separate Postgres database, with its own set of migrations. Currently we only connect to the database in development, as we don't yet have the database setup for the staging/production environments. Release Notes: - N/A	2024-08-06 17:18:08 -04:00
Marshall Bowers	cf5f4dddf5	Authorize access to language model providers based on country (#15859 ) This PR updates the LLM service to authorize access to language model providers based on the requester's country. We detect the country using Cloudflare's [`CF-IPCountry`](https://developers.cloudflare.com/fundamentals/reference/http-request-headers/#cf-ipcountry) header. The country code is then checked against the list of supported countries for the given LLM provider. Countries that are not supported will receive an `HTTP 451: Unavailable For Legal Reasons` response. Release Notes: - N/A	2024-08-06 11:49:04 -04:00
Max Brunsfeld	8e9c2b1125	Introduce a separate backend service for LLM calls (#15831 ) This PR introduces a separate backend service for making LLM calls. It exposes an HTTP interface that can be called by Zed clients. To call these endpoints, the client must provide a `Bearer` token. These tokens are issued/refreshed by the collab service over RPC. We're adding this in a backwards-compatible way. Right now the access tokens can only be minted for Zed staff, and calling this separate LLM service is behind the `llm-service` feature flag (which is not automatically enabled for Zed staff). Release Notes: - N/A --------- Co-authored-by: Marshall <marshall@zed.dev> Co-authored-by: Marshall Bowers <elliott.codes@gmail.com>	2024-08-05 20:26:21 -04:00

23 commits