Yehowshua/ZIm - Forgejo: Beyond coding. We Forge.

Author	SHA1	Message	Date
Richard Feldman	91ffa02e2c	/auto (#16696 ) Add `/auto` behind a feature flag that's disabled for now, even for staff. We've decided on a different design for context inference, but there are parts of /auto that will be useful for that, so we want them in the code base even if they're unused for now. Release Notes: - N/A --------- Co-authored-by: Antonio Scandurra <me@as-cii.com> Co-authored-by: Marshall Bowers <elliott.codes@gmail.com>	2024-09-13 13:17:49 -04:00
Piotr Osiewicz	e6c1c51b37	chore: Fix several style lints (#17488 ) It's not comprehensive enough to start linting on `style` group, but hey, it's a start. Release Notes: - N/A	2024-09-06 11:58:39 +02:00
Marshall Bowers	30056254f3	collab: Add `GET /models` endpoint to LLM service (#17307 ) This PR adds a `GET /models` endpoint to the LLM service. This endpoint returns the models that the authenticated user has access to. This is the first step towards populating the models for the hosted service from the server. Release Notes: - N/A	2024-09-03 11:41:32 -04:00
Peter Tripp	4d6bb52d1f	Anthropic/OpenAI: Add country codes for territories (#17089 ) - Cloudflare provides ISO-3166-1 country code for protectorates. Expand our allowlist to include the territories of countries on the allowlist (US, UK, France, Australia, New Zealand). - Also include the country_code in the error message when we block. Co-authored-by: Marshall Bowers <elliott.codes@gmail.com>	2024-08-29 11:32:29 -04:00
Marshall Bowers	93a7682659	collab: Count active users based on the tokens per minute measure (#16911 ) This PR fixes an issue where active user counts were being computed across _all_ measures instead of the per-minute measures. We now compute them using the tokens per minute measure, as we're concerned with usage in recent minutes. Release Notes: - N/A	2024-08-26 15:04:55 -04:00
Marshall Bowers	0229d3ccac	collab: Track active user counts independently for each model (#16624 ) This PR fixes an issue where the active user count spanned individual models. We now track the active user counts on a per-model basis. Release Notes: - N/A	2024-08-21 17:19:47 -04:00
Max Brunsfeld	b5bd8a5c5d	Add logic for closed beta LLM models (#16482 ) Release Notes: - N/A --------- Co-authored-by: Marshall <marshall@zed.dev>	2024-08-19 11:09:52 -07:00
Max Brunsfeld	1b1070e0f7	Add tracing needed for LLM rate limit dashboards (#16388 ) Release Notes: - N/A --------- Co-authored-by: Marshall <marshall@zed.dev>	2024-08-16 17:52:31 -04:00
Marshall Bowers	a9441879c3	collab: Fix writing LLM rate limit events to Clickhouse (#16367 ) This PR fixes the writing of LLM rate limit events to Clickhouse. We had a table in the table name: `llm_rate_limits` instead of `llm_rate_limit_events`. I also extracted a helper function to write to Clickhouse so we can use it anywhere we need to. Release Notes: - N/A	2024-08-16 14:03:34 -04:00
Marshall Bowers	7a5acc0b0c	collab: Rework model name checks (#16365 ) This PR reworks how we do checks for model names in the LLM service. We now normalize the model names using the models defined in the database. Release Notes: - N/A	2024-08-16 13:54:28 -04:00
Marshall Bowers	9233418cb8	collab: Attach GitHub login to LLM spans (#16316 ) This PR updates the LLM service to include the GitHub login on its spans. We need to pass this information through on the LLM token, so it will temporarily be `None` until this change is deployed and new tokens have been issued. Release Notes: - N/A	2024-08-15 17:06:20 -04:00
Max Brunsfeld	6b7664ef4a	Fix bugs preventing non-staff users from using LLM service (#16307 ) - db deadlock in GetLlmToken for non-staff users - typo in allowed model name for non-staff users Release Notes: - N/A --------- Co-authored-by: Marshall <marshall@zed.dev> Co-authored-by: Joseph <joseph@zed.dev>	2024-08-15 11:21:19 -07:00
Marshall Bowers	b4c22cc861	collab: Add ability to revoke LLM service access tokens (#16143 ) This PR adds the ability to revoke access tokens for the LLM service. There is a new `revoked_access_tokens` table that contains the identifiers (`jti`) of revoked access tokens. To revoke an access token, insert a record into this table: ```sql insert into revoked_access_tokens (jti) values ('1e887b9e-37f5-49e8-8feb-3274e5a86b67'); ``` We now attach the `jti` as `authn.jti` to the tracing spans so that we can associate an access token with a given request to the LLM service. Release Notes: - N/A	2024-08-12 21:47:05 -04:00
Max Brunsfeld	dbcd06642c	Track lifetime spending for each user and model (#16137 ) Release Notes: - N/A Co-authored-by: Marshall <marshall@zed.dev>	2024-08-12 20:15:26 -04:00
Max Brunsfeld	a3c79218c4	Report telemetry events for rate limit errors (#16130 ) clickhouse telemetry schema: ``` CREATE TABLE default.llm_rate_limit_events ( `time` DateTime64(3), `user_id` Int32, `is_staff` Bool, `plan` LowCardinality(String), `model` String, `provider` LowCardinality(String), `usage_measure` LowCardinality(String), `requests_this_minute` UInt64, `tokens_this_minute` UInt64, `tokens_this_day` UInt64, `max_requests_per_minute` UInt64, `max_tokens_per_minute` UInt64, `max_tokens_per_day` UInt64, `users_in_recent_minutes` UInt64, `users_in_recent_days` UInt64 ) ORDER BY tuple() ``` Release Notes: - N/A Co-authored-by: Marshall <marshall@zed.dev>	2024-08-12 16:31:11 -04:00
Marshall Bowers	f3ec8d425f	collab: Use a separate Anthropic API key for Zed staff (#16128 ) This PR makes it so Zed staff can use a separate Anthropic API key for the LLM service. We also added an `is_staff` column to the `usages` table so that we can exclude staff usage from the "active users" metrics that influence the rate limits. Release Notes: - N/A --------- Co-authored-by: Max <max@zed.dev>	2024-08-12 15:20:34 -04:00
Max Brunsfeld	33e120d964	Capture telemetry data on per-user monthly LLM spending (#16050 ) Release Notes: - N/A --------- Co-authored-by: Marshall <marshall@zed.dev>	2024-08-09 16:38:37 -07:00
Max Brunsfeld	8688b2ad19	Add telemetry for LLM usage (#16049 ) Release Notes: - N/A Co-authored-by: Marshall <marshall@zed.dev>	2024-08-09 18:15:57 -04:00
Max Brunsfeld	423c7b999a	Larger rate limit integers (#16047 ) Tokens per day may exceed the range of Postgres's 32-bit `integer` data type. Release Notes: - N/A Co-authored-by: Marshall <marshall@zed.dev>	2024-08-09 14:07:49 -07:00
Max Brunsfeld	240b7c641c	Fix llm queries (#16006 ) Release Notes: - N/A --------- Co-authored-by: Marshall <marshall@zed.dev>	2024-08-08 17:21:38 -07:00
Max Brunsfeld	06625bfe94	Apply rate limits in LLM service (#15997 ) Release Notes: - N/A --------- Co-authored-by: Marshall <marshall@zed.dev> Co-authored-by: Marshall Bowers <elliott.codes@gmail.com>	2024-08-08 15:46:33 -07:00
Bennet Bo Fenner	3a52d6cc52	assistant: Limit model access for Zed AI users to Claude-3.5-sonnet (#15904 ) This prevents users from accessing other models, such as OpenAI's GPT-4 or Google's Gemini-Pro. Staff members can still access all models. Co-authored-by: Thorsten <thorsten@zed.dev> Release Notes: - N/A --------- Co-authored-by: Thorsten <thorsten@zed.dev>	2024-08-07 16:26:56 +02:00
Marshall Bowers	a54e16b7ea	collab: Add `usages` table to LLM database (#15884 ) This PR adds a `usages` table to the LLM database. We'll use this to track usage for rate-limiting purposes. Release Notes: - N/A	2024-08-06 18:40:10 -04:00
Marshall Bowers	b19f85f9b5	collab: Remove unused parameter to `run_database_migrations` (#15883 ) This PR removes the unused `ignore_checksum_mismatch` parameter to `run_database_migrations`. We were always passing `false`, which meant the behavior didn't need to be parameterized. Release Notes: - N/A	2024-08-06 17:31:52 -04:00
Marshall Bowers	7f6d0919c9	collab: Setup database for LLM service (#15882 ) This PR puts the initial infrastructure for the LLM service's database in place. The LLM service will be using a separate Postgres database, with its own set of migrations. Currently we only connect to the database in development, as we don't yet have the database setup for the staging/production environments. Release Notes: - N/A	2024-08-06 17:18:08 -04:00
Marshall Bowers	cf5f4dddf5	Authorize access to language model providers based on country (#15859 ) This PR updates the LLM service to authorize access to language model providers based on the requester's country. We detect the country using Cloudflare's [`CF-IPCountry`](https://developers.cloudflare.com/fundamentals/reference/http-request-headers/#cf-ipcountry) header. The country code is then checked against the list of supported countries for the given LLM provider. Countries that are not supported will receive an `HTTP 451: Unavailable For Legal Reasons` response. Release Notes: - N/A	2024-08-06 11:49:04 -04:00
Max Brunsfeld	8e9c2b1125	Introduce a separate backend service for LLM calls (#15831 ) This PR introduces a separate backend service for making LLM calls. It exposes an HTTP interface that can be called by Zed clients. To call these endpoints, the client must provide a `Bearer` token. These tokens are issued/refreshed by the collab service over RPC. We're adding this in a backwards-compatible way. Right now the access tokens can only be minted for Zed staff, and calling this separate LLM service is behind the `llm-service` feature flag (which is not automatically enabled for Zed staff). Release Notes: - N/A --------- Co-authored-by: Marshall <marshall@zed.dev> Co-authored-by: Marshall Bowers <elliott.codes@gmail.com>	2024-08-05 20:26:21 -04:00

27 commits