Yehowshua/ZIm - Forgejo: Beyond coding. We Forge.

Author	SHA1	Message	Date
Marshall Bowers	340662e2f7	collab: Add lifetime spending limit for LLM usage (#16780 ) This PR adds a lifetime spending limit on LLM usage. Exceeding this limit will prevent further use of the Zed LLM provider. Currently the cap is $1,000. Release Notes: - N/A	2024-08-23 16:41:16 -04:00
Marshall Bowers	1d986b0c77	collab: Report active user counts separately, as well (#16629 ) This PR adds additional reporting of the active user counts as separate logs. We were already reporting these on individual rate limit events/logs, but it seems like something that would be good to report on independent of user activity. Release Notes: - N/A	2024-08-21 18:15:15 -04:00
Marshall Bowers	0229d3ccac	collab: Track active user counts independently for each model (#16624 ) This PR fixes an issue where the active user count spanned individual models. We now track the active user counts on a per-model basis. Release Notes: - N/A	2024-08-21 17:19:47 -04:00
Marshall Bowers	96bcceed40	collab: Add traces for user LLM rate limits (#16610 ) This PR adds traces for when users hit LLM rate limits. We were already emitting telemetry events for these to Clickhouse, but it will be handy to have them available in Axiom as well. Release Notes: - N/A	2024-08-21 15:13:55 -04:00
Marshall Bowers	de41c151c8	collab: Add `is_staff` to upstream rate limit spans (#16463 ) This PR adds the `is_staff` field to the `upstream rate limit` spans. Since we use different API keys for staff vs non-staff, it will be useful to break down the rate limits accordingly. Release Notes: - N/A	2024-08-19 10:15:25 -04:00
Marshall Bowers	3d997e5fd6	collab: Add `is_staff` to spans (#16389 ) This PR adds the `is_staff` field to our LLM spans so that we can distinguish between staff and non-staff traffic. Release Notes: - N/A	2024-08-16 18:42:44 -04:00
Max Brunsfeld	1b1070e0f7	Add tracing needed for LLM rate limit dashboards (#16388 ) Release Notes: - N/A --------- Co-authored-by: Marshall <marshall@zed.dev>	2024-08-16 17:52:31 -04:00
Marshall Bowers	7a5acc0b0c	collab: Rework model name checks (#16365 ) This PR reworks how we do checks for model names in the LLM service. We now normalize the model names using the models defined in the database. Release Notes: - N/A	2024-08-16 13:54:28 -04:00
Marshall Bowers	9233418cb8	collab: Attach GitHub login to LLM spans (#16316 ) This PR updates the LLM service to include the GitHub login on its spans. We need to pass this information through on the LLM token, so it will temporarily be `None` until this change is deployed and new tokens have been issued. Release Notes: - N/A	2024-08-15 17:06:20 -04:00
Marshall Bowers	5e05821d18	collab: Attach `user_id` to LLM spans (#16311 ) This PR updates the LLM service to attach the user ID to the spans. Release Notes: - N/A	2024-08-15 15:49:12 -04:00
Marshall Bowers	b4c22cc861	collab: Add ability to revoke LLM service access tokens (#16143 ) This PR adds the ability to revoke access tokens for the LLM service. There is a new `revoked_access_tokens` table that contains the identifiers (`jti`) of revoked access tokens. To revoke an access token, insert a record into this table: ```sql insert into revoked_access_tokens (jti) values ('1e887b9e-37f5-49e8-8feb-3274e5a86b67'); ``` We now attach the `jti` as `authn.jti` to the tracing spans so that we can associate an access token with a given request to the LLM service. Release Notes: - N/A	2024-08-12 21:47:05 -04:00
Max Brunsfeld	dbcd06642c	Track lifetime spending for each user and model (#16137 ) Release Notes: - N/A Co-authored-by: Marshall <marshall@zed.dev>	2024-08-12 20:15:26 -04:00
Max Brunsfeld	a3c79218c4	Report telemetry events for rate limit errors (#16130 ) clickhouse telemetry schema: ``` CREATE TABLE default.llm_rate_limit_events ( `time` DateTime64(3), `user_id` Int32, `is_staff` Bool, `plan` LowCardinality(String), `model` String, `provider` LowCardinality(String), `usage_measure` LowCardinality(String), `requests_this_minute` UInt64, `tokens_this_minute` UInt64, `tokens_this_day` UInt64, `max_requests_per_minute` UInt64, `max_tokens_per_minute` UInt64, `max_tokens_per_day` UInt64, `users_in_recent_minutes` UInt64, `users_in_recent_days` UInt64 ) ORDER BY tuple() ``` Release Notes: - N/A Co-authored-by: Marshall <marshall@zed.dev>	2024-08-12 16:31:11 -04:00
Max Brunsfeld	1674e12ccb	Expose anthropic API errors to the client (#16129 ) Now, when an anthropic request is invalid or anthropic's API is down, we'll expose that to the user instead of just returning a generic 500. Release Notes: - N/A Co-authored-by: Marshall <marshall@zed.dev>	2024-08-12 13:11:48 -07:00
Marshall Bowers	f3ec8d425f	collab: Use a separate Anthropic API key for Zed staff (#16128 ) This PR makes it so Zed staff can use a separate Anthropic API key for the LLM service. We also added an `is_staff` column to the `usages` table so that we can exclude staff usage from the "active users" metrics that influence the rate limits. Release Notes: - N/A --------- Co-authored-by: Max <max@zed.dev>	2024-08-12 15:20:34 -04:00
Marshall Bowers	ebdb755fef	Surface upstream rate limits from Anthropic (#16118 ) This PR makes it so hitting upstream rate limits from Anthropic result in an HTTP 429 response instead of an HTTP 500. To do this we need to surface structured errors out of the `anthropic` crate. Release Notes: - N/A	2024-08-12 11:59:24 -04:00
Marshall Bowers	3140d6ce8c	collab: Temporarily bypass LLM rate limiting for staff (#16089 ) This PR makes it so staff members will be exempt from rate limiting by the LLM service. This is just a temporary measure until we can tweak the rate-limiting heuristics. Staff members are still subject to upstream LLM provider rate limits. Release Notes: - N/A	2024-08-11 14:41:49 -04:00
Max Brunsfeld	33e120d964	Capture telemetry data on per-user monthly LLM spending (#16050 ) Release Notes: - N/A --------- Co-authored-by: Marshall <marshall@zed.dev>	2024-08-09 16:38:37 -07:00
Max Brunsfeld	8688b2ad19	Add telemetry for LLM usage (#16049 ) Release Notes: - N/A Co-authored-by: Marshall <marshall@zed.dev>	2024-08-09 18:15:57 -04:00
Max Brunsfeld	fbebb73d7b	Use LLM service for tool call requests (#16046 ) Release Notes: - N/A --------- Co-authored-by: Marshall <marshall@zed.dev>	2024-08-09 16:22:58 -04:00
Max Brunsfeld	b1c69c2178	Fix usage recording in llm service (#16044 ) Release Notes: - N/A Co-authored-by: Marshall <marshall@zed.dev>	2024-08-09 11:48:18 -07:00
Max Brunsfeld	225726ba4a	Remove code paths that skip LLM db in prod (#16008 ) Release Notes: - N/A	2024-08-09 10:41:50 -04:00
Max Brunsfeld	06625bfe94	Apply rate limits in LLM service (#15997 ) Release Notes: - N/A --------- Co-authored-by: Marshall <marshall@zed.dev> Co-authored-by: Marshall Bowers <elliott.codes@gmail.com>	2024-08-08 15:46:33 -07:00
Bennet Bo Fenner	514b79e461	collab: Always use newest anthropic model version (#15978 ) When Anthropic releases a new version of their models, Zed AI users should always get access to the new version even when using an old version of zed. Co-Authored-By: Thorsten <thorsten@zed.dev> Release Notes: - N/A Co-authored-by: Thorsten <thorsten@zed.dev>	2024-08-08 15:24:08 +02:00
Marshall Bowers	7f6d0919c9	collab: Setup database for LLM service (#15882 ) This PR puts the initial infrastructure for the LLM service's database in place. The LLM service will be using a separate Postgres database, with its own set of migrations. Currently we only connect to the database in development, as we don't yet have the database setup for the staging/production environments. Release Notes: - N/A	2024-08-06 17:18:08 -04:00
Marshall Bowers	cf5f4dddf5	Authorize access to language model providers based on country (#15859 ) This PR updates the LLM service to authorize access to language model providers based on the requester's country. We detect the country using Cloudflare's [`CF-IPCountry`](https://developers.cloudflare.com/fundamentals/reference/http-request-headers/#cf-ipcountry) header. The country code is then checked against the list of supported countries for the given LLM provider. Countries that are not supported will receive an `HTTP 451: Unavailable For Legal Reasons` response. Release Notes: - N/A	2024-08-06 11:49:04 -04:00
Marshall Bowers	ca9511393b	collab: Add support for more providers to the LLM service (#15832 ) This PR adds support for additional providers to the LLM service: - OpenAI - Google - Custom Zed models (through Hugging Face) Release Notes: - N/A	2024-08-05 21:16:18 -04:00
Max Brunsfeld	8e9c2b1125	Introduce a separate backend service for LLM calls (#15831 ) This PR introduces a separate backend service for making LLM calls. It exposes an HTTP interface that can be called by Zed clients. To call these endpoints, the client must provide a `Bearer` token. These tokens are issued/refreshed by the collab service over RPC. We're adding this in a backwards-compatible way. Right now the access tokens can only be minted for Zed staff, and calling this separate LLM service is behind the `llm-service` feature flag (which is not automatically enabled for Zed staff). Release Notes: - N/A --------- Co-authored-by: Marshall <marshall@zed.dev> Co-authored-by: Marshall Bowers <elliott.codes@gmail.com>	2024-08-05 20:26:21 -04:00
Max Brunsfeld	27779e33fb	Refactor: Restructure collab main function to prepare for new subcommand: `serve llm` (#15824 ) This is just a refactor that we're landing ahead of any functional changes to make sure we haven't broken anything. Release Notes: - N/A Co-authored-by: Marshall <marshall@zed.dev> Co-authored-by: Jason <jason@zed.dev>	2024-08-05 12:07:38 -07:00

1 2

79 commits