Yehowshua/ZIm - Forgejo: Beyond coding. We Forge.

Author	SHA1	Message	Date
Marshall Bowers	9bd3dbcf28	collab: Include more information on some LLM usage log lines (#28116 ) This PR updates the `user rate limit` and `user usage` log lines to include some more information that will be useful for graphing in Axiom. Release Notes: - N/A	2025-04-04 18:33:23 +00:00
Marshall Bowers	558d61b907	collab: Adjust rate-limiting measures for Claude 3.7 Sonnet (#28111 ) This PR updates the usage measures used for rate limiting when using Claude 3.7 Sonnet. Instead of using the combined `tokens_per_minute` measure we now rate limit individually on `input_tokens_per_minute` (which exclude cache reads) and `output_tokens_per_minute`. Release Notes: - N/A	2025-04-04 13:37:24 -04:00
Marshall Bowers	1a899fda60	collab: Capture upstream input/output rate limits from Anthropic (#28106 ) This PR makes it so we capture the upstream rate limit information from Anthropic for input and output tokens. Release Notes: - N/A	2025-04-04 17:09:00 +00:00
Marshall Bowers	183f57f318	collab: Include max input/output tokens per minute on "Language Model Rate Limited" event (#28108 ) This PR adds the max input/output tokens per minute on the "Language Model Rate Limited" event. Missed this in https://github.com/zed-industries/zed/pull/28097. Release Notes: - N/A	2025-04-04 16:57:43 +00:00
Marshall Bowers	5fe86f7e70	collab: Track input and output tokens per minute separately (#28097 ) This PR adds tracking for input and output tokens per minute separately from the current aggregate tokens per minute. We are not yet rate-limiting based on these measures. Release Notes: - N/A	2025-04-04 15:37:06 +00:00
Piotr Osiewicz	dc64ec9cc8	chore: Bump Rust edition to 2024 (#27800 ) Follow-up to https://github.com/zed-industries/zed/pull/27791 Release Notes: - N/A	2025-03-31 20:55:27 +02:00
Marshall Bowers	cd5d7e82d0	collab: Make account age-related fields required in `LlmTokenClaims` (#26959 ) This PR makes the account age-related fields required in `LlmTokenClaims`. We've also removed the account age check from the LLM token issuance endpoint, instead having it solely be enforced in the `POST /completion` endpoint. This change will be safe to deploy at ~8:01PM EDT. Release Notes: - N/A	2025-03-17 19:54:44 -04:00
Marshall Bowers	0851842d2c	collab: Defer account age check to `POST /completion` endpoint (#26956 ) This PR defers the account age check to the `POST /completion` endpoint instead of doing it when an LLM token is generated. This will allow us to lift the account age restriction for using Edit Prediction. Note: We're still temporarily performing the account age check when issuing the LLM token until this change is deployed and the LLM tokens have had a chance to cycle. Release Notes: - N/A	2025-03-17 22:42:29 +00:00
Marshall Bowers	7f166298db	collab: Adjust maximum spending limit check (#25596 ) This is a follow-up to https://github.com/zed-industries/zed/pull/25573. We were still using the spend for a particular model when determining if the user was over their maximum monthly spend instead of looking at the usage across all models. Release Notes: - N/A	2025-02-25 16:45:01 -05:00
Marshall Bowers	3a3621f2d8	collab: Limit free tier usage across all models (#25573 ) This PR adjusts the usage checks for the LLM free tier. Previously we would limit the usage on a per-model basis, meaning the user would get $10/mo free for each model they had access to. We now have usage for all models count towards the free tier limit. Release Notes: - N/A	2025-02-25 16:42:55 +00:00
Peter Tripp	10a4760f90	Add Anthropic Claude 3.7 support (#25497 )	2025-02-24 16:10:26 -05:00
Marshall Bowers	8be73bf187	collab: Remove unused `POST /predict_edits` endpoint from LLM service (#23997 ) This PR removes the `POST /predict_edits` endpoint from the LLM service, as it has been superseded by the corresponding endpoint running in Cloudflare Workers. All traffic is already being routed to the Cloudflare Workers via the Workers route, so nothing is hitting this endpoint running in the LLM service anymore. You can see the drop off in requests to this endpoint on this graph when the Workers route was added: <img width="472" alt="Screenshot 2025-01-30 at 9 18 04 PM" src="https://github.com/user-attachments/assets/fa60f7c8-2737-4329-88a3-17093bdb5a29" /> We also don't use the `fireworks` crate anymore in this repo, so it has been removed. Release Notes: - N/A	2025-01-31 03:21:40 +00:00
Michael Sloan	87b0f62041	Implement simpler logic for edit predictions prompt byte limits (#23983 ) Realized that the logic in #23814 was more than needed, and harder to maintain. Something like that could make sense if using the tokenizer and wanting to precisely hit a token limit. However in the case of edit predictions it's more of a latency+expense vs capability tradeoff, and so such precision is unnecessary. Happily this change didn't require much extra work, just copy-modifying parts of that change was sufficient. Release Notes: - N/A	2025-01-30 15:27:42 -07:00
Agus Zubiaga	e23e03592b	zeta: Onboarding and title bar banner (#23797 ) Release Notes: - N/A --------- Co-authored-by: Danilo Leal <daniloleal09@gmail.com> Co-authored-by: Danilo <danilo@zed.dev> Co-authored-by: João Marcos <joao@zed.dev>	2025-01-30 16:55:32 -03:00
Thorsten Ball	0cb41754e2	llm: Sample ~10% of staff members inputs/outputs to LLM (#23537 ) Release Notes: - N/A	2025-01-23 15:32:25 +01:00
Antonio Scandurra	880f3ff243	Timeout if completion takes longer than 2s (#23215 ) Release Notes: - N/A	2025-01-16 11:13:25 +01:00
Agus Zubiaga	4a7630204a	Check for `predict-edits` feature flag, remove `is_staff` check (#23165 ) Release Notes: - N/A --------- Co-authored-by: Thorsten Ball <mrnugget@gmail.com>	2025-01-15 13:52:10 +00:00
Antonio Scandurra	c26553de82	Add more metrics for Fireworks Completion Requested (#23062 ) Release Notes: - N/A Co-authored-by: Thorsten <thorsten@zed.dev>	2025-01-13 12:04:28 +00:00
Thorsten Ball	1fcc9b36ba	zeta: Report Fireworks request data to Snowflake (#22973 ) Release Notes: - N/A --------- Co-authored-by: Antonio Scandurra <me@as-cii.com> Co-authored-by: Conrad <conrad@zed.dev>	2025-01-10 22:40:54 +00:00
Antonio Scandurra	c3301077af	Log errors when a prediction fails (#22961 ) Release Notes: - N/A --------- Co-authored-by: Thorsten <thorsten@zed.dev>	2025-01-10 14:07:17 +00:00
Antonio Scandurra	a8ef0f2426	Include outline when predicting edits with Zeta (#22895 ) Release Notes: - N/A Co-authored-by: Thorsten <thorsten@zed.dev>	2025-01-09 14:26:33 +00:00
Conrad Irwin	03efd0d1d9	Stop sending data to Clickhouse (#21763 ) Release Notes: - N/A	2024-12-10 08:47:29 -07:00
Marshall Bowers	158cdc33ba	collab: Attach additional properties to `Language Model Used` event (#21770 ) This PR attaches two new properties to the `Language Model Used` event: - `has_llm_subscription` - This will tell us if a user is a paid subscriber. - `max_monthly_spend_in_cents` - This will indicate what their maximum monthly spend is set to. Release Notes: - N/A	2024-12-09 17:13:41 -05:00
Antonio Scandurra	77b8296fbb	Introduce staff-only inline completion provider (#21739 ) Release Notes: - N/A --------- Co-authored-by: Thorsten Ball <mrnugget@gmail.com> Co-authored-by: Bennet <bennet@zed.dev> Co-authored-by: Thorsten <thorsten@zed.dev>	2024-12-09 14:26:36 +01:00
Conrad Irwin	984bb192ba	Send llm events to snowflake too (#21091 ) Closes #ISSUE Release Notes: - N/A	2024-11-22 20:40:39 -07:00
Thorsten Ball	aee01f2c50	assistant: Remove `low_speed_timeout` (#20681 ) This removes the `low_speed_timeout` setting from all providers as a response to issue #19509. Reason being that the original `low_speed_timeout` was only as part of #9913 because users wanted to _get rid of timeouts_. They wanted to bump the default timeout from 5sec to a lot more. Then, in the meantime, the meaning of `low_speed_timeout` changed in #19055 and was changed to a normal `timeout`, which is a different thing and breaks slower LLMs that don't reply with a complete response in the configured timeout. So we figured: let's remove the whole thing and replace it with a default _connect_ timeout to make sure that we can connect to a server in 10s, but then give the server as long as it wants to complete its response. Closes #19509 Release Notes: - Removed the `low_speed_timeout` setting from LLM provider settings, since it was only used to _increase_ the timeout to give LLMs more time, but since we don't have any other use for it, we simply remove the setting to give LLMs as long as they need. --------- Co-authored-by: Antonio <antonio@zed.dev> Co-authored-by: Peter Tripp <peter@zed.dev>	2024-11-15 07:37:31 +01:00
Marshall Bowers	a451bcc3c4	collab: Exempt staff from LLM usage limits (#19836 ) This PR updates the usage limit check to exempt Zed staff members from usage limits. We previously had some affordances for the rate limits, but hadn't yet updated it for the usage-based billing. Release Notes: - N/A	2024-10-28 11:45:18 -04:00
Marshall Bowers	1a4b253ee5	collab: Add support for a custom monthly allowance for LLM usage (#19525 ) This PR adds support for setting a monthly LLM usage allowance for certain users. Release Notes: - N/A	2024-10-21 17:12:33 -04:00
Marshall Bowers	b44bed0115	collab: Unconditionally execute billing checks (#19432 ) This PR removes the conditional checks around the billing-related enforcement for LLM completions. These were just in place to prevent executing any billing code before we had rolled it out. Now that it is rolled out, we don't need this conditional execution anymore. Release Notes: - N/A	2024-10-18 15:55:28 -04:00
Antonio Scandurra	8c910540ed	Subtract FREE_TIER_MONTHLY_SPENDING_LIMIT from reported monthly spend (#19358 ) Release Notes: - N/A	2024-10-17 13:09:50 +02:00
Marshall Bowers	f6fad3b09e	collab: Remove lifetime spending limit in favor of LLM usage billing (#19321 ) This PR removes the lifetime spending limit that was added in #16780. We had previously added this as a way to prevent runaway usage, but now that we have a cap on free usage per month with paid access after that, we don't need this check anymore. Release Notes: - N/A	2024-10-16 18:14:07 -04:00
Antonio Scandurra	474e670bbd	Increase monthly free tier spend from 5 dollars to 10 dollars (#19291 ) Release Notes: - N/A Co-authored-by: Marshall <marshall@zed.dev> Co-authored-by: Richard <richard@zed.dev>	2024-10-16 12:22:24 -04:00
Mikayla Maki	22ac178f9d	Restore HTTP client transition, but use reqwest everywhere (#19055 ) Release Notes: - N/A	2024-10-11 14:58:58 -07:00
Marshall Bowers	c709b66f35	collab: Don't record billing events if billing is not enabled (#19102 ) This PR adjusts the billing logic to not write any records to `billing_events` if: - The user is staff, as we don't want to bill staff members - Billing is disabled (we currently enable billing based on the presence of the Stripe API key) Release Notes: - N/A	2024-10-11 17:54:10 -04:00
Marshall Bowers	22ea7cef7a	collab: Add usage-based billing for LLM interactions (#19081 ) This PR adds usage-based billing for LLM interactions in the Assistant. Release Notes: - N/A --------- Co-authored-by: Antonio Scandurra <me@as-cii.com> Co-authored-by: Antonio <antonio@zed.dev> Co-authored-by: Richard <richard@zed.dev> Co-authored-by: Richard Feldman <oss@rtfeldman.com>	2024-10-11 13:36:54 -04:00
Marshall Bowers	69711660ab	collab: Make LLM billing fields required in `LlmTokenClaims` (#18959 ) This PR makes the `has_llm_subscription` and `max_monthly_spend_in_cents` fields in the `LlmTokenClaims` required. This change will be safe to deploy in ~45 minutes. Release Notes: - N/A	2024-10-09 18:42:22 -04:00
Marshall Bowers	d316577fd5	collab: Add billing preferences for maximum LLM monthly spend (#18948 ) This PR adds a new `billing_preferences` table. Right now there is a single preference: the maximum monthly spend for LLM usage. Release Notes: - N/A --------- Co-authored-by: Richard <richard@zed.dev>	2024-10-09 16:29:07 -04:00
Marshall Bowers	f1053ff525	collab: Clarify naming around free tier spending limits (#18936 ) This PR renames the `MONTHLY_SPENDING_LIMIT` constant to `FREE_TIER_MONTHLY_SPENDING_LIMIT` to clarify it. This will help distinguish it from the user's specified limit on their paid monthly spending. Release Notes: - N/A	2024-10-09 15:05:53 -04:00
Marshall Bowers	817a41c4dc	collab: Add a `Cents` type (#18935 ) This PR adds a new `Cents` type that can be used to represent a monetary value in cents. This cuts down on the primitive obsession we were using when dealing with money in the billing code. Release Notes: - N/A	2024-10-09 14:22:32 -04:00
Mikayla Maki	5d5c4b6677	Revert http client changes (#18892 ) These proved to be too unstable. Will restore these changes once the issues have been fixed. Release Notes: - N/A	2024-10-09 01:07:18 -07:00
Marshall Bowers	f861479890	collab: Update billing code for LLM usage billing (#18879 ) This PR reworks our existing billing code in preparation for charging based on LLM usage. We aren't yet exercising the new billing-related code outside of development. There are some noteworthy changes for our existing LLM usage tracking: - A new `monthly_usages` table has been added for tracking usage per-user, per-model, per-month - The per-month usage measures have been removed, in favor of the `monthly_usages` table - All of the per-month metrics in the Clickhouse rows have been changed from a rolling 30-day window to a calendar month Release Notes: - N/A --------- Co-authored-by: Antonio Scandurra <me@as-cii.com> Co-authored-by: Richard <richard@zed.dev> Co-authored-by: Max <max@zed.dev>	2024-10-08 18:29:38 -04:00
Marshall Bowers	d55f025906	collab: Track cache writes/reads in LLM usage (#18834 ) This PR extends the LLM usage tracking to support tracking usage for cache writes and reads for Anthropic models. Release Notes: - N/A --------- Co-authored-by: Antonio Scandurra <me@as-cii.com> Co-authored-by: Antonio <antonio@zed.dev>	2024-10-07 17:32:49 -04:00
Conrad Irwin	3a5deb5c6f	Replace isahc with async ureq (#18414 ) REplace isahc with ureq everywhere gpui is used. This should allow us to make http requests without libssl; and avoid a long-tail of panics caused by ishac. Release Notes: - (potentially breaking change) updated our http client --------- Co-authored-by: Mikayla <mikayla@zed.dev>	2024-10-02 12:30:48 -07:00
Richard Feldman	caaa9a00a9	Remove Qwen2 model (#18444 ) Removed deprecated Qwen2 7B Instruct model from zed.dev provider (staff only). Release Notes: - N/A	2024-09-27 13:30:25 -04:00
Piotr Osiewicz	2c8a6ee7cc	remote_server: Remove dependency on libssl and libcrypto (#15446 ) Fixes: #15599 Release Notes: - N/A --------- Co-authored-by: Mikayla <mikayla@zed.dev> Co-authored-by: Conrad <conrad@zed.dev>	2024-09-18 23:29:34 +02:00
Richard Feldman	91ffa02e2c	/auto (#16696 ) Add `/auto` behind a feature flag that's disabled for now, even for staff. We've decided on a different design for context inference, but there are parts of /auto that will be useful for that, so we want them in the code base even if they're unused for now. Release Notes: - N/A --------- Co-authored-by: Antonio Scandurra <me@as-cii.com> Co-authored-by: Marshall Bowers <elliott.codes@gmail.com>	2024-09-13 13:17:49 -04:00
Piotr Osiewicz	e6c1c51b37	chore: Fix several style lints (#17488 ) It's not comprehensive enough to start linting on `style` group, but hey, it's a start. Release Notes: - N/A	2024-09-06 11:58:39 +02:00
Bennet Bo Fenner	f413ea90bf	assistant: Fix Google AI provider not respecting `low_speed_timeout_in_seconds` (#17423 ) Release Notes: - Fixed an issue when using Google Gemini models, where the setting `low_speed_timeout_in_seconds` was not respected	2024-09-05 18:16:30 +02:00
Marshall Bowers	30056254f3	collab: Add `GET /models` endpoint to LLM service (#17307 ) This PR adds a `GET /models` endpoint to the LLM service. This endpoint returns the models that the authenticated user has access to. This is the first step towards populating the models for the hosted service from the server. Release Notes: - N/A	2024-09-03 11:41:32 -04:00
Marshall Bowers	d666cc5fba	collab: Report when upstream rate limit is exceeded (#17083 ) This PR makes it so we report a trace when the upstream rate limit is exceeded. Release Notes: - N/A	2024-08-29 08:54:45 -04:00

1 2

79 commits