ZIm/crates/google_ai
Michael Sloan fbf7caf93e
Default to fast model for thread summaries and titles + don't include system prompt / context / thinking segments (#29102)
* Adds a fast / cheaper model to providers and defaults thread
summarization to this model. Initial motivation for this was that
https://github.com/zed-industries/zed/pull/29099 would cause these
requests to fail when used with a thinking model. It doesn't seem
correct to use a thinking model for summarization.

* Skips system prompt, context, and thinking segments.

* If tool use is happening, allows 2 tool uses + one more agent response
before summarizing.

Downside of this is that there was potential for some prefix cache reuse
before, especially for title summarization (thread summarization omitted
tool results and so would not share a prefix for those). This seems fine
as these requests should typically be fairly small. Even for full thread
summarization, skipping all tool use / context should greatly reduce the
token use.

Release Notes:

- N/A
2025-04-19 23:26:29 +00:00
..
src Default to fast model for thread summaries and titles + don't include system prompt / context / thinking segments (#29102) 2025-04-19 23:26:29 +00:00
Cargo.toml Add workspace-hack (#27277) 2025-04-02 13:26:34 -07:00
LICENSE-GPL Fix licensing errors 2024-03-20 15:52:02 +01:00