assistant: Limit amount of concurrent completion requests (#13856)

This PR refactors the completion providers to only process a maximum
amount of completion requests at a time.

Also started refactoring language model providers to use traits, so it's
easier to allow specifying multiple providers in the future.

Release Notes:

- N/A
This commit is contained in:
Bennet Bo Fenner 2024-07-05 14:52:45 +02:00 committed by GitHub
parent f2711b2fca
commit c4dbe32f20
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
11 changed files with 693 additions and 532 deletions

View file

@ -163,7 +163,7 @@ impl LanguageModelRequestMessage {
}
}
#[derive(Debug, Default, Serialize)]
#[derive(Debug, Default, Serialize, Deserialize)]
pub struct LanguageModelRequest {
pub model: LanguageModel,
pub messages: Vec<LanguageModelRequestMessage>,