assistant: Limit amount of concurrent completion requests (#13856)

This PR refactors the completion providers to only process a maximum amount of completion requests at a time. Also started refactoring language model providers to use traits, so it's easier to allow specifying multiple providers in the future. Release Notes: - N/A
2024-07-05 14:52:45 +02:00 · 2024-07-05 14:52:45 +02:00 · c4dbe32f20
commit c4dbe32f20
parent f2711b2fca
11 changed files with 693 additions and 532 deletions
--- a/crates/assistant/src/assistant.rs
+++ b/crates/assistant/src/assistant.rs
@ -163,7 +163,7 @@ impl LanguageModelRequestMessage {
    }
 }

-#[derive(Debug, Default, Serialize)]
+#[derive(Debug, Default, Serialize, Deserialize)]
 pub struct LanguageModelRequest {
    pub model: LanguageModel,
    pub messages: Vec<LanguageModelRequestMessage>,