assistant: Limit amount of concurrent completion requests (#13856)
This PR refactors the completion providers to only process a maximum amount of completion requests at a time. Also started refactoring language model providers to use traits, so it's easier to allow specifying multiple providers in the future. Release Notes: - N/A
This commit is contained in:
parent
f2711b2fca
commit
c4dbe32f20
11 changed files with 693 additions and 532 deletions
|
@ -163,7 +163,7 @@ impl LanguageModelRequestMessage {
|
|||
}
|
||||
}
|
||||
|
||||
#[derive(Debug, Default, Serialize)]
|
||||
#[derive(Debug, Default, Serialize, Deserialize)]
|
||||
pub struct LanguageModelRequest {
|
||||
pub model: LanguageModel,
|
||||
pub messages: Vec<LanguageModelRequestMessage>,
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue