Don't auto-retry in certain circumstances (#35037)
Someone encountered this in production, which should not happen: <img width="1266" height="623" alt="Screenshot 2025-07-24 at 10 38 40 AM" src="https://github.com/user-attachments/assets/40f3f977-5110-4808-a456-7e708d953b3b" /> This moves certain errors into the category of "never retry" and reduces the number of retries for some others. Also it adds some diagnostic logging for retry policy. It's not a complete fix for the above, because the underlying issues is that the server is sending a HTTP 403 response and although we were already treating 403s as "do not retry" it was deciding to retry with 2 attempts anyway. So further debugging is needed to figure out why it wasn't going down the 403 branch by the time the request got here. Release Notes: - N/A
This commit is contained in:
parent
f6f7762f32
commit
ceab8c17f4
1 changed files with 12 additions and 7 deletions
|
@ -2037,6 +2037,12 @@ impl Thread {
|
|||
if let Some(retry_strategy) =
|
||||
Thread::get_retry_strategy(completion_error)
|
||||
{
|
||||
log::info!(
|
||||
"Retrying with {:?} for language model completion error {:?}",
|
||||
retry_strategy,
|
||||
completion_error
|
||||
);
|
||||
|
||||
retry_scheduled = thread
|
||||
.handle_retryable_error_with_delay(
|
||||
&completion_error,
|
||||
|
@ -2246,15 +2252,14 @@ impl Thread {
|
|||
..
|
||||
}
|
||||
| AuthenticationError { .. }
|
||||
| PermissionError { .. } => None,
|
||||
// These errors might be transient, so retry them
|
||||
SerializeRequest { .. }
|
||||
| BuildRequestBody { .. }
|
||||
| PromptTooLarge { .. }
|
||||
| PermissionError { .. }
|
||||
| NoApiKey { .. }
|
||||
| ApiEndpointNotFound { .. }
|
||||
| NoApiKey { .. } => Some(RetryStrategy::Fixed {
|
||||
| PromptTooLarge { .. } => None,
|
||||
// These errors might be transient, so retry them
|
||||
SerializeRequest { .. } | BuildRequestBody { .. } => Some(RetryStrategy::Fixed {
|
||||
delay: BASE_RETRY_DELAY,
|
||||
max_attempts: 2,
|
||||
max_attempts: 1,
|
||||
}),
|
||||
// Retry all other 4xx and 5xx errors once.
|
||||
HttpResponseError { status_code, .. }
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue