More resilient eval (#32257)

Bubbles up rate limit information so that we can retry after a certain
duration if needed higher up in the stack.

Also caps the number of concurrent evals running at once to also help.

Release Notes:

- N/A
This commit is contained in:
Ben Brandt 2025-06-09 20:07:22 +02:00 committed by GitHub
parent fa54fa80d0
commit e4bd115a63
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
22 changed files with 147 additions and 56 deletions

View file

@ -62,7 +62,7 @@ jobs:
- name: Run unit evals
shell: bash -euxo pipefail {0}
run: cargo nextest run --workspace --no-fail-fast --features eval --no-capture -E 'test(::eval_)' --test-threads 1
run: cargo nextest run --workspace --no-fail-fast --features eval --no-capture -E 'test(::eval_)'
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}