Add mismatched tag threshold parameter to eval function (#32190)
Replace hardcoded 0.10 threshold with configurable parameter and set 0.05 default for most tests, with 0.2 for from_pixels_constructor eval that produces more mismatched tags. Release Notes: - N/A
This commit is contained in:
parent
8bd8435887
commit
ddf70b3bb8
2 changed files with 19 additions and 3 deletions
2
.github/workflows/unit_evals.yml
vendored
2
.github/workflows/unit_evals.yml
vendored
|
@ -66,7 +66,7 @@ jobs:
|
|||
env:
|
||||
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
|
||||
|
||||
- name: Send the pull request link into the Slack channel
|
||||
- name: Send failure message to Slack channel if needed
|
||||
if: ${{ failure() }}
|
||||
uses: slackapi/slack-github-action@b0fa283ad8fea605de13dc3f449259339835fc52
|
||||
with:
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue