Re-add code block formatting instructions (#29574)

Re-enabled instructions about code block formatting.

In practice, the model doesn't seem to use these very often, but there's
no negative effect on evals. In a future PR, I'll experiment with adding
more evals around the model actually using the code blocks.

2 runs before: (`--repetitions=8`)
```
=================================================================
                            AGGREGATE
=================================================================


4 examples failed to run!
Average programmatic score: 37%
Average diff score: 66%
Average thread score: 93%


-----------------------------------------------------------------
                     CUMULATIVE TOOL METRICS
-----------------------------------------------------------------

┌──────────────────────────────┬──────────┬──────────┬──────────┐
│             Tool             │   Uses   │ Failures │   Rate   │
├──────────────────────────────┼──────────┼──────────┼──────────┤
│edit_file                     │   398    │    53    │   13%    │
│terminal                      │    11    │    1     │    9%    │
│create_file                   │    40    │    2     │    5%    │
│read_file                     │   245    │    8     │    3%    │
│find_path                     │    48    │    0     │    0%    │
│list_directory                │    13    │    0     │    0%    │
│grep                          │   133    │    0     │    0%    │
│thinking                      │    18    │    0     │    0%    │
│diagnostics                   │   130    │    0     │    0%    │
```

```
=================================================================
                            AGGREGATE
=================================================================


1 examples failed to run!
Average programmatic score: 41%
Average diff score: 68%
Average thread score: 96%


-----------------------------------------------------------------
                     CUMULATIVE TOOL METRICS
-----------------------------------------------------------------

┌──────────────────────────────┬──────────┬──────────┬──────────┐
│             Tool             │   Uses   │ Failures │   Rate   │
├──────────────────────────────┼──────────┼──────────┼──────────┤
│fetch                         │    1     │    1     │   100%   │
│edit_file                     │   553    │    63    │   11%    │
│read_file                     │   349    │    3     │    1%    │
│diagnostics                   │   158    │    0     │    0%    │
│find_path                     │    70    │    0     │    0%    │
│list_directory                │    10    │    0     │    0%    │
│thinking                      │    45    │    0     │    0%    │
│grep                          │   213    │    0     │    0%    │
│create_file                   │    24    │    0     │    0%    │
│terminal                      │    17    │    0     │    0%    │
└──────────────────────────────┴──────────┴──────────┴──────────┘
```

1 run after this change:

```
=================================================================
                            AGGREGATE
=================================================================

Average programmatic score: 42%
Average diff score: 74%
Average thread score: 100%


-----------------------------------------------------------------
                     CUMULATIVE TOOL METRICS
-----------------------------------------------------------------

┌──────────────────────────────┬──────────┬──────────┬──────────┐
│             Tool             │   Uses   │ Failures │   Rate   │
├──────────────────────────────┼──────────┼──────────┼──────────┤
│edit_file                     │   534    │    92    │   17%    │
│read_file                     │   325    │    6     │    2%    │
│list_directory                │    6     │    0     │    0%    │
│thinking                      │    12    │    0     │    0%    │
│create_file                   │    16    │    0     │    0%    │
│diagnostics                   │    49    │    0     │    0%    │
│grep                          │   234    │    0     │    0%    │
│find_path                     │    65    │    0     │    0%    │
│terminal                      │    38    │    0     │    0%    │
└──────────────────────────────┴──────────┴──────────┴──────────┘
```


Release Notes:

- N/A
This commit is contained in:
Richard Feldman 2025-04-29 10:37:31 -04:00 committed by GitHub
parent 4812c9094b
commit 2b431d3e9d
No known key found for this signature in database
GPG key ID: B5690EEEBB952194

View file

@ -36,6 +36,20 @@ If appropriate, use tool calls to explore the current project, which contains th
- The user might specify a partial file path. If you don't know the full path, use `find_path` (not `grep`) before you read the file.
{{/if}}
## Code Block Formatting
Whenever you mention a code block, you MUST use ONLY use the following format when the code in the block comes from a file
in the project:
```path/to/Something.blah#L123-456
(code goes here)
```
The `#L123-456` means the line number range 123 through 456, and the path/to/Something.blah
is a path in the project. (If this code block does not come from a file in the project, then you may instead use
the normal markdown style of three backticks followed by language name. However, you MUST use this format if
the code in the block comes from a file in the project.)
## Fixing Diagnostics
1. Make 1-2 attempts at fixing diagnostics, then defer to the user.