edit_file: Let agent specify locations of edit chunks (#32628)
These changes help the agent edit files when `<old_text>` matches more than one location. First, the agent can specify an optional `<old_text line=XX>` parameter. When this is provided and multiple matches exist, we use this hint to identify the best match. Second, when there is ambiguity in matches, we now return the agent a more helpful message listing the line numbers of all possible matches. Together, these changes should reduce the number of misplaced edits and agent confusion. I have ensured the LLM Worker works with these prompt changes. Release Notes: - Agent: Improved locating edits
This commit is contained in:
parent
e8d495806f
commit
5d293ae8ac
6 changed files with 286 additions and 63 deletions
|
@ -438,14 +438,21 @@ fn eval_disable_cursor_blinking() {
|
|||
#[test]
|
||||
#[cfg_attr(not(feature = "eval"), ignore)]
|
||||
fn eval_from_pixels_constructor() {
|
||||
// Results for 2025-05-22
|
||||
// Results for 2025-06-13
|
||||
//
|
||||
// The outcome of this evaluation depends heavily on the LINE_HINT_TOLERANCE
|
||||
// value. Higher values improve the pass rate but may sometimes cause
|
||||
// edits to be misapplied. In the context of this eval, this means
|
||||
// the agent might add from_pixels tests in incorrect locations
|
||||
// (e.g., at the beginning of the file), yet the evaluation may still
|
||||
// rate it highly.
|
||||
//
|
||||
// Model | Pass rate
|
||||
// ============================================
|
||||
//
|
||||
// claude-3.7-sonnet |
|
||||
// gemini-2.5-pro-preview-03-25 | 0.94
|
||||
// gemini-2.5-flash-preview-04-17 |
|
||||
// claude-4.0-sonnet | 0.99
|
||||
// claude-3.7-sonnet | 0.88
|
||||
// gemini-2.5-pro-preview-03-25 | 0.96
|
||||
// gpt-4.1 |
|
||||
let input_file_path = "root/canvas.rs";
|
||||
let input_file_content = include_str!("evals/fixtures/from_pixels_constructor/before.rs");
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue