agent: Handle attempts to use hallucinated tools (#29946)

This change:

1. Catches attempts to use missing tools. If this happens, we now send
Agent a message listing available tools, after which Agent can
gracefully recover. Prior behavior: thread would stop in a broken state.

Example of a hallucinated call and a message we send back: 

![image](https://github.com/user-attachments/assets/92a8f700-b192-4038-8c7e-0a74ca2e0146)

2. Adds evals for hallucinated tool use and imagined edits
3. Adds ability to configure a profile name in evals.



Release Notes:

- N/A
This commit is contained in:
Oleksiy Syvokon 2025-05-05 22:31:11 +03:00 committed by GitHub
parent 7dfbe0b908
commit 8199664a5a
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
14 changed files with 111 additions and 0 deletions

View file

@ -307,9 +307,14 @@ impl ExampleInstance {
std::fs::write(&last_diff_file_path, "")?;
let thread_store = thread_store.await?;
let profile_id = meta.profile_id.clone();
thread_store.update(cx, |thread_store, cx| thread_store.load_profile_by_id(profile_id, cx)).expect("Failed to load profile");
let thread =
thread_store.update(cx, |thread_store, cx| thread_store.create_thread(cx))?;
thread.update(cx, |thread, _cx| {
let mut request_count = 0;
let previous_diff = Rc::new(RefCell::new("".to_string()));