Yehowshua/ZIm - Forgejo: Beyond coding. We Forge.

Author	SHA1	Message	Date
Max Brunsfeld	a994666888	Include full abs paths of worktrees in system prompt (#32725 ) Some MCP servers expose tools that take absolute paths as arguments. To interact with these, the agent needs to know the absolute path to the project directories, not just their names. This PR changes the system prompt to include the full path to each worktree, and updates some tool descriptions to reflect this. Todo: * [x] Run evals, make sure assistant still understand how to specify paths for tools, now that we include abs paths in the system prompt. Release Notes: - Improved the agent's ability to use MPC tools that require absolute paths to files and directories in the project. --------- Co-authored-by: Ben Brandt <benjamin.j.brandt@gmail.com>	2025-06-15 15:45:26 +02:00
Oleksiy Syvokon	3884de937b	assistant: Partial fix for HTML entities in tools params (#32148 ) This problem seems to be specific to Opus 4. Eval shows improvement from 89% to 97%. Closes: https://github.com/zed-industries/zed/issues/32060 Release Notes: - N/A Co-authored-by: Ben Brandt <benjamin.j.brandt@gmail.com>	2025-06-05 10:36:55 +00:00
Richard Feldman	23fbab15ee	Manual no tool calls (#29745 ) Now instead of the model hallucinating tool calls, we get requests for more context: <img width="620" alt="Screenshot 2025-05-01 at 12 45 49 PM" src="https://github.com/user-attachments/assets/847d5c14-82f6-4234-b85a-8cd2bc7ab11d" /> It still knows how to answer general questions: <img width="624" alt="Screenshot 2025-05-01 at 12 47 44 PM" src="https://github.com/user-attachments/assets/43ab0fc3-4cc8-452f-b26b-474b5d31919f" /> Release Notes: - Fixed the model still trying to do tool calls when no tools selected (e.g. in `Manual` profile). --------- Co-authored-by: Ben <ben@zed.dev> Co-authored-by: Michael <michael@zed.dev>	2025-05-01 16:11:13 -04:00
Jonathan LEI	50ec26c163	Fix user rules ignored by agent (#29754 ) Closes #29753 The template contains an error: `has_default_user_rules` is always undefined and should be `has_user_rules` instead. Release Notes: - Fixed default user rules ignored during prompt building.	2025-05-01 18:22:48 +00:00
Richard Feldman	d7004030b3	Code block evals (#29619 ) Add a targeted eval for code block formatting, and revise the system prompt accordingly. ### Eval before, n=8 <img width="728" alt="eval before" src="https://github.com/user-attachments/assets/552b6146-3d26-4eaa-86f9-9fc36c0cadf2" /> ### Eval after prompt change, n=8 (excluding the new evals, so just testing the prompt change) <img width="717" alt="eval after" src="https://github.com/user-attachments/assets/c78c7a54-4c65-470c-b135-8691584cd73e" /> Release Notes: - N/A	2025-04-29 18:52:09 -04:00
Agus Zubiaga	2508e491d5	agent: Discourage long-running commands (#29627 ) Adds popular examples of long-running commands to system prompt. Unfortunately, I couldn't add an eval example as the new terminal tool no longer works in `eval`. We can look into that tomorrow, but I'm seeing improvements when manually testing this, so I'd like to merge it. <img src="https://github.com/user-attachments/assets/ac24e617-e068-466f-875d-c30e1f2465c4" width=400></img> Release Notes: - agent: Discourage long-running commands	2025-04-29 19:21:16 -03:00
Richard Feldman	2b431d3e9d	Re-add code block formatting instructions (#29574 ) Re-enabled instructions about code block formatting. In practice, the model doesn't seem to use these very often, but there's no negative effect on evals. In a future PR, I'll experiment with adding more evals around the model actually using the code blocks. 2 runs before: (`--repetitions=8`) ``` ================================================================= AGGREGATE ================================================================= 4 examples failed to run! Average programmatic score: 37% Average diff score: 66% Average thread score: 93% ----------------------------------------------------------------- CUMULATIVE TOOL METRICS ----------------------------------------------------------------- ┌──────────────────────────────┬──────────┬──────────┬──────────┐ │ Tool │ Uses │ Failures │ Rate │ ├──────────────────────────────┼──────────┼──────────┼──────────┤ │edit_file │ 398 │ 53 │ 13% │ │terminal │ 11 │ 1 │ 9% │ │create_file │ 40 │ 2 │ 5% │ │read_file │ 245 │ 8 │ 3% │ │find_path │ 48 │ 0 │ 0% │ │list_directory │ 13 │ 0 │ 0% │ │grep │ 133 │ 0 │ 0% │ │thinking │ 18 │ 0 │ 0% │ │diagnostics │ 130 │ 0 │ 0% │ ``` ``` ================================================================= AGGREGATE ================================================================= 1 examples failed to run! Average programmatic score: 41% Average diff score: 68% Average thread score: 96% ----------------------------------------------------------------- CUMULATIVE TOOL METRICS ----------------------------------------------------------------- ┌──────────────────────────────┬──────────┬──────────┬──────────┐ │ Tool │ Uses │ Failures │ Rate │ ├──────────────────────────────┼──────────┼──────────┼──────────┤ │fetch │ 1 │ 1 │ 100% │ │edit_file │ 553 │ 63 │ 11% │ │read_file │ 349 │ 3 │ 1% │ │diagnostics │ 158 │ 0 │ 0% │ │find_path │ 70 │ 0 │ 0% │ │list_directory │ 10 │ 0 │ 0% │ │thinking │ 45 │ 0 │ 0% │ │grep │ 213 │ 0 │ 0% │ │create_file │ 24 │ 0 │ 0% │ │terminal │ 17 │ 0 │ 0% │ └──────────────────────────────┴──────────┴──────────┴──────────┘ ``` 1 run after this change: ``` ================================================================= AGGREGATE ================================================================= Average programmatic score: 42% Average diff score: 74% Average thread score: 100% ----------------------------------------------------------------- CUMULATIVE TOOL METRICS ----------------------------------------------------------------- ┌──────────────────────────────┬──────────┬──────────┬──────────┐ │ Tool │ Uses │ Failures │ Rate │ ├──────────────────────────────┼──────────┼──────────┼──────────┤ │edit_file │ 534 │ 92 │ 17% │ │read_file │ 325 │ 6 │ 2% │ │list_directory │ 6 │ 0 │ 0% │ │thinking │ 12 │ 0 │ 0% │ │create_file │ 16 │ 0 │ 0% │ │diagnostics │ 49 │ 0 │ 0% │ │grep │ 234 │ 0 │ 0% │ │find_path │ 65 │ 0 │ 0% │ │terminal │ 38 │ 0 │ 0% │ └──────────────────────────────┴──────────┴──────────┴──────────┘ ``` Release Notes: - N/A	2025-04-29 10:37:31 -04:00
Oleksiy Syvokon	99df1190a9	agent: Include grep-related instructions in the prompt only if the tool is available (#29536 ) This change updates the system prompt to conditionally include `grep`-related instructions based on whether the `grep` tool is enabled. Implementation details: 1. Add a `has_tool` handlebars helper. 2. Pass the `model` to all locations where the prompt is built. 3. Use `{{#if has_tool "grep"}}` in the system prompt to gate `grep`-specific instructions. Testing: - Unit tests for the `hasTool` helper. - Unit tests to verify that `grep`-related instructions are included / omitted from the prompt as appropriate. - Manual agent evaluation: - Setup: Asked the Agent "List all impls of MyTrait in the project" using a custom "No tools" profile (all tools disabled). - Before the change: The Agent attempted to call `grep`, encountered an error, then realized the tool was unavailable. - After the change: The Agent immediately asked to enable a search tool. Note: in principle, `grep`/`read_file` tool descriptions alone might be enough, but to confirm this we need more evaluation. If it turns out to be true, we'll be able to remove grep-specific instructions from the system prompt and undo this change. Release Notes: - N/A	2025-04-28 19:47:40 +00:00
Agus Zubiaga	8b5835de17	agent: Improve initial file search quality (#29317 ) This PR significantly improves the quality of the initial file search that occurs when the model doesn't yet know the full path to a file it needs to read/edit. Previously, the assertions in file_search often failed on main as the model attempted to guess full file paths. On this branch, it reliably calls `find_path` (previously `path_search`) before reading files. After getting the model to find paths first, I noticed it would try using `grep` instead of `path_search`. This motivated renaming `path_search` to `find_path` (continuing the analogy to unix commands) and adding system prompt instructions about proper tool selection. Note: I know the command is just called `find`, but that seemed too general. In my eval runs, the `file_search` example improved from 40% ± 10% to 98% ± 2%. The only assertion I'm seeing occasionally fail is "glob starts with `**` or project". We can probably add some instructions in that regard. Release Notes: - N/A	2025-04-23 21:24:41 -03:00
Michael Sloan	7aa0fa1543	Add ability to attach rules as context (#29109 ) Release Notes: - agent: Added support for adding rules as context.	2025-04-21 20:16:51 +00:00
Nathan Sobo	107d8ca483	Rename regex search tool to grep and accept an include glob pattern (#29100 ) This PR renames the `regex_search` tool to `grep` because I think it conveys more meaning to the model, the idea of searching the filesystem with a regular expression. It's also one word and the model seems to be using it effectively after some additional prompt tuning. It also takes an include pattern to filter on the specific files we try to search. I'd like to encourage the model to scope its searches more aggressively, as in my testing, I'm only seeing it filter on file extension. Release Notes: - N/A	2025-04-20 00:53:30 +00:00
Nathan Sobo	bab28560ef	Systematically optimize agentic editing performance (#28961 ) Now that we've established a proper eval in tree, this PR is reboots of our agent loop back to a set of minimal tools and simpler prompts. We should aim to get this branch feeling subjectively competitive with what's on main and then merge it, and build from there. Let's invest in our eval and use it to drive better performance of the agent loop. How you can help: Pick an example, and then make the outcome faster or better. It's fine to even use your own subjective judgment, as our evaluation criteria likely need tuning as well at this point. Focus on making the agent work better in your own subjective experience first. Let's focus on simple/practical improvements to make this thing work better, then determine how we can craft our judgment criteria to lock those improvements in. Release Notes: - N/A --------- Co-authored-by: Max <max@zed.dev> Co-authored-by: Antonio <antonio@zed.dev> Co-authored-by: Agus <agus@zed.dev> Co-authored-by: Richard <richard@zed.dev> Co-authored-by: Max Brunsfeld <maxbrunsfeld@gmail.com> Co-authored-by: Antonio Scandurra <me@as-cii.com> Co-authored-by: Michael Sloan <mgsloan@gmail.com>	2025-04-19 02:47:59 +00:00
Michael Sloan	502a0f6535	agent: Use default prompts from prompt library in system prompt (#28915 ) Related to #28490. - Default prompts from the prompt library are now included as "user rules" in the system prompt. - Presence of these user rules is shown at the beginning of the thread in the UI. _ Now uses an `Entity<PromptStore>` instead of an `Arc<PromptStore>`. Motivation for this is emitting a `PromptsUpdatedEvent`. - Now disallows concurrent reloading of the system prompt. Before this change it was possible for reloads to race. Release Notes: - agent: Added support for including default prompts from the Prompt Library as "user rules" in the system prompt. --------- Co-authored-by: Danilo Leal <daniloleal09@gmail.com>	2025-04-18 09:32:35 -06:00
Agus Zubiaga	90bcde116f	agent: Use current shell (#28470 ) Release Notes: - agent: Replace `bash` tool with `terminal` tool which uses the current shell --------- Co-authored-by: Bennet <bennet@zed.dev> Co-authored-by: Antonio <antonio@zed.dev>	2025-04-09 23:38:36 -06:00
Michael Sloan	301fc7cd7b	Pull out plain rules file loading code into a new `agent_rules` crate (#28383 ) Also renames for rules file templated into the system prompt Release Notes: - N/A	2025-04-09 01:31:56 +00:00
Joseph T. Lyons	763cc6dba3	Tell the model not to act on TODO type comments (#28358 ) Release Notes: - Adjusted system prompt to direct it to never act on TODO-type comments it encounters, unless the user directly asked it to do so or they relate to the current task at hand.	2025-04-08 21:00:02 +00:00
Joseph T. Lyons	ca8f6e8a3f	Tell the model not to remove tests (#28349 ) Release Notes: - Adjusted system prompt to direct it to never remove tests as a way to have the test suite pass, unless the user directly asks for test removal.	2025-04-08 19:26:43 +00:00
Richard Feldman	1c85901440	Tell the model not to create .bak files (#28244 ) Release Notes: - Adjusted system prompt to avoid having the agent create backup files unnecessarily.	2025-04-08 18:45:35 +00:00
Richard Feldman	bfe08e449f	Tell the system prompt not to write incomplete code (#28245 ) Sometimes agents do this. I've had some success responding by telling it not to do this, so trying out having it in the system prompt. Release Notes: - Adjusted the system prompt to avoid incomplete code generation.	2025-04-07 20:59:52 -04:00
Richard Feldman	aeea3645ff	Fix typo in system prompt (#28246 ) Release Notes: - N/A	2025-04-07 17:29:56 +00:00
Richard Feldman	fa90b3a986	Link to cited code blocks (#28217 ) <img width="612" alt="Screenshot 2025-04-06 at 9 59 41 PM" src="https://github.com/user-attachments/assets/3a996b4a-ef5c-4ca6-bd16-3b180b364a3a" /> Release Notes: - Agent panel now shows links to relevant source code files above code blocks.	2025-04-07 12:01:34 -03:00
Richard Feldman	ac9e2f30bb	Try to improve behavior when agent is stuck (#28169 ) Currently, it's pretty common that when the agent gets stuck, it deletes whatever it's stuck on and replaces it with a TODO comment, then cheerfully reports that it has "simpified" the implementation. This is worse than leaving the broken code, because at least a human could take over and try to get it across the finish line. This system prompt adjustment attempts to make the agent do something more useful when in this situation: report that it's stuck, explain why it's stuck, and ask the user what to do. Release Notes: - N/A	2025-04-05 23:52:28 -04:00
Agus Zubiaga	9c4e61eae1	agent: Add more guidelines to system prompt (#27927 ) Release Notes: - N/A	2025-04-02 12:41:51 -03:00
Agus Zubiaga	75689c1c88	assistant2: System prompt response guidance (#27782 ) Adds some guidance for the assistant on how to respond to tool results and other interactions Release Notes: - N/A Co-authored-by: Richard Feldman <richard@zed.dev>	2025-03-31 14:13:02 +00:00
Agus Zubiaga	ca6be249dc	assistant2: Change system prompt to discourage doom loops (#27781 ) Ask assistant to limit diagnostic fix attempts to 3 max Release Notes: - N/A Co-authored-by: Richard Feldman <richard@zed.dev>	2025-03-31 14:03:47 +00:00
Agus Zubiaga	130abc8998	assistant2: Encourage diagnostics check (#27510 ) Release Notes: - N/A	2025-03-26 13:42:09 -03:00
Richard Feldman	4a5f89aded	Make system prompt be more explicit about root paths (#27383 ) ## Before <img width="627" alt="Screenshot 2025-03-24 at 12 55 15 PM" src="https://github.com/user-attachments/assets/349d7025-e65e-4107-86ae-45eb321003b3" /> ## After <img width="627" alt="Screenshot 2025-03-24 at 12 52 04 PM" src="https://github.com/user-attachments/assets/0e8c061a-11c5-4d60-a694-55575b6c8f5e" /> Release Notes: - N/A	2025-03-24 14:00:16 -04:00
Michael Sloan	1180b6fbc7	Initial support for AI assistant rules files (#27168 ) Release Notes: - N/A --------- Co-authored-by: Danilo <danilo@zed.dev> Co-authored-by: Nathan <nathan@zed.dev> Co-authored-by: Thomas <thomas@zed.dev>	2025-03-20 08:30:04 +00:00
Antonio Scandurra	70c973f6c3	Fix issues in `EditFilesTool`, `ListDirectoryTool` and `BashTool` (#26647 ) Release Notes: - N/A	2025-03-13 09:41:27 +00:00
Antonio Scandurra	41eb586ec8	Remove `list_worktrees` and use relative paths instead (#26546 ) Release Notes: - N/A	2025-03-12 15:06:04 +00:00

30 commits