ZIm/crates/eval/examples
Nathan Sobo bab28560ef
Systematically optimize agentic editing performance (#28961)
Now that we've established a proper eval in tree, this PR is reboots of
our agent loop back to a set of minimal tools and simpler prompts. We
should aim to get this branch feeling subjectively competitive with
what's on main and then merge it, and build from there.

Let's invest in our eval and use it to drive better performance of the
agent loop. How you can help: Pick an example, and then make the outcome
faster or better. It's fine to even use your own subjective judgment, as
our evaluation criteria likely need tuning as well at this point. Focus
on making the agent work better in your own subjective experience first.
Let's focus on simple/practical improvements to make this thing work
better, then determine how we can craft our judgment criteria to lock
those improvements in.

Release Notes:

- N/A

---------

Co-authored-by: Max <max@zed.dev>
Co-authored-by: Antonio <antonio@zed.dev>
Co-authored-by: Agus <agus@zed.dev>
Co-authored-by: Richard <richard@zed.dev>
Co-authored-by: Max Brunsfeld <maxbrunsfeld@gmail.com>
Co-authored-by: Antonio Scandurra <me@as-cii.com>
Co-authored-by: Michael Sloan <mgsloan@gmail.com>
2025-04-19 02:47:59 +00:00
..
add_arp_protocol_support Systematically optimize agentic editing performance (#28961) 2025-04-19 02:47:59 +00:00
auth_session_management Systematically optimize agentic editing performance (#28961) 2025-04-19 02:47:59 +00:00
buffer_string_input_support Systematically optimize agentic editing performance (#28961) 2025-04-19 02:47:59 +00:00
checkpoint_stability Systematically optimize agentic editing performance (#28961) 2025-04-19 02:47:59 +00:00
dd_iaptic_mcp_server_integration Systematically optimize agentic editing performance (#28961) 2025-04-19 02:47:59 +00:00
debian_image_builder Systematically optimize agentic editing performance (#28961) 2025-04-19 02:47:59 +00:00
docs_restructure Systematically optimize agentic editing performance (#28961) 2025-04-19 02:47:59 +00:00
email_verification_refactor Systematically optimize agentic editing performance (#28961) 2025-04-19 02:47:59 +00:00
exif_rotation_support Systematically optimize agentic editing performance (#28961) 2025-04-19 02:47:59 +00:00
expand_laravel_php_support Systematically optimize agentic editing performance (#28961) 2025-04-19 02:47:59 +00:00
find_and_replace_diff_card Systematically optimize agentic editing performance (#28961) 2025-04-19 02:47:59 +00:00
finnish_translation Systematically optimize agentic editing performance (#28961) 2025-04-19 02:47:59 +00:00
language_model_file_support Systematically optimize agentic editing performance (#28961) 2025-04-19 02:47:59 +00:00
lhs_join_update_callbacks Systematically optimize agentic editing performance (#28961) 2025-04-19 02:47:59 +00:00
libdevice_symbol_reexport Systematically optimize agentic editing performance (#28961) 2025-04-19 02:47:59 +00:00
license_management Systematically optimize agentic editing performance (#28961) 2025-04-19 02:47:59 +00:00
metal_i64_support Systematically optimize agentic editing performance (#28961) 2025-04-19 02:47:59 +00:00
metrics_data_size_updates Systematically optimize agentic editing performance (#28961) 2025-04-19 02:47:59 +00:00
nan_diff_handling Systematically optimize agentic editing performance (#28961) 2025-04-19 02:47:59 +00:00
never_type_workaround Systematically optimize agentic editing performance (#28961) 2025-04-19 02:47:59 +00:00
optimizer_schema_refactor Systematically optimize agentic editing performance (#28961) 2025-04-19 02:47:59 +00:00
rate_limit_endpoints Systematically optimize agentic editing performance (#28961) 2025-04-19 02:47:59 +00:00
replace_hold_with_drain_on_exit Systematically optimize agentic editing performance (#28961) 2025-04-19 02:47:59 +00:00
request_to_axios_migration Systematically optimize agentic editing performance (#28961) 2025-04-19 02:47:59 +00:00
restore_version_api_support Systematically optimize agentic editing performance (#28961) 2025-04-19 02:47:59 +00:00
runtime_script_refactor Systematically optimize agentic editing performance (#28961) 2025-04-19 02:47:59 +00:00
standardized_docker_dependency_checks Systematically optimize agentic editing performance (#28961) 2025-04-19 02:47:59 +00:00
table_metrics_sorting Systematically optimize agentic editing performance (#28961) 2025-04-19 02:47:59 +00:00
tax_id_validation Systematically optimize agentic editing performance (#28961) 2025-04-19 02:47:59 +00:00
test_infrastructure Systematically optimize agentic editing performance (#28961) 2025-04-19 02:47:59 +00:00
time_detail_merge_update Systematically optimize agentic editing performance (#28961) 2025-04-19 02:47:59 +00:00
tool_response_handling Systematically optimize agentic editing performance (#28961) 2025-04-19 02:47:59 +00:00
toolbar_endpoints Systematically optimize agentic editing performance (#28961) 2025-04-19 02:47:59 +00:00
virtio_block_request_refactor Systematically optimize agentic editing performance (#28961) 2025-04-19 02:47:59 +00:00
war_and_uri_corrections Systematically optimize agentic editing performance (#28961) 2025-04-19 02:47:59 +00:00
window_title_support Systematically optimize agentic editing performance (#28961) 2025-04-19 02:47:59 +00:00