Commit graph

6 commits

Author SHA1 Message Date
Marshall Bowers
ce93961fe0
agent: Add "max mode" toggle (#29549)
This PR adds a "max mode" toggle to the Agent panel, for models that
support it.

Only visible to folks in the `new-billing` feature flag.

Icon is just a placeholder.

Release Notes:

- N/A
2025-04-28 16:50:47 +00:00
Michael Sloan
609c528ceb
Refactor markdown formatting utilities to avoid building intermediate strings (#29511)
These were nearly always used when using `format!` / `write!` etc, so it
makes sense to not have an intermediate `String`.

Release Notes:

- N/A
2025-04-27 19:04:51 +00:00
Michael Sloan
17ecf94f6f
Restructure agent context (#29233)
Simplifies the data structures involved in agent context by removing
caching and limiting the use of ContextId:

* `AssistantContext` enum is now like an ID / handle to context that
does not need to be updated. `ContextId` still exists but is only used
for generating unique `ElementId`.
* `ContextStore` has a `IndexMap<ContextSetEntry>`. Only need to keep a
`HashSet<ThreadId>` consistent with it. `ContextSetEntry` is a newtype
wrapper around `AssistantContext` which implements eq / hash on a subset
of fields.
* Thread `Message` directly stores its context.

Fixes the following bugs:

* If a context entry is removed from the strip and added again, it was
reincluded in the next message.
* Clicking file context in the thread that has been removed from the
context strip didn't jump to the file.
* Refresh of directory context didn't reflect added / removed files.
* Deleted directories would remain in the message editor context strip.
* Token counting requests didn't include image context.
* File, directory, and symbol context deduplication relied on
`ProjectPath` for identity, and so didn't handle renames.
* Symbol context line numbers didn't update when shifted

Known bugs (not fixed):

* Deleting a directory causes it to disappear from messages in threads.
Fixing this in a nice way is tricky. One easy fix is to store the
original path and show that on deletion. It's weird that deletion would
cause the name to "revert", though. Another possibility would be to
snapshot context metadata on add (ala `AddedContext`), and keep that
around despite deletion.

Release Notes:

- N/A
2025-04-24 21:29:33 +00:00
Oleksiy Syvokon
f69aeb6311
Do not log unfinished tools use that are in the middle of streaming (#29275)
Release Notes:

- N/A
2025-04-23 13:19:01 +00:00
Oleksiy Syvokon
76a78b550b
eval: Write JSON-serialized thread (#29271)
This adds `last.message.json` file that contains the full request plus
response (serialized as a message from assistant for consistency with
other messages).

Motivation: to capture more info and to make analysis of finished runs
easier.

Release Notes:

- N/A
2025-04-23 15:22:19 +03:00
Agus Zubiaga
ce1a674eba
eval: Fine-grained assertions (#29246)
- Support programmatic examples
([example](17feb260a0/crates/eval/src/examples/file_search.rs))
- Combine data-driven example declarations into a single `.toml` file
([example](17feb260a0/crates/eval/src/examples/find_and_replace_diff_card.toml))
- Run judge on individual assertions (previously called "criteria")
- Report judge and programmatic assertions in one combined table

Note: We still need to work on concept naming 

<img width=400
src="https://github.com/user-attachments/assets/fc719c93-467f-412b-8d47-68821bd8a5f5">

Release Notes:

- N/A

---------

Co-authored-by: Richard Feldman <oss@rtfeldman.com>
Co-authored-by: Max Brunsfeld <maxbrunsfeld@gmail.com>
Co-authored-by: Thomas Mickley-Doyle <tmickleydoyle@gmail.com>
2025-04-22 23:58:58 -03:00