Commit graph

7 commits

Author SHA1 Message Date
Cole Miller
5b61b8c8ed
agent: Fix crash with pathological fetch output (#34253)
Closes #34029

The crash is due to a stack overflow in our `html_to_markdown`
conversion; I've added a maximum depth of 200 for the recursion in that
crate to guard against this kind of thing.

Separately, we were treating all content-types other than `text/plain`
and `application/json` as HTML; I've changed this to only treat
`text/html` and `application/xhtml+xml` as HTML, and fall back to
plaintext. (In the original crash, the content-type was
`application/octet-stream`.)

Release Notes:

- agent: Fixed a potential crash when fetching large non-HTML files.
2025-07-11 21:01:09 -04:00
Michael Sloan
b93cee8d27
Use static LazyLocks for all constant regexes (#22225)
Release Notes:

- N/A
2024-12-19 02:20:35 +00:00
Piotr Osiewicz
e6c1c51b37
chore: Fix several style lints (#17488)
It's not comprehensive enough to start linting on `style` group, but
hey, it's a start.

Release Notes:

- N/A
2024-09-06 11:58:39 +02:00
Marshall Bowers
6fa347dff7
Move rustdoc-related code to rustdoc crate (#12945)
This PR moves the rustdoc-related code out of `html_to_markdown` and
into the `rustdoc` crate.

Release Notes:

- N/A
2024-06-12 15:53:05 -04:00
Marshall Bowers
8ccd2a0c99
Add tag handler for collecting crate items from rustdoc output (#12903)
This PR adds a tag handler for collecting crate items from rustdoc's
HTML output.

This will serve as the foundation for getting more insight into a
crate's contents.

Release Notes:

- N/A
2024-06-11 15:56:37 -04:00
Marshall Bowers
834089feb1
Handle Wikipedia code blocks in /fetch command (#12780)
This PR extends the `/fetch` command with support for Wikipedia code
blocks.

Release Notes:

- N/A
2024-06-07 12:54:33 -04:00
Marshall Bowers
2d9479667f
Make HTML to Markdown conversion more pluggable (#12653)
This PR overhauls the HTML to Markdown conversion functionality in order
to make it more pluggable. This will ultimately allow for supporting a
variety of different HTML input structures (both natively and via
extensions).

As part of this, the `rustdoc_to_markdown` crate has been renamed to
`html_to_markdown`.

The `MarkdownWriter` now accepts a list of trait objects that can be
used to drive the conversion of the HTML into Markdown. Right now we
have some generic handler implementations for going from plain HTML
elements to their Markdown equivalents, as well as some rustdoc-specific
ones.

Release Notes:

- N/A
2024-06-04 16:14:26 -04:00