Closes#34029
The crash is due to a stack overflow in our `html_to_markdown`
conversion; I've added a maximum depth of 200 for the recursion in that
crate to guard against this kind of thing.
Separately, we were treating all content-types other than `text/plain`
and `application/json` as HTML; I've changed this to only treat
`text/html` and `application/xhtml+xml` as HTML, and fall back to
plaintext. (In the original crash, the content-type was
`application/octet-stream`.)
Release Notes:
- agent: Fixed a potential crash when fetching large non-HTML files.
This PR adds a tag handler for collecting crate items from rustdoc's
HTML output.
This will serve as the foundation for getting more insight into a
crate's contents.
Release Notes:
- N/A
This PR overhauls the HTML to Markdown conversion functionality in order
to make it more pluggable. This will ultimately allow for supporting a
variety of different HTML input structures (both natively and via
extensions).
As part of this, the `rustdoc_to_markdown` crate has been renamed to
`html_to_markdown`.
The `MarkdownWriter` now accepts a list of trait objects that can be
used to drive the conversion of the HTML into Markdown. Right now we
have some generic handler implementations for going from plain HTML
elements to their Markdown equivalents, as well as some rustdoc-specific
ones.
Release Notes:
- N/A