This PR overhauls the HTML to Markdown conversion functionality in order
to make it more pluggable. This will ultimately allow for supporting a
variety of different HTML input structures (both natively and via
extensions).
As part of this, the `rustdoc_to_markdown` crate has been renamed to
`html_to_markdown`.
The `MarkdownWriter` now accepts a list of trait objects that can be
used to drive the conversion of the HTML into Markdown. Right now we
have some generic handler implementations for going from plain HTML
elements to their Markdown equivalents, as well as some rustdoc-specific
ones.
Release Notes:
- N/A
This PR adds a new `/fetch` slash command to the Assistant for fetching
the content of an arbitrary URL as Markdown.
Currently it's just using the same HTML to Markdown conversion that
`/rustdoc` uses, but I'll be working to refine the output to be more
widely useful.
Release Notes:
- N/A
This PR improves `rustdoc_to_markdown`'s paragraph handling to produce
better output.
Specifically, there should now be fewer instances where a space is
missing between words as the result of line breaks in the source HTML.
Release Notes:
- N/A
This PR fixes an issue in `rustdoc_to_markdown` with code blocks being
trimmed incorrectly.
We were erroneously popping from the current element stack even if we
didn't push an element onto the stack.
Added test coverage for this case as well, so we don't regress.
Release Notes:
- N/A
This PR adds a `/rustdoc` slash command for retrieving and inserting
rustdoc docs into the Assistant.
Right now the command accepts the crate name as an argument and will
return the top-level docs from `docs.rs`.
Release Notes:
- N/A
This PR adds a new crate for converting rustdoc output to Markdown.
We're leveraging Servo's `html5ever` to parse the Markdown content, and
then walking the DOM nodes to convert it to a Markdown string.
The Markdown output will be continued to be refined, but it's in a place
where it should be reasonable.
Release Notes:
- N/A