ZIm/crates/http_client/src/async_body.rs
tiagoq 56b99f49fd
bedrock: Fix remaining streaming delays (#33931)
Closes #26030

*Note: This is my first contribution to Zed*

This addresses a second streaming bottleneck in Bedrock that remained
after the initial fix in #28281 (released in preview 194).

The issue is in the mechanism used to convert Zed's internal `AsyncBody`
into the `SdkBody` expected by the Bedrock language provider. We are
using a non-streaming converter that buffers responses.

**How the fix works:**
The AWS SDK provides streaming-compatible converters to create `SdkBody`
instances, but these require the input body to implement the `Body`
trait from the `http-body` crate.

This PR enables streaming by implementing the required trait and
switching to the streaming-compatible converter.

**Changes (2 commits):**

 * 1st Commit - **Implement http-body Body trait for AsyncBody:**
   - Add `http-body = 1.0` dependency (already an indirect dependency)
   - Implement the `Body` trait for our existing `AsyncBody` type
- Uses `poll_frame` to read data chunks asynchronously, preserving
streaming behavior

 * 2nd Commit - **Use streaming-compatible AWS SDK converter:**
- Create `SdkBody` using `SdkBody::from_body_1_x()` with the new `Body`
trait implementation

**Details/FAQ:**

**Q: Why add another dependency?**
A: We tried to avoid adding a dependency, but the AWS SDK requires the
`Body` trait and `http-body` is where it's defined. The crate is already
an indirect dependency, making this a reasonable solution.

**Q: Why modify the shared `http_client` crate instead of just
`aws_bedrock_client`?**
A: We considered implementing the `Body` trait on a wrapper in
`aws_bedrock_client`, but since `AsyncBody` already uses `http` crate
types, extending support to the companion `http-body` crate seems
reasonable and may benefit other integrations.

**Q: How was this bottleneck discovered?**
A: After @5herlocked's initial streaming fix in #28281, I tested preview
194 and noticed streaming still had issues. I found a way to reproduce
the problem and chatted with @5herlocked about it. He immediately
pinpointed the exact location where the issue was occurring, his
diagnosis made this fix possible.

**Q: How does this relate to the previous fix?**
A: #28281 fixed buffering issues higher in the stack, but unfortunately
there was another bottleneck lower-down in the aws-http-client. This PR
addresses that separate buffering issue.

**Q: Does this use zero-copy or one-copy?**
A: The `Body` implementation includes one copy. Someone more
knowledgeable might be able to achieve a zero-copy approach, but we
opted for a conservative approach. The performance impact should not be
perceptible in typical usage.

**Testing:**
Confirmed that Bedrock streaming now works without buffering delays in a
local build.


Release Notes:

- Improved Bedrock streaming by eliminating response buffering delays

---------

Co-authored-by: Marshall Bowers <git@maxdeviant.com>
2025-07-22 11:55:24 -04:00

138 lines
3.5 KiB
Rust

use std::{
io::{Cursor, Read},
pin::Pin,
task::Poll,
};
use bytes::Bytes;
use futures::AsyncRead;
use http_body::{Body, Frame};
/// Based on the implementation of AsyncBody in
/// <https://github.com/sagebind/isahc/blob/5c533f1ef4d6bdf1fd291b5103c22110f41d0bf0/src/body/mod.rs>.
pub struct AsyncBody(pub Inner);
pub enum Inner {
/// An empty body.
Empty,
/// A body stored in memory.
Bytes(std::io::Cursor<Bytes>),
/// An asynchronous reader.
AsyncReader(Pin<Box<dyn futures::AsyncRead + Send + Sync>>),
}
impl AsyncBody {
/// Create a new empty body.
///
/// An empty body represents the *absence* of a body, which is semantically
/// different than the presence of a body of zero length.
pub fn empty() -> Self {
Self(Inner::Empty)
}
/// Create a streaming body that reads from the given reader.
pub fn from_reader<R>(read: R) -> Self
where
R: AsyncRead + Send + Sync + 'static,
{
Self(Inner::AsyncReader(Box::pin(read)))
}
pub fn from_bytes(bytes: Bytes) -> Self {
Self(Inner::Bytes(Cursor::new(bytes.clone())))
}
}
impl Default for AsyncBody {
fn default() -> Self {
Self(Inner::Empty)
}
}
impl From<()> for AsyncBody {
fn from(_: ()) -> Self {
Self(Inner::Empty)
}
}
impl From<Bytes> for AsyncBody {
fn from(bytes: Bytes) -> Self {
Self::from_bytes(bytes)
}
}
impl From<Vec<u8>> for AsyncBody {
fn from(body: Vec<u8>) -> Self {
Self::from_bytes(body.into())
}
}
impl From<String> for AsyncBody {
fn from(body: String) -> Self {
Self::from_bytes(body.into())
}
}
impl From<&'static [u8]> for AsyncBody {
#[inline]
fn from(s: &'static [u8]) -> Self {
Self::from_bytes(Bytes::from_static(s))
}
}
impl From<&'static str> for AsyncBody {
#[inline]
fn from(s: &'static str) -> Self {
Self::from_bytes(Bytes::from_static(s.as_bytes()))
}
}
impl<T: Into<Self>> From<Option<T>> for AsyncBody {
fn from(body: Option<T>) -> Self {
match body {
Some(body) => body.into(),
None => Self::empty(),
}
}
}
impl futures::AsyncRead for AsyncBody {
fn poll_read(
self: Pin<&mut Self>,
cx: &mut std::task::Context<'_>,
buf: &mut [u8],
) -> std::task::Poll<std::io::Result<usize>> {
// SAFETY: Standard Enum pin projection
let inner = unsafe { &mut self.get_unchecked_mut().0 };
match inner {
Inner::Empty => Poll::Ready(Ok(0)),
// Blocking call is over an in-memory buffer
Inner::Bytes(cursor) => Poll::Ready(cursor.read(buf)),
Inner::AsyncReader(async_reader) => {
AsyncRead::poll_read(async_reader.as_mut(), cx, buf)
}
}
}
}
impl Body for AsyncBody {
type Data = Bytes;
type Error = std::io::Error;
fn poll_frame(
mut self: Pin<&mut Self>,
cx: &mut std::task::Context<'_>,
) -> Poll<Option<Result<Frame<Self::Data>, Self::Error>>> {
let mut buffer = vec![0; 8192];
match AsyncRead::poll_read(self.as_mut(), cx, &mut buffer) {
Poll::Ready(Ok(0)) => Poll::Ready(None),
Poll::Ready(Ok(n)) => {
let data = Bytes::copy_from_slice(&buffer[..n]);
Poll::Ready(Some(Ok(Frame::data(data))))
}
Poll::Ready(Err(e)) => Poll::Ready(Some(Err(e))),
Poll::Pending => Poll::Pending,
}
}
}