Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Additional BufReader API for incremental buffer filling #422

Closed
the8472 opened this issue Aug 6, 2024 · 1 comment
Closed

Additional BufReader API for incremental buffer filling #422

the8472 opened this issue Aug 6, 2024 · 1 comment
Labels
api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api

Comments

@the8472
Copy link
Member

the8472 commented Aug 6, 2024

Proposal

Problem statement

Decoding packetized variable-length data on a full-duplex protocol can require fine control over the read operations.
Naively attempting to fill a buffer with repeated reads can deadlock if the remote end is waiting on a reply.
But a single read can result in short reads.
On the other hand one might want to try more than one message to amortize syscall overheads which can result in an incomplete message at the end of a buffer.

BufReader::peek is being added and this covers some cases.

But it does not handle end-of-message/-parser based decoding well where one only wants to make read syscalls until the end of the current packet is reached without knowing the length in advance. E.g. HTTP headers or JSON-lines. This was previously requested in #409

The current peek implementation backshifts eagerly - which can be undesirable when there is a large partial message in the buffer - and can issue large reads, which can make the next backshift more expensive.

Motivating examples or use cases

Solution sketch

impl<R: Read> BufReader<Read> {
  /// Will read at most `n` additional bytes. Ok(0) does not necessarily indicate EOF.
  /// Issues at most 1 read call to the underlying reader.
  fn fill_more_up_to(&mut self, upper_limit: usize) -> Result<usize>
  /// move any existing data to the beginning of the internal buffer
  fn backshift(&mut self)
  /// indicates how much room there is for further reads, which is not the same as `capacity`
  fn tail_capacity(&self) -> usize
}

Alternatives

rust-lang/rust#57000 (comment) argues that BufReader is not meant to be a general "read to buffer" abstraction. So instead of attempting to grow BufReader's API surface another option would be to create a detached buffer API which can then be filled from an (unbuffered) reader and backshifted and refilled as needed. A buffer API can be more flexible since it could also be used for writing and other IO APIs.

It'd be a {buf: Box<[MaybeUninit<u8>]>, live: start..end}, similar to java's ByteBuffer. Internally we already have std::io::buffered::bufreader::Buffer. Interaction with BorrowedBuf would be TBD.

Links and related work

What happens now?

This issue contains an API change proposal (or ACP) and is part of the libs-api team feature lifecycle. Once this issue is filed, the libs-api team will review open proposals as capability becomes available. Current response times do not have a clear estimate, but may be up to several months.

Possible responses

The libs team may respond in various different ways. First, the team will consider the problem (this doesn't require any concrete solution or alternatives to have been proposed):

  • We think this problem seems worth solving, and the standard library might be the right place to solve it.
  • We think that this probably doesn't belong in the standard library.

Second, if there's a concrete solution:

  • We think this specific solution looks roughly right, approved, you or someone else should implement this. (Further review will still happen on the subsequent implementation PR.)
  • We're not sure this is the right solution, and the alternatives or other materials don't give us enough information to be sure about that. Here are some questions we have that aren't answered, or rough ideas about alternatives we'd want to see discussed.
@the8472 the8472 added T-libs-api api-change-proposal A proposal to add or alter unstable APIs in the standard libraries labels Aug 6, 2024
@Amanieu
Copy link
Member

Amanieu commented Sep 3, 2024

We discussed this extensively in the libs-api meeting. There was disagreement as to whether it makes sense to expose this much low-level access to the buffers in the standard library and whether this would make more sense as an external crate. There are many possible implementations for a buffer and we want to keep the flexibility of changing our implementation in the future for std's BufReader and BufWriter.

@Amanieu Amanieu closed this as completed Sep 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api
Projects
None yet
Development

No branches or pull requests

2 participants