Type-safe read-only views #12

ptitjes · 2022-07-28T10:45:54Z

In most of my use-cases, I need a read-only view of buffers (both Buffer and CompositeBuffer) with:

absolute read methods (i.e. random-access read)
relative read methods (i.e. sequential read)

It seems from e5l/view-and-refcount and whyoleg/resources that we are taking the route of having reference-counted underlying storages. So the following design proposal is based on this idea.

As a first step, I would propose that we extract a ReadBuffer interface from Buffer containing:

var readIndex: Int relative read index,
a new val readLimit: Int exclusive read limit index (always equal to writeIndex in Buffer's default implementation),
fun read*() relative read methods,
fun get*At(...) absolute read methods,
fun copyTo*(...) relative and absolute bulk read methods,
a new fun duplicate(): ReadBuffer method that returns a duplicate read buffer (with shared underlying storage but independent readIndex),
new fun slice(...): ReadBuffer methods that returns sliced duplicates.

Additionally, we would add the following to the Buffer interface:

a new val writeLimit: Int for "symmetry" (always equal to capacity)
an override override fun duplicate(): Buffer method that returns a duplicate buffer (with shared underlying storage but independent readIndex and writeIndex).

The text was updated successfully, but these errors were encountered:

whyoleg · 2022-07-28T11:19:45Z

While I thinking, that ReadBuffer looks reasonable, can you provide use case for duplicate/slice methods? The main problem with anything like duplicate is that we can spawn several objects, that will share same underlying memory segment. And f.e. after it, we can send some of duplicates to channel, or another function, or anywhere else, while somewhere else exists another duplicate, which can be mutated, and such situations are really hard to debug. That is why, I would think, that something like takeHead or anything else, that splits buffer in non intersection views is much better.

ptitjes · 2022-07-28T11:52:55Z

@whyoleg Indeed, all my use-cases are always: write something to the buffer, and then make read-only views (possibly used concurrently) of the already wrote part of the buffer. So indeed, I would prefer to have a contract similar to takeHead (not convinced by the name though) that mutates the current buffer so that it does not intersect the returned new head buffer.

We could then do:

val source: Source = // ...
val buffer: Buffer = // ...

val read = source.readFully(buffer) // Or something similar
val head: ReadBuffer = buffer.takeHead(read)
// Pass head to some external api

Also would it be possible to have an absolute variant of takeHead that takes an additional index: Int parameter (with a check such that this.startIndex + index < this.writeIndex? Maybe changing the current takeHead signature to:

fun takeHead(startIndex: Int = readIndex, endIndex: Int = writeIndex): Buffer

whyoleg · 2022-07-28T12:07:30Z

Im not sure, that it will be possible to have such an absolute variant, as if so it will split buffer in 3 parts (from 0 to start, from start to end, from end to capacity) and returns center part of it, so not sure, that it will be convenient for anyone.

ptitjes · 2022-07-28T12:14:53Z

Well, then I could always do the following but that is not very elegant IMO:

val head = buffer.takeHead(endIndex)
head.takeHead(startIndex).close()
// Use head and buffer...

Also I still would need to be able to duplicate read-only views, in order to have multiple read views (possibly accessed concurrently) of the same slice of the underlying memory segment.

whyoleg · 2022-07-28T12:32:24Z

Can you provide better sample (gist, link to repo) on what you are trying to achieve?
I think that I have an idea on what kind of API/operations are needed for your use case.

ptitjes · 2022-07-28T12:50:01Z

@whyoleg I will cook a more complete example today. Thanks.

ptitjes · 2022-07-28T16:39:11Z

So let me try to give a more complete example. This is the context of https:/ptitjes/kzmq. Currently, I use ByteArrays everywhere, but, in the future final API, I want to enable buffer reuse. In my implementation of the PUB ZeroMQ socket, I have to send the same data to multiple sockets. There are channels in the middle and also wrappers around the buffers, but if I remove all the noise it boils down to:

interface SocketHandler {
  val input: Source
  val output: Destination
}

val sockets: List<SocketHandler> = // ... not important
val messages: Channel<ReadBuffer> = // ... not important

// At some point, I launch:
launch {
  while (isActive) {
    val message = messages.receive()
    sockets.forEach { socket ->
       val m = message.duplicate()
       launch {
         socket.output.write(m)
         m.close()
       }
    }
    message.close()
  }
}

// When user wants to send some data:
val data: ReadBuffer = // ... buffer given by the user
messages.send(data)

@whyoleg Is my use-case clearer?

whyoleg · 2022-07-28T17:09:16Z

Yeah, thx! I see how this can be done. RSocket has similar requirements, so I know what you want

Buuuut 😅 , I'm really interested about 'takeHead'/'slice' use cases, this part of design is much harder from my point of view

ptitjes · 2022-07-28T17:23:35Z

@whyoleg Yeah I forgot to say that the data ReadBuffer, in my example, can itself be obtained from another socket. I receive frames from the socket that have the following shape:

one byte containing flags
4 bytes containing size
data of the above size

Either, I would:

read 5 bytes to my buffer, and read in my buffer the flags and size
compact the buffer
read the data to my buffer
take the head (from index 0) to size
make this buffer a read view for my users

Or, I could:

read 5 bytes to my buffer, and read in my buffer the flags and size
read the data to my buffer
take the head (from index 5) to size
make this buffer a read view for my users

ptitjes · 2022-07-28T17:28:40Z

I feel like the read-only view/duplicate read-only view design is intricately linked to the takeHead/slice design:

val head = buffer.takeHead(size)
val readOnlyHead = head.someOperationToMakeAReadView()

Should someOperationToMakeAReadView steal the content of head, as does takeHead?

whyoleg · 2022-07-28T20:07:28Z

Thx!
So, In my mind it will look somehow like this:

val buffer: Buffer = //retrieved from socket somehow
buffer
  .stealReadOnly() //will return ReadOnlyBuffer instance, which will be view over buffer underlying storage; after this call, `buffer` will be empty, and have no underlying storage
  .use { frame: ReadOnlyBuffer ->
    val flags = frame.readByte()
    val size = frame.readInt()
    val data: ReadOnlyBuffer = frame.readBuffer(size)
    //do anything with it
  }

Where Buffer and ReadOnlyBuffer are unrelated interfaces (not inherited one from another) with same ancestor Readable which defines simple buffer operations.
Buffer in addition has write operations (via Writable interface) and takeHead(index), steal() and stealReadOnly() - which modify underlying storage in way, that no overlapping is possible
ReadOnlyBuffer it addition has readBuffer, getBufferAt and copy - which return zero copy views

I will summarize my idea in PR and will try to create it today-tomorrow.

ptitjes · 2022-07-28T20:57:33Z

@whyoleg it would eleganty fit my use case. 👍 Thanks

ptitjes · 2022-07-28T21:05:18Z

Why not renametakeHead to stealHead? That would make the API even more discoverable.

whyoleg · 2022-07-31T16:36:02Z

API Prototype (no implementation yet) is here: https:/ktorio/ktor-io/tree/whyoleg/read-only-buffer
Still have a lot of todos, on how it will be better to do one or other things
Will wait for @e5l to be available to discuss this way in more detail, may be he have more ideas
P.S. also there is a prototype API there for accessing underlying components of Buffer for interop with exisisting solutions: f.e. to be able to access ByteBuffer or ByteArray from buffer without copying to interacting with platform APIs like javax.crypto, sockets and so on. Still have no idea how to make such an API really safe

ptitjes · 2022-08-08T10:21:53Z

Hi @whyoleg, sorry for the late answer. I very much like your design proposal.

However, reading your example, I am wondering whether a BytesSource.read() for a TCP socket would return a full TCP frame or not?

whyoleg · 2022-08-08T10:50:53Z

Overall it's out of scope of this issue, because it can depend on implementation of source, but I would think, that most of the time it will be not single TCP frame, but limited by some buffer inside implementation. But for now, I have no preferences or ideas on this.

ptitjes · 2022-08-08T11:00:32Z

OK. But then this changes how we do use buffers. For instance, in your example, I cannot be sure that the returned read only buffer contains all the expected data, right?

whyoleg · 2022-08-08T11:19:13Z

As I know, when you are doing some TCP work, you can't rely on an idea, that all data will be in one TCP frame. Amount of data in it depends a lot on buffers sizes of client and server. That's why we need to send data length when working with TCP. In my example There is no check for availability of data in buffer, yes, real world code should check, if it enough data left in buffer and then: or steal head, and wait more data, or just wait more data, if buffer has enough capacity, or fail if socket closed, and so on.
I haven't yet experimented on how we can work with sockets via this new buffer/source API, but still I think that it's about how source is implemented: may be there will be a possibility to peak buffer, read length, and if it not enough, await more data. But you also should be aware of TCP flow control, and may be even buffer, which leave underneath of source implementation can be limited to some size to not extend up to OOM, if user is slow to read

And of course, if there will be not enough data in buffer, and you want to read buffer of length bigger than exists - we will fail, don't think that silent returning part of a buffer is a good idea here

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Type-safe read-only views #12

Type-safe read-only views #12

ptitjes commented Jul 28, 2022 •

edited

Loading

whyoleg commented Jul 28, 2022 •

edited

Loading

ptitjes commented Jul 28, 2022 •

edited

Loading

whyoleg commented Jul 28, 2022

ptitjes commented Jul 28, 2022

whyoleg commented Jul 28, 2022

ptitjes commented Jul 28, 2022

ptitjes commented Jul 28, 2022 •

edited

Loading

whyoleg commented Jul 28, 2022

ptitjes commented Jul 28, 2022

ptitjes commented Jul 28, 2022

whyoleg commented Jul 28, 2022

ptitjes commented Jul 28, 2022 •

edited

Loading

ptitjes commented Jul 28, 2022

whyoleg commented Jul 31, 2022 •

edited

Loading

ptitjes commented Aug 8, 2022

whyoleg commented Aug 8, 2022

ptitjes commented Aug 8, 2022

whyoleg commented Aug 8, 2022

Type-safe read-only views #12

Type-safe read-only views #12

Comments

ptitjes commented Jul 28, 2022 • edited Loading

whyoleg commented Jul 28, 2022 • edited Loading

ptitjes commented Jul 28, 2022 • edited Loading

whyoleg commented Jul 28, 2022

ptitjes commented Jul 28, 2022

whyoleg commented Jul 28, 2022

ptitjes commented Jul 28, 2022

ptitjes commented Jul 28, 2022 • edited Loading

whyoleg commented Jul 28, 2022

ptitjes commented Jul 28, 2022

ptitjes commented Jul 28, 2022

whyoleg commented Jul 28, 2022

ptitjes commented Jul 28, 2022 • edited Loading

ptitjes commented Jul 28, 2022

whyoleg commented Jul 31, 2022 • edited Loading

ptitjes commented Aug 8, 2022

whyoleg commented Aug 8, 2022

ptitjes commented Aug 8, 2022

whyoleg commented Aug 8, 2022

ptitjes commented Jul 28, 2022 •

edited

Loading

whyoleg commented Jul 28, 2022 •

edited

Loading

ptitjes commented Jul 28, 2022 •

edited

Loading

ptitjes commented Jul 28, 2022 •

edited

Loading

ptitjes commented Jul 28, 2022 •

edited

Loading

whyoleg commented Jul 31, 2022 •

edited

Loading