14

I was watching this talk about implementing Async IO in Rust and Carl mentions two potential models. Readiness and Completion.

Readiness Model:

  • you tell the kernel you want to read from a socket
  • do other things for awhile…
  • the kernel tells you when the socket is ready
  • you read (fill a buffer)
  • do whatever you need
  • free the buffer (happens automatically with Rust)

Completion Model:

  • you allocate a buffer for the kernel to fill
  • do other things for awhile…
  • the kernel tells you when the buffer has been filled
  • do whatever you need to with the data
  • free the buffer

In Carl's example of using the readiness model you could iterate over ready sockets filling and freeing a global buffer which makes it seem like it would use much less memory.

Now my assumptions:

Under the hood (in kernel space) when a socket is said to be "ready", the data already exists. It has come into the socket over the network (or from wherever) and the OS is holding onto the data.

It's not as if that memory allocation magically doesn't happen in the readiness model. It's just that the OS is abstracting it from you. In the Completion model, the OS is asking you to allocate memory before data actually flows in and it's obvious what's happening.

Here's my amended version of the Readiness Model:

  • you tell the kernel you want to read from a socket
  • do other things for awhile…
  • AMENDMENT: data comes in to the OS (some place in kernel memory)
  • the kernel tells you the socket is ready
  • you read (fill another buffer separate from the abover kernel buffer (or you get a pointer to it?))
  • do whatever you need
  • free the buffer (happens automatically with Rust)

/My assumptions

I happen to like keeping the user-space program small but I just wanted some clarification on what is, in reality, happening here. I don't see that one model would inherently use less memory or support a higher level of concurrent IO. I'd love to hear thoughts and deeper explanation of this.

kjs3
  • 241
  • 1
  • 5
  • I also arrived here from that YouTube talk. For anyone learning about how async IO or how to implement event loops, the Rust team has this playlist of "Aysnc Interviews" [here](https://www.youtube.com/playlist?list=PLCQVvhKUrTN-SNmiJKTEhOGWznDcTtbY_) interviewing very knowledgeable folks from the community – Clifford Fajardo Jan 19 '20 at 20:08

2 Answers2

5

In the Readiness Model memory consumption is proportional to the amount of data unconsumed by the application.

In the Completion Model the memory consumption is proportional to the amount of outstanding socket calls.

If there are many sockets that are mostly idle then the Readiness Model consumes less memory.

There is an easy fix for the Completion Model: Initiate a read of 1 byte. This consumes only a tiny buffer. When the read completes issue another (maybe synchronous) read that get the rest of the data.

In some languages the Completion Model is extremely simple to implement. I consider it to be a good default choice.

usr
  • 2,734
  • 18
  • 15
1

In the Completion model, the OS is asking you to allocate memory before data actually flows in and it's obvious what's happening.

But what happens if more data comes in than you allocated space for? The kernel still has to allocate its own buffer so as not to drop the data. (For example, this is why the 1-byte read trick mentioned in usr's answer works.)

The tradeoff is that while the Completion Model consumes more memory, it can also (sometimes) do fewer copy operations, because keeping the buffer around means the hardware can DMA directly out of or into it. I also suspect (but am less sure) that the Completion Model tends to do the actual copy operation (when it exists) on another thread, at least for Windows' IOCP, while the Readiness Model does it as part of the non-blocking read() or write() call.

rpjohnst
  • 111
  • 4