Photo by Albert Stoynov on Unsplash
See parts 1, 2 and 3 for context.
The awaitable/awaiter pattern is used to block program execution (but not the thread) until the async operation completes and return its result to the caller as though it were a synchronous call.
An async method runs synchronously until an await
is encountered. At this point, it checks to see if the async operation is already complete (which might happen if there is an input validation error, or the result was returned from a cache for example). If yes, then it just gets the result of the operation and the program proceeds synchronously. If no, then it captures and sets a continuation to be invoked on completion and returns an awaiter instance. The continuation is generated behind the scenes by the pattern as a state machine (which we will examine in the next post). When the async operation is complete, the continuation runs (likely on a different thread) and updates the result of the operation in the task instance, which is available to the caller through the awaiter.
What is an awaiter? Conceptually, it is an object that suspends program execution until an operation has completed and helps obtain the result of the operation. The thread itself is not blocked or suspended, and can be returned to the thread pool, making it available for other work. When the async operation completes, the program resumes via the continuation callback, possibly on another threadpool thread. It provides a non-blocking way to await an operation.
Awaitable is a type that can be awaited. It has a GetAwaiter()
method that returns an awaiter. In the case of async/await, Task
and Task<T>
are awaitable types.
Awaitable/Awaiter is a bridge between the caller of the async method and the continuation that runs upon completion. It enables a linear flow of control and intuitive exception handling. JavaScript in its pre-await days suffered from callback hell, with flow of control jumping all over the place, and no easy way to return a value locally from an async operation. Awaitable/Awaiter helps avoid all that and write async code that looks and behaves like synchronous code, even though under the hood, the flow of control might be jumping around between threads.
In the following code:
HttpResponse response = await _httpClient.GetAsync(url);
_httpClient.GetAsync(url)
returns a Task<HttpResponse>
instance. The compiler replaces await
with a GetAwaiter()
call on the task instance, which returns an “awaiter” of type TaskAwaiter<HttpResponse>
.
TaskAwaiter<HttpResponse> awaiter = _httpClient.GetAsync(url).GetAwaiter();
The awaiter as its name suggests, provides a non-blocking way to wait for the async operation to complete. It exposes a GetResult()
method to return the result (called in the continuation).
HttpResponse response = awaiter.GetResult();
Awaitable/Awaiter is not restricted just to task instances, but as Stephen Toub explains, it is a pattern that the language supports to await any instance that exposes a GetAwaiter
method. In the case of Task
, GetAwaiter
returns a TaskAwaiter
instance. But you could write your own class Bar
that exposes an awaiter pattern by just implementing a GetAwaiter
method that returns a BarAwaiter
instance that implements the INotifyCompletion
interface (which has one method - OnCompleted
) and exposes the following members, just as TaskAwaiter
does:
bool IsCompleted { get; }
void OnCompleted(Action continuation);
TResult GetResult(); // TResult can also be void
Another usage of this pattern is to be found in the Task
class itself. Task.Yield()
, which is used to yield control back to the current context when awaited, returns a YieldAwaitable
.
Task.Yield()
“creates an awaitable task that asynchronously yields back to the current context when awaited.”
This is ostensibly to allow for control to be ceded back at yield points during long-running operations, so as to avoid monopolizing resouces.
Vasil Kosturski has an excellent article with an example worth trying to get a feel for the awaitable/awaiter pattern. In a nutshell, when an await
is encountered, the awaiter’s IsCompleted
is checked. If true, then the code proceeds synchronously to execute GetResult()
. If false, then it means the operation is in progress, and OnCompleted
is called to register the continuation. When the async I/O completes, the associated task is set to complete (using Task.TrySetResult
) and the registered continuations scheduled to run. The continuation has access to the antecedent task, and can call GetResult()
to obtain its result.
Here, a question arises: how does the system know when the async operation completes? We have to dive into the deep end of the pool to understand this.
The async call to a framework method such as _httpClient.GetAsync
invokes SocketsHttpHandler
behind the scenes which makes a Windows Socket API (Winsock) call through P/Invoke (managed to unmanaged code interface). Winsock initiates an I/O operation in asynchronous mode, registers a callback function and returns immediately.
To do this it creates an I/O request packet (IRP) for the I/O manager. This IRP has information about the registered callback to invoke on completion of the operation. The I/O Manager is part of the Windows kernel, responsible for managing I/O operations. It routes this IRP to the TCP/IP driver, which runs in kernel mode. The TCP/IP driver interfaces with NICs through device drivers, sending low level frames.
When the NIC receives data back, the NIC driver triggers an interrupt to notify the CPU, which runs an Interrupt Service Routine (ISR) which resets the device and queues up a Deferred Procedure Call (DPC) to handle the interrupt. DPCs are used to defer lower-priority processing in interrupt handlers and device drivers. When the CPU gets around to running the DPC, it invokes the TCP/IP driver to process the received packet.
When the TCP/IP driver is done, it signals completion to the I/O manager, which qeueues up a completion packet containing the received data to an I/O Completion Port (IOCP). IOCPs are a Windows mechanism for efficiently handling multiple asynchronous I/O operations. They are conceptually like a queue serviced by the thread pool (the .NET thread pool has dedicated IOCP processing threads). An IOCP thread from the pool awakens to process the completion packet, extract the result of the I/O operation, and invoke the registered callback (the completion packet has access to the IRP which can be used to identify the callback), which sets the associated task to completed. This in turn triggers the task framework to run the continuation.
All this Herculean effort just so that a thread is not blocked waiting to receive data. This free time for one thread, (which can be significant, since I/O operations can be 100-1000 times slower than CPU operations), multiplied by all the threads doing I/O across the system, can increase server throughput by upto 4x)!
Another question: what part of this contraption runs for synchronous I/O operations? We would still have IRPs (although they would be synchronous vs asynchronous IRPs), ISRs and DPCs, but no IOCPs are involved (those are specifically for asynchronous I/O), and the TCP/IP driver running in the DPC would return the result directly to the calling thread, which, poor fellow, has been patiently waiting for the result.
Task
and Task<T>
).