I will quote JEP 425:
While [modern async designs] remove the limitation on throughput imposed by the scarcity of OS threads, it comes at a high price: It requires what is known as an asynchronous programming style, employing a separate set of I/O methods that do not wait for I/O operations to complete but rather, later on, signal their completion to a callback. Without a dedicated thread, developers must break down their request-handling logic into small stages, typically written as lambda expressions, and then compose them into a sequential pipeline with an API (see CompletableFuture, for example, or so-called “reactive” frameworks). They thus forsake the language’s basic sequential composition operators, such as loops and try/catch
blocks.
The fundamental distinction between synchronous and asynchronous programming is that synchronous code returns its result directly:
def getUser(id: UserId): User
On the other hand, asynchronous code returns its result indirectly, by invoking a callback (don’t call me, I’ll call you):
def getUser(callback: UserId => Unit): Unit
In asynchronous programming, this could be termed an asynchronous return, to draw attention to its semantic equivalence with synchronous return.
Now, await
/async
or even for
comprehensions slightly muddy the waters, because when you use such higher-level machinery that is built on callbacks, you do not see the callbacks, either at all, or very clearly. That is not an accident, it is by design: we wish to minimize the visible introduction of callbacks, so we can avoid callback hell.
But that await
/async
or even a for
comprehension are capable of hiding callbacks does not change the fact that these systems are precisely asynchronous because they are implemented using callbacks.
So, in summary, sync programming is a style of programming whereby we invoke (synchronous) functions that synchronously return their values to us, and async programming is a style of programming where we invoke (asynchronous) functions that asynchronously return their values to us through the mechanism of callbacks, whether that is visible or hidden.
The reason the industry adopted asynchronous programming, despite the fact that it is a more difficult style to program in (even with added layers), is scalability. Quoting JEP 425 again:
Some developers wishing to utilize hardware to its fullest have given up the thread-per-request style in favor of a thread-sharing style. Instead of handling a request on one thread from start to finish, request-handling code returns its thread to a pool when it waits for an I/O operation to complete so that the thread can service other requests.
We program with callbacks because we must, not because it is our preferred programming model. Our preferred programming model is synchronous programming.
Now, over time, async machinery has evolved to replicate sync machinery:
- Just like synchronous code can “block”, so to asynchronous code can “block”. In an asynchronous context, “blocking” means that there exists a potentially infinite delay between the registration of a callback, and the invocation of the callback, where such delay is caused by the result not yet being available from some external system (e.g. a response to a request, a chunk of data, etc.).
- Just like synchronous code “suspends” when it blocks, so also asynchronous code “suspends” when it blocks.
- Most async systems have replicated exceptions, including exception handling, and in some cases, even
finally
.
- Just like asynchronous code may never resume (which corresponds to the callback never being invoked), synchronous code may never resume (e.g.
while(true){}
).
- Etc.
At this point, for every useful or necessary feature in synchronous programming, there exists an analogue in asynchronous programming, whose implementation may be quite different, because it is based on callbacks, but whose semantics are identical, at least up to the limitations of callback-based programming (e.g. async stack traces are notoriously difficult because they require runtime support).
Now, Loom has arisen precisely because we would like to have our cake and eat it too:
- We wish to achieve the scalability of systems built on callbacks.
- We wish to program synchronously, in a direct style, without callbacks.
Loom gives us this magical combination through virtual threads, which have the same computation model as ordinary threads, but without the high cost. Effectively, this lets us create large numbers of threads, which “block” all the time, but this “blocking” is implemented inside the JVM, and therefore, it does not have to block operating system threads, which enables a small number of operating system threads to execute the work of large numbers of virtual threads.
Async reimplemented JVM threading in user-land code, to achieve high scalability, but Loom just makes JVM threading highly scalable.
Every async system implements something like a “fiber” (green thread implemented in user-land), even if it doesn’t expose a first-class value that represents the running computation.
However, let me be clear: when I say, Loom makes everything async, I am being imprecise: while Loom threads have an async (callback-like) implementation inside the Java runtime, to anyone building on the JVM, Loom code is synchronous code, written in a purely direct style.
To be precise, I would say Loom makes threads and blocking cheap, by giving us green threads. I take the shorthand form because the idea that “async is scalable, sync is not” is etched into everyone’s brains, due to decades of wrestling with JVM physical threads, and I want to emphasize that Loom gives us all of that scalability by baking into the JVM everything that we were doing manually with callbacks.
I understand that is a common perception but it is imprecise: technically speaking, asynchronous programming is about callbacks, not concurrency, and it is possible to launch tasks and continue doing things immediately, without waiting both in asynchronous style, and also in synchronous style.
For example, on pre-Loom JVM, I can write code like:
def fork[A](a: => A): Unit = new Thread() { override def run() = a }.start
fork(uploadFileToS3(file))
fork(...)
fork(...)
This is purely synchronous code and will block operating system level threads (limiting scalability), but it is also code that is launching tasks and continuing to do things immediately, without waiting.
Concurrency is the technically correct word to use when describing the interleaving of multiple independent strands of sequential computation, and concurrency is possible both in synchronous code, as well as asynchronous code.
Future gives us an async data type, all of whose operations are implemented using callbacks, as well as immediately submits the execution of user-defined code to a thread pool, which will initiate concurrent execution of the code.
That’s a design choice of Future, and not a necessary design choice in the landscape of async systems. Indeed, the functional effect systems like ZIO or CE or Monix make a different choice: to separate async operations from concurrent operations.