What is “asynchronous” programming?
“Asynchronous” programming, by definition, contrasts with classic imperative or functional programming, which is inherently “synchronous.” To fully grasp the difference between these two approaches, let’s remember that a program is simply a set of instructions, some of which are considered “blocking.” An instruction is deemed “blocking” when it explicitly requires waiting for a resource to become available (a file, a network connection, a mutex, etc.). A program specifically designed to minimize the number of blocking instructions can be regarded as an example of asynchronous programming.
Traditionally, to avoid relying on blocking instructions, two types of approaches are used:
- Callbacks
The blocking instruction is replaced by a non-blocking one to which a function is attached, executed upon acquiring the resource, while continuing to execute the following series of instructions. - Futures
The blocking instruction is replaced by a non-blocking one, providing the means to know at any point in the program whether the resource is finally available.
Why use asynchronous programming?
Asynchronous programming allows multiple asynchronous tasks to be executed concurrently from the same thread. It is one of the techniques used to enhance user experience by making applications more responsive, particularly web applications. Instead of waiting for a response to a request made to a web service, the program can now send another request, process a response from a previous one, or provide user feedback on the program’s progress.
It also emphasizes resource management. There is an implicit cost associated with holding a thread for a blocking instruction. A thread is essentially a finite resource directly or indirectly tied to the available memory, and its usage incurs scheduling costs, whether at the virtual machine or system kernel level.
In an ideal world, it is important to remember that using more threads than the number of cores available on the machine running one (or more) applications is unnecessary and even inefficient. This can commonly be referred to as resource waste. Asynchronous programming and multi-threaded programming address different needs but are not necessarily incompatible.
Why not use asynchronous programming?
Asynchronous programming introduces an additional layer of complexity to the way program code is written or even architected, making it difficult at times to trace the execution of an otherwise simple program. Unfortunately, it is often necessary to fully commit to asynchronous programming when writing a program for it to be genuinely beneficial. Even languages that provide syntactic sugar for asynchronous programming, such as Python or C# with the async
/ await
keywords, often require a complete code rewrite.
Additionally, certain resource synchronization constraints complicate writing such programs. A learning period is typically required for anyone diving into asynchronous programming for the first time. Although it may seem approachable at first glance, it carries pitfalls that could potentially undermine the existing code.
About Reactor (Spring WebFlux)
Reactor is a Java implementation, the language in which Babbar‘s crawler is written, of the so-called “reactive” or “event-driven” programming paradigm. Reactive programming is a form of asynchronous programming that emphasizes data streams and event propagation (state changes, user input, etc.). It also reduces the complexity introduced by asynchronous programming, such as “callback hell,” by leveraging techniques closer to functional programming.
Reactive programming was introduced by Microsoft in the .NET ecosystem with Rx extensions (and its Observable
/ Observer
interfaces), extended to the Java universe with RxJava
(see https://reactivex.io/), and later directly integrated into Java 9 with the implementation of Reactive Streams
(see https://www.reactive-streams.org/). Today, implementations exist for many languages, particularly those that already support asynchronous programming, such as RxJS
, RxScala
, RxClojure
, RxGo
, etc. It suits languages adopting either imperative or functional approaches.
Reactive programming is based on the Observer design pattern, more commonly known as publisher/subscriber (observable/observer), a “push” approach, as opposed to the more traditional Iterator pattern (including Java streams from java.util.Stream
), a “pull” approach. This distinction provides greater control over data (or event) flows and how they are processed throughout the chain. For example, errors encountered during the calculation chain can be explicitly propagated and handled in a controlled manner.
The biggest challenge for reactive libraries is offering the power of advanced asynchronous features while making it simple and efficient to reuse and/or combine these methods. Reactor, for instance, provides the concept of “operators” to combine and reuse transformations applied to data/event streams implementing the Publisher
interface, like the Flux
or Mono
classes.
Most operations performed in Reactor on Flux
classes are not asynchronous by nature. Only a few operators truly take advantage of the library’s asynchronous nature (e.g., Flux.flatMap
potentially used with Flux.deferred
/ Flux.fromCallable
and Flux.subscribeOn
/ Flux.publishOn
). Parallelization of operations, distinct from concurrent asynchronous tasks mentioned earlier, is available via the Flux.parallel
and Flux.runOn
operators.
Thanks to the publisher/subscriber design (observable/observer), reactive programming also allows for efficient solutions to common asynchronous programming problems, such as back-pressure management. It is crucial in such applications to finely tune the supply (observable) and demand (observer) to maximize resource utilization.
Most reactive programming libraries already implement advanced and dedicated concepts such as continuous data streams, event composition, caching, broadcasting, and stream synchronization. Reactor focuses on the fundamentals of reactive programming, while Spring WebFlux (and potentially Spring MVC) relies on it to implement a framework dedicated to highly scalable web services while maintaining low resource consumption.
About Virtual Threads
Virtual threads help avoid resource waste caused by potentially blocking instructions (by simply making them indirectly non-blocking) and maximize performance/resource ratios without rewriting the application code. Despite some remaining limitations (partially being addressed), they render asynchronous programming obsolete.
While the principles introduced by reactive programming compared to traditional asynchronous programming are still generally relevant, there are often more conventional approaches to address them. Thus, choosing reactive programming can often prove more cumbersome due to the unavoidable need for a training phase.