StreetStrider April 8, 2013 at 22:31

Asynchronous APIs and Deferred Objects in Detail

From the sandbox
Tutorial

Most modern programming languages allow the use of asynchronously executed blocks of code. Along with the flexibility gained by using the asynchronous approach, those who dared to use it also get more difficult to understand and maintain code. However, any complication faced by programmers, as a rule, finds a practical solution in the form of a new approach or increase the level of abstraction. In the case of asynchronous programming, such a tool is an object of the type deferred result or deferred (English deferred - deferred, deferred).

The article will talk about the basic approaches to returning asynchronous results, callback functions, deferred objects and their capabilities. Examples will be given in JavaScript, and a deferred sample object will be parsed. This article will be useful to programmers who begin to comprehend asynchronous programming, as well as familiar with it, but not owning a deferred object.

Synchronous and asynchronous calls

Any function can be described in synchronous and asynchronous form. Suppose we have a function calcthat does some calculation.

In the case of the usual, “synchronous” approach, the calculation result will be transmitted through the return value, that is, the result will be available immediately after the function is executed, and can be used in another calculation.

var result = calc();
another_calc(result * 2);

The code is executed strictly sequentially, the result obtained on one line can be used on the next. This is reminiscent of the proof of the theorem, when the following statements logically follow from the previous ones.

In the case of an asynchronous call, we cannot get the result in place. By calling a function calc, we only indicate the need to perform the calculation and get its result. In this case, the next line will start executing without waiting for the previous one to complete. Nevertheless, we somehow need to get the result, and here the callback comes to the rescue - a function that will be called by the system upon the arrival of the calculation result. The result will be passed to this function as an argument.

calc(function (result) {
   another_calc(result * 2);
});
no_cares_about_result();

As you can see from the example, the function now has the signature:, calc(callback)but callbacktakes the result as the first parameter.

Since it calcis executed asynchronously, the function no_cares_about_resultwill not be able to access its result, and, generally speaking, it can be executed earlier than the callback (speaking specifically about JavaScript - if the called function is truly asynchronous, but does not take data from the cache, for example, it is guaranteed, that it will always be executed after the calling code has been executed, that is, the remaining code will always be executed before the callback; this will be discussed below).

You must admit that such a code has already become somewhat more difficult to understand, with the same semantic load as its “straightforward” synchronous analogue. What are the benefits of using an asynchronous approach? First of all, in the rational use of system resources. For example, if it calcis a time-consuming calculation that can take a lot of time, or uses some external resource, the use of which is subject to a certain delay, then with a synchronous approach, all subsequent code will be forced to wait for the result and will not be executed until it is executed calc. Using the asynchronous approach, it is possible to explicitly indicate which part of the code depends on some result, and which is indifferent to the result. In the exampleno_cares_about_resultobviously does not use the result, and therefore, he does not need to expect it. The code section inside the callback will be executed only after receiving the result.

Generally speaking, most APIs, by their nature, are asynchronous, but can mimic synchronous ones: access to remote resources, database queries, even the file API are asynchronous. If the API “pretends” to be synchronous, then the success of such “pretense” is associated with delays in the result: the smaller the delay, the better. The same file API, working with the local machine, shows small delays and is often implemented as synchronous. Work with remote resources and access to the database are increasingly implemented asynchronously.

Layered Challenges

The difficulties of the asynchronous approach become more noticeable when it is necessary not only to make an asynchronous call, but, having received its result, to do something with it and use it in another asynchronous call. Obviously, a synchronous approach to several sequentially executed lines of code is not suitable here:

var result = calc_one();
result = calc_two(result * 2);
result = calc_three(result + 42);
// using result

The code will take the following form:

calc_one(function (result) {
   calc_two(result * 2, function (result) {
      calc_three(result + 42, function (result) {
         // using result
      });
   });
});

Firstly, this code has become "multi-level", although, in terms of the actions it is performed, it is similar to synchronous. Secondly, in the function signature calc_two, calc_threemixed input parameters and the callback, which is essentially a place of return of the result , ie output parameter. Thirdly, each function may fail, and the result will not be obtained.

You can simplify this code by defining callback functions separately and passing them by name, however, this is not a solution to all problems. A new level of abstraction is required here, namely, we can abstract the asynchronous result .

Asynchronous result

What is such a result? In fact, this is an object containing information that the result will someday come or has already arrived. Subscribing to the result is done through the same callback, however now it is encapsulated in this object and does not oblige asynchronous functions to implement callbacks as input parameters.

In fact, three things are required of the result object: to implement the ability to subscribe to the result, the ability to indicate the arrival of the result (this will be used by the asynchronous function itself, and not by the API client) and the storage of this result.

An important distinguishing feature of such an object is also the specificity of its states. Such an object can be in two states: 1) there is no result, and 2) there is a result. Moreover, the transition is possible only from the first state to the second. When the result is obtained, it is no longer possible to go into a state of its absence or into a state with another result.

Consider the following simple interface for this object:

function Deferred () // constructor
function on (callback)
function resolve (result)

The method onaccepts a callback. The callback will be called as soon as the result is available and it will be passed as a parameter. Here is a complete analogy with a regular callback passed as a parameter. At the time of the callback registration, the object may be in a state with and without a result. If there is no result yet, the callback will be called upon his arrival. If the result is already there, the callback will be called immediately. In both cases, the callback is called once and gets the result.

The method resolveallows you to translate (resolver) an object into a state with a result and indicate this result. This method is idempotent , i.e. repeated callsresolvewill not modify the object. Upon transition to a state with the result, all registered callbacks will be called, and all callbacks that will be registered after the call resolvewill be called instantly. In both cases (registration before and after the call resolve), callbacks will receive the result, due to the fact that the object stores it.

An object with this behavior is called deferred (and is also known as promise and future ). Compared to simple callbacks, it has several advantages:

1. Abstraction of the asynchronous function from the result: now each asynchronous function does not need to provide callback parameters. The subscription to the result remains with the client code. For example, we don’t need to subscribe to the result at all if we don’t need it (it’s similar to passing the noop function as a callback). The interface of an asynchronous function becomes cleaner: it has only significant input parameters, it becomes possible to more confidently use functions with an indefinite number of parameters, options parameter, etc.
2. Abstraction from the state of the result: the client of the code does not need to check the current state of the result, he just signs the handler and does not think whether the result has arrived or not.
3. The possibility of multiple subscriptions: you can sign more than one handler and all of them will be called upon receipt of the result. In a callback scheme, you would have to create a function that calls a group of functions, for example.
4. A number of additional amenities, including, for example, the "algebra" of deferred objects, which allows you to determine the relationship between them, run them in a chain, or after successfully completing a group of such objects.

Consider the following example. Let there be an asynchronous function getData(id, onSuccess)that takes two parameters: the id of some element that we want to receive and a callback to get the result. A typical code for its use would look like this:

getData(id, function (item) {
   // do some actions with item
});

We rewrite this using Deferred. The function now has a signature getData(id)and is used as follows:

getData(id).on(function (item) {
   // do some actions with item
});

In this case, the code has not become much more complicated; rather, the approach has simply changed. The result is now passed through the return value of the function as deferred. However, as will become noticeable later, in more complex cases, the use of deferred gives some advantage in the readability of the code.

Error processing

The issue of error handling when using such objects will be reasonable. In synchronous code, an exception mechanism is widely used, which allows in case of an error to transfer control to higher blocks of code, where all errors can be caught and processed, without significantly complicating the "local" code, freeing the programmer from having to write checks for each sneeze.
In asynchronous code (and in any callback scheme), there is some difficulty in using exceptions, because the exception will arrive asynchronously, like the result, and therefore it cannot be caught simply by framing the call to the asynchronous function intry. If we consider the error, then, in fact, this is just another result of the function (we can say negative, but also the result), while the object of the error (exception) acts as the return value.

Such a result, as well as a successful one, is implemented as a callback (sometimes called errback , a pun from error and back ).

Let's strengthen our training object Deferredso that it can provide a subscription separately for success and failure, namely, we will rework the methods onand resolve.

function on (state, callback)

The first parameter can transmit enumeration value from two values of, e.g. E_SUCCESS, E_ERROR. For readability, we will use simple string values in the examples: 'success', 'error'. Also, we will strengthen this method, obliging it to return the object itselfDeferred . This will allow the use of subscription chains (a technique very specific to JavaScript).

The method also changes accordingly resolve:

function resolve (state, result)

As the first parameter, the state is passed to the object Deferred(error, success), and the second is the result. The state rule still applies to such a modified object: after the transition to the state with the result, the object cannot change its state to another. This means that if an object has passed, for example, to the success state, then all handlers registered for an error will never work, and vice versa.

So, let our function getDatamay end with some error (no data, incorrect input, failure, etc.).
The code will take the following form:

getData(id)
.on('success', function (item) {
   // do some actions with item
})
.on('error', function (err_code) {
   // deal with error
});

Consider a more realistic example, namely, take the standard method fs.readFile from the standard Node.js. This method reads a file. At the beginning of the article it was mentioned that almost any function can be written either in synchronous or in asynchronous style. In the standard Node.js library, the file API is defined in both styles, each function has its own synchronous counterpart.

For example, we use the asynchronous version of readFile and adapt it to use Deferred.

function readFileDeferred (filename, options)
{
   var result = new Deferred;
   fs.readFile(filename, options, function (err, data)
   {
      if (err)
      {
         result.resolve('error', err);
      }
      else
      {
         result.resolve('success', data);
      }
   });
   return result;
}

Such a function is somewhat more convenient to use, because it allows you to register functions for success and error separately.

The described functionality is quite enough for the vast majority of cases, but deferred has more potential, which will be discussed below.

Advanced Deferred Object Features

1. Unlimited number of outcome options. In the example, an object was used Deferredwith two possible results: success and error. Nothing prevents the use of any other (custom) options. Fortunately, we used a string value as state, this allows us to define any set of results without changing any enumerated type.
2. The ability to subscribe to all options for the result. This can be used for all kinds of generalized handlers (this makes the most sense, coupled with paragraph 1.).
3. Creating a sub-object promise. It Deferredcan be seen from the interface of the object that the client code has access to the method resolve, although, in fact, it only needs the ability to subscribe. The essence of this improvement is the introduction of the methodpromise, which returns a "subset" of the object Deferredfrom which only subscription is available, but not setting the result.
4. State transfer from one deferred to another, optionally subjecting the result to conversion. This can be very useful for multi-level calls.
5. Creation of deferred, which depends on the result of a set of other deferred. The essence of this improvement is to subscribe to the result of a group of asynchronous operations.
Suppose we need to read two files and do something interesting with both. We use our function readFileDeferredfor this:

var r1 = readFileDeferred('./session.data'),
    r2 = readFileDeferred('./data/user.data');
var r3 = Deferred.all(r1, r2);
r3.on('success', function (session, user) {
   session = JSON.parse(session);
   user = JSON.parse(user);
   console.log('All data recieved', session, user);
}).on('error', function (err_code) {
   console.error('Error occured', err_code);
});

Deferred.allcreates a new object Deferredthat will go into the success state if all the arguments passed go into that state. In doing so, it will also receive the results of all deferred as arguments. If at least one argument goes into the error state, then the result Deferred.allwill also go into this state and get as the result the result of the argument that went into the error state.

JavaScript deferred features

It is worth noting that there is no multithreading in JavaScript. If a callback was set by setInterval/ setTimeoutor by events, it cannot interrupt the execution of the current code, or run in parallel with it. This means that even if the result of the asynchronous function arrives instantly, it will still be received only after completion of the execution of the current code.

In JavaScript, functions can be called with any number of parameters, as well as with any context. This allows you to transfer as many parameters as necessary to callbacks. For example, if an asynchronous function returns a pair of values (X, Y), then they can be transferred as an object with two fields, or a list with two values (an impromptu analogue of a tuple), or you can use the first two arguments of the callback for this purpose.

A callback call in this case can take the following form:

callback.call(this, X, Y);

JavaScript uses links, and freeing memory is controlled by the garbage collector. The deferred object is needed both inside the asynchronous function (to signal the arrival of the result) and outside (to get the result), in languages with more stringent models of working with memory, you should take care of the correct processing of the lifetime of such an object.

Existing deferred

1. In jQuery there is an object $.Deferred( documentation ). It supports subscription to success, error, progress notifications are also supported: intermediate events generated before the result arrives; You can transfer the state to another Deferred (method then), you can register Deferred by the result of the Deferred ( $.when) list , you can create it promise.
All library ajax methods return a promise for such an object.
2. The q library implements deferred objects, it is possible to make chains of asynchronous functions, you can register deferred by the result of the deferred list.
3. The async.js library allows you to use filter / map / reduce on asynchronous calls, create chains and groups of asynchronous calls.
4. The when.js library also allows deferred to be used.
5. The Dojo Toolkit contains a Deferred object ( documentation ).
6. In the brother language of Python, in the event-driven Twisted framework there is a Deferred object ( documentation ). This implementation is very old and may claim the right of the ancestor of the idea of deferred results.
It supports subscription to success, error and both results. You can pause an object.
7. In my interest in Deferred, I wrote my own version of this object ( documentation , source code , tests ). A number of features described in this article are supported.

That's all, thank you for your attention.

Tags: