Sterhel October 22, 2012 at 16:45

Rivertrail: concurrency in JavaScript

Transfer

Using concurrency is now a common programming practice. However, all languages can be divided into two types: those in which parallelism is used with might and main (for example, C), and those that have not yet fully tasted the joys of multithreading. The latter, in particular, include JavaScript. To fill this annoying gap and to fill up the box of progressive experience, we offer you a translation of the message from the blog of Nick Matsakis , the Mozilla Foundation programmer, in which he shares his first personal impressions of using Rivertrail - a parallelization tool in JavaScript created by Intel.

I recently started using RivertrailIs a product offered by Intel as a tool for parallelizing data in JS. The project so far leaves only positive impressions. The initial version will be tied to Intel specifications , but I hope that later it will be possible to combine it with other projects created within the framework of PJs.

Rivertrail for Dummies

Here is a small intro for those who have not heard of this product before: Rivertrail is a specification that allows you to process an array in parallel. The basis of the specification is the addition of the ParallelArray class . These arrays have a number of key differences from JS arrays.

they are unchanging
do not have gaps
they can be multidimensional, but multidimensional “correctly” (for example, in a two-dimensional matrix the number of columns will be equal to the number of rows).

Parallel arrays support a fairly large number of high-level operations, such as () map and () reduce , and not only. You can find the complete list in the specification for Rivertrail . These methods take functions as arguments, and work in much the same way as the corresponding functions in regular JS arrays. In addition to a couple of important differences:

firstly, the functions that are taken as an argument should always be “pure”, about this a little lower;
secondly, where possible, the JS engine will try to perform these functions in parallel.

Pure functions

These are ordinary JS functions that do not change the shared state. This does not mean that functions cannot change anything at all - local variables and objects allocated by the function itself can change.

For example, a function that computes the Mandelbrot set is pure:

  function mandelbrot(x, y) {
    var Cr = (x - 256) / scale + 0.407476;
    var Ci = (y - 256) / scale + 0.234204;
    var I = 0, R = 0, I2 = 0, R2 = 0;
    var n = 0;
    while ((R2+I2 < 2.0) && (n < 512)) {
        I = (R+R)*I+Ci;
        R = R2-I2+Cr;
        R2 = R*R;
        I2 = I*I;
        n++;
    }
    return n;
}

mandelbrot () only modifies local variables. But sums () is more interesting in this regard - the function calculates the partial sums of the input array X and stores them in the output array sums:

function sums(x) {
    var sums = [], sum = 0;
    for (var i = 0; i < x.length; i++) {
        sum += x[i];
        sums[i] = sum;
    }
    return sums;
}

What you should pay attention to - this function assigns values to the sums array, changing the heap object, and not just local variables. But, since this object is highlighted in the function itself, it is still a pure function. In fact, a specific example will not be executed in parallel due to a number of limitations in the current version of Rivertrail, but hopefully this will change soon.

And here is an example of a function that will not be considered pure, since it changes X , and X is not allocated locally.

x = [1, 2, 3];
function impure() {
    x[0] += 1;
}

Parallel execution

The main magic of ParallelArray is that, as far as possible, it will perform functions in parallel. In this case, the parallelism or sequence of execution will depend on the implementation of JS itself. The fact that functions intended for parallel execution are clean means that conceptually they can always be executed safely. But this does not mean that the JS engine itself will be able to execute them.

JS engines do a lot of covert optimization, and not all of these optimizations are thread-safe.

So far, the list of operations performed in parallel is rather conservative, but over time it will expand and, ideally, will expand so that any pure functions can be performed in parallel.

You have to do many things to make sure your code is fast. To ensure parallel code execution, you have to do the same things. That's why.

Example

a.b = c

If the JIT compiler analyzes type a and determines, for example, that property b is always stored with a certain offset, it will be able to optimize the code in one assembler instruction. But if the compiler fails to parse the code, in the worst case, the interpreter will be called, working on various hash tables, prototyping trees, and so on. Now you need to understand if ab = c is thread safe. Everything is simple here - the save instruction is safe under the assumption that only one thread has access to the memory into which it is saved. Which is the guarantee of the "purity" of the function. It will be more difficult to solve this when the interpreter will touch hundreds, if not thousands, of lines of code.

Of course, knowing which code will be effectively compiled is not a victory. In the following articles I will talk about some things that ensure parallel execution for today, and give a couple of forecasts in this area for the future.

Parallel Execution Models

Rivertrail Mozill’s usage model is slightly different than Intel’s prototype . This is a plugin compiling JS in OpenCL. The native implementation allows you to execute the code in 4 different ways, but at the moment only the first 2 are available.

Sequentially . Standby mode. Equivalent to writing for loops or using high-level Array methods. This mode works in Nightly builds and, possibly, in Aurora. Try typing var x = new ParallelArray ([1, 2, 3] in the console
Multicore mode . In this mode, and work by default. Multi-core mode distributes one thread per core in the system. Each worker thread will work with a parallel copy of the function. We expect a more functional version of this mode over the next couple of months.
Vectorized mode. It looks like a multi-core one, but there are differences - each worker thread uses SSE instructions to process more than one array element at a time. This is still in the plans after Multicore.
The GPU . This is just a variant of the vectorization mode, but in it the vectorization of the code will not work on the CPU, but on the GPU. There are many technical differences. On the one hand, vectorization will be handled by the GPU in hardware, and the compiler will not have to use special instructions. On the other hand, on some platforms you will have to work hard on copying memory between the CPU and GPU.

Of the described modes, the most general can be considered Serial - it can be applied to any pure function. Multi-core is also quite versatile and can be used when working with pure functions that limit themselves to the currently supported operations.

Vectorization and GPU modes will be more limited. It will make sense to use vectorization only for functions in which the code can be converted into SSE instructions without packing and unpacking, while the GPU will impose certain restrictions on the movement of data, and so on.

A few words about performance

There will be some data since

I have not finished profiling yet
no good detailed tests yet
optimization of calculations also failed

At least, here are the results of computing the Mandelbrot set on my quad-core laptop with Hypertheading.

seq - runtime in sequential mode
par - runtime of the same number of threads in parallel mode
ratio - ratio of time of sequential mode to parallel time (seq / par). The bigger, the better.

Threads	Seq (ms)	Par (ms)	Ratio (Seq / Par)
1	2976	2515	1.18
2	2952	1782	1.65
4	2964	1417	2.09
8	2880	1149	2.50
16	2891	1109	2.60

Obviously, these values can be improved. It would be great if productivity would increase linearly. And I do not think that this is difficult to achieve, I am optimistic.

By the way, the data on the serial mode is taken here, using JS execution based on arrays, and not on the parallel mode ParallelArray, and the code worked for a while to make sure that JIT was used. Although instrumental testing of the use of JIT was not done (which is why I say that “proper profiling was not done”).

About PJs

Some of you may have heard of previous ideas for running JavaScript in parallel, they were called “PJs” (Parallel Java Script). While this is all in the plans, but maybe it will be possible to use some Rivertrail mechanisms in the PJs API. And so far no reasons can be seen that could become an obstacle in this matter. The main thing now is to expand the set of functions supported by multi-core mode.

Summarizing

Combining data (concurrency) comes to JS (at least in Firefox). Implemented APIs can put JS at the forefront of programming languages using concurrency. This is all very simple to use and guarantees serialized execution. PJs also guarantees deterministic execution, but Rivertrail does not because of functions like () reduce . Few languages can boast of this.

Thanks to vikky13 for helping me translate and edit.

Tags: