Deferred array processing

Once, when developing a complex application in JavaScript, an insurmountable problem arose - a large array in Internet Explorer (hereinafter IE). Since I initially tested work in Chrome, I did not notice any problems when using arrays. I will say more - nested arrays did not cause a feeling of fear. Chrome easily coped with difficult tasks. However, IE, in a very tough form, pointed out all my flaws and flaws in the code, especially the handling of Large arrays.

In a nutshell


The developed application was supposed to respond in real time to events coming from the server. For these purposes, we used the Socket.IO library , which made it easy to implement two-way communication with the server. At the same time, the application was also tied to a third-party service, to which I had access only through an external REST API.

The weak link was a large cycle that was supposed to process an array of data received just from a third-party service. In this case, it was necessary to simultaneously change the DOM model in a cycle. In these cases, IE went into a state of deep thought and naturally broke the connection. Outwardly, this was unacceptable - the application lock and the constantly spinning download icon could not satisfy the customer. Yes, many can rightly point out the high costs of modifying the DOM model. And I must assure you that this part of the problem was solved by taking all the changes outside the cycle. However, the main problem - connection disconnection during array processing remained.

Full refactoring was scary, as thousands of lines of code had already been written. In addition, the deadlines were running out. It was under such conditions that the idea of ​​a “deferred”, partial processing of an array was born.

Algorithm


The array is logically divided into several ranges, each of which is processed with a delay. The essence of the work is shown in the figure:



Thus, having a single global Array array, the function processes it in parts, with a predetermined delay. For an asynchronous language, which is JavaScript, this technique allows you to offload the main thread (also the only one) when processing very large arrays.

The main body of the handler:
//Lazy array processing
/**
* Input parameters
* params = {
*   outerFuntion : function(array, start, stop),
*   mainArray: []
*   startIndex: Int,
*   stepRange Int,
*   timeOut: Int,
*   callback: function()
*}
*/
var lazyProcessing = function(params) {
    var innerFunction = (function(_) {
        return function() {
           var arrayLength = _.mainArray.length,
               stopIndex   = _.startIndex + _.stepRange;
            if(arrayLength < stopIndex) {
                _.outerFunction(
                       _.mainArray,
                       _.startIndex,
                       arrayLength - _.startIndex);
                if(_.callback != undefined) {
                    _.callback();
                }
                return;
            } else {
                _.outerFunction(
                       _.mainArray,
                       _.startIndex,
                       stopIndex);
                _.startIndex += _.stepRange;
                lazyProcessing(_);
            }
        }
    })(params);
    setTimeout(innerFunction, params.timeOut);
};
//Outer function works with mainArray in given range
var func = function(mainArray, start, stop) {
    //TODO: this should be everything you want to do with elements of
    for (var i = start; i < stop; ++i) {
        // do something
    }
};

In the body of the lazyProcessing function , a local innerFunction function is created , which closes the received params parameters . This allows each time, when recursively calling the lazyProcessing function from itself, to save unique parameters.

The innerFunction function returns an unnamed function that performs the following fairly simple actions:
  1. Checks the end of a global array
  2. Calls the outer function outerFuntion with various stop values
  3. In the case of reaching the end of the array - is callback function
  4. Otherwise, it calls lazyProcessing recursively .

The array itself is passed to the outer function outerFuntion (due to its global nature , this could not have been done, but this will affect visualization), as well as the indices of the beginning and end of the loop. In this case, the processing results can either be stored in an array, or in other global variables. It all depends on the current task.

Upon reaching the end of the array, a callback function may be called, but this is optional.

Pros and cons


Naturally, this solution has its pitfalls and disadvantages:
  1. If you set stepRange minimum, a sufficiently large mainArray array can be corny "fall" on the stack overflow
  2. The thread will still block when an external outerFunction is called. Those. performance will directly depend on the algorithm for processing array elements
  3. “Noodles” of nested and returned functions does not look very friendly

At the same time, partial processing of the array at regular intervals does not block the flow of program execution. That allows you to handle other callback functions.

Full working example:
//Test array
var test = [];
for(var i = 0; i < 100000; ++i) {
    test[i] = i;
}
//Lazy array processing
/*
params = {
    outerFuntion : function(array, start, stop),
    mainArray: []
    startIndex: Int,
    stepRange Int,
    timeOut: Int,
    callback: function()
}
*/
var lazyProcessing = function(params) {
    var _params = params;
    var innerFunction = (function(_) {
        return function() {
            var arrayLength = _.mainArray.length,
                stopIndex   = _.startIndex + _.stepRange;
            if(arrayLength < stopIndex) {
                _.outerFunction(
                    .mainArray,
                    _.startIndex,
                    arrayLength - _.startIndex);
                if(_.callback != undefined) {
                    _.callback();
                }
                return;
            } else {
                _.outerFunction(
                    _.mainArray,
                    _.startIndex,
                    stopIndex);
                _.startIndex += _.stepRange;
                lazyProcessing(_);
            }
        }
    })(_params);
    setTimeout(innerFunction, _params.timeOut);
};
//Test function works with array
var func = function(mainArray, start, stop) {
    //TODO: this should be everything
    //you want to do with elements of mainArray
    var _t = 0;
    for (var i = start; i < stop; ++i) {
        mainArray[i] = mainArray[i]+2;
        _t += mainArray[i];
    }
};
lazyProcessing({
        outerFunction: func,
        mainArray: test,
        startIndex: 0,
        stepRange: 1000,
        timeOut: 100,
        callback: function() {
                    alert("Done");
                  }
        });



PS. The user zimorodok threw a wonderful example - the same in spirit and in essence. I can not add it.
Pass through an array with a callback for each element with the ability to set timeOut:
/*
example of use
var arr = ['masha','nadya', 'elena'];
iterate_async(arr, function (el, index, arr) {
    console.log(el + ' is #' + (index + 1));
}, 99);
*/
function iterate_async (arr, callback, timeout) {
	var item_to_proceed;
	item_to_proceed = 0;
	(function proceed_next () {
		if (item_to_proceed < arr.length) {
			setTimeout(function () {
				callback.call(arr, arr[item_to_proceed], item_to_proceed, arr);
				item_to_proceed += 1;
				proceed_next();
			}, timeout || 50);
		}
	}());
}

Also popular now: