Speeding up JavaScript using Set data type

Original author: Bret Cameron
  • Transfer
Written material, translation of which we publish today, says he is confident that many developers use JavaScript-mostly data types such as Number, String, Object, Arrayand Boolean. In most cases, this is sufficient. But if you need to make the code as fast and scalable as possible, the use of these data types is not always justified. In this article we will talk about how, using a data type that provides the ability to work with collections of unique values, make code faster. This is especially true for the code of large-scale projects. Of types and a lot in common, but the use of data types



SetArraySetSetable to give the programmer such features that are clearly manifested during the execution of programs that the type does not have Array.

What is the difference between Array and Set data types?


The main feature of a data type Array(we will call objects of this type “arrays”) is that arrays are indexed collections of values. This means that data in arrays is stored using indexes.

const arr = [A, B, C, D];
console.log(arr.indexOf(A)); // Результат: 0
console.log(arr.indexOf(C)); // Результат: 2

Unlike arrays, objects of the type Set(we will call them “collections”) are collections containing data in the key / value format. Instead of using indexes, collections store items using keys. Elements stored in the collection can be sorted in the order they were added to the collection, while the collection cannot store the same elements. In other words, all elements of the collection must be unique.

What are the main strengths of the collections?


If you compare collections and arrays, then you can find some advantages over collections over arrays, especially noticeable in situations in which program performance is important:

  • Search for items. The methods of the arrays indexOf()and includes(), used to search for elements and check whether an element has an element in the array, work slowly.
  • Removing items. An item can be deleted in the collection based on its value. In an array, the equivalent of such an action would be to use a method splice()based on the index of the element. As with item search, removing items using indexes is a slow operation.
  • Insert an item. It is much faster to add an element to the collection than using methods like push()and unshift()into an array.
  • Work with value NaN. The method indexOf()cannot be used to search for values NaNin the array, while using the collection method has()you can find out if it is in it NaN.
  • Removing duplicate items. Type objects Setonly store unique values. If you need to avoid saving duplicate elements in some data structure, this is their significant advantage over arrays. When working with arrays to remove duplicate elements would have to write additional code.

A complete list of built-in methods of type objects Setcan be found here .

On the time complexity of algorithms


The methods that arrays use to search for elements have linear time complexity - O (N). In other words, the element search time is proportional to the size of the array.

Unlike arrays, the methods used by collections to find, delete, and add elements have O (1) time complexity. This means that the size of the collection has virtually no effect on the working time of such methods.

Here you can read more about the time complexity of the algorithms.

How much faster are collections than arrays?


Although the performance indicators of JavaScript code are strongly influenced by a variety of factors, in particular, they depend on the system on which the code is run, on the code runtime used, on the size of the processed data, I hope the results of my tests will give you the opportunity to compare arrays and collections from a practical point of view, and understand how collections are faster than arrays. Now we will consider three simple tests and analyze their results.

▍Test preparation


Before doing any tests, let's create an array with a million elements and the same collection. For the sake of simplicity, we will use a cycle, the first counter value of which will be 0, and the last - 999999:

let arr = [], set = new Set(), n = 1000000;
for (let i = 0; i < n; i++) {
  arr.push(i);
  set.add(i);
}

▍Test No. 1: checking for the presence of an element in an array and in a collection


First, we will check for the presence of an element 123123in the array and in the collection, knowing in advance that there is such an element in these data structures.

let result;
console.time('Array');
result = arr.indexOf(123123) !== -1;
console.timeEnd('Array');
console.time('Set');
result = set.has(123123);
console.timeEnd('Set');

Here are the results of this test:

Array: 0.173ms
Set: 0.023ms

The collection is 7.54 times faster than the array.

▍Test No. 2: insertion of an element


Now let's try adding elements to arrays and collections.

console.time('Array'); 
arr.push(n);
console.timeEnd('Array');
console.time('Set'); 
set.add(n);
console.timeEnd('Set');

Here's what happened:

Array: 0.018ms
Set: 0.003ms

The collection is 6.73 times faster than the array.

▍Test 3: delete an item


Now let's remove the item from each data structure (for example, the one that was added in the previous test). Arrays do not have a built-in method for deleting elements, so we will create a helper function to keep our code in good condition:

const deleteFromArr = (arr, item) => {
  let index = arr.indexOf(item);
  return index !== -1 && arr.splice(index, 1);
};

And here is the test code:

console.time('Array'); 
deleteFromArr(arr, n);
console.timeEnd('Array');
console.time('Set'); 
set.delete(n);
console.timeEnd('Set');

The result is the following:

Array: 1.122ms
Set: 0.015ms

In this case, the collection was 74.13 times faster than the array!

In general, it can be noted that code performance can be significantly increased by using collections instead of arrays. Consider a few practical examples.

Example # 1: removing duplicate values ​​from an array


If you need to quickly remove duplicate values ​​from an array, you can convert it to a collection. This is perhaps the easiest way to get rid of duplicate values:

const duplicateCollection = ['A', 'B', 'B', 'C', 'D', 'B', 'C'];
// Если нужно превратить массив в коллекцию
let uniqueCollection = new Set(duplicateCollection);
console.log(uniqueCollection) // Результат: Set(4) {"A", "B", "C", "D"}
// Если нужно сохранить значения в виде массива
let uniqueCollection = [...new Set(duplicateCollection)];
console.log(uniqueCollection) // Результат: ["A", "B", "C", "D"]

Example 2: Interview task at Google


In one of my material, I examined four options for answering the question asked by an interviewer from Google. The interview was conducted using C ++, but if JavaScript were used instead of this language, the data structure would have to be used in solving the problem Set.

If you want to understand the answer to this question more deeply - read the above article. Here I just show a ready-made solution.

▍ Task


Given an unsorted array of integers and a value sum. Write a function that returns trueif the result of adding any two elements of this array is a value sum. If there are no such elements in the array, the function should return false.

It turns out, for example, that if we are given an array [3, 5, 1, 4]and a value sumequal to 9, then the function should return true, since 4+5=9.

▍Solution


You can approach this problem using the following idea: you need to iterate over the array, creating as it goes through the data structure Set, which will add values ​​that complement the values ​​found to the value sum.

Let's analyze this idea using the example of the above array. When we meet 3, we can add a number to the collection 6, since we know that we need to find two numbers that are equal in total 9. Then, every time we come across a new value from the array, we can check the collection and see if it is there. When we meet the number 5, we will add to the collection 4. And when, finally, we get to the number 4, we find it in the collection and can return it true.

Here's what a solution to this problem might look like:

const findSum = (arr, val) => {
  let searchValues = new Set();
  searchValues.add(val - arr[0]);
  for (let i = 1, length = arr.length; i < length; i++) {
    let searchVal = val - arr[i];
    if (searchValues.has(arr[i])) {
      return true;
    } else {
      searchValues.add(searchVal);
    }
  };
  return false;
};

And here is a more concise solution:

const findSum = (arr, sum) =>
  arr.some((set => n => set.has(n) || !set.add(sum - n))(new Set));

Since the time complexity of the method Set.prototype.has()is O (1), using a data structure Setto store numbers that complement the numbers found in the array to a given value allows you to find a solution in linear time (O (N)).

If the solution would depend on the method Array.prototype.indexOf()or on the method Array.prototype.includes(), the time complexity of each of which is O (N), then the total time complexity of the algorithm would be O (N 2 ). As a result, he would become much slower.

Summary


If you have not encountered a data type Setin JavaScript before , then we hope that now, having an idea of ​​it, you will be able to use it with benefit in your projects.

Dear readers! How would you apply a data structure Setin your code?


Also popular now: