Jabher November 4, 2013 at 17:02

Harmony collections NOW

An article about such wonderful things as Map, WeakMap and Set was already slipping on the hub, but in reality the real capabilities of these APIs were not disclosed (if I nevertheless made good use of the search).
These APIs are not really implemented anywhere except firefox (can be included in chrome canary), but even there, until recently, the use of HTMLElement-like objects as keys was not supported. Polymer, for example, removed only three weeks ago

	if (navigator.userAgent.indexOf('Firefox/') > -1)

Why are they so good? In essence, Map / WeakMap can be perceived as ordinary hash objects, only complex objects (Object, Function, Array) can be used as keys, since the binding does not take place according to the contents, but to the address in memory.
Thus, it becomes possible to attach on the frontend to

dom element
XHR request
File element

This gives us the opportunity to work without id-elements of elements, do data-binding many times faster, create a crazy alternative implementation of promises and so on.
We will talk about WeakMap. Not even that, we will talk about existing polyfills for WeakMap.

As for the rest of the elements:
Map is a hash whose keys can be either primitives or objects.
Set is a collection of unique elements. It is quite simple to build on top of Map. Its most important advantage is the processing of arrays, due to which it is possible to reduce the complexity of the uniq problem from O (n ^ 2) to O (n).

About what features of working with the DBMS appear on the backend with node.js - I probably will not say anything, because I’m not yet well acquainted with the node to authoritatively recommend anything.

The syntax is a little more complicated than that of a classic object, but it is also quite understandable and readable:

var map = new WeakMap;
map.set(someObject, 'someKey');
alert(map.get(someObject));
map.delete(someObject);

In several projects, including one of the previous versions of cornerJS , such a solution was justified in terms of brevity and readability of the code, however, due to the fact that one of the polyfills was used, I considered this solution to be noticeably memory-eating.

The ecmascript.org implementation offers a similar implementation as an option (translated from pseudo-js to a somewhat reduced executable, full implementation on github ):

window.WeakMap = function () {
        var keyList = [];
        var valueList = [];
        return Object.freeze({
            'get': function (key, defaultValue) {
                var index = keyList.indexOf(key);
                var value = valueList[index];
                return index === -1 ? defaultValue : value;
            },
            'set': function (key, value) {
                var index = keyList.indexOf(key);
                if (index === -1) {
                    keyList.push(key);
                    valueList.push(value);
                } else {
                    valueList[index] = value;
                }
            }
            //...(пропущены has, delete и clear)
        });
    };

This implementation has a lot of problems.
Firstly, the memory flows noticeably: even when an element is deleted, it remains in keyList, and if such a WeakMap works on a real project, a huge amount of memory may be leaked to the storage of essentially already deleted objects. GarbageCollector does not work on an element; for it, it is considered to still exist.
At the engine level, this can be solved, but still, GarbageCollector in some cases will not work correctly.
Secondly, when there are a lot of elements in keyList, the selection of the latter begins to really take a lot of time in chrome on macbook air 2013 the search for the 1e9th element took more than a second. The complexity of the problem is O (n), which sometimes significantly affects performance.

There is an alternative implementation that solves the problem with the sampling rate, reducing the complexity of the problem to O (1):

window.WeakMap = (function(){
	var counter = Date.now() % 1e9;
	return function(){
		var name = '__st' + (Math.random() * 1e9 >>> 0) + (counter++ + '__');
		return {
	      set: function(key, value) {
	        var entry = key[this.name];
	        if (entry && entry[0] === key)
	          entry[1] = value;
	        else
	          defineProperty(key, this.name, {value: [key, value], writable: true});
	      },
	      get: function(key) {
	        var entry;
	        return (entry = key[this.name]) && entry[0] === key ?
	            entry[1] : undefined;
      	}
    }
}
})();

However, this does not solve the memory problem, and even makes it worse. A cyclic link appears in the memory of each element: key [name] [0] === key. If you believe the open documentation, the garbage collector is not able to catch such scenarios, so that over time the memory gets clogged up - though it will take a lot of time.
That is why it is worth using Polymer / X-tags with caution, a lot of which rely on WeakMap (a slightly reduced implementation is given above). So, for example, the polyfill for MutationObserver is based on it, which is used not only by them, but also in a large number of third-party projects, many of which may not be aware of memory problems in it.
A couple more minuses are added in comparison with an honest implementation. One of them will be insignificant for the majority: we lose the ability to become attached to frozen objects. The second for some will be very serious. This implementation becomes IE9 +. The first option is so simple that it can work in IE6 + if you connect an Array.prototype.indexOf polyfill (for example, from Mozilla ). Array.prototype.indexOf also appeared in IE9, but at least it can be implemented. Unlike Object.defineProperty (strictly speaking, in IE8 there is support for defineProperty for DOM elements, but you can not rely on this).

As a result, we have either slow operation on large volumes of records, or an extra property in the key element (which means, possibly, even access to it from the outside), problems with some of the iterators, and in any case, memory filled with additional attributes.

There is also an implementation of WeakMaps in jQuery. When you request jQuery.data (element, key), you are working with one of the WeakMap implementations. It works a little easier.
Perhaps you have ever seen something like this in your code:

document.body.jQuery19104357186993584037
20

Now you know that this is the id of the element in jQuery's own WeakMap.

When an element is deleted, it is deleted, but we still:

not protected from outside data changes;
not protected from memory leaks for values - keys are deleted, but the values associated with them are stored in memory.

Here we come to the main part.

WeakMap was named that way because they wanted to implement a weak reference - a weak connection that is not taken into account in the GC, respectively, both the key and value are deleted when there are no other entries for this key.
In the correct implementation, both the key and the value should be deleted.
But I did not even imagine that this was possible until I came across in one of the repositories on the github in the headline a phrase that could not help but surprise:

Shim for WeakMap with non-leaky O (1) lookup time

It relies on Object.defineProperty - which means IE9 + and the inability to refer to frozen objects, however, the element had no new attributes, which could not but surprise.
The repository itself is here: https://github.com/Benvie/WeakMap/
I could not resist checking how much this could be true at all.

A simple and quick test with memory snapshots was made for this.

the first was shot on a blank page ( blue arrow )
the second starred in the implementation of polyfill ( yellow arrow )
the third was removed after creating the elements and entering them into the memory ( red arrow )
the fourth was removed after the remaining links to the elements were deleted ( green arrow )

fully test code

//снимается "синий" снэпшот 1
//запускается полифилл
//снимается "желтый" снэпшот 2
var someStorage = [];
var map = new WeakMap;
var length = 1e5; //для ie брался размер массива 1e4, так как снэпшот для 1e5 не закончил сниматься даже через 10 минут работы профайлера.
for (var i = 0; i < length; i++) {
    someStorage[i] = {randomValue: Math.random() * 1e9 >>> 0}; //сохраняем значение в памяти для того, чтобы у нас была ссылка на него - так garbage collector не будет удялять элемент из памяти
    map.set(someStorage[i], {otherRandomValue: Math.random() * 1e9 >>> 0})
}
//снимается "красный" снэпшот 3
for (var i = 0; i < length; i++) {
    someStorage[i] = void 0;
}
//снимается "желтый" снэпшот 4

The results are as follows:

Chrome 30 (Mac 10.9)

Firefox 24 (Mac 10.9)

IE 11 (Win 8.1)

I admit honestly - I was shocked when I saw that all the memory that the application had taken was given up - both memory for keys and for values.

After that, I decided just in case to make sure that the search time is really O (1). It turned out to be true.

Chrome 30 (Mac 10.9)

Firefox 24 (Mac 10.9)

In the end, it's all true. WeakMap can really be used in a large number of projects, it gives huge potential for data-binding, working with ospreys, creating a large number of plug-ins, and this can be done today, in any case we have only every fifth IE8 + project and the rest of IE9 +. At the same time, WeakMap not only simplifies the work with certain types of data, but also allows you to improve performance and optimize, and in some cases significantly, work with memory.
The main thing is to choose the right polyfill.

By the way, you can make a double polyfill, something like:

if (Object.defineProperty) {
	window.weakMap = //O(1) решение без текущей памяти и с задаваемыми атрибутами
} else {
	window.weakMap = //O(n) правильное решение, но с текущей памятью
}

Of course, the most interesting thing about this solution is how it works. The solution is so confusing that in order to understand it, I had to contact the author ( Brandon Benvie ) and ask him a few questions in order to understand the most confusing details.

I don’t like to get into someone else’s code thoroughly, but this one was extremely interesting to implement, and it turned out that it was worth studying, nevertheless Brandon (the author) wrote the ES6 compiler on ES3, created app.js (a platform for desktop applications on a node) and implemented many more really complex solutions in JS.

Disclaimer: in most examples, the code is slightly reduced for better readability: some checks are removed, some things are inlined. The code that I am quoting here is not recommended for use in practice, since this is either a kind of shamanism so that the garbage collector sees that the memory can be freed, or practices that allow the code to work faster, but can easily confuse and frighten the ordinary average front-end developer.

It may confuse at first that all methods return

function toString() { [native code] }

In this case, bind is not done anywhere, which can turn an ordinary function into such a function.
However, everything turned out to be much simpler: the prep method, which is set like this:

var prep =	{ __proto__: [] } instanceof Array
  ? function(f){ f.__proto__ = stringifier }
  : function(f){ define(f, stringifier) };

makes an __proto__ object that prototypes a particular element.
stringifier is another function:

var stringifier = function toString(){
  return src[0] + nameOf(this) + src[1];
};

It is set to itself as a toString method (as I understand it - this is done for brevity). As a result, we actually get something like:

a.__proto__ = {
    toString: function(){
        return "function "+this.name+"() { [native code] }"
    }
}

By the way, this particular code will not work particularly well due to the fact that the name attribute of the function is not available in all JS implementations; it is considered bad practice to use it. The function should not rely on the environment (including its name) and should work out the same under any conditions.
Just because Function.prototype.name is not in all implementations, in Brandon code the function name is defined like this:

function nameOf(func){
return 'name' in func ? func.name : toSource.call(func).match(/^\n?function\s?(\w*)?_?\(/)[1];
}

It was enough to remove define (stringifier, stringifier); so that the code would stop confusing us.

So, we need to understand where the keys are stored. To do this, we set the value, and it falls into the set function:

function set(key, value){
    unwrap(this).set(key, value);
}
function get(key){
      return unwrap(this).get(key);
}

And here the fun begins:

var unwrap = function(weakMap){
      return data.unlock(weakMap).value;
}

data is an instance of the inner class Data, which in itself is very similar to WeakMap, but it has only two methods, get and set, while all weakMap themselves are stored in one large Data object, which still stores for each weakMap one data object.
We have a meta-WeakMap (if you call a spade a spade), which stores all WeakMap s, each of which already contains key-value pairs for objects.

And finally, the most interesting thing in the Data object.

The first trick: we hide the value from the general set of keys so that it does not fall into any of the iterators. To do this, we reassign getOwnPropertyNames

Object.defineProperty(Object, 'getOwnPropertyNames', {
	value: function getOwnPropertyNames(obj){
		var props = getProps(obj);
		if (Object.prototype.hasOwnProperty.call(obj, globalID)) //используем не собственный hasOwnProperty, потому что он тоже мог быть где-то переназначен.
			props.splice(props.indexOf(globalID), 1);
		return props;
	}
});

As a result, even the chromium debugger does not suspect the existence of the property: The

second trick:

function storage(obj){
	if (hasOwn.call(obj, globalID))
		return obj[globalID];
//опущена проверка на Object.isExtensible
	var store = create(null);
	defProp(obj, globalID, { value: store });
		return store;
};

At the output, we get a unique object with the key value.

function Data(){
  var puid = createUID(), //просто создает уникальный 16-значный дескриптор
      secret = {}; //трюк - в качестве секрета для каждой WeakMap используется пустой объект, а если более корректно с точки зрения работы с памятью ссылка на него. Мы можем как угодно изменять его, но пока он есть в памяти, мы на него не опираемся. А удалить его невозможно – он является private свойством экземпляра класса Data
  this.unlock = function(obj){
    var store = storage(obj);
    if (hasOwn.call(store, puid))
      return store[puid](secret);
    var data = Object.create(null, { value: { writable: true, value: undefined } }); 
    Object.defineProperty(store, puid, {
    	value: (function(secret,data){
			return function(key){ if(key===secret) return data }
    	})(secret, data)
    });
    return data;
  }
}

The first very interesting line is:

Object.create(null, { value: { writable: true, value: undefined } });

We create the object, inheriting from null, which gives us a dummy object without any extensions, and then we pass the second argument to propertiesObject, which allows us to set the values we need through defineProperty. We explicitly create a value, but pass it undefined, solely so that it already exists and subsequently passes a check ("value" in data).

The second insanely interesting line:

Object.defineProperty(store, puid, {
	value: (function(secret,data){
		return function(key){
			if(key===secret)
				return data
		}
	})(secret, data)
});

In essence, we get a function that takes a unique, specific dummy object, unique to each WeakMap, and returns the stored value if it matches the secret key.

In fact, in the original code it is even more confusing:

defProp(store, puid, {
    value: new Function('s', 'l', 'return function(k){if(k===s)return l}')(secret, data)
});

Brandon commented on it like this:

Simply performance. By defining a literal function, you also entrain all the local variables in scope. By using `Function` the only thing in scope is the global object. The goal was to prevent inadvertently leaking memory. Your version would probably work fine as well, but there's the potential to leak references that wouldn't otherwise be retained, and much of the goal of using a WeakMap is to prevent this kind of leakage.

In the free and extended translation into Russian:

This is done for performance. By creating a function through function () {}, you bind to the local osprey in it - despite the fact that it does not use it in any way. If you use a debugger, you can see that from the inside of this function the whole array in which it was created is available.
new Function creates a function that works in the global scope, which removes the binding to the local one. The objective of the implementation was the absence of leaks in the memory, and all that was done there was done primarily for the absence of leaks.

As a result, in reality, it looks like this in memory:

var map = new WeakMap;
var a = {};
map.set(a, ['test'])
console.log(a['2ql3g5ae6pcwstt9']['o6tnzx1xskf39pb9'])
>function (k){if(k===s)return l}

where a ['2ql3g5ae6pcwstt9'] is a keystore representing a dummy object with a name from a randomly generated globalID common to all elements that fall into any weakMap, and a ['2ql3g5ae6pcwstt9'] ['o6tnzx1xskf39pb9 an object that stores a closure with

{
	value: 'содержимое'
}

as a second argument. In the case of a change in value, we replace value, while the object remains as before inside the closure.

All keys are hidden in all possible senses, up to removing them from Object.getOwnPropertyNames, due to this we get almost non-existing keys that cannot be suspected of being inside the code - they are not listed in iterators and are not given in any of queries.
As a result, we get an IE9 + implementation of WeakMaps with absolutely secure (access without modification inside the code is impossible) with keys to open a container with content that can be easily changed and which is always deleted with the key, which gives us a really non-leaking WeakMap implementation with complexity of O (1) and unwritable and undeletable due to Object.defineProperty with attributes configurable: false, enumerable: false, writable: false (set by default) with access keys.

Of course, I don’t like the fact that an attribute is created for the key, but I don’t know which solution could be more ideal than that. It remains only to admit that Brandon did a great job of working with memory and made a decision that should be used in production.

I will be glad to hear in the comments ideas about why you can still use the Harmony collections and WeakMap in particular, and the attitude to similar APIs in general.

UPD Brandon explained why (0, eval) ('this') was done at the end. (0, eval) ("this") is roughly equivalent to var e = eval; e ("this"). The fact is that eval always works in the local array, however if you assign eval to another variable and run it, it will be executed in the global environment.

Tags:

Harmony collections NOW

Also popular now: