D3.js. Graph visualization

  • Tutorial
D3.js is a JavaScript library for managing documents based on data. D3 helps bring data to life using HTML, SVG, and CSS. D3 allows you to bind arbitrary data to the DOM, and then apply the results of manipulations with them to the document.

Knowledge of the basics of D3 is useful for understanding the article , and in it we will consider the implementation of Force-directed graph drawing algorithms , which in D3 (version 3) is called Force Layout . This is a class of graph visualization algorithms that calculate the position of each node, simulating the attractive force between each pair of connected nodes, as well as the repulsive force between the nodes.


In the picture above, you see how the notorious New Yourk Times visualized connectionsbetween applicants for the next Oscar. The final layout is static, but the positions of the graph nodes were calculated using the Force Layout. An internal editor was built for the graph, which allows you to save the coordinates of the nodes for use in the static version.

NB! Just yesterday , a new version (version 4) of D3.js was released, so the article I started may already be considered obsolete. Nevertheless, I hope that it will be useful for understanding the capabilities of the new version. You can read about the changes made in the new version in the graph visualization API here .

A bit about Layouts

The D3.js API contains several hundred functions, and for convenience they are divided into logical blocks, one of which is the Layouts block. It contains the functionality of visual display of data-related elements relative to each other. Layouts receive a series of input data, apply an algorithm or heuristic to them, and display the result in a graphical representation of the data.

Layouts are not much different from d3.svg path generators in that they help transform the data for their visual presentation. However, Layouts, as a rule, work with the data set as a whole, and not separately. In addition, Layout results are not limited to one SVG. Some Layouts are dynamic in time: for example, Force Layout, where, after executing the .start () method of the instanced3.layout.force () can track events of the 'tick' update of the layout.

More than a dozen Layouts are built into D3. Their instances are often functions (although not necessarily) that can be configured and then applied to the dataset. In other cases, separate methods or event handlers are used to enter data and present the result. To use, you need to look at the documentation of each specific Layout.

Force layout

A flexible force-directed graph is visualized using the Verlet method of numerical integration to impose restrictions on the movement of graph elements relative to each other. You can read more about physical modeling here . This implementation uses the quadtree module (quadrant tree) to accelerate the interaction of graph nodes with each other, using the Barnes-Hut approximation . In addition to the repulsive charge force of the node, the pseudo-gravitational force gravity keeps the nodes in the visible region and avoids pushing unbound subgraphs out of the scope, while the graph bonds have a fixed linkDistance lengthand play the role of geometric constraints. Additional user effects and restrictions can be applied in the tick event by updating the attributes of the x and y nodes. For a comprehensive overview of the features with examples, see the video report from one of the key D3 developers Mike Bostok and the presentation from this report. Some fun examples: divergent forces , multiple foci , graph constructor , force-directed tree , force-directed symbols , force-directed images and labels , force-directed states , sticky force layout


Like other classes in D3, Layouts follow a method chaining technique when setter methods return their Layout, allowing you to build multiple setters in a single call chain. Unlike some other Layouts implementations, Force Layout maintains a link to nodes and graph links within itself; thus, each Force Layout instance can be used with only one data set.

d3.layout.force ()

Creates a new force-directed layout with the following default settings: size 1 × 1, link strength 1, friction 0.9, distance 20, charge strength -30, gravity strength 0.1, theta parameter 0.8 (the listed parameters will be described below). By default, the nodes and links of the graph are empty arrays, and when Layout is started, the alpha internal cooling parameter is set to 0.1. The general template for building force-directed layouts is to set all configuration properties, and then call the .start () method :

var force = d3.layout.force()
    .size([w, h])

Note that, unlike other D3 Layouts, force-directed layout is not associated with a specific visual representation. Usually nodes are displayed as SVG circle elements, and links are displayed as SVG line elements. But you can also display nodes as symbols or images .

force.size ([width, height])

If the size parameter is passed, sets the available layout size (width and height). Otherwise, returns the current size, which by default is [1, 1]. In the force-directed layout, size affects two things: the gravitational center and the initial random position of the added nodes (their x and y coordinates). The center of gravity is calculated simply [x / 2, y / 2]. When adding nodes to the Force Layout, if they do not have the x and y attributes already set, then these attributes are initialized using a uniform random distribution in the range [0, x] and [0, y], respectively.

force.linkDistance ([distance])

If the distance parameter is passed, sets the distance specified in it between the connected nodes (length of links). Otherwise, it returns the current length of links, which is equal to 20 by default. If distance is constant, then all links will have the same length. Otherwise, if distance is a function, then this function is calculated for each connection (in order). The function takes two arguments - the relationship and its index; the context of the thisfunction matters the current Force Layout. The value returned by the function is used to set the length of each bond. The function is calculated at startup (method . Start () ) of the layout.

Connections are implemented not as “elastic forces”, which is common in other force-directed layouts, but as weak geometric constraints. For each layout tick event, the distance between each pair of connected nodes is calculated and compared with the target distance; then the bonds move closer or farther apart until they converge at the desired distance. This approach, coupled with the Verlet method of numerical integration, is much more stable than approaches using elastic forces, and also allows the flexible implementation of other constraints in the tick event handler , such as a hierarchical representation.

force.linkStrength ([strength])

If the strength parameter is passed, sets the specified bond stiffness in the range [0,1]. Otherwise, it returns the current stiffness, which is equal to 1 by default. If strength is constant, then all bonds will have the same stiffness. Otherwise, if strength is a function, then this function is calculated for each relationship (in order). The function takes two arguments - the relationship and its index; the context of the thisfunction matters the current Force Layout. The value returned by the function is used to set the stiffness of each bond. The function is calculated at startup (method . Start () ) of the layout.

force.friction ([friction])

If the friction parameter is passed, sets the specified coefficient of friction. Otherwise, returns the current coefficient, which defaults to 0.9. The name of this parameter is possibly misleading; it does not correspond to the standard coefficient of friction (from physics). Rather, it is more similar to speed attenuation: for each event of the 'tick' of the simulation process, the speed of nodes is calculated based on the friction parameter . So, a value of 1 corresponds to a frictionless environment, and a value of 0 freezes all nodes in place. Values ​​outside the range [0,1] are not recommended and may have destabilizing effects.

force.charge ([charge])

If the charge parameter is passed, sets the specified node charge power. Otherwise, returns the current charge strength, which by default is -30. If charge is constant, then all nodes will have the same charge strength. Otherwise, if charge is a function, then this function is calculated for each node (in order). The function takes two arguments - the node and its index; the context of the thisfunction matters the current Force Layout. The value returned by the function is used to set the charge strength of each node. The function is calculated at startup (method . Start () ) of the layout.

A negative value of the charge force leads to repulsion of the nodes, and a positive value leads to the attraction of the nodes. Negative values ​​should be used to represent the graph; To simulate the N-body problem , positive values ​​can be used. As expected, all nodes are infinitesimal points with equal charge and mass. Charge forces are effectively implemented using the Barnes-Hut algorithm by computing the quadrant tree for each tick event . Setting the charge strength to 0 disables the calculation of the quadrant tree, which can significantly improve performance if you do not need such functionality.

force.chargeDistance ([distance])

If the distance parameter is passed, sets the maximum distance at which the node’s charge forces act. Otherwise, returns the current maximum distance, which is infinity by default. Determining the final distance improves the performance of Force Layout and gives a more localized output; This is especially useful when combined with custom gravity .

force.theta ([theta])

If theta is passed, sets the Barnes-Hut approximation criterion. Otherwise, returns the current value, which defaults to 0.8. Unlike bonds, which affect only two connected nodes, the charge strength is of universal importance: each node affects all other nodes, even if they are on unconnected subgraphs.

To avoid the delay associated with quadratic time complexity, Force Layout uses the Barnes-Hut algorithm , which has a time complexity of O (n log n) for one 'tick' . For each tick eventa tree of quadrants is created to save the current position of the node; then, for each node, the sum of the charge forces of all other nodes is calculated. For groups of nodes that are far away, the charge strength is approximated by processing a distant group of nodes as one large node. Theta determines the accuracy of the calculation: if the ratio of the area of ​​the quadrant in the tree of quadrants to the distance between the node and the center of mass of the quadrant is less than theta , all nodes in this quadrant are treated as one large node, and are not calculated separately.

force.gravity ([gravity])

If the gravity parameter is passed, sets the force of gravitational attraction. Otherwise, returns the current gravitational force, which is 0.1 by default. The name of this parameter is possibly misleading; it does not correspond to physical gravity (which can be simulated by assigning a positive value to the charge parameter ). Instead, the gravity parameterimplemented as a small geometric constraint, similar to a virtual spring connecting each node to the center of the layout. This approach has remarkable properties: near the center of the layout, the force of gravitational attraction is almost zero, which prevents any local distortion of the layout; since the nodes are extended farther from the center, the force of gravitational attraction is enhanced in a linear proportion to the distance. Thus, the force of gravitational attraction will always overcome the repulsive forces of a charge at a certain threshold, preventing the exit of disconnected nodes beyond the boundaries of the layout.

Gravity can be turned off by setting the force of gravitational attraction to zero. When gravity is turned off, it is recommended to implement some other geometric restriction to prevent nodes from going beyond the layout boundaries.

force.nodes ([nodes])

If the nodes parameter is passed, sets the nodes of the graph indicated in the array. Otherwise, returns the current array of nodes, which is empty by default. Each node has the following attributes:

  • index - index (index count from 0) of the node in the nodes array.
  • x is the x coordinate of the current node position.
  • y is the y coordinate of the current node position.
  • px is the x coordinate of the previous node position.
  • py is the y coordinate of the previous position of the node.
  • fixed - a Boolean value that indicates whether the position of the node is fixed.
  • weight - the number of edges associated with the node.

These attributes do not have to be set before passing the Force Layout node; if they are not set, the corresponding default values ​​will be initialized by Force Layout when the .start () method is called . However, keep in mind that if you store any other data in your nodes, your data attributes should not conflict with the above properties used by Force Layout.

force.links ([links])

If the links parameter is passed, sets the graph links indicated in the array. Otherwise, returns the current array of links, which is empty by default. Each link has the following attributes:

  • source - the initial node (element of the nodes array)
  • target - the final node (element of the nodes array)

Note: the values ​​of the source and target attributes can be initially set as indices in the nodes array; they will be replaced by links after calling the .start () method . Link objects may have additional user-defined fields; this data can be used to calculate the linkStrength of the connection and the distance of the LinkDistance between the communication nodes using the access function.

force.start ()

Starting the simulation process; this method should be called when creating a Force Layout, after setting up nodes and links. In addition, it needs to be called again when the nodes or links change. Force Layout uses the “cooling” parameter alpha , which controls the temperature of the Force Layout: since physical modeling is reduced to a static layout, the temperature decreases, as a result of which the nodes slow down. Ultimately, alpha drops below a certain threshold, and the simulation stops completely, freeing up resources. Force Layout can be re-heated using the .resume () method or by restarting it; this also happens automatically when using drag mode .

When launched, Force Layout initializes the various attributes of its associated nodes. The index of each node is calculated by iterating over the array, starting from 0. The initial coordinates of the node x and y, if their value is not specified, are calculated on the basis of neighboring nodes: if the connected node already has the initial value x and y, the corresponding coordinates are applied to the new node. This increases the stability of the graph’s layout when adding new nodes, as opposed to using default values ​​that initialize coordinates randomly within the layout size. The px and py coordinates of the previous node position (if not specified) take the value of the initial coordinates, which gives the new nodes an initial speed of zero. Finally, fixed is false by default.

Force Layout also initializes the source and target attributes of links: these attributes can be set not only by direct links to nodes, but also by numeric indexes of nodes (this is convenient when reading data from a JSON file or other static description). The source and target attributes of the links are replaced with the corresponding entries in the nodes only if these attributes are numbers; thus, these attributes are not affected on existing relationships when you restart Force Layout. LinkDistance and linkStrength link parameters are also computed at startup.

force.alpha ([value])

Gets or sets the "cooling" alpha parameter of the Force Layout simulation process. If the value is passed, sets the alpha parameter and returns Force Layout. If the value passed is greater than zero, this method also restarts the Force Layout if it is not already running, raising the 'start' event and including the tick timer. If the passed value is not positive and Force Layout is running, this method stops Force Layout on the next 'tick' event and raises the 'end' event . If no value is specified, this method returns the current value of the "cooling" parameter.

force.resume ()

Equivalent to a call:


Sets the “cooling” parameter alpha to 0.1 and then restarts the timer . Generally, you do not need to call this method directly; it is called automatically by the .start () method . It is also called automatically by the .drag () method when dragging.

force.stop ()

Equivalent to a call:


Ends the simulation process by setting the cooling parameter alpha to 0. This method can be used to explicitly stop the simulation process. If you do not stop Force Layout explicitly, this will happen automatically after the “cooling” alpha parameter drops below a certain threshold.

force.tick ([value])

Performs one step simulation of Force Layout. This method can be used in conjunction with the .start () and .stop () methods to calculate a static layout. For instance:

for (var i = 0; i < n; ++i) force.tick();

The number of iterations depends on the size of the graph and its complexity. The selection of starting positions is also important. For example, here the nodes are located diagonally:

var n = nodes.length;
nodes.forEach(function(d, i) {
  d.x = d.y = width / n * i;

If you do not initialize the positions of the nodes manually, Force Layout initializes them randomly, resulting in a somewhat unpredictable summarization.

force.on ([type, listener])

Registers a specific listener to handle events of a particular type from Force Layout. Currently, only the 'start' , 'tick' , and 'end' events are supported .

Event objects that are passed to the handler function are user objects created using d3.dispatch () . Each event object has two properties: type (string, 'start', 'tick', or 'end'), and alpha, which is the current value of the “cooling” parameter alpha . Event.alpha propertycan be used to monitor the progress of Force Layout simulations or to make your own adjustments to this process.

The 'start' event is dispatched both at the initial start of the simulation process, and every time the simulation is restarted.

The 'tick' event is dispatched at each simulation step. Track 'tick' events to update the displayed positions of nodes and links. For example, if you initially display nodes and links as follows:

var link = vis.selectAll("line")
var node = vis.selectAll("circle")
    .attr("r", 5);

You can set their positions for each step of the modeling process:

force.on("tick", function() {
  link.attr("x1", function(d) { return d.source.x; })
      .attr("y1", function(d) { return d.source.y; })
      .attr("x2", function(d) { return d.target.x; })
      .attr("y2", function(d) { return d.target.y; });
  node.attr("cx", function(d) { return d.x; })
      .attr("cy", function(d) { return d.y; });

In this case, we saved the set of nodes (nodes) and links (link) at the initialization stage, so that we would not need to re-select nodes at each modeling step. If you wish, you can display nodes and links in a different way; for example, you can use symbols instead of circles.

The 'end' event is dispatched when the internal “cooling" parameter alpha falls below the threshold value (0.005) and is reset.

force.drag ()

Associates behavior with nodes for interactive drag and drop, both with the mouse and touch. Use it in combination with the call method for nodes; for example, call node.call (force.drag) to initialize. In drag mode, when you hover over a node, its fixed attribute is set to true, thereby stopping its movement. Fixing a node when you hover the mouse (mouseover), in contrast to fixing it when you click on the node (mousedown), simplifies the task of catching the desired node. When the mousedown event occurs, and for each subsequent mousemove event up to the mouseup event, the center of the node is set to the current position of the mouse. In addition, each 'mousemove' event fires a .resume () methodForce Layout, “warming up” the modeling process. If you want the moved nodes to lock after dragging, set the fixed attribute to true with the 'dragstart' event, as done in this example .

Implementation Note: event handlers 'mousemove' and 'mouseup' are registered for the current window, so when the user starts to drag the node, the drag and drop process will not be stopped, even if the mouse cursor goes beyond the layout. Each event handler uses the force namespace to avoid conflict with other event handlers that the user can bind to nodes or windows. If the node is moved by dragging and dropping, the subsequent 'click' event, which is fired when the mouse button is released ('mouseup'), will be canceled. If you register a 'click' event handler, you can ignore the 'click' events that occur when you drag and drop, as follows:

selection.on("click", function(d) {
  if (d3.event.defaultPrevented) return; // ignore drag

Finally, check out these two examples: collapsible force layout and divergent forces .

Also popular now: