Social Network Graph Visualization: An Analysis of Blogosphere Events Before December 2011

This is a logical continuation of the article “ Building a graph of a social network using Drupal and Feeds ”.

As part of the group, I was engaged in collecting information from the blogosphere. The task was to assess the tension, the activity of political discussions during the election campaign of the State Duma elections. Looking ahead, I will say that the study allowed us to put forward hypotheses, which were later confirmed. In particular, according to the results, which you will read about below, you can understand who will go out on the square and take people out. And most importantly, who will they follow.

In recent years, there has been a rapid increase in the impact of blogosphere events on political and social processes in the world, including the political life of our country. Social networks are a platform for active discussion of all political events in the country, shaping public opinion, and, above all, youth - those in whose hands will be the fate of the country after 10-15 years. Thus, the need for developing methods and algorithms for studying social communication of social media and the features of their influence on current political events becomes more and more obvious.

A study of social media communications was conducted in mid-November 2011. The study analyzed the October-November discussions from LiveJournal regarding the upcoming December 4 State Duma elections.
The blog platform LiveJournal (LiveJournal) was chosen as a platform for testing the monitoring methodology of the social media segment under study. This choice is due to the focus of this network primarily on conducting open public discussions: LiveJournal has become one of the main platforms for citizen journalism today.

During the study, more than 1,200 user comments were collected, the number of edges in the oriented graph exceeded 950. The period of information collection is July - November 2011.

For analysis, we used the open Gephi program, into which the graph from the previous article was imported.

Vertex and Edge Properties

Figure 1 - The graph after importing
betweenness - the number of vertices in the shortest paths between any other vertices. The study showed that a very small number of nodes has a high degree of intermediateness - only 6 or about 0.5%. This means that in the political segment of the Runet there is no complex branched network with many large clusters and communities. As a rule, users who conduct information have the ability to transmit information by communicating simultaneously in 2-4 different circles of political opinions. Moreover, these information providers do not have much influence on the opinion of the communities in which they are members; therefore, it is difficult to use them in information campaigns during the pre-election period.
The figure shows a graph in which the users with the highest degree of intermediateness are selected with the largest size and color of warm shades (green, orange, and red).

Figure 2 - Graph with selected vertices with a high degree of intermediateness
The distribution of intermediateness in the graph is extremely uneven, most of the vertices do not have it at all.

Figure 3 - A graph with selected vertices with high intermediateness.
The table sorted in descending order shows the specific nicknames of users with the corresponding intermediateness. Of the well-known people, the leader can be noted - this is V. Milov (v_milov), one of the leaders of the opposition.

Figure 4 - Users with a high intermediate rate
Eigenvector centrality is a recursive characteristic of vertex importance derived from the sum of the importance of connected vertices. The study showed that A. Navalny, G. Yavlinsky, S. Mironov have a high centrality, and only ru_politics from political communities.

Figure 5 - Users with high centrality by eigenvalue

Cluster Properties

The degree of clustering (transitivity) is a characteristic of the increased likelihood of communication between the vertices of AC, if AB and BC (my friend is my friend). This characteristic may indicate that peaks with a high degree of clustering are commented on by people who know them personally.

Figure 6 - The number of "triangles" in the graph

Network properties

Diameter is the maximum shortest path between any two peaks (between which such a path can be laid).
d = min⁡max⁡ L ij
Formula 1 - Determination of the diameter
The diameter of the graph obtained is 2, which indicates the absence of chains of communication interactions between users.
Degree distribution - a graph of the degree of a vertex versus the total number of such vertices in a graph. The degrees for the current study were calculated based on the challenges. To determine authoritative users, the metric of incoming degrees (in-degree) is used. If the top has a high input degree, then this user is often and often commented on, which in turn means a high degree of interest in it from the community. As a rule, such users are opinion leaders and agents of new ideas that cause active discussions in society. The study showed that the distribution of incoming degrees obeys a power law and decreases sharply with an increase in the number of commentators. So, the leaders are users who have typed 60, 30, 18, 15 comments for the given keywords.

Figure 7 - Users with a high incoming degree

Figure 8 - Distribution of incoming degrees
One of the most striking leaders is A. Navalny.

Figure 9 - Distribution of incoming degrees
An analysis of the outgoing degree in a graph shows that, as a rule, people commenting on opinion leaders are themselves leaders in terms of the number of comments.

Figure 10 - Distribution of incoming degrees The
average distribution of degrees for the entire graph is 0.743, but the median is more interesting, it is in the region of 2-4. The general distribution of degrees, both incoming and outgoing, is presented in the figure.

Figure 11 - Distribution of incoming degrees
A weighted degree characterizes the normalized distribution of degrees in the range from 1 to 100. Unconditional leaders are A. Navalny, G. Yavlinsky, community ru_politics. Also on the list are the economist Khazin and the Solidarity movement. An interesting result was that the list does not contain such politicians and figures as G. Zyuganov, V. Zhirinovsky, M. Prokhorov, which can partly be explained by the fact that the main discussions are conducted by their supporters on other platforms, in particular, official websites. The absence of Prokhorov can also be explained by the fact that he now writes not about politics, but concentrated as before on business.
Another interesting result is that there are no regional political communities in the list, such as politics_south (401 readers) - Politics in the South of Russia, gorodgeroev_ru (281 readers) - Political life in Volgograd. These regional communities, although they have readers, do not attract active commentators. Communities ru_cprf Communist Party - a political party, ru_sps Union of Right Forces, spravedliva_ru Fair Russia contains only texts and reposts, there is practically no political activity and discussion.
The main conclusion: as a rule, active discussions are conducted in the magazines of political leaders, but not in communities, which are therefore somewhat artificial in nature.

Figure 12 - Leaders by Weighted Degree

Modularity allows to identify communities or user groups in the graph structure. In the column obtained, 4-6 small groups can be distinguished by the selected keywords.

Figure 13 - Groups in the column

Figure 14 - Community A. Navalny
The sizes of the largest groups vary from 10 to 35 users, see.

Figure 15 - Distribution of groups

Figure 16 - Modularity class
In addition to analyzing the structure, the study allows you to immediately familiarize yourself with the texts of user commentator posts. The table shows the edges of the graph, each edge corresponds to the title and text of the comment. This allows you to immediately analyze the more accurate topics of the comments left, to evaluate the overall tone of the messages.

Figure 17 - Tops of the graph with commentary texts

Summary: Now, after a year, when we know how events developed, it is clear that such a study can predict with real accuracy the real activity of protest leaders based on their activity in the blogosphere.
Of course, we collected a little data, we can argue about the representativeness of the sample (records were collected only for certain queries created using the Yandex search constructor), we need to explore more networks, not just LJ. It is in the future.

But now, our study is unique in terms of analyzing the graph and network structure. As far as I know, studies usually build engagement graphs, quantitative characteristics (like number of posts, number per user, etc.), audience size, etc. But no one builds a graph structure, does not calculate metrics, as they did we. But this allows in the future to track the dynamics of events.

Also popular now: