Re-decentralization of the web. This time forever

Original author: Ruben Verborgh
  • Transfer
In recent years, the web has become highly centralized. To restore freedom and control over the digital aspects of our life, you need to understand how we got to this state and how to get back on the right path. This article tells the story of web decentralization and the role of Tim Berners-Lee in the ongoing struggle for free and open Internet. Problems and solutions are not purely technical in nature, but rather fit into a larger socio-economic puzzle. We all have to do it together. Let's get the Internet back to this forever, and use the full potential of the web, as envisioned by its creator.

Power to the people

The inventor can assume the purpose and the fate of his creation, but in the end it is people who decide how to use it. John Pemberton was going to treat morphine addicts when he started brewing a potion, now known as Coca-Cola, and the Play-Doh toy plasticine was originally created as a means of cleaning the walls. Alfred Nobel instituted annual awards to avoid being remembered as the inventor of dynamite.

It is noteworthy that Tim Berners-Lee never intended to control his own invention: his former employer CERN released the World Wide Web software for public use, and the network itself was designed to be decentralized so that no one had the power and right to silence other people. Such unprecedented openness led to large-scalefree innovation and limitless creativity, gave voice to more than half of the world's population. She revolutionized communications, education, and business. However, the consequence of this freedom is also the fact that everyone can create things that are contrary to the spirit of the Internet, such as illegal materials and - ironically, platforms whose main purpose is centralization .

The concept of centralization is not a problem in itself: there are good reasons for centralized unification of people or things. But the problem arises when we are deprived of choice, misleading - we are forced to think that there is only one door to the space that we actually collectively own. Some time ago it seemed inconceivable that the fundamentally open Internet would become the basis for closed services, where we pay with our personal data for some of the freedoms that are actually ours. However, today the majority of users with daily interaction are locked within the boundaries of several influential social media. These giants collect information from all over the world and accumulate this wealth in their closed space, where they are both the chief and the judge at the same time.

Since the change happened so suddenly, you may need to remember that not so long ago the web landscape looked quite different. In 2008, Iranian blogger Hosein Derakhshan was sentenced to 20 years in prison for publishing to a blog. He and many others could express their critical opinion, because they had the Web as an open platform - they did not ask anyone for permission to publish. It is important to note that the hyperlink mechanism on the Internet allows blogs to link to each other, again without requiring permission in any way. This allows you to create a decentralized value network between equal people, where readers retain active and conscious control over their actions. When Derakhshan was released in 2014, he returned to a completely different Network.: instead of readers with an active position, he saw passive viewers, who seemed to be watching TV. Of course, web technologies have advanced, but the main foundations of the Network have degraded: in just six years, people began to use the Internet in a completely different way.

Of course, social media is not our enemy: thanks to them, the barrier to the publication of short texts and photos by any person has decreased. Nevertheless, they work within the framework of the “winner takes all” strategy: each of the players strives for dominance, not mutual interaction, like the rest of the Internet. Unlike blogs, we usually cannot interact with publications on one network from within another: we need to move either people or data. Known problem of fenced gardens in social mediahas deteriorated significantly since 2008. Some of the “gardens” have grown to enormous size, but the walls have remained.

The main problem is that access to the dominant networks necessarily means giving up control over personal data: we can enter, but pay with our digital property. This personal data is then used to imperceptibly influence us through personalized advertising of brands, products, and even political programs. In addition, once inside, people usually form small communities — an effect that social networking algorithms specifically aim at maximizing involvement at the expense of diversity. As a result, the filter bubble isolates us in separate echo cameras, although the web and social networks have always wanted to bring people together .

Not surprisingly, this situation is reflected in the three global tasks that Tim Berners-Lee formulated in 2017:

  • regain control of our personal data;
  • prevent the spread of misinformation;
  • ensure transparency of political advertising.

Obviously, it is undesirable to solve these problems centrally through some kind of commission or committee. This again creates a point of failure, which - even with the best of intentions - is always vulnerable to abuse. Ultimately, the main problem is not in specific social networks, but in the hyper-centralization of data and people, that is, of power. We need control, but the power should belong to all people - as the right to own personal data and content created.

It becomes clear that the main obstacles are not technological ; so Tim Berners-Lee calls "gathering of scientists, representatives of business, technology, government agencies, civil society and the arts world to combat Internet threats" . At the same time, scientists and engineers are assignedtechnological mission : to prove that decentralized personal data networks can globally scale and be as convenient for people as centralized platforms.

Therefore, let us begin with the technical issues of decentralization, emphasizing the role of Tim Berners-Lee in the ongoing struggle to maintain an open and decentralized network. After a historical excursion, we will focus on what changes decentralization requires, and consider what a healthy ecosystem looks like. As a concrete implementation, we study the Solid project. In conclusion, we discuss unresolved issues and prospects for the future.

A brief history of (de) web centralization

Social networks were not always the cause of centralization - and, most likely, at some point in the future, the problem will become different. The target is constantly moving: every time we begin to see a threat, it is replaced by an even larger one. Understanding these threats allows you to better understand the various aspects of decentralization.

Decentralization as an unspoken assumption

At the time of the invention of the WWW, decentralized systems already existed in the world, including the Internet. E-mail has become an even more decentralized service than the traditional postal service, which imitated, since different mail servers exchanged messages directly. Long-forgotten protocols, such as the Network News Transfer Protocol (NNTP), decentralized the exchange of news and articles. In short, decentralization is not a crazy idea, but rather the spirit of that time.

Therefore, from the very beginning of the design of the new hypertext system in 1989, Tim Berners-Lee took for granted that the system would be decentralized, unlike the documentation systems of that time. The main force of the web has become universality - independence from hardware and software. Decentralization was such an obvious property that it was not even mentioned. This is reflected in the original article with the announcement of the WWW , which emphasizes universal support in all operating systems, but the term “decentralization” is not mentioned at all .

The only centralized component in the network architecture is the Domain Name System (DNS). In those days, there were relatively few domains, and the owners did not change, so the problem was not so acute. Currently, millions of domain names often pass from hand to hand, thereby breaking existing links, possibly in malicious ways. By manipulating the DNS, governments can block or change access to existing sites. Tim Berners-Lee says: now it is clear that it was better to immediately implement a distributed DNS system. Except for this, the Internet had all the components to flourish as a decentralized system.

Battle for desktops

The browser war in the 90s was the first wave of centralization, where companies were trying to gain a monopoly position and become the only software provider to access the Web. The principle of universality of web design required readability on any platform, so nothing interfered with the work of several browsers at the same time - except for the fact that they wanted to dominate the market, and not to mutually beneficial coexistence. Netscape and Microsoft Internet Explorer browsers tried to entice users by introducing new features, and the share of Internet Explorer on desktops at some point exceeded 90%.

Although competition through innovation is beautiful in itself, but because of the new features, browsers have become incompatible with each other and, therefore, have begun to directly threaten the universality of the Internet. Type icons appeared on sites“Best viewed in Internet Explorer,” because developers could not guarantee consistent work across all platforms. If someone did not want or could not install a specific browser, then he risked losing access to such sites altogether. As a result, the IE monopoly influenced the choice of people in relation to the browser and the OS. The power on the Internet was concentrated in the hands of one company, which slowed down innovation.

The World Wide Web Consortium (W3C) was founded by Tim Berners-Lee to ensure interoperability between browsers. For this, recommendations are issued that determine the proper operation of web technologies. Although the W3C is administratively centralized, the adoption of standards represents feedback from a distributed network of participants through a consensus-based process. In the early 2000s, the problem was that Internet Explorer at critical moments deviated from the recommendations of the W3C, forcing developers to follow either actual standards or their incorrect implementation in the most popular browser.

Fortunately, pressure from Firefox and Safari during the second war of browsers ultimately forced Microsoft to change course and focus on standards. Since 2010, no browser has owned more than 2/3 of the global market, so now compatibility is in the interests of both browser developers and web developers. Balkanization of the network due to the centralized development of the browser has largely been avoided.

Battle of the search engines

Microsoft's short victory was unimportant, because the battle for centralization shifted to other areas. While each browser sought to become the default application, search engines competed to become the main entry point to the Web. Soon it didn't matter what your browser was; it was important who gave you the search instructions. In the end, free browser development does not bring direct income, while companies gladly pay for first place in the search.

Among the search engines immediately appeared several competitors, such as AltaVista and Lycos, but just a couple of years, Google became the leader. The centralization of the search meant that one company began to influence too much what content was available to people by changing the search results for the given conditions. Even assuming the best intentions and ignoring paid advertising, the presence of a single algorithm that makes decisions for a large number of people affects the information field. After all, there is no one objective way to determine the "best" web pages on any topic. Attempts were made to external manipulate this algorithm, first through deceptive keywords, and then using advanced SEO methods to improve the ranking of sites in various (sometimes questionable) ways.

With the advent of search engines, the first time monetization of user data began. Search queries of a person allow you to create a detailed profile of interests in your personal and professional life. Search engines may know more about certain aspects of a person’s life than his close friends. This profile helps you find personalized advertisements and search results, encouraging you to visit sites and buy items that you otherwise might not have bought. Although personalization is useful for many, the problem is lack of choice and control. We focus on large search engines that have accumulated the largest amount of data and show a more relevant issue. However, these search engines do not provide options - most of them accept only our personal data as payment. In addition, we do not know exactly how our data affects search results, not to mention controlling them. The growth of personalization led to the emergence of the firstfilter bubbles , inside which we are more likely to show results similar to those we clicked on earlier.

The battle for our personal data and identity

While Google’s hegemony continues, social networks have found a more powerful way to collect and monetize our data. The social networking revolution in the 2000s prompted people to go online, leading them to various platforms for sharing text on blogs, bookmarks, photos, videos, and more. After a few years, social media companies created centralized platforms to take on many of the functions that were previously distributed among several providers. In exchange for their services, these platforms store our personal data and request the right to use it. Each works in its own "fenced garden".

Like search engines, social networks give the user a linear list of content, ranked by factors and algorithms that we can minimally influence. Unlike the search, here the tape is generated without any search queries from our side - like a TV without a remote control.

The tape was carefully personalized on the basis of the data that we deliberately left on the social network, in combination with traces from the history of page views, collected without our explicit consent using trackers on third-party sites. In his lecture in 2018, Tim Berners-Lee noted that political television had long ago been banned on British televisionbecause of fears that such a direct means of influence unduly affects the masses. According to this logic, one should be much more wary of the highly personalized political advertising that modern social networks allow. Even if a person refrains from explicitly expressing very personal preferences, these preferences are reliably determined from seemingly insignificant fragments of other data . Data mining reveals a person’s sexual orientation, ethnicity, religious and political views. Subsequently, the information is used for targeted exposure.

As in previous battles for centralization, people feel pressured to join a large network. Refusal to join means to fall out of the circle of virtual communication of friends and relatives. Often for grandparents, the easiest way to see the latest photos of grandchildren is to create a Facebook or Instagram account.

That is how the digital memory of the modern generation is largely concentrated in one place, often outside the control of the users themselves. The centralization of online activity has become so extreme forms that some Facebook users are no longer aware of the possibility of accessing the Internet.. Unfortunately, this paradox has become a reality in many countries where, at the initiative of, [a “charitable” organization that Facebook founded - approx. Lane.] provides a strictly limited version of the Internet, which is a glaring violation of network neutrality.

Meanwhile, another battle unfolded in the background, this time for becoming our identity provider. More and more sites are replacing their own authentication systems with a service from large providers such as Google or Facebook. It is convenient for people with an account to log in using the Facebook button. On the rest, additional pressure is created to join the network. In both cases, these buttons are another way to track online activity. This centralization deprives us of anonymity, that is, the freedom to hide data that we consider personal.

Separation of data and services

In all the listed battles for centralization, the refrain is one theme: lack of choice. Lack of choice of browser and operating system, point of entry to the Internet, the location of storage of our personal data. Decentralization is, first of all, the creation of favorable conditions for selection by abandoning the unique place of data storage, artificially tied to the service. These two systems should be separated from each other and give the user a choice. Just as we are free to choose any combination of gadgets, operating systems and browsers to access the Internet, we must be able to interact with sites and other people without obligations with respect to one or another social platform.

The return of control over our personal data, according to Tim Berners-Lee, is accomplished byseparation of data storage from other services . This means that people can store their data where they want, at the same time using any services. To store your texts, photos and videos, we can choose any provider - or just store them on our own computer. Any third-party service with our permission will use this data, regardless of the storage location. The data repository may, although not required to, perform the most important user authentication service.

Such logic gives rise to the concept of a personal data module.(personal data pod), in which we store all the created information. As shown in the figure below, this can be understood literally: even the seemingly trivial part of the data, like the likes put, is stored in a private module. Although this degree of decentralization may seem extreme, remember that even supposedly trivial likes reveal deeply personal information , so it makes sense to put them under control. In addition, if a person does not depend on someone else’s permission to publish data in his own module, he can put likes and comments wherever he wants, without fear of censorship and punishment.

In a decentralized network, each piece of data is stored in a location chosen by the author.

This full ownership of data provides very detailed access control: users can selectively grant friends or applications permissions to read or write certain fragments. For example, they decide whether to publish their photo and full name, who will see the likes and comments, which applications will edit the photos and posts. You can change or revoke permission at any time. It is allowed several data modules for various purposes: for example, a module for personal and family photos, a module with rules for storing professional data for work, a university module with training materials and assessments. After creating a module, a person decides what data to store.

Choosing a place to store your own data, we prevent unauthorized access and operation. We are no longer obliged to pay with our data for the services of Internet companies. Moreover, we can protect the most sensitive parts of the data, keeping them with us, limiting access only to those people and services that really need it, and only for a certain time.

Independent innovation after sharing data and services

When people themselves store their data, it will become impossible to cash in on them. These economic changes can be accelerated with legislation like the GDPR and explaining to the population the danger of centralization, given the recent scandals with leakage of private information like the stories of Equifax and Facebook. Therefore, new business models are needed.

Decentralization requires avoiding isolated applications. As shown in the figure below, current web applications combine data and service. Because of this connection, our LinkedIn contacts cannot comment on our Facebook photos, and an invitation to an event on Facebook cannot be shown on the Doodle calendar. On the other hand, distributed applications act as views on top of our and other data modules. Having received a special permission, the social tape application can take from the module the photos uploaded there by the photo gallery application. Events from the personal calendar with the status “visible to all” are added to the same tape. Friends are given access to individual pieces of our data through any application they want to use.

Now centralized web applications act as storages that do not communicate with each other. Distributed applications work as general views over the personal data modules.

Since the choice of data and service provider is no longer dependent on data storage, separate data and service markets arise.. The figure below shows that centralized applications are now competing for ownership of data. Thus, people cannot easily switch to a more convenient application, and data transfer is technically challenging, if possible at all. In addition, new potentially more convenient applications are experiencing problems with entering the market, because they do not yet have enough data. With decentralized applications, people choose a service provider and storage location separately, and companies independently compete in both markets. At both levels, competition is based solely on the quality of services, the ratio of functions to cost.

This independence means that we can freely switch between providers of data and services, without requiring our friends to make the same choice. This destroys the walls between the “gardens” because the services interact freely with each other. Providers of data and services can develop independently, providing a faster and more creative innovation cycle. Anyone can enter any market and attract customers if his service is better than others without requiring control over user data.

Centralized applications compete in the same market for owning our data. In a distributed network, data and service providers compete in different markets.

Solid project

To implement this concept, Tim Berners-Lee launched the Solid project . It includes specifications for interoperability, server, client, and application implementations , as well as the developer community . Next, we discuss some of the unique features of Solid.

Linking and integration of personal data

The goal of Solid is the empowerment of people through personal data management as an analogue of corporate Personal Data Management systems. A Solid server or data module is the web equivalent of a hard disk where we store arbitrary documents, and Solid applications are similar to programs for a personal computer, only opening documents from Solid servers on the Internet. Unlike real hard drives, Solid servers are usually open to the world, so they need detailed access control options. They set who can view or edit which documents. Tim Berners-Lee himself set an example, using Solid in his personal and professional life for several years.

For this to work on a network scale, the data in different modules must be linked as hypertext documents. Solid uses Linked Data for this : each piece of data can be linked to any other. For example, a comment in the module of one user is attached to a photo in the module of another, while both users remain owners of their data. At the time of execution of the Solid-application, the data is integrated from several sources and integrated into a single whole.

Modules provide and decentralized authentication. The person chooses the so-called WebID - a unique web address for identification. This address points to a common profile, and the user enters any module with its own WebID without the need for separate authentication at each site or using a centralized platform.

Web read-write

One of the most important aspects of Solid is that it provides a read / write platform, which was the original purpose of the WWW . Although the “record” was always technically possible, in the sense that anyone could launch their own website, but the Web 2.0 revolutions and social networks had to simplify the process considerably. The success of these platforms is partly due to their interactivity: now everyone is able to create and publish content at any time, especially through mobile devices.

Solid should publish content just as easily. The difference is that we publish in our own data modules, and not in the application. At the same time, freedom of expression is guaranteed without the risk of censorship. For maximum compatibility, related data should be stored usingSemantic Web technologies that link pieces of data to their meaning. Thus, applications understand (fragments) the data of each other, without agreeing on the format.

We also need an information mechanism when objects in modules are created or changed - especially if we are talking about comments. It provides Linked Data Notifications technology : small automatic messages, like email, that different data modules send to each other. By combining these technologies, Solid implements the concept of Read-Write Linked Data, guaranteeing everyone participation in the Web of Data.

Revolutionary potential

Transforming the ownership of data and the role of applications in a distributed ecosystem, Solid will break many centralized processes on the Internet. Now it is possible to exclude intermediaries controlling these processes, which stimulates innovation in many areas.

The first obvious goal is social relations between people. With Solid, a simple and confidential way of sharing media files with friends, colleagues and relatives appears. Other examples are joint work on various documents with strict access control: organization of meetings and events - again with full ownership of data, choice of application and storage, synchronization between applications, etc.

In addition, Solid is technologically capable of revolutionizing entire industries, such as scientific publications. The current process assumes that the author uploads the manuscript to a centralized platform, where it is evaluated by a closed group of reviewers. After approval, the manuscript is published as an article, and then becomes available to the public, possibly for a fee. This is quite a long process. The wider scientific community can read the article only at the very end, if it is accepted. The process is also opaque, since valuable details are hidden from the public: reviews and editing of articles. As a rule, feedback is possible only through a similar slow process. Instead, a distributed application for publishing articles, such as dokieli, allows researchers to independently publish manuscripts in their own module Solid. Colleagues' comments are stored in their own modules, guaranteeing freedom of expression to anyone who wants to participate. All results remain open to comment even after posting on the web.

Decentralized Network for All

Repeated decentralization of the web in accordance with the concept of Solid will help to overcome the three problems that Tim Berners-Lee formulated . We can regain control of personal data by storing them in our own modules. Misinformation is blocked because the free choice of applications allows you to control your news feed - and any information can be traced to the source itself. Political advertising is becoming more transparent, as everyone decides to whom and how to open data fragments. Moreover, the separation of data and services markets allows us to consider other options, without advertising at all. Although Solid does not fully solve all the problems, but data ownership and freedom of choice is the main thing.

However, one should always pay for freedom: the victory of personal rights and freedom of speech at the same time contributes to illegal activity, because distributed networks make it difficult to control information. Of course, this is a difficult question, since in some countries it is declared unlawful to speak, which are completely legal elsewhere. An intriguing example is the increased popularity of the decentralized social network Mastodon in Japan : when Twitter began to delete images that are dubious by American standards, Japanese users began to publish them on platforms with lower censorship.

We will have to accept this compromise between freedom and control. In the absence of universally accepted norms, centralized filtering of prohibited content will never be an adequate solution.

This brings us to another aspect of decentralization, namely the contradiction between freedom and universality. The paradox of freedom says that a person can become free only if he obeys certain rules. Simply put, we are free to take a bike and go anywhere - if we just keep to the right side of the road (in several countries left). Without observing this rule, we will not get anywhere without an accident. Since universality has always been the main task of the webDistributed communities need to agree on some basic framework for decentralization. As with the versatility of browsers, the W3C consortium plays an important role in creating standards for the interaction of data modules and applications. Fortunately, no need to coordinate all the details. The Linked Data format allows for multi-level agreements in which several rules apply to many participants, and additional rules are agreed upon by smaller groups as needed.

It is important to note that Solid is not created to deal with specific companies, such as Google, Facebook or Twitter. The project challenges the centralization as a whole, since many of the problems of these companies are caused by the centralization and business model of data ownership. We come to the fact that companies have such a volume of data that alreadycannot predict the long-term consequences of such centralization . Therefore, it is unreasonable to continue to rely on “informed consent” as a pretext , since no one can understand what ultimately lead to the rejection of control over small or large pieces of its data. Thus, storing your data in a safe place with freedom of choice and a detailed permissions model is the only safe option.

None of us dreams of the Net without major players. Quite the contrary: Tim Berners-Lee insists that the Web should always scalefrom very small to very large participants. The problem is that at present very large participants are trying to crush the rest, which threatens the freedoms that we have enjoyed for many years. As mentioned above, decentralization is first and foremost freedom of choice: people should be able to freely join large or small communities. And although we face several technical problems, including a guarantee of similar convenience and speed, as with centralized platforms, Solid is the first viable technical solution. Now we need to consolidate progress in the socio-economic reality in order to completely decentralize the web. Only when we manage to regain control and freedom of choice for the most valuable digital assets, can we truly say:This is the Network for all .

Also popular now: