Facebook has opened the curtain on its technology

    It is an obvious fact that distributed computing systems of especially large scale require special, individual rules and regulation systems. Simply scaling technologies for small systems does not work here. One of these differences is the need for web caching. How do Facebook engineers deal with Zuckenberg's brainchild? Let's take a closer look at the caching principles used by the network giant Facebook.



    It’s a well-known thing, caching boils down to the fact that the information most frequently requested by users is moved from their permanent storage to an intermediate buffer, which allows I / O operations much faster. The traditionally used concept of "cache" is found in many areas of our lives, and its role is extremely important. Specifically, in the network IT industry, using data caching, relieve the load from servers that store an array of information highly demanded by users. The result of implementing this approach for an ordinary visitor to a social network is the instant loading of any, even the most popular web page at that moment. For the normal functioning of popular network resources, this technology has long passed from the category of “desirable” to “necessary”. And of course Facebook is not the only one of its kind. Caching is also crucial for other similar Internet projects: Twitter, Instagram, Reddit and others.

    Facebook network infrastructure engineers have set up a special tool to manage caching processes, giving it the name "Mcrouter." At the beginning of this month, a very important event took place for the continued existence and development of Mcrouter. At a seasonal conference that took place in San Francisco, Facebook officials revealed the system code for their brainchild. In fact, it is a memcached protocol-based router that manages all traffic between thousands of cached servers and dozens of clusters located at corporate data centers. Thus, memcached technology is quite capable of working on a project such as Facebook.

    Memcached is a system for caching data on servers that are part of a distributed infrastructure. For the first time this software product was used for LiveJournal back in 2003, and today it has already become an integral part of many Internet companies, such as Wikipedia, Twitter, Google.



    Instagram started using mcrouter when it was already working on server infrastructure from Amazon Web Services (AWS) before moving to the Facebook data center. Employees of Reddit, which, incidentally, also hosted on AWS, have already completed mcrouter testing for their project, and in the very near future they plan to completely transfer their entire resource to its use.

    Open source formalization


    Facebook, to manage its data centers, creates and uses many open source software products. That is their principle. At the same recent conference in San Francisco, company representatives mentioned a lot about the importance of maximizing the “openness” of the software used. Their initiative to continue working in this direction was supported, among other things, by such IT giants as Google, Twitter, Box & Github. With some risks of openness, the benefits that software gets for its development are quite obvious. Well, we just have to wait for what real actions will follow the announced initiative, and they won’t take long to wait.
    The companies participating in the forum that supported the initiative of "openness" of software, announced the creation of a TODO organization (the abbreviation for "Talk Openly, Create Openly"), which will assume the role of a focal point in this process. So far, no concrete plans of activities have been announced, but in general, obviously, the organization will contribute to the development of new products and promote existing ones. Community members also commit to adapt and unify their own software development.



    Where Likes Live


    Mcrouter has become indispensable for the further development of Facebook, especially for the implementation of some interesting features on the site. According to Rajesh Nishtal, one of the company's programmers, one of these functions was the “social graph” - an application that tracks the connections between people, their tastes and actions that they do while staying on Facebook.

    The social graph contains the names of people and their relationships, as well as objects: photos, posts, likes, comments and geolocation data. “And this is just one of the tasks for which caching is applied,” Nishtala said.



    Each time you load a page on a social network, a cache is accessed. In turn, the cache is capable of processing more than 4 billion such operations per second, thanks in large part to mcrouter. This tool successfully copes with the whole infrastructure.

    From load balancing to overload protection


    Mcrouter is an intermediate link between the client and the cache server, in fact, the user works through it. It is he who accepts user requests and transmits cache server responses. In addition to all of the above, Nishtala also described three main tasks of this system: combining cache connections, distributing workload across the storage itself and automatic protection against overloading a specific server on the network.

    Combining cache connections helps maintain high site performance. If each user starts to connect directly to the cache server directly, this can easily lead to its overload. Mcrouter works on the principle of proxy and balancer, which gives clients access to the server until the load on it becomes critical.

    At a time when many processes compete with each other in the memory cache, thereby creating a load on it, the system distributes these processes into groups, and these groups are already scattered between the existing network of servers.

    If the cache server fails, mcrouter automatically connects another, with a backup. As soon as this happens, the system begins to monitor whether the fallen server is back in service.

    The system also has the ability to create whole levels of joined server groups. If one cache server association is unavailable, mcrouter automatically transfers the load to another available group.



    Reddit also relies on mcrouter


    Raddit already managed to test mcrouter on one of the AWS clusters they provided. Reddit system administrator Ricky Ramirez on this occasion said: “The use of servers by the site has changed continuously for many days, the number of servers used was from 170 to 300 pieces. We also had about 70 nodes backend cache with a total memory of 1TV. According to the results, we can say that in general the system paid off. The weak point that mcrouter should help with is the inability to switch to the new types of machines that Amazon constantly invents. This is a pretty big problem that takes a lot of time from the operator-engineer. ”

    After a successful test, launched on one of the groups of combined servers, which performed 4200 operations per second, the team plans to use mcrouter on much heavier loads. The next group can be tested for operability even with a load of more than 200,000 operations per second.

    Further, the engineers plan to use mcrouter to gain access to new cloud virtual machines and replace current capacities with them without downtime of the Internet resource. Unloading the already used capacities will be fraught with some difficulties in managing the infrastructure, but Ramirez is confident that as a result all these efforts will bring an impressive increase in the productivity of this equipment.


    Also popular now: