yeah_boss December 13, 2016 at 11:00

Ask Badoo backend developers. Part 1. Platform

We really like the AMA (ask me anything) format on Reddit, when someone (in our case, the development team) comes to the AMA subreddit and says that he is ready to answer questions. Of the most memorable Ask Me Anything sessions, for example, a team of Space X engineers , or Google engineers , and even current US President Barack Obama answered questions on Reddit four years ago . Our Android team recently hosted an AMA and answered developer questions online.

But in Russia there is no Reddit. But there is a Habr. Therefore, we decided to come here with the “ask us a question” format. And not empty-handed, as the AMA rules say. To make it easier for you to understand the topic, we selected one of our teams - “Platform” - and asked the guys to tell us what they were doing, what they were programming on, and what they had achieved during the team’s existence. And summed up the small results of the outgoing 2016. Go!

1. What “Platform” does
2. Services: Pinba, SoftMocks and others
3. System programming. How we started using Go and what it led to
4. Photos
5. Script cloud
6. LSD: Live Streaming Daemon
7. Cassandra Time Series: what it is and how it works
8. Badoo AMA: ask the developers of the Platform

Proof , what is it really we are.

What does Platform do

Anton Povarov , einstein_man , head of the Platform
Mikhail Kurmaev , demi_urg , head of the A-Team

Platform team is an infrastructure team that helps other departments. Our main goal is to make sure that everyone is happy, that everything works, programmers are happy and can safely write code without looking at all sorts of complicated things. We are the backend for the backend.

The "Platform" consists of two teams: a team of C-programmers (they write in C, but recently - in Go) and A-Team (they write in PHP and sometimes in Go too). Syshers write services, make PHP extensions. A-Team is engaged in PHP and database infrastructure, as well as the development and support of tools for other teams.

If we talk about specific projects in terms of what the user sees (what he uses), then we are engaged in:

photos: the entire storage and delivery infrastructure lies with us;
we have our own script cloud (this is a “distributed crown”, which allows us to run offline handlers on a cloud of machines);
We are responsible for all the “wrappers” for services (we provide other Badoo teams with convenient access to services, because we have sharding, there are self-written replication methods).

We are responsible for the “wrappers” because we want to hide all these insides from the backend developers of other teams in order to simplify their work and prevent unforeseen situations when everything we do from the main tasks suddenly broke.

Services: Pinba, SoftMocks and others

Some of our internal services have grown into full-fledged products over time and even become the de facto standard in the PHP ecosystem. The most famous of them is PHP-FPM (now it has become part of the standard PHP delivery for the web), it was written by Andrey Nigmatulin (a report on this topic can be viewed here ), and Pinba (http://pinba.org/), a service for receiving realtime statistics from running applications without the overhead of collecting them (UDP), with which it is easy to understand what is happening with the performance of your application at any given time.

Pinba is convenient in that it allows us to collect data all the time. And this data will be always at hand when you need to understand the cause of the problem. This is convenient and significantly reduces the time it takes to find and fix the problem. No less important is the fact that Pinba helps to see the problem in advance, when it has not yet affected users.

We also invented and made SoftMocks, this is our own framework that facilitates unit testing, which allows you to replace classes and functions in tests. We had to create it in connection with the transition to PHP7, in which the internal architecture of the interpreter was greatly redesigned, and our old Runkit simply stopped working. At that time, we had about 50k unit tests, most of which somehow use moki to isolate the external environment, and we decided to try a different approach, potentially more powerful than Runkit / Uopz.

One of the main advantages of SoftMocks is independence from the internal structure of the interpreter and the absence of the need for any third-party PHP extensions. This is achieved due to the approach chosen by us - rewriting the source code of the program on the fly, and not dynamic substitution inside the interpreter. At the moment, the project is laid out in open-source , and anyone can use it.

You may know that we have a very strong team of PHP developers at Badoo. Therefore, it is not surprising that we were among the first companies to transfer a project of such a scale (like Badoo) to PHP 7 this year, 2016. You can read about how we came to this, what we encountered and what we got in this post .

System Programming. How we started using Go and what it led to

Marco Kevats , mkevac , C / C ++ programmer

At C / C ++, we are developing high-performance in-memory daemons that process hundreds of thousands of requests per second and store hundreds of gigabytes of data in memory. Among them, you can find things like search daemons that use bitmap indexes and search for them using a generic JIT, or a smart proxy that processes the connections and requests of all our mobile clients. If necessary, we expand the PHP language to our needs. Some patches are sent to the upstream, some are too specific for us, and some things can be done in the form of loadable modules. We write and maintain modules for NGINX that deal with such things as url and data encryption and fast on-the-fly photo processing.

We are hardcore system programmers, but at the same time we perfectly understand all the disadvantages of C / C ++ programming: slow development, potential errors, programming complexity using threads.

Since the advent of Go, a newfangled, youthful and promising language from Google, we have become interested in it. And almost immediately after the release of the first stable version in 2012, we began to consider the possibility of its use in production.

Go promised to be close in spirit and performance to our beloved C, but allowed us to make prototypes and even final products noticeably faster and with fewer errors. And the fact that Go was a synonym for competition with its channels and goroutines, especially excited our imagination.

At that moment, we had a new cool and very urgent task of finding intersections between people in the real world. After listening to the requirements, we almost exclaimed in chorus: “This is the task for Go!” It was required to process a large number of user coordinates streaming, cross them correctly in several “coordinates”, including time, and produce some kind of result. A lot of interactions between parts, a lot of parallel computing. In a word, exactly what is the basic task for Go.

The prototype was made by three people in a week. He worked. He worked well. And we realized that Go would take root with us. In 2015, Anton Povarov spoke in detail about Go in Badoo.

But it cannot be said that our romance was perfect. Go at that time was a very young language with a lot of problems, and we immediately started writing products that processed tens of thousands of requests per second and consumed almost 100 gigabytes of memory.

We had to optimize our services so as not to make unnecessary memory allocations directly and so that the Go compiler would not decide to do these allocations for us. And here the beauty and convenience of Go again showed themselves. From the very beginning, Go had excellent tools for profiling performance, memory consumption, in order to see when the compiler decides to allocate some piece on the heap, and not on the stack. The presence of these tools made optimization an interesting and informative adventure, and not flour.

In the first project, we needed to use the existing geo-computation library written in C. So we plunged into the very thick of the problems and nuances of the interaction of these two languages.

Since Go was an initiative “from below”, we had to try to prevent our colleagues and managers from rejecting our idea immediately. We understood that it was necessary to make the Go project not different from the C project from the operation side: the same configs in JSON, the same interaction protocols (the main protobuf and the additional one in JSON), the same standard statistics that go into RRD. We had to make sure that the release of the engineering project on Go was no different from the project on C: the same Git + TeamCity Flow, the same build in TeamCity, the same calculation process. And we did it.

Administrators, operations and release engineers do not think about what the project is written on. We realized that now you can not be shy to use new tools, as they perfectly proved themselves in practice (in non-critical tasks, as it should be for a start).

We did not create anything from scratch - we built Go into the infrastructure existing for many years. This fact limited us in using some things that are standard for Go. But it was this fact, coupled with the fact that we immediately began to write a serious high-loaded project, that allowed us to plunge into the language head over heels. We are notoriously dirty, I tell you, but this closeness helped us to "grow together" with this beautiful language.

It was interesting to watch how, with each version of Go, it grew like a child turning into an adult. We saw how the GC pauses on our daemons melted with each new version, and this is without changing the code on our part!

Now, after four years of working with this language, we have about ten of the most diverse services on Go in three teams and several more new plans. Go has firmly entered our arsenal. We know how to “cook” it and when to use it. After so many years, it’s interesting to hear how programmers regularly say things like “quickly sketch a prototype on Go” or “there is so much parallelism and interaction, this is work for Go.”

Photo

Artyom Denisov , bo0rsh201 , senior PHP-programmer

Photos are one of the key components of Badoo from the point of view of the product, and we simply must pay a lot of attention to the infrastructure for their storage and display. At the moment, we store about 3 PB photos, every day users upload about 3.5 million new images, and the reading load is about 80k req / sec at each site.

Conceptually, it is organized as follows. We have three points of presence in three data centers (in Miami, Prague and Hong Kong) that provide locality to most of our target markets.

The first layer of infrastructure is caching servers with fast SSDs, which process 98% of the incoming traffic, they run our own mini-CDN - this is a caching proxy optimized for our load nature, which also runs a lot of utilitarian / product logic (ACL, resize, overlay filters and watermarks on the fly, circuit breaker, etc.).

The next layer is a cluster of pairs of servers responsible for long-term storage,
some of which have local disks on which photos are directly stored, and some are optically connected to the third layer - Storage Area Network.

These pairs of machines serve the same user ranges and operate in master - master mode, fully replicating and backing up each other via an asynchronous queue. The presence of such pairs allows us to have fault tolerance not only at the level of hard disks, but also at the level of physical hosts (kernel panic, reboot, blackout, etc.), as well as it is easy to carry out scheduled work and to survive failures without degradation of the service, which when large scales are not uncommon.

In more detail about our work with photos Artyom Denisov told this year on Highload ++ .

Script cloud

It's no secret that in any project, in addition to actions that are performed in the context of a user’s request, there are a large number of background tasks that are performed on a delayed or scheduled basis. Usually, they use some kind of background worker scheduler (in the simplest case, this is cron).

With an increase in the number of such tasks and the amount of resources consumed by them, which gradually cease to fit on one, and sometimes several dozen physical machines, managing these crowns and manually balancing the load for each node in the cluster becomes difficult. So the need arose to create our cloud - a flexible infrastructure for the transparent launch of development tasks.

It works like this:

1) The developer describes the job in the form of a PHP class that implements one of several interfaces (crown script, queue parser, database crawler, etc ...

2) Adds it via the web interface to the cloud, selects parameters for the startup frequency, time Outs and resource limits.

3) Next, the system itself launches this job on a distributed infrastructure that is allocated for the cloud, monitors its implementation and balances the load on the cluster. The developer can only monitor the status of the work of his job and look at the logs via the Web UI (how many instances are running, what settings, which starts how ended).

At the moment, in the cloud we have about 2000 hosts in two DCs ~ 48k CPU cores / 84Tb memory. 1800 user jobs generate about 4,000 starts per second.

We talked about the cloud here and here .

LSD: Live Streaming Daemon

Everyone who works with large amounts of data, one way or another, is faced with the task of streaming them. As a rule, we stream some data from a large number of different sources into one place so that they can be processed centrally there. Moreover, the type of this data often does not play a role: we stream application logs, statistics, user events, and much more. Conceptually, we use two different approaches to solve this problem:

1) Our own implementation of the queue server for delivering events related to the product / application logic.

2) A simpler mechanism for streaming various logs, statistical metrics, and simply large amounts of data from multiple nodes that need to be centrally aggregated and processed in large bundles in one place.

For the second task, we used Facebook's Scribe for a long time , but with the increase in the amount of data pumped through it, it became less and less predictable and has long been forgotten.

As a result, at some point it became more profitable for us to write our own solution (since this task does not look very difficult), which would be easier to maintain.

We called our own streaming events LSD: Live Streaming Daemon.

Key features of LSD:

transport in the form of strings from plain files (for the client there is nothing more reliable than writing data to a local FS file);
clients are not blocked during recording, even if all destination servers are unavailable (the buffer accumulates in the local FS);
transparent control / setting of limits on the consumption of network / disk resources;
scribe compatible recording format / file aggregation method on the receiver.

This year we published the source code for LSD, and now you can use it in your projects.

Cassandra Time Series: what it is and how it works

Evgeny Guguchkin , che , senior PHP programmer

Badoo is a complex system consisting of many related components. Assessing the state of this system is not an easy task. In order to do this, we collect more than 250 million metrics at a speed of about 200,000 values per second, and this data takes about 10 TB.

Historically, for the storage and visualization of time series, we used the well-known utility RRDtool, “wrapping” it with our framework for the convenience of work.

What we liked about RRDtool was its read speed. However, there are serious disadvantages:

high disk load caused by a large number of random access I / O (we solved this problem using SSD and RRDcached);
lack of the ability to record retroactively: that is, if we recorded a value at a point in time 2016-01-01 00:00:00, then we will not be able to record the value for 2015-12-31 23:59:59);
rather wasteful use of disk space for sparse data;
access to data is carried out locally: it is impossible to build a distributed system with horizontal scaling out of the box.

It was the last point that became decisive for us, because without it we could not display metrics from different servers on the same chart.

As a result, we carried out a detailed analysis of existing solutions in the field of time series databases, made sure that none of us was suitable, and wrote our own solution based on Cassandra.

At the moment, half of our real data is duplicated in a new storage. In numbers, it looks like this:

nine servers;
10 TB of data;
100,000 values per second;
140 million metrics.

At the same time, we solved almost all the tasks that confronted us:

failure of one node in the cluster does not block either reading or writing;
it is possible to append, as well as rewrite the "raw" data, if they are not older than a week (a window in which you can change the data, you can configure and change during operation);
Simple scaling of the cluster
no need to use expensive SSD drives;
moreover, it is possible not to use redundant RAID arrays from the HDD, since when replacing a disk, a node is able to recover data from neighboring replicas.

We are very proud of the work done to analyze existing solutions and build a new one. We stepped on countless rakes while working with Cassandra and will be happy to answer your questions, share our experience.

Badoo AMA: ask the developers of the Platform a question

And now, in fact, why we publish this post. Today from 12:00 to 19:00 (Moscow time), the platform team will answer your questions. We experienced a lot during the existence of the team: we expanded, changed, studied, faced with some problems, came to new programming languages. And we are ready to share our experience with you (including talking about feyl, fakapy and our pain).

For example, ask about:

the arrangement of our internal projects (how our sharding works, how we collect and draw statistics, how we communicate with services);
what kind of rake we stepped on, why we made certain decisions (what did we do when we stopped fit on one server, in one DC);
customization of the work of a strong team of PHP / C-programmers;
transition to PHP 7 (what problems can be encountered);
features of working with a highly loaded project;
recommendations to PHP and Go programmers;
everything that is described above in a post.

But do not stop there!

UPD: Thanks to everyone for the questions, we are ending our AMA session, but we will continue to answer, but not so promptly. Therefore - ask again.

Tags: