A meeting of developers with students of the Moscow Institute of Physics and Technology or “How to assemble Badoo on the knee”

    This Wednesday, our developers, graduates of the Moscow Institute of Physics and Technology, will hold a meeting with students of the Moscow Institute of Physics and Technology and tell how large projects are created and how to make Badoo on your own.
    No marketing, public relations and other bulshit. Only development, only hardcore!
    Developers from the A-team department will communicate with students - they specialize in developing infrastructure projects of the company. At Badoo, the A-team department is creating scalable and fault-tolerant application platforms, developing cluster management applications, testing automation / code deployment utilities, collecting and exploring tons of data to improve the quality and performance of multi-server production systems.
    The work is carried out at the junction of applications for end users and system software.
    If suddenly one of you is studying at another university, but wants to get to a meeting, then write about it in the comments to the post or in a personal message until 15-00 on October 23. We are waiting for a letter with the name of the university, full name, course and specialty.

    Where: Dolgoprudny, Moscow Institute of Physics and Technology, main building, 117 classroom
    When: October 23, Wednesday, at 19-00
    Bonus: Ability to ask tricky questions fisher , antonstepanenko , youROCK and Demi (without an account on Habré).

    We managed to steal the drafts, according to which the developers are preparing for the presentation. We will share them with you.

    What we will talk about at the meeting:

    Badoo: DIY


    1. Start of the project
    • 1 server, LAMP;
    • The MySQL database because it is simple, fast, cheap to maintain, and we do not need Oracle functionality, since we do not want the logic in the database;
    • PHP because fast development, good performance, it’s easier to find developers, and there were no alternatives at the start of the project;
    • nginx + fpm because the problem is slow clients;
    • Launched, working.
    Total:
    * LAMP: Linux / Apache / Mysql / PHP
    Apache -> nginx + php-fpm

    2. Caching
    • Big traffic, serious load, the base can not cope;
    • We put memcache, there are other options (redis, cassandra, etc.), but memcache is simple, reliable and fast, and persistent will provide the base;
    • We shard keys on a pool of servers, all keys of one user in one server;
    • Prolongate;
    • Reset by transaction commit;
    • Custom daemons for special occasions, gpb interface.

    3. Scaling the web
    • The muzzle stopped coping;
    • We increase the number of faces, set a balancer in front of them (the simplest option is nginx as reverse proxy, but we have our own expensive piece of hardware with smart balancing and failover);
    • We store sessions in memcache.

    4. Monitoring and logs
    • Pinba - realtime monitoring, sending packets over UDP, data aggregation, graphics;
    • Log collection through scribe to the database, search by sphinx logs, filters;
    • Mistakes will always be, you just need to control their number and criticality.

    5. Scale the base
    • Increased the number of faces - the base stopped coping;
    • We tried a master slave - write intensive does not help applications;
    • We will shard the base;
    • Residues from fission and the like are not suitable;
    • UDB, spots;
    • Data sampling on such an architecture, search (we fasten the sphinx, but we have our own magic);
    • Queues, sending events in one transaction with changes, two-phase commit problem, delayed event processing, asynchrony;

    6. Scripting framework
    • They made a lot of queues - there were a lot of raking scripts, we need a pool of machines;
    • We made a pool, tightly bound the script to the machine - bad, the machine crashes, the script does not work;
    • Existing solutions (Slurm, etc.) are not suitable, either poor balancing, or very specific requirements for tasks;
    • Make a cloud;
    • On each machine, a special agent to run and heartbeat;
    • The central node manages the execution queue, pushes tasks on live machines, monitors the load.

    7. Deploy
    • We have more than 2000 cars, we need to somehow deploy the code on them;
    • The simplest solution is git pull, slow and non-atomic;
    • The next step is rsync, which is already faster, but still non-atomic, plus it heavily loads the network and 2000 forks is hard;
    • Our option - uftp + aio, that is, multicast, works quickly and does not load the network;
    • Atomic throwing symlink, libpssh on aio;
    • Uimages: file versioning of statics;
    • Automatic build build, run unit and other tests, deploy twice a day.

    8. And we also have:
    • Your own CDN;
    • Code formatter;
    • Replication in php;
    • Antispam;
    • Fast blitz template engine.

    Come!

    Also popular now: