Percona Live 2016 CA Notes

    I recently attended the wonderful Percona Live 2016 conference in Santa Clara. I would like to write a lot of laudatory words to the organizers and for the excellent working Wi-Fi, and food, and the exact adherence to the schedule, and the preparation of the halls. But nevertheless, I’m writing an article not for a tourist site, but for a technical one, therefore I’ll just tell you about the most interesting reports that I visited.

    Surprisingly for such a narrowly focused conference, the range of reports was not limited to MySQL alone, as it might seem , but covered the data processing tools as a whole. There was a place for Hadoop with an ecosystem and columnar databases, and clouds (where now without them).

    Deploy GTID Replication in Dropbox


    Starting with MySQL 5.6, a wonderful thing like GTID replication has appeared in MySQL . The manual says “a lot of letters” about how this replication works, but there is practically no information on why it is needed.

    Let's imagine that someone needed to cascade data replication. This may be necessary if part of the cascade of replicas is located in another data center. To save traffic between data centers, only one replica pulls a copy of the data on itself, and local slaves are updated from its logs. In general, not the worst and quite working scheme. But it contains one small problem.

    You can’t just take and change the master at the slave. For example, if one of the nodes fails, then all nodes of the branch will have to be reinitialized, i.e. Deploy a copy of the new wizard and start replication from it. And this is already expensive and long.
    To prevent this from happening, a new format for the replication log was proposed, which allows the slave to continue replication from other slaves. Those. the replication log recorded on the slave will completely duplicate the log from the master.
    You can learn more about the work of this mechanism, as well as how to enable it for a large project (recall, they tell Dropbox) directly from the report.
    Rolling out Global Transaction IDs at Dropbox

    Maslow Pyramid for DB


    In a very satirical and practical form, Charity Majors talked about what actually defines the pyramid of needs and survival when choosing a database for the project. Each of the points is supported by an excellent illustration, such as this:

    Maslow's Hierarchy of Needs for Databases

    Partitioning in MySQL


    For many many years now, MySQL has the ability to organize partitioned tables. Why this is needed, I hope, no need to explain. However, due to some implementation features, developers often bypass this mechanism. In general, fear of an ax - do not cut down the forest. Rick James offers to figure out how this tool can be used for its intended purpose and under what restrictions partitioning will work well.
    Here are examples of tasks where you can benefit from partitioning:
    • Sliding time;
    • 2d index;
    • Import export.

    PARTITIONing - How-To vs. Don't-bother

    Replicate custom shards on Facebook


    Daren Seagrave from Facebook revealed some of the features of organizing their user shard system and talked about how they move between servers and data centers, what path they went to get an even and efficient use of the server pool. The most unusual solution, in my opinion, was that they first determine where to transfer data, and only then - what exactly needs to be transferred. Despite the fact that technically the shards themselves are MySQL servers, the report is almost entirely applicable to any database.

    Everyday We're Shuffling - Online Shard Migration at Facebook

    Facebook database backup


    Facebook Shlomo Priymak and Dan Reif talked about how they organized a system for storing backups of user shards. Due to the fact that all shards are small, the backup process of an individual shard is fast. As you know, a backup cannot be considered a backup until we are convinced that you can turn around with it. That is why Facebook organized a system that constantly checks that captured backups can be used to deploy the database.

    The second technical feature of the system was an interesting idea for organizing incremental backups. Honestly, I have never even heard of anyone shooting an incremental database backup. The idea for its implementation turned out to be fantastically simple.

    The report also mentioned how they implemented it and organized storage in Hadoop. And also how they refused to calculate this “differential” in Hadoop. In general, I find this report the most useful and interesting 8).
    Massively Distributed Backup at Facebook Scale

    Linux performance


    Netflix's Brendan Gregg presented an excellent blitz on which subsystems Linux consists of. What utilities can get information about each of the subsystems to understand if there is a "blunt".


    He also presented his list of commands in order to collect the necessary information about the status of the server in 60 seconds. I believe that from this report every “devoop” will draw something new and useful for himself.
    Linux Systems Performance

    BI retrospective at Badoo


    And of course, I can’t keep silent about the Badoo report at Percona Live 2016. In the report, we talked about how our business intelligence system was developing. What difficulties did you encounter, how did you solve them, what technologies and at what volumes of data did you work with, and how did we choose the database for analytics.

    At the end of the report, we talked about the fact that the most urgent task for us is the problem of data complexity (hundreds of tables) and how we are going to solve this problem.
    BI at Badoo - historical retrospective

    Fixing MySQL Bug # 2: now MySQL makes toast!


    The incredible happened! After 14 years, we finally fixed bug number 2 . Right before our eyes, MySQL made a toast!


    Summary


    By and large, the whole post is a big thank you to the organizers for the program of the conference. In addition, I want to note various little things that helped to keep up with the conference itself: almost no queuing power, excellent Wi-Fi, the presence of water in coolers, free space and the ability to charge gadgets in lecture halls for reports, a fun quest for collecting stamps in the exhibition area and , of course, hosting the Percona Game Night .

    Alexey Eremikhin (@alexxz), Head of BI Development Team

    Also popular now: