Transition of Mail.Ru Mail to 64-bit architecture: how it was

    For several months, the fronts of Mail.Ru Mail have become 64bit. Better late than never, we decided, and today I’ll tell you why we did it, what we went through for this and how we did it.

    And so it works

    For a long time, our Mail worked on 32 bits on the first Apache and Perl 5.8 running CentOS 5. The idea of ​​moving the front-end to more modern software and 64-bit architecture has been wandering in our minds for a long time: a year and a half ago there were only two people - one admin and one developer - for a week without sleep, they raised a test server on which our bright future was spinning. However, in those days we had more urgent tasks, and we safely forgot about the server. Periodically, they returned to this idea, but everything happened in the “What if so?” “Oh, something has broken!”, And again everything rolled away and was put into a long drawer.

    Time has come

    Finally, it's time for a change. At some point, we realized that this could not go on further: full support for CentOS 5 ceased in 2014, Perl 5.10 on some tasks showed a 30% increase in speed, not to mention the fact that 32-bit architecture in the 21st century is somewhat lags behind the desired.
    In addition, after we transferred Mail to HTTPS , load average increased on the front-end servers, so that a more productive Perl became more relevant than ever.

    Transition difficulties

    First of all, we had to rewrite those parts of the code that directly interact with Apache. Since our monitoring system is tied to ErrorLog Apache, we had to teach the new server to log errors as we need. As a result, a self-written Apache 2 error logging module appeared, available by reference .
    In addition, I had to deal with dependencies: all the packages we use have historically been built under the c5x32 architecture and, in this form, were added to the repository. In the changed realities, everything, including modules under nginx, had to be rebuilt for c6x64.
    The captcha generation module has also been completely rewritten.
    However, the banners brought us the most trouble. Our entire layout is built on slots, the contents of which, including banners, are taken from a template engine written in C and deeply integrated into Apache. The banner module picks up Apache headers and targets them already. To make all this fly up under Apache 2, I had to not only work hard myself, but also rock the guys from the corresponding department.

    Binary protocols

    In Mail.Ru Mail, interaction often occurs through binary protocols. First of all, this is communication with our Tarantool data warehouse. In addition to the database, even between our services, for example, a Perl server and a C server, data is transmitted in binary form. This is good, fast and convenient, until it comes to a change of architecture.
    Each time it is necessary to encrypt data or add modulo, the probability that the results of the operation on x32 and x64 will be different becomes non-zero. It is complicated by the fact that these differences appear only on specific data, so it’s not a trivial task to search, catch and record these cases.
    For example, the first line of code below works differently on different architectures.
    my $ crypted_userid = $ user -> {'ID'} ^ 41262125215;
    getpage ('project_url_api? user_id ='. $ crypted_userid);
    This difference in behavior leads to problems in completely unexpected places and even on different projects. For example, new results of code execution with user id led to the fact that thanks to our common authorization system, changing a password by one user entailed logging out a completely different user from another service.
    The same encryption problem was found in the Mail itself. After sending the letter, the user is sent to the URL, which, among other things, contains encrypted recipients (no one wants his email to be sent to the address bar in clear text). What happened after we began to generate the URL on a 64-bit architecture was not hard to guess: instead of a list of recipients, a random set of characters appeared.
    Of course, these problems are now resolved, but their capture took a significant amount of time.

    Only two.

    The transition itself took two months. Before this happened, for about six months, our Post worked on both old fronts and new ones. We carefully monitored the behavior of these two systems: on our dashboard there were two separate graphs, according to which we followed where there were more errors and what works faster. A funny story is connected with them - once in the middle of the night everyone was raised to the ears, because on the performance charts of the new fronts the lines ran higher - it turned out that they work out much more slowly. Then, however, it turned out that we had optimized so much that the scale for new fronts automatically changed by an order of magnitude, and in fact they worked out 10 times faster. Nevertheless, we managed to get scared.
    In addition, in the conditions of two systems, you can’t just take and process the Apache request. You have to do the following:
    sub GetApacheRequest {
    $ ENV {MOD_PERL} = ~ m {mod_perl / 2}? Apache2 :: RequestUtil-> request (): Apache-> request ();
    The assembly of packages for Mail also began to dazzle with conditions like
    % if 0% {? Centos} == 6
    % if 0% {? Centos} == 5
    And there are a lot of such places.
    In addition to monitoring and making changes to two branches at once, we, of course, had to test new iterations twice as long, but our testers managed.

    The reward for work

    So, what did we get as a result of this painstaking and sometimes nervous work?
    • Full CentOS 6 support - new patches, current system status
    • Fast Regulars in Perl 5.10. Scripts that parse and parse fly even faster
    • Apache 2 picks up a new config and scripts without restart. Layout of code and configs does not lead to the 500th error (theoretically, the first Apache was also able to do this, however, getting it to do this normally without disconnecting the front from the load is a task from the realm of fantasy)
    • Total refactoring. The transition to new software is a great reason to get rid of unnecessary dependencies, unnecessary entities and unused modules
    • Using Puppet . We decided to go for a walk like that, and at the same time switched to Puppet. Now the layout of new features and the deployment of hotfixes have become much easier.

    It was expected that the transition from 32 to 64 bits would adversely affect the memory consumption of Apache, which is already trying to eat away everything that will give. The amount of memory allocated for one process, of course, has grown, there is no escape from this. However, everything began to work faster, so fewer processes cope with tasks, so overall memory costs have not increased. Around profit.
    Perl 5.10, by the way, gives us an additional advantage: ease of transition to 5.16 compared to the painful transition from 5.8.8. So wait for the new Perl in our Mail.
    If you have any questions, I suggest discussing them in the comments.

    Ilya Zaretsky,
    head of the mail development backend group

    Also popular now: