How to return a remote config or Never give up!

    image

    Sysadmins are divided into those who do not backup, and those who already do them =)

    About how to recover files from ext3 / ufs more than one article has been written, so I won’t repeat myself and I’ll write about not the most widely known methods of restoring configs for production server.


    How did this happen?


    A call in the evening from an old friend who is now working in a web studio. On the other side is complete panic and uncertainty.
    -! "No.;%: AAAA! Everything fell, nothing works. I’m kapets, save me.

    After fifteen minutes of bringing the person to an adequate state and finding out what happened, the following became clear:
    • Their studio not only makes sites, but also hosts them.
    • The nginx config is generated by the script, pulling out location and rewrite from the MySQL service database.
    • The base is on good servers with RAID-1 and master-slave replication
    • Backups are not done, because "the chance that both screws on both servers will die is zero" (c) The system administrator of this studio


    About backups


    What is true is true. Indeed, 4 screws cannot die at the same time (possible, but statistically unlikely © “Charlie” Eppes, Numb3rs ), but for some reason people don’t think that rm -rf / * executed on RAID-1 will kill the old on both screws, they also forget that DROP TABLE is being replicated from one server to another. Also, rarely does anyone suspect that one day the office may burn out due to a fire / drown due to a flood / collapse due to an earthquake / leave together with the Department of Economic Crimes and Economic Crime. In general, few people do off-site backups ... But in vain, at least once a month, you can merge everything onto a USB flash drive into a password-protected .rar and take it home even manually, without much steam.

    Neither ZFS snapshots, nor RAID, nor replication are a replacement for backups. Although all this reduces the chances of losing data, and it is very good that it is, however, there should always be Off-site backups!

    Get to the point


    Under Murphy’s Law , what can happen is simply bound to happen. So on this ill-fated evening, due to an error in the UPDATE SQL query, the service table with the data from which the nginx'a config was generated was filled, '' and due to an error in the nginx.conf script, it was overwritten with an empty file. Fortunately, nginx is a smart thing and before reloading the config checks it for correctness, so I refused to use the new config.

    How to recover the overwritten config?


    My old friend gave me access to the frontend with nginx.
    Everything is ordinary here: a machine on FreeBSD, gmirror on two disks and nginx, nothing more.
    First, gmirror stopped so that all my changes would not overwrite the files on the second screw. Then he began to think about how to recover the killed file from the disk, but then he looked at the server uptime and remembered what the friend said, they say, the config changes quite rarely, I decided to try another method.

    Looked at how much swap we have. The fact that he is currently busy at 5% does not mean that there is only 5% of the information, most likely there is much more =) We will save its current state And knowing which thread the line from the config will start to grab it. Since most people tyunyut, as frayahu, and nginx "Sysoev"
    # swapinfo
    Device 1K-blocks Used Avail Capacity
    /dev/ad4s1b 2063152 94612 1968540 5%




    # cat /dev/ad4s1b > /usr/SWAP

    , then in the config most likely there is the line “reset_timedout_connection on”, well, let's check my luck and try to fix it: and, voila, a piece of the config, it remains only to play with the values ​​-A and -B, unhook the whole config and choose from the newest / unbeaten options (maybe there will be several of them in the swap) All the config is in our hands. It seems sales are not broken and relevant. Now, having parsed it, you can restore the MySQL table. This method is not a panacea or a silver bullet, because it worked in my case rather an exception than a rule, but maybe for some of you this method will help to restore important data once a thread.

    # cat /usr/SWAP | grep -a -A10 reset_timedout_connection
    �Lj�Lj��Lj��Lj�Lj�Lj$�Lj0�Lj8�Lj<�LjX�Lj\�Ljd�Ljp�Lj��Lj��Lj��Lj��Lj��Lj�Lj��Lj��Lj��Lj���Lj8�LjP�Ljp�Lj��Lj��Lj��Lj�Lj��LjX�Lj
    ��Lj�Lj�m [Ȉh�LjxȈҰLj@������� �.�`�`���0u�0u2�d�d�Lj�Ȉ<4�@TȈ��Ȉ
    --
    reset_timedout_connection on;
    sendfile on;
    tcp_nopush on;
    tcp_nodelay on;
    send_lowat 12000;

    keepalive_timeout 65;

    gzip on;
    gzip_min_length 2048;
    gzip_types text/css text/js text/xml;
    ^C



    # cat /usr/SWAP | grep -a -A400 -B12 "reset_timedout_connection on;"





    If there is no swap'e, but the file cannot be restored from the screw


    There is also a second, less preferable option for recovering information if the nginx process is still running on the server.

    First we look for the nginx master which has been launched , then we give it a coredump and then pick it as we want, even so ... And we are horrified at how difficult it is to assemble piece by piece config
    # ps -auxww | grep nginx
    root 1197 0,0 0,1 13216 2488 ?? Is ср18 0:00,02 nginx: master process /usr/local/sbin/nginx
    www 29484 0,0 2,3 57248 47576 ?? I 7:58 0:00,06 nginx: worker process (nginx)



    # gcore 1197

    # cat core.1197 | strings | grep -B10 -A10 reset_timedout_connection

    # cat core.1197 | grep -a -B10 -A10 reset_timedout_connection


    Conclusion


    People, do not be Yourself Evil Pinocchio, make frequent well-protected automatic data backups. And remember that even from the deepest ass there are at least two exits%)

    Instead of an afterword


    MySQL database was eventually restored. The administrator himself, without knowing it, turned on --bin-log from the very beginning of the database life (by the way, by the time I started to restore the binlog database it already occupied 89% / var and after a couple of months mysql would stop running). Due to the fact that no one deleted them, it was possible to make Point-in-Time Recovery

    PS. It would be nice if nginx could, upon request, issue its current config or diff from the current one and what lies in the file on disk =)

    Also popular now: