Backup a large number of heterogeneous web-projects

    It would seem that the topic is beaten - much has been said and written about backup, so there is nothing to reinvent the wheel, just take it and do it. However, every time when the system administrator of a web project faces the task of setting up backups, for many it hangs in the air with a big question mark. How to collect backup data? Where to store backups? How to provide the necessary level of retrospective storage of copies? How to unify the backup process for the whole zoo of different software?



    For ourselves, we first solved this problem in 2011. Then we sat down and wrote our backup scripts. For many years, we have used only them, and they have successfully ensured a reliable process of collecting and synchronizing backups of our clients' web projects. Backups were stored in our or some other external storage, with the possibility of tuning for a specific project.


    I must say, these scripts have worked their full. But the further we grew, the more variegated projects we had with different software and external repositories that our scripts did not support. For example, we did not have support for Redis and MySQL and PostgreSQL “hot” backups, which appeared later. The process of backups was not monitored, there were only email-alerts.


    Another problem was the support process. For many years, our once compact scripts have grown and turned into a huge awkward monster. And when we were going with the forces and released a new version, it was worth it to roll out an update for some of the customers who used the previous version with some kind of customization.


    As a result, at the beginning of this year we made a strong-willed decision: to replace our old backup scripts with something more modern. Therefore, we first sat down and wrote out all the wishes for a new solution. It turned out about the following:


    • Back up the most frequently used software:
      • Files (discrete and incremental backups)
      • MySQL (cold / hot backups)
      • PostgreSQL (cold / hot backups)
      • MongoDB
      • Redis
    • Store backups in popular repositories:
      • Local
      • FTP
      • Ssh
      • SMB
      • Nfs
      • WebDAV
      • S3
    • Receive alerts in case of any problems during the backup process
    • Have a single configuration file that allows you to manage backups centrally
    • Add support for new software through the connection of external modules
    • Specify extra options for collecting dumps
    • Have the ability to restore backups using standard tools.
    • Ease of initial configuration

    Analyzing existing solutions


    We looked at open-source solutions that already exist:


    • Bacula and its fork, for example, Bareos
    • Amanda
    • Borg
    • Duplicaty
    • Duplicity
    • Rsnapshot
    • Rdiff-backup

    But each of them had its flaws. For example, Bacula is overloaded with unnecessary functions, the initial configuration is quite time-consuming because of the large amount of manual work (for example, for writing / searching database backup scripts), and for restoring copies you need to use special utilities, etc.


    In the end, we came to two important conclusions:


    1. None of the existing solutions did not fully suit us;
    2. It seems that we ourselves had enough experience and madness to take up writing our decision.

    So we did.


    Birth nxs-backup


    We chose Python as the language for implementation - it is easy to write and maintain, flexible and convenient. The configuration files were made to be described in the yaml format.


    For the convenience of supporting and adding backups of new software, a modular architecture was chosen, where the process of collecting backups of each specific software (for example, MySQL) is described in a separate module.


    Support for files, databases and remote repositories


    Currently, support is provided for the following types of file backups, databases, and remote repositories:


    DB:


    • MySQL (hot / cold backups)
    • PostgreSQL (hot / cold backups)
    • Redis
    • MongoDB

    Files:


    • Discrete copying
    • Incremental backups

    Remote repositories:


    • Local
    • S3
    • SMB
    • Nfs
    • FTP
    • Ssh
    • WebDAV

    Discrete backup


    Either discrete or incremental backups are suitable for different tasks, so they implemented both types. You can specify which method to use at the level of individual files and directories.


    For discrete copies (both files and databases), you can set a retrospective view in the format days / weeks / months.


    Incremental backup


    Incremental copies of files are made as follows:


    At the beginning of the year is going to full backup. Further, at the beginning of each month - an incremental monthly copy of a relatively annual. Inside the monthly - incremental decadal relative to the monthly. Inside each decade - incremental day relative to the decade.


    It is worth mentioning that while there are some problems when working with directories that contain a large number of subdirectories (tens of thousands). In such cases, the collection of copies slows down significantly and can take more than a day. We are actively engaged in the elimination of this defect.


    Recovering from incremental backups


    There is no problem with restoring from discrete backups - just take a copy of the desired date and deploy it with the usual console tar. Incremental backups are a bit more complicated. To recover, for example, on July 24, 2018, you need to do the following:


    1. Expand a one-year backup, even if in our case it is counted from January 1, 2018 (in practice it can be any date, depending on when the decision was made to implement incremental backup)
    2. Roll on him a monthly backup for July
    3. Roll up the decade backup for July 21st
    4. Roll up a daily backup for July 24

    At the same time, to execute 2-4 points, it is necessary to add the -G switch to the tar command, thereby indicating that this is an incremental backup. Of course, this is not the fastest process, but if you consider that recovering from backups is not so often economical and important, this scheme turns out to be quite effective.


    Exceptions


    Often you need to exclude individual files or directories from backups, for example, directories with a cache. This can be done by specifying the appropriate exception rules:


    sample configuration file
        - target:
          - /var/www/*/data/
          excludes:
          - exclude1/exclude_file
          - exclude2
          - /var/www/exclude_3

    Backups rotation


    In our old scripts, the rotation was implemented so that the old copy was deleted only after the new one was assembled successfully. This led to problems on projects where, in principle, the space for backups was allocated to exactly one copy - a fresh copy could not be gathered there due to lack of space.


    In the new implementation, we decided to change this approach: first remove the old one and only then collect a new copy. And the process of collecting backups put on monitoring to learn about the occurrence of any problems.


    In the case of a discrete backup, the old copy is considered to be an archive that goes beyond the specified storage scheme in the format days / weeks / months. In the case of incremental backups, backups are stored by default for a year, and old copies are deleted at the beginning of each month, while archives for the same month of the previous year are considered old backups. For example, before collecting a monthly backup on August 1, 2018, the system will check if there are any backups for August 2017, and if so, delete them. This allows optimal use of disk space.


    Logging


    In any process, and especially in backups, it is important to keep your finger on the pulse and be able to find out if something went wrong. The system keeps a log of its work and records the result of each step: start / stop of funds, the start / end of a specific task, the result of collecting a copy in the temporary directory, the result of copying / moving a copy from the temporary directory to a permanent location, the result of backups rotation, etc. ..


    Events are divided into 2 levels:


    • Info : information level - the flight is normal, the next stage is completed successfully, the corresponding information record is made in the log
    • Error : error level - something went wrong, the next stage ended abnormally, the corresponding error record is made in the log

    E-mail notifications


    At the end of the backup collection, the system can send out email notifications.


    2 recipient lists are supported:


    • Administrators - those who serve the server. They receive only error notifications; they are not interested in notifications of successful operations.
    • Business users - in our case, these are customers who sometimes want to be notified to make sure that they are fine with backups. Or, conversely, not very. They can choose - to get the full log or only the log with errors.

    Configuration File Structure


    The structure of the configuration files is as follows:


    structure example
    /etc/nxs-backup
    ├── conf.d
    │   ├── desc_files_local.conf
    │   ├── external_clickhouse_local.conf
    │   ├── inc_files_smb.conf
    │   ├── mongodb_nfs.conf
    │   ├── mysql_s3.conf
    │   ├── mysql_xtradb_scp.conf
    │   ├── postgresql_ftp.conf
    │   ├── postgresql_hot_webdav.conf
    │   └── redis_local_ftp.conf
    └── nxs-backup.conf

    Here /etc/nxs-backup/nxs-backup.conf is the main configuration file, which specifies the global settings:


    configuration file
    main:
      server_name: SERVER_NAME
      admin_mail: project-tech@nixys.ru
      client_mail:
      - ''
      mail_from: backup@domain.ru
      level_message: error
      block_io_read: ''
      block_io_write: ''
      blkio_weight: ''
      general_path_to_all_tmp_dir: /var/nxs-backup
      cpu_shares: ''
      log_file_name: /var/log/nxs-backup/nxs-backup.log
    jobs: !include [conf.d/*.conf]

    An array of jobs (jobs) contains a list of tasks (job), which are a description of what exactly to back up, where to store and in what quantity. As a rule, they are placed in separate files (one file per job), which are connected via include in the main configuration file.


    They also took care of maximally optimizing the process of preparing these files and wrote a simple generator. Therefore, the administrator does not need to spend time searching for the config template for some service, for example, MySQL, but rather simply run the command:


    nxs-backup generate --storage local scp --type mysql --path /etc/nxs-backup/conf.d/mysql_local_scp.conf

    The output is the file /etc/nxs-backup/conf.d/mysql_local_scp.conf :


    File contents
      - job: PROJECT-mysql
        type: mysql
        tmp_dir: /var/nxs-backup/databases/mysql/dump_tmp
        sources:
        - connect:
            db_host: ''
            db_port: ''
            socket: ''
            db_user: ''
            db_password: ''
            auth_file: ''
          target:
          - all
          excludes:
          - information_schema
          - performance_schema
          - mysql
          - sys
          gzip: no
          is_slave: no
          extra_keys: '--opt --add-drop-database --routines --comments --create-options --quote-names --order-by-primary --hex-blob'
        storages:
        - storage: localenable: yes
          backup_dir: /var/nxs-backup/databases/mysql/dump
          store:
            days: ''
            weeks: ''month: ''
        - storage: scp
          enable: yes
          backup_dir: /var/nxs-backup/databases/mysql/dump
          user: ''
          host: ''
          port: ''password: '' 
          path_to_key: ''store:
            days: ''
            weeks: ''month: ''

    In which it remains only to substitute several necessary values.


    Let us consider an example. Suppose we have on the server in the / var / www directory there are two platforms of an online store on 1C-Bitrix (bitrix-1.ru, bitrix-2.ru), each of which works from its database in different MySQL instances (port 3306 for bitrix_1_db and port 3307 for bitrix_2_db).


    The file structure of a typical Bitrix project is approximately as follows:


    ├── ...
    ├── bitrix
    │   ├── ..
    │   ├── admin
    │   ├── backup
    │   ├── cache
    │   ├── ..
    │   ├── managed_cache
    │   ├── ..
    │   ├── stack_cache
    │   └── ..
    ├── upload
    └── ...

    As a rule, the upload directory weighs a lot, and only grows with time, so back it up incrementally. All other directories are discrete, with the exception of directories with a cache and backups collected by Bitrix native tools. Let the backup storage scheme for these two sites should be the same, while copies of files should be stored both locally and remotely in ftp storage, and the database should only be stored in remote smb storage.


    The final configuration files for this setup will look like this:


    bitrix-desc-files.conf (configuration file with job description for discrete backup)
      - job: Bitrix-desc-files
        type: desc_files
        tmp_dir: /var/nxs-backup/files/desc/dump_tmp
        sources:
        - target:
          - /var/www/*/
          excludes:
          - bitrix/backup
          - bitrix/cache
          - bitrix/managed_cache
          - bitrix/stack_cache
          - upload
          gzip: yes
        storages:
        - storage: local
          enable: yes
          backup_dir: /var/nxs-backup/files/desc/dump
          store:
            days: 6
            weeks: 4
            month: 6
        - storage: ftp
          enable: yes
          backup_dir: /nxs-backup/databases/mysql/dump
          host: ftp_host
          user: ftp_usr
          password: ftp_usr_pass
          store:
            days: 6
            weeks: 4
            month: 6

    bitrix-inc-files.conf (configuration file with job description for incremental backups)
      - job: Bitrix-inc-files
        type: inc_files
        sources:
        - target:
          - /var/www/*/upload/
          gzip: yes
        storages:
        - storage: ftp
          enable: yes
          backup_dir: /nxs-backup/files/inc
          host: ftp_host
          user: ftp_usr
          password: ftp_usr_pass
        - storage: local
          enable: yes
          backup_dir: /var/nxs-backup/files/inc

    bitrix-mysql.conf (configuration file with job description for MySQL backups)
      - job: Bitrix-mysql
        type: mysql
        tmp_dir: /var/nxs-backup/databases/mysql/dump_tmp
        sources:
        - connect:
             db_host: localhost
             db_port: 3306
             db_user: bitrux_usr_1
             db_password: password_1
          target:
          -   bitrix_1_db
          excludes:
          -  information_schema
          - performance_schema
          - mysql
          - sys
          gzip: no
          is_slave: no
          extra_keys: '--opt --add-drop-database --routines --comments --create-options --quote-names --order-by-primary --hex-blob'
        - connect:
            db_host: localhost
            db_port: 3307
            db_user: bitrix_usr_2
            db_password: password_2
          target:
          -  bitrix_2_db
          excludes:
          - information_schema
          - performance_schema
          - mysql
          - sys
          gzip: yes
          is_slave: no
          extra_keys: '--opt --add-drop-database --routines --comments --create-options --quote-names --order-by-primary --hex-blob'
        storages:
        - storage: smb
          enable: yes
          backup_dir: /nxs-backup/databases/mysql/dump
          host: smb_host
          port: smb_port
          share: smb_share_name
          user: smb_usr
          password: smb_usr_pass
          store:
            days: 6
            weeks: 4month: 6

    Parameters for running backup collection


    In the previous example, we prepared job configuration files for collecting backups of all elements at once: files (discretely and incrementally), two databases and their storage in local and external (ftp, smb) storages.


    It remains to run the whole thing. Start is made by command:


    nxs-backup start $JOB_NAME -c $PATH_TO_MAIN_CONFIG

    There are several reserved job names:


    • files - random execution of all jobs with desc_files , inc_files types (that is, in fact, back up files only)
    • databases - random execution of all jobs with the types mysql , mysql_xtradb , postgresql , postgresql_hot , mongodb , redis (that is, backing up only the database)
    • external - randomly execute all jobs with external type (running only additional user scripts, more on this below)
    • all - imitation of running the command in turn with job files , databases , external (default value)

    Since we need to get backups of data from both files and databases as of the same time (or with a minimum difference), it is recommended to run nxs-backup with job all , which will ensure consistent execution of the described job (Bitrix-desc- files, Bitrix-inc_files, Bitrix-mysql).


    That is, an important point - backups will not be collected in parallel, but sequentially, one after the other, with the minimum time difference. Moreover, the software itself at the next launch checks for the presence of an already running process in the system and, if it is detected, will automatically finish its work with a corresponding note in the log. This approach significantly reduces the load on the system. Minus - backups of individual elements are not collected at once, but with some time difference. But so far our practice shows that this is not critical.


    External modules


    As mentioned above, thanks to the modular architecture, the capabilities of the system can be expanded using additional user modules that interact with the system through a special interface. The goal is to be able to add support for backups of new software in the future without having to rewrite nxs-backup.


    sample configuration file
      - job: TEST-externaltype: external
        dump_cmd: ''
        storages:
        ….

    Particular attention should be paid to the key dump_cmd , where the full command for running the external script is specified as the value. At the same time, upon completion of the execution of this command, it is expected that:


    • A complete software data archive will be compiled.
    • Data will be sent to stdout in json format, like:
      {
      "full_path": "ABS_PATH_TO_ARCHIVE",
      "basename": "BASENAME_ARCHIVE",
      "extension": "EXTERNSION_OF_ARCHIVE",
      "gzip": true/false
      }

      • In this case, the keys basename , extension , gzip are necessary exclusively for the formation of the final name of the backup.
    • In case of successful completion of the script, the return code should be 0 and any other in case of any problems.

    For example, suppose we have a script for creating snapshot etcd /etc/nxs-backup-ext/etcd.py :


    script code
    #! /usr/bin/env python3# -*- coding: utf-8 -*-import json
    import os
    import subprocess
    import sys
    import tarfile
    defarchive(snapshot_path):
        abs_tmp_path = '%s.tar' %(snapshot_path)
        with tarfile.open(abs_tmp_path, 'w:') as tar:
            tar.add(snapshot_path)
            os.unlink(snapshot_path)
        return abs_tmp_path
    defexec_cmd(cmdline):
        data_dict = {}
        current_process = subprocess.Popen([cmdline], stdout=subprocess.PIPE,
                                           stderr=subprocess.PIPE, shell=True,
                                           executable='/bin/bash')
        data = current_process.communicate()
        data_dict['stdout'] = data[0][0:-1].decode('utf-8')
        data_dict['stderr'] = data[1][0:-1].decode('utf-8')
        data_dict['code'] = current_process.returncode
        return data_dict
    defmain():
        snapshot_path = "/var/backups/snapshot.db"
        dump_cmd = "ETCDCTL_API=3 etcdctl --cacert=/etc/ssl/etcd/ssl/ca.pem --cert=/etc/ssl/etcd/ssl/member-node1.pem"+\
                   " --key=/etc/ssl/etcd/ssl/member-node1-key.pem --endpoints 'https://127.0.0.1:2379' snapshot save %s" %snapshot_path
        command = exec_cmd(dump_cmd)
        result_code = command['code']
        if result_code:
            sys.stderr.write(command['stderr'])
        else:
            try:
                new_path = archive(snapshot_path)
            except tarfile.TarError as e:
                sys.exit(1)
            else:
                result_dict = {
                        "full_path": new_path,
                        "basename": "etcd",
                        "extension": "tar",
                        "gzip": False
                    }
                print(json.dumps(result_dict))
        sys.exit(result_code)
    if __name__ == '__main__':
        main()

    The config for running this script is as follows:


    configuration file
      - job: etcd-externaltype: external
        dump_cmd: '/etc/nxs-backup-ext/etcd.py'
        storages:
        - storage: local
          enable: yes
          backup_dir: /var/nxs-backup/external/dump
          store:
            days: 6
            weeks: 4
            month: 6

    At the same time, the program when running job etcd-external :


    • Run the /etc/nxs-backup-ext/etcd.py script with no parameters
    • After completion of the script, check the completion code and the availability of the necessary data in stdout
    • If all the checks were successful, the same mechanism will be further used as in the operation of already built-in modules, where the tmp_path is the value of the full_path key. If not, complete the task with the corresponding mark in the log.

    Support and update


    The process of developing and maintaining a new backup system has been implemented for all CI / CD canons. No more updates and script edits on the combat servers. All changes pass through our central git-repository in Gitlab, where the pipeline contains the assembly of new versions of deb / rpm-packages, which are then uploaded to our deb / rpm repositories. And after that through the package manager are delivered to the destination server clients.


    How to download nxs-backup?


    We did nxs-backup open-source project. Anyone can download and use it to organize the backup process in their projects, as well as modify to fit their needs, write external modules.


    The source code for nxs-backup can be downloaded from the Github repository via this link . There is also a guide for installation and configuration.


    We also prepared a Docker image and uploaded it to DockerHub .


    If in the process of setting up or using any questions, please contact us. We will help to understand and refine the instructions.


    Conclusion


    In the near future we have to implement the following functionality:


    • Integration with monitoring
    • Backup Encryption
    • Web interface for managing backup settings
    • Expanding backups using nxs-backup
    • And much more

    Also popular now: