Stateful backups in Kubernetes

    What are we talking about? Mitap was on “Backups in Kubernetes”.

    Most likely, having heard this name, many will say: “Why backup in Kubernetes? He doesn't need to be backed up, he's Stateless. ”

    Introduction ...

    Let's start with a little background. Why did it become necessary to highlight this topic and why it is needed.

    In 2016, we became acquainted with such technology as Kubernetes and began to actively apply it to our projects. Of course, these are mainly projects with microservice architecture, and this in turn entails the use of a large number of various software.

    With the very first project where we used Kubernetes, we had a question about how to back up stateful services located there, which sometimes for one reason or another fall into k8s.

    We began to study and search for existing practices to solve this problem. Communicate with our colleagues and comrades: "And how is this process carried out and built from them?"

    After talking, we realized that for everyone it happens by different methods, means and with a large number of crutches. At the same time, we did not follow any single approach even within the framework of one project .

    Why is this so important? Since our company serves projects built on the basis of k8s, we just needed to develop a structured methodology for solving this problem.

    Imagine you are working with one specific project in Kubera. It contains some stateful services and you need to back up their data. In principle, you can do with a couple of crutches and forget about it. But what if you already have two projects on k8s? And the second project uses completely different services in its work. And if there are already five projects? Ten? Or more than twenty?

    Of course, putting crutches on is difficult and inconvenient. We need some kind of unified approach that could be used when working with many projects in Cuba and at the same time so that the engineering team can easily and in just a few minutes make the necessary changes to the work of the backups of these projects.

    Within the framework of this article, we will tell you exactly what tool and practice we use to solve this problem within our company.

    What are we doing this for?

    Nxs-backup what is it?

    For backups, we use our own open source tool - nxs-backup. We will not go into the details of what he can. More information about him can be found at the following link .

    We now turn to the actual implementation of backups in k8s. How and what exactly we did.

    What is backup?

    Let's look at an example of the backup of our own Redmine. In it we will back up the MySQL database and user project files.

    How do we do it?

    1 CronJob == 1 Service

    On normal servers and clusters on hardware, almost all backup tools are mainly run through normal cron. In k8s, we use CronJob for this purpose, that is, we create 1 CronJob for 1 service, which we will back up. All these CronJobs are located in the same namespace as the service itself.

    Let's start with the MySQL database. In order to backup MySQL, we need 4 elements, as well as almost any other service:

    • ConfigMap (nxs-backup.conf)
    • ConfigMap (mysql.conf for nxs-backup)
    • Secret (access to the service is stored here, in this case MySQL). Usually, this element is already defined for the operation of the service and can be reused.
    • CronJob (for each service its own)

    Let's go in order.


    apiVersion: v1
    kind: ConfigMap
      name: nxs-backup-conf
      nxs-backup.conf: |-
          server_name: Nixys k8s cluster
          - ''
          level_message: error
          block_io_read: ''
          block_io_write: ''
          blkio_weight: ''
          general_path_to_all_tmp_dir: /var/nxs-backup
          cpu_shares: ''
          log_file: /dev/stdout
        jobs: !include [conf.d/*.conf]

    Here we set the basic parameters that are passed to our tool, which are necessary for its operation. This is the name of the server, e-mail for notifications, restriction on resource consumption and other parameters.

    Configurations can be set in j2 format, which allows the use of environment variables.


    apiVersion: v1
    kind: ConfigMap
      name: mysql-conf
      service.conf.j2: |-
        - job: mysql
          type: mysql
          tmp_dir: /var/nxs-backup/databases/mysql/dump_tmp
          - connect:
              db_host: {{ db_host }}
              db_port: {{ db_port }}
              socket: ''
              db_user: {{ db_user }}
              db_password: {{ db_password }}
            - redmine_db
            gzip: yes
            is_slave: no
            extra_keys: '--opt --add-drop-database --routines --comments --create-options --quote-names --order-by-primary --hex-blob'
          - storage: localenable: yes
            backup_dir: /var/nxs-backup/databases/mysql/dump
              days: 6
              weeks: 4month: 6

    This file describes the backup logic for the corresponding service, in our case it is MySQL.

    Here you can specify:

    • What is the name of Job (field: job)
    • Job'a type (field: type)
    • The temporary directory needed to collect backups (field: tmp_dir)
    • MySQL connection parameters (field: connect)
    • Database that will be backed up (field: target)
    • The need to stop the Slave before collecting (field: is_slave)
    • Additional keys for mysqldump (field: extra_keys)
    • Storage storage, i.e. in which storage we will store a copy (field: storage)
    • The directory where we will store our copies (field: backup_dir)
    • Storage scheme (field: store)

    In our example, the storage type is set to local, that is, we collect and store backup copies locally in a certain directory of the pod being launched.

    By the way, by analogy with this configuration file, you can set the same configuration files for Redis, PostgreSQL or any other necessary service, if our tool supports it. The fact that it supports can be found on the link given earlier.

    Secret mysql

    apiVersion: v1
    kind: Secretmetadata:
      name: app-config
      db_name: ""
      db_host: ""
      db_user: ""
      db_password: ""
      secret_token: ""
      smtp_address: ""
      smtp_domain: ""
      smtp_ssl: ""
      smtp_enable_starttls_auto: ""
      smtp_port: ""
      smtp_auth_type: ""
      smtp_login: ""
      smtp_password: ""

    In secret, we keep access to connect to MySQL itself and the mail server. They can be stored or in a separate secret, or use the existing, of course, if it is. There is nothing interesting. Our secret also keeps the secret_token necessary for the operation of our Redmine.

    MySQL CronJob

    apiVersion: batch/v1beta1
    kind: CronJob
      name: mysql
      schedule: "00 00 * * *"
                    - matchExpressions:
                      - key:
                        operator: In
                        - nxs-node5
              - name: mysql-backup
                image: nixyslab/nxs-backup:latest
                - name: DB_HOST
                      name: app-config
                      key: db_host
                - name: DB_PORT
                  value: '3306'
                - name: DB_USER
                      name: app-config
                      key: db_user
                - name: DB_PASSWORD
                      name: app-config
                      key: db_password
                - name: SMTP_MAILHUB_ADDR
                      name: app-config
                      key: smtp_address
                - name: SMTP_MAILHUB_PORT
                      name: app-config
                      key: smtp_port
                - name: SMTP_USE_TLS
                  value: 'YES'
                - name: SMTP_AUTH_USER
                      name: app-config
                      key: smtp_login
                - name: SMTP_AUTH_PASS
                      name: app-config
                      key: smtp_password
                - name: SMTP_FROM_LINE_OVERRIDE
                  value: 'NO'
                - name: mysql-conf
                  mountPath: /usr/share/nxs-backup/service.conf.j2
                  subPath: service.conf.j2
                - name: nxs-backup-conf
                  mountPath: /etc/nxs-backup/nxs-backup.conf
                  subPath: nxs-backup.conf
                - name: backup-dir
                  mountPath: /var/nxs-backup
                imagePullPolicy: Always
              - name: mysql-conf
                  name: mysql-conf
                  - key: service.conf.j2
                    path: service.conf.j2
              - name: nxs-backup-conf
                  name: nxs-backup-conf
                  - key: nxs-backup.conf
                    path: nxs-backup.conf
              - name: backup-dir
                  path: /var/backups/k8s
                  type: Directory
              restartPolicy: OnFailure

    Perhaps this is the most interesting element. Firstly, in order to create the correct CronJob, it is necessary to determine where the collected backups will be stored.

    We have dedicated server for this with the necessary amount of resources. In the example, a separate cluster node, nxs-node5, is reserved for collecting backups. Restriction of CronJob launch on the nodes we need is set by the nodeAffinity directive.

    When CronJob is launched, the corresponding directory is connected to it via the hostPath from the host system, which is used for storing backup copies.

    Next, ConfigMapes containing the configuration for nxs-backup are connected to a specific CronJob, namely, the files nxs-backup.conf and mysql.conf, which we just talked about above.

    Then, all the necessary environment variables are set, which are defined directly in the manifest or are pulled up from the Secret.

    So, the variables are transferred to the container and through are substituted in ConfigMaps in the right places to the right values. For MySQL, this is db_host, db_user, db_password. In this case, the port is transmitted simply as a value in the CronJob manifest, since it does not carry any valuable information.

    Well, with MySQL everything seems to be clear. And now let's see what is needed for backup of the Redmine application files.


    apiVersion: v1
    kind: ConfigMap
     name: desc-files-conf
     service.conf.j2: |-
       - job: desc-files
         type: desc_files
         tmp_dir: /var/nxs-backup/files/desc/dump_tmp
         - target:
           - /var/www/files
           gzip: yes
         - storage: localenable: yes
           backup_dir: /var/nxs-backup/files/desc/dump
             days: 6
             weeks: 4month: 6

    This is a configuration file that describes the backup logic for files. Here, too, there is nothing unusual, all the same parameters are set as those of MySQL, with the exception of the data for authorization, because they simply do not exist. Although they can be, if the protocols for data transfer will be involved: ssh, ftp, webdav, s3 and others. We will consider this option a little later.

    Cronjob desc_files

    apiVersion: batch/v1beta1
    kind: CronJob
     name: desc-files
     schedule: "00 00 * * *"
                   - matchExpressions:
                     - key:
                       operator: In
                       - nxs-node5
             - name: desc-files-backup
               image: nixyslab/nxs-backup:latest
               - name: SMTP_MAILHUB_ADDR
                     name: app-config
                     key: smtp_address
               - name: SMTP_MAILHUB_PORT
                     name: app-config
                     key: smtp_port
               - name: SMTP_USE_TLS
                 value: 'YES'
               - name: SMTP_AUTH_USER
                     name: app-config
                     key: smtp_login
               - name: SMTP_AUTH_PASS
                     name: app-config
                     key: smtp_password
               - name: SMTP_FROM_LINE_OVERRIDE
                 value: 'NO'
               - name: desc-files-conf
                 mountPath: /usr/share/nxs-backup/service.conf.j2
                 subPath: service.conf.j2
               - name: nxs-backup-conf
                 mountPath: /etc/nxs-backup/nxs-backup.conf
                 subPath: nxs-backup.conf
               - name: target-dir
                 mountPath: /var/www/files
               - name: backup-dir
                 mountPath: /var/nxs-backup
               imagePullPolicy: Always
             - name: desc-files-conf
                 name: desc-files-conf
                 - key: service.conf.j2
                   path: service.conf.j2
             - name: nxs-backup-conf
                 name: nxs-backup-conf
                 - key: nxs-backup.conf
                   path: nxs-backup.conf
             - name: backup-dir
                 path: /var/backups/k8s
                 type: Directory
             - name: target-dir
                 claimName: redmine-app-files
             restartPolicy: OnFailure

    Also nothing new about MySQL. But here one additional PV (target-dir) is mounted, just which we will back up - / var / www / files. As for the rest, we still store copies locally on the node we need, which CronJob is assigned to.


    For each service we want to back up, we create a separate CronJob with all the necessary companion elements: ConfigMaps and Secrets. By analogy with the considered examples, we can back up any similar service in the cluster.

    I think, based on these two examples, everybody got some idea how exactly we back up Stateful services in Cuba. I think it makes no sense to analyze in detail the same examples for other services, because basically they all look alike and have minor differences.

    Actually, this is what we wanted to achieve, namely, some kind of unified approach in building the backup process. And so that this approach could be applied to a large number of different projects based on k8s.

    Where is it stored?

    In all the examples above, we store copies in the local directory of the node on which the container is running. But no one bothers to connect Persistent Volume as a working external storage and collect copies there. Or you can only synchronize them to a remote repository using the desired protocol, without saving locally. That is a lot of variations. First compile locally, then synchronize. Or collect and store only in a remote repository, etc. The configuration is quite flexible.

    mysql.conf + s3

    Below is an example of the MySQL backup configuration file, where copies are stored locally on the node where CronJob is running, and also synchronized to s3.

    apiVersion: v1
    kind: ConfigMap
      name: mysql-conf
      service.conf.j2: |-
        - job: mysql
          type: mysql
          tmp_dir: /var/nxs-backup/databases/mysql/dump_tmp
          - connect:
              db_host: {{ db_host }}
              db_port: {{ db_port }}
              socket: ''
              db_user: {{ db_user }}
              db_password: {{ db_password }}
            - redmine_db
            gzip: yes
            is_slave: no
    extra_keys: '
    --opt --add-drop-database --routines --comments --create-options --quote-names --order-by-primary --hex-blob'
          - storage: localenable: yes
            backup_dir: /var/nxs-backup/databases/mysql/dump
              days: 6
              weeks: 4month: 6
        - storage: s3
          enable: yes
          backup_dir: /nxs-backup/databases/mysql/dump
          bucket_name: {{ bucket_name }}
          access_key_id: {{ access_key_id }}
          secret_access_key: {{ secret_access_key }}
          s3fs_opts: {{ s3fs_opts }}
            days: 2
            weeks: 1month: 6

    Ie, if it is not enough to store copies locally, you can synchronize them to any remote storage using the appropriate protocol. Storage number can be any.

    But in this case, you still need to make some additional changes, namely:

    • Connect the appropriate ConfigMap with the content required for authorization with AWS S3, in j2 format
    • Create an appropriate Secret to store access authorization
    • Set the desired environment variables taken from Secret above
    • Adjust to replace corresponding variables in ConfigMap
    • Rebuild Docker image by adding utilities for working with AWS S3

    So far this process is far from perfect, but we are working on it. Therefore, in the near future we will add to nxs-backup the ability to define parameters in the configuration file using environment variables, which will greatly simplify the work with the entrypoint file and minimize the time costs of adding support for backup of new services.


    On this, probably, everything.

    Using the approach that has just been discussed, first of all, it allows you to structured and statefully back up the Stateful project services to k8s using a template. Ie this is a ready-made solution, and most importantly the practice that can be applied in their projects, while not wasting time and effort on finding and refining existing open source solutions.

