Stateful backups in Kubernetes
So, as everyone knows, most recently, DevOpsConfRussia2018 took place on October 1-2 October in Moscow in InfoSpace . For those who do not vkurse, DevOpsConf - a professional conference on the integration of development processes, testing and operation.
Our company also took part in this conference. We were its partners, representing the company at our stand, and also conducted a small meeting. By the way, this was our first participation in this kind of activity. The first conference, the first mitap, the first experience.
What are we talking about? Mitap was on “Backups in Kubernetes”.
Most likely, having heard this name, many will say: “Why backup in Kubernetes? He doesn't need to be backed up, he's Stateless. ”
Introduction ...
Let's start with a little background. Why did it become necessary to highlight this topic and why it is needed.
In 2016, we became acquainted with such technology as Kubernetes and began to actively apply it to our projects. Of course, these are mainly projects with microservice architecture, and this in turn entails the use of a large number of various software.
With the very first project where we used Kubernetes, we had a question about how to back up stateful services located there, which sometimes for one reason or another fall into k8s.
We began to study and search for existing practices to solve this problem. Communicate with our colleagues and comrades: "And how is this process carried out and built from them?"
After talking, we realized that for everyone it happens by different methods, means and with a large number of crutches. At the same time, we did not follow any single approach even within the framework of one project .
Why is this so important? Since our company serves projects built on the basis of k8s, we just needed to develop a structured methodology for solving this problem.
Imagine you are working with one specific project in Kubera. It contains some stateful services and you need to back up their data. In principle, you can do with a couple of crutches and forget about it. But what if you already have two projects on k8s? And the second project uses completely different services in its work. And if there are already five projects? Ten? Or more than twenty?
Of course, putting crutches on is difficult and inconvenient. We need some kind of unified approach that could be used when working with many projects in Cuba and at the same time so that the engineering team can easily and in just a few minutes make the necessary changes to the work of the backups of these projects.
Within the framework of this article, we will tell you exactly what tool and practice we use to solve this problem within our company.
What are we doing this for?
Nxs-backup what is it?
For backups, we use our own open source tool - nxs-backup. We will not go into the details of what he can. More information about him can be found at the following link .
We now turn to the actual implementation of backups in k8s. How and what exactly we did.
What is backup?
Let's look at an example of the backup of our own Redmine. In it we will back up the MySQL database and user project files.
How do we do it?
1 CronJob == 1 Service
On normal servers and clusters on hardware, almost all backup tools are mainly run through normal cron. In k8s, we use CronJob for this purpose, that is, we create 1 CronJob for 1 service, which we will back up. All these CronJobs are located in the same namespace as the service itself.
Let's start with the MySQL database. In order to backup MySQL, we need 4 elements, as well as almost any other service:
- ConfigMap (nxs-backup.conf)
- ConfigMap (mysql.conf for nxs-backup)
- Secret (access to the service is stored here, in this case MySQL). Usually, this element is already defined for the operation of the service and can be reused.
- CronJob (for each service its own)
Let's go in order.
nxs-backup.conf
apiVersion: v1
kind: ConfigMap
metadata:
name: nxs-backup-conf
data:
nxs-backup.conf: |-
main:
server_name: Nixys k8s cluster
admin_mail: admins@nixys.ru
client_mail:
- ''
mail_from: backup@nixys.ru
level_message: error
block_io_read: ''
block_io_write: ''
blkio_weight: ''
general_path_to_all_tmp_dir: /var/nxs-backup
cpu_shares: ''
log_file: /dev/stdout
jobs: !include [conf.d/*.conf]
Here we set the basic parameters that are passed to our tool, which are necessary for its operation. This is the name of the server, e-mail for notifications, restriction on resource consumption and other parameters.
Configurations can be set in j2 format, which allows the use of environment variables.
mysql.conf
apiVersion: v1
kind: ConfigMap
metadata:
name: mysql-conf
data:
service.conf.j2: |-
- job: mysql
type: mysql
tmp_dir: /var/nxs-backup/databases/mysql/dump_tmp
sources:
- connect:
db_host: {{ db_host }}
db_port: {{ db_port }}
socket: ''
db_user: {{ db_user }}
db_password: {{ db_password }}
target:
- redmine_db
gzip: yes
is_slave: no
extra_keys: '--opt --add-drop-database --routines --comments --create-options --quote-names --order-by-primary --hex-blob'
storages:
- storage: localenable: yes
backup_dir: /var/nxs-backup/databases/mysql/dump
store:
days: 6
weeks: 4month: 6
This file describes the backup logic for the corresponding service, in our case it is MySQL.
Here you can specify:
- What is the name of Job (field: job)
- Job'a type (field: type)
- The temporary directory needed to collect backups (field: tmp_dir)
- MySQL connection parameters (field: connect)
- Database that will be backed up (field: target)
- The need to stop the Slave before collecting (field: is_slave)
- Additional keys for mysqldump (field: extra_keys)
- Storage storage, i.e. in which storage we will store a copy (field: storage)
- The directory where we will store our copies (field: backup_dir)
- Storage scheme (field: store)
In our example, the storage type is set to local, that is, we collect and store backup copies locally in a certain directory of the pod being launched.
By the way, by analogy with this configuration file, you can set the same configuration files for Redis, PostgreSQL or any other necessary service, if our tool supports it. The fact that it supports can be found on the link given earlier.
Secret mysql
apiVersion: v1
kind: Secretmetadata:
name: app-config
data:
db_name: ""
db_host: ""
db_user: ""
db_password: ""
secret_token: ""
smtp_address: ""
smtp_domain: ""
smtp_ssl: ""
smtp_enable_starttls_auto: ""
smtp_port: ""
smtp_auth_type: ""
smtp_login: ""
smtp_password: ""
In secret, we keep access to connect to MySQL itself and the mail server. They can be stored or in a separate secret, or use the existing, of course, if it is. There is nothing interesting. Our secret also keeps the secret_token necessary for the operation of our Redmine.
MySQL CronJob
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: mysql
spec:
schedule: "00 00 * * *"
jobTemplate:
spec:
template:
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- nxs-node5
containers:
- name: mysql-backup
image: nixyslab/nxs-backup:latest
env:
- name: DB_HOST
valueFrom:
secretKeyRef:
name: app-config
key: db_host
- name: DB_PORT
value: '3306'
- name: DB_USER
valueFrom:
secretKeyRef:
name: app-config
key: db_user
- name: DB_PASSWORD
valueFrom:
secretKeyRef:
name: app-config
key: db_password
- name: SMTP_MAILHUB_ADDR
valueFrom:
secretKeyRef:
name: app-config
key: smtp_address
- name: SMTP_MAILHUB_PORT
valueFrom:
secretKeyRef:
name: app-config
key: smtp_port
- name: SMTP_USE_TLS
value: 'YES'
- name: SMTP_AUTH_USER
valueFrom:
secretKeyRef:
name: app-config
key: smtp_login
- name: SMTP_AUTH_PASS
valueFrom:
secretKeyRef:
name: app-config
key: smtp_password
- name: SMTP_FROM_LINE_OVERRIDE
value: 'NO'
volumeMounts:
- name: mysql-conf
mountPath: /usr/share/nxs-backup/service.conf.j2
subPath: service.conf.j2
- name: nxs-backup-conf
mountPath: /etc/nxs-backup/nxs-backup.conf
subPath: nxs-backup.conf
- name: backup-dir
mountPath: /var/nxs-backup
imagePullPolicy: Always
volumes:
- name: mysql-conf
configMap:
name: mysql-conf
items:
- key: service.conf.j2
path: service.conf.j2
- name: nxs-backup-conf
configMap:
name: nxs-backup-conf
items:
- key: nxs-backup.conf
path: nxs-backup.conf
- name: backup-dir
hostPath:
path: /var/backups/k8s
type: Directory
restartPolicy: OnFailure
Perhaps this is the most interesting element. Firstly, in order to create the correct CronJob, it is necessary to determine where the collected backups will be stored.
We have dedicated server for this with the necessary amount of resources. In the example, a separate cluster node, nxs-node5, is reserved for collecting backups. Restriction of CronJob launch on the nodes we need is set by the nodeAffinity directive.
When CronJob is launched, the corresponding directory is connected to it via the hostPath from the host system, which is used for storing backup copies.
Next, ConfigMapes containing the configuration for nxs-backup are connected to a specific CronJob, namely, the files nxs-backup.conf and mysql.conf, which we just talked about above.
Then, all the necessary environment variables are set, which are defined directly in the manifest or are pulled up from the Secret.
So, the variables are transferred to the container and through docker-entrypoint.sh are substituted in ConfigMaps in the right places to the right values. For MySQL, this is db_host, db_user, db_password. In this case, the port is transmitted simply as a value in the CronJob manifest, since it does not carry any valuable information.
Well, with MySQL everything seems to be clear. And now let's see what is needed for backup of the Redmine application files.
desc_files.conf
apiVersion: v1
kind: ConfigMap
metadata:
name: desc-files-conf
data:
service.conf.j2: |-
- job: desc-files
type: desc_files
tmp_dir: /var/nxs-backup/files/desc/dump_tmp
sources:
- target:
- /var/www/files
gzip: yes
storages:
- storage: localenable: yes
backup_dir: /var/nxs-backup/files/desc/dump
store:
days: 6
weeks: 4month: 6
This is a configuration file that describes the backup logic for files. Here, too, there is nothing unusual, all the same parameters are set as those of MySQL, with the exception of the data for authorization, because they simply do not exist. Although they can be, if the protocols for data transfer will be involved: ssh, ftp, webdav, s3 and others. We will consider this option a little later.
Cronjob desc_files
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: desc-files
spec:
schedule: "00 00 * * *"
jobTemplate:
spec:
template:
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- nxs-node5
containers:
- name: desc-files-backup
image: nixyslab/nxs-backup:latest
env:
- name: SMTP_MAILHUB_ADDR
valueFrom:
secretKeyRef:
name: app-config
key: smtp_address
- name: SMTP_MAILHUB_PORT
valueFrom:
secretKeyRef:
name: app-config
key: smtp_port
- name: SMTP_USE_TLS
value: 'YES'
- name: SMTP_AUTH_USER
valueFrom:
secretKeyRef:
name: app-config
key: smtp_login
- name: SMTP_AUTH_PASS
valueFrom:
secretKeyRef:
name: app-config
key: smtp_password
- name: SMTP_FROM_LINE_OVERRIDE
value: 'NO'
volumeMounts:
- name: desc-files-conf
mountPath: /usr/share/nxs-backup/service.conf.j2
subPath: service.conf.j2
- name: nxs-backup-conf
mountPath: /etc/nxs-backup/nxs-backup.conf
subPath: nxs-backup.conf
- name: target-dir
mountPath: /var/www/files
- name: backup-dir
mountPath: /var/nxs-backup
imagePullPolicy: Always
volumes:
- name: desc-files-conf
configMap:
name: desc-files-conf
items:
- key: service.conf.j2
path: service.conf.j2
- name: nxs-backup-conf
configMap:
name: nxs-backup-conf
items:
- key: nxs-backup.conf
path: nxs-backup.conf
- name: backup-dir
hostPath:
path: /var/backups/k8s
type: Directory
- name: target-dir
persistentVolumeClaim:
claimName: redmine-app-files
restartPolicy: OnFailure
Also nothing new about MySQL. But here one additional PV (target-dir) is mounted, just which we will back up - / var / www / files. As for the rest, we still store copies locally on the node we need, which CronJob is assigned to.
Total
For each service we want to back up, we create a separate CronJob with all the necessary companion elements: ConfigMaps and Secrets. By analogy with the considered examples, we can back up any similar service in the cluster.
I think, based on these two examples, everybody got some idea how exactly we back up Stateful services in Cuba. I think it makes no sense to analyze in detail the same examples for other services, because basically they all look alike and have minor differences.
Actually, this is what we wanted to achieve, namely, some kind of unified approach in building the backup process. And so that this approach could be applied to a large number of different projects based on k8s.
Where is it stored?
In all the examples above, we store copies in the local directory of the node on which the container is running. But no one bothers to connect Persistent Volume as a working external storage and collect copies there. Or you can only synchronize them to a remote repository using the desired protocol, without saving locally. That is a lot of variations. First compile locally, then synchronize. Or collect and store only in a remote repository, etc. The configuration is quite flexible.
mysql.conf + s3
Below is an example of the MySQL backup configuration file, where copies are stored locally on the node where CronJob is running, and also synchronized to s3.
apiVersion: v1
kind: ConfigMap
metadata:
name: mysql-conf
data:
service.conf.j2: |-
- job: mysql
type: mysql
tmp_dir: /var/nxs-backup/databases/mysql/dump_tmp
sources:
- connect:
db_host: {{ db_host }}
db_port: {{ db_port }}
socket: ''
db_user: {{ db_user }}
db_password: {{ db_password }}
target:
- redmine_db
gzip: yes
is_slave: no
extra_keys: '
--opt --add-drop-database --routines --comments --create-options --quote-names --order-by-primary --hex-blob'
storages:
- storage: localenable: yes
backup_dir: /var/nxs-backup/databases/mysql/dump
store:
days: 6
weeks: 4month: 6
- storage: s3
enable: yes
backup_dir: /nxs-backup/databases/mysql/dump
bucket_name: {{ bucket_name }}
access_key_id: {{ access_key_id }}
secret_access_key: {{ secret_access_key }}
s3fs_opts: {{ s3fs_opts }}
store:
days: 2
weeks: 1month: 6
Ie, if it is not enough to store copies locally, you can synchronize them to any remote storage using the appropriate protocol. Storage number can be any.
But in this case, you still need to make some additional changes, namely:
- Connect the appropriate ConfigMap with the content required for authorization with AWS S3, in j2 format
- Create an appropriate Secret to store access authorization
- Set the desired environment variables taken from Secret above
- Adjust docker-entrypoint.sh to replace corresponding variables in ConfigMap
- Rebuild Docker image by adding utilities for working with AWS S3
So far this process is far from perfect, but we are working on it. Therefore, in the near future we will add to nxs-backup the ability to define parameters in the configuration file using environment variables, which will greatly simplify the work with the entrypoint file and minimize the time costs of adding support for backup of new services.
Conclusion
On this, probably, everything.
Using the approach that has just been discussed, first of all, it allows you to structured and statefully back up the Stateful project services to k8s using a template. Ie this is a ready-made solution, and most importantly the practice that can be applied in their projects, while not wasting time and effort on finding and refining existing open source solutions.