AndreiYemelianov December 1, 2015 at 11:14

Systemd and containers: getting to know systemd-nspawn

Containerization today is one of the most relevant topics. The number of publications on such popular tools as LXC or Docker is in the thousands, if not tens of thousands.
In this article we would like to discuss another solution, about which there are few publications in Russian so far. We are talking about systemd-nspawn - a tool for creating isolated environments, which is one of the components of systemd. And fixing systemd as a standard in the Linux world is already an accomplished fact. In light of this fact, there is every reason to believe that in the near future the scope of systemd-nspawn will expand significantly, and it’s worth getting to know this tool now.

Systemd-nspawn: general information

The name systemd-nspawn is short for namespaces spawn. Already from this name it follows that systemd-nspawn only controls the isolation of processes, but at the same time it cannot isolate resources (however, this can be done by means of systemd itself, which will be discussed later).

Using systemd-nspawn, you can create a completely isolated environment in which the / proc and / sys pseudo-file systems will be automatically mounted, as well as create an isolated loopback interface and a separate namespace for process identifiers (PID), inside which you can run an OS based on Linux kernel.

There is no special image repository, as in Docker, in systemd-nspawn. You can use any third-party tools to create and download images. The formats tar, raw, qcow2 and dkr are supported (dkr are images for Docker; the documentation for systemd-nspawn does not explicitly write about this anywhere, and its authors carefully avoid the word Docker). Work with images is based on the BTRFS file system .

Run in a Debian container

We begin our introduction to systemd-nspawn with a simple but illustrative practical example. On the server running OC Fedora, we will create an isolated environment in which the Debian OS will be launched. All of the sample commands below are for Fedora 22 with systemd 219; in other Linux distributions and in other versions of systemd, the commands may differ.

Let's start by installing the necessary dependencies:

sudo dnf install debootstrap bridge-utils

Then create a file system for the future container:

sudo debootstrap --arch=amd64 jessie /var/lib/machines/container1/

Upon completion of all preparatory work, you can proceed to launch the container:

sudo systemd-nspawn -D /var/lib/machines/container1/ --machine test_container

The guest operating system prompt appears on the console:

root@test_container

Set the root password for it:

passwd
Enter new UNIX password: 
Retype new UNIX password: 
passwd: password updated successfully

Exit the container by pressing the key combination Ctrl +]]], and then execute the following command:

sudo systemd-nspawn -D /var/lib/machines/container1/ --machine test_container  -b

It contains the -b flag (or −−boot), which indicates that when starting an instance of the operating system in the container, init must be run with all daemons running. This flag can only be used if a systemd system is running in the container. Otherwise, system loading is not guaranteed.

Upon completion of all these operations, the system will prompt you to enter a username and password.
So, a full-fledged OS in an isolated environment is running. Now we need to configure a network for it. Let's exit the container and create a bridge through which it will connect to the interface on the main host:

sudo brctl addbr cont-bridge

Assign an IP address for this bridge:

ip a a [IP-адрес] dev cont-bridge

After that, execute the command:

sudo systemd-nspawn -D /var/lib/machines/container1/ --machine test_container --network-bridge=cont-bridge -b

To configure the network, you can also use the −−network-ipvlan option, which will connect the container with the specified interface on the main host using ipvlan:

sudo systemd-nspawn -D /var/lib/machines/container1/ --machine test_container -b --network-ipvlan=[сетевой интерфейс]

Run the container as a service

Using systemd, you can configure containers to start automatically when the system boots. To do this, add the following configuration file to the / etc / systemd / system directory:

[Unit]
Description=Test Container
[Service]
LimitNOFILE=100000
ExecStart=/usr/bin/systemd-nspawn --machine=test_container --directory=/var/lib/machines/container1/ -b --network-ipvlan=[сетевой интерфейс] 
Restart=always
[Install]
Also=dbus.service

Let us comment on the given fragment. In the [Description] section, we simply specify the name of the container. In the [Service] section, we first set the limit on the number of open files in the container (LimitNOFILE), then specify the command to start the container with the necessary options (ExecStart). Specifying Restart = always means that the container must be restarted in the event of a “crash”. In the [Install] section, an additional unit is indicated, which must be added to autostart on the host (in our case, it is a D-Bus interprocess communication system).

Save the changes in the configuration file and execute the command:

sudo systecmctl start test_container

You can start the container as a service in another, simpler way. Systemd has a configuration file for automatically launching containers placed in the / var / lib / machines directory. You can activate the launch on the basis of this blank using the following commands:

sudo systemctl enable machine.target
mv ~/test_container /var/lib/machines/test_container
sudo systemctl enable systemd-nspawn@test_container.service

Container management: machinectl utility

Containers can be controlled using the machinectl utility. Briefly consider its main options.

List all containers available in the system:

sudo machinectl list

View container status information:

sudo machinectl status test_container

Enter the container:

sudo machinectl login test_container

Reload container:

sudo machinectl reboot test_container

Stop container:

sudo machinectl poweroff test_container

The last command will work if an OS compatible with systemd is installed in the container. For operating systems using sysvinit, use the terminate option.
We talked only about the most basic features of the machinectl utility; detailed instructions for its use can be found, for example, here .

Download images

We have already said that with the help of systemd-nspawn you can run images of any other formats. However, there is one important condition: working with images is possible only on the basis of the BTRFS file system, which must be mounted on the / var / lib / machines directory:

sudo dnf install btrfs-progs
mkfs.btrs /dev/sdb
mount /dev/sdb /var/lib/machines
mount | grep btrfs
dev/sdb on /var/lib/machines type btrfs (rw,relatime,seclabel,space_cache)

If there is no free disk, BTRFS can also be done in a file.
In newer versions of systemd, the ability to download images is supported “out of the box”, and there is no need to mount BTRFS.

Let's try loading the Docker image:

sudo machinectl pull-dkr --verify=no library/redis --dkr-index-url=https://index.docker.io

Starting a container based on a loaded image is simple:

sudo systemd-nspawn --machine redis

View container logs

Information about all events occurring inside the containers is recorded in the logs. Logging settings can be set directly when creating a container using the
−−link-journal option , for example:

sudo systemd-nspawn -D /var/lib/machines/container1/ --machine test_container -b --link-journal=host

The above command indicates that the logs of the container will be stored with on the main host in the directory / var / log / journal / machine-id. If you set the option −−link-journal = guest, then all the logs will be stored in the container in the / var / log / journal / machine-id directory, and a symbolic link will be created on the main host in the directory with the same address. The −−link-journal option will work only if a systemd system is started in the container. Otherwise, correct logging is not guaranteed.

You can view information about container starts and stops using the journalctl utility, which we already wrote about in one of the previous publications :

 journalctl -u test_container.service

Journalctl provides the ability to view event logs inside a container.
To do this, use the -M option (we present only a small fragment of the output):

journalctl -M test_container
Sep 18 11:50:21 octavia.localdomain systemd-journal[16]: Runtime journal is using 8.0M (max allowed 197.6M, trying to leave 296.4M free of 1.9G available <86><92>  current limit 197.6M).
Sep 18 11:50:21 octavia.localdomain systemd-journal[16]: Runtime journal is using 8.0M (max allowed 197.6M, trying to leave 296.4M free of 1.9G available <86><92> current limit 197.6M).
Sep 18 11:50:21 octavia.localdomain systemd-journal[16]: Journal started
Sep 18 11:50:21 octavia.localdomain systemd[1]: Starting Slices.
Sep 18 11:50:21 octavia.localdomain systemd[1]: Reached target Slices.
Sep 18 11:50:21 octavia.localdomain systemd[1]: Starting Remount Root and Kernel File Systems...
Sep 18 11:50:21 octavia.localdomain systemd[1]: Started Remount Root and Kernel File Systems.
Sep 18 11:50:21 octavia.localdomain systemd[1]: Started Various fixups to make systemd work better on Debian.

Resource allocation

The main features of systemd-nspawn we reviewed. One important point remained: allocation of resources to containers. As noted above, systemd-nspawn does not isolate resources. You can limit the resource consumption for a container using systemctl, for example:

sudo systemctl set-property [имя контейнера] CPUShares=200 CPUQuota=30% MemoryLimit=500M

Resource restrictions for the container can also be specified in the unit file, in the [Slice] section.

Conclusion

Systemd-nspawn is an interesting and promising tool. Among its undoubted advantages it is worth highlighting:

tight integration with other systemd components;
the ability to work with images in different formats;
no need to install any additional packages or patch patches to the kernel.

Of course, it’s too early to talk about the full use of systemd-nspawn in production: the tool is still in a “raw” state and is suitable only for testing and experimentation. However, as systemd continues to spread, it's worth waiting for improvements to systemd-nspawn.

Naturally, in the framework of the review article it is impossible to tell absolutely everything. Any questions, comments and additions are welcome in the comments.
If we missed some details or didn’t tell about some interesting features of systemd-nspawn - write, and we will definitely supplement our review.
And if any of you use systemd-nspawn, we invite you to share your experience.

Readers who for one reason or another cannot post comments here are welcome to our blog..

Tags: