SyCraft June 27, 2019 at 13:35

Work with a Proxmox cluster: installation, network setup, ZFS, solving common problems

Tutorial

Over the past few years, I have been working very closely with Proxmox clusters: many clients require their own infrastructure where they can develop their project. That is why I can tell you about the most common mistakes and problems that you may also encounter. In addition to this, we will of course configure a cluster of three nodes from scratch.

A Proxmox cluster can consist of two or more servers. The maximum number of nodes in a cluster is 32 pieces. Our own cluster will consist of three nodes on a multicast (in the article I will also describe how to raise a cluster on uniqueness - this is important if you base your cluster infrastructure on Hetzner or OVH, for example). In short, multicast allows data transfer to several nodes simultaneously. With a multicast, we can not think about the number of nodes in the cluster (focusing on the limitations above).

The cluster itself is built on an internal network (it is important that the IP addresses are on the same subnet), the same Hetzner and OVH have the ability to combine nodes in different data centers using Virtual Switch (Hetzner) and vRack (OVH) technology - about Virtual Switch we will also talk in the article. If your hosting provider does not have similar technologies at work, then you can use OVS (Open Virtual Switch), which is natively supported by Proxmox, or use a VPN. However, in this case, I recommend using Unicast with a small number of nodes — situations often arise where the cluster simply “falls apart” based on such a network infrastructure and has to be restored. Therefore, I try to use exactly OVH and Hetzner in my work - I have seen fewer such incidents, but first of all, study the hosting provider,

Install Proxmox

Proxmox can be installed in two ways: ISO-installer and installation via shell. We choose the second method, so install Debian on the server.

We proceed directly to the installation of Proxmox on each server. Installation is extremely simple and is described in the official documentation here.

Add the Proxmox repository and the key of this repository:

echo "deb http://download.proxmox.com/debian/pve stretch pve-no-subscription" > /etc/apt/sources.list.d/pve-install-repo.list
wget http://download.proxmox.com/debian/proxmox-ve-release-5.x.gpg -O /etc/apt/trusted.gpg.d/proxmox-ve-release-5.x.gpg
chmod +r /etc/apt/trusted.gpg.d/proxmox-ve-release-5.x.gpg  # optional, if you have a changed default umask

Updating repositories and the system itself:

apt update && apt dist-upgrade

After a successful update, install the necessary Proxmox packages:

apt install proxmox-ve postfix open-iscsi

Note : Postfix and grub will be configured during installation - one of them may fail. Perhaps this will be caused by the fact that the hostname does not resolve by name. Edit the hosts entries and perform apt-get update

From now on, we can log into the Proxmox web interface at https: // <external-ip-address>: 8006 (you will encounter an untrusted certificate during connection).

Image 1. Proxmox node web interface

Install Nginx and Let's Encrypt Certificate

I don’t really like the situation with the certificate and IP address, so I suggest installing Nginx and setting up Let's Encrypt certificate. I will not describe the installation of Nginx, I will leave only the important files for the Let's encrypt certificate to work:

/etc/nginx/snippets/letsencrypt.conf

location ^~ /.well-known/acme-challenge/ {
  allow all;
  root /var/lib/letsencrypt/;
  default_type "text/plain";
  try_files $uri =404;
}

Command for issuing SSL certificate:

certbot certonly --agree-tos --email sos@livelinux.info --webroot -w /var/lib/letsencrypt/ -d proxmox1.domain.name

Site Configuration in NGINX

upstream proxmox1.domain.name  {
      server 127.0.0.1:8006;
}
server {
    listen 80;
    server_name proxmox1.domain.name;
    include snippets/letsencrypt.conf;
    return 301 https://$host$request_uri;
}
server {
    listen 443 ssl;
    server_name  proxmox1.domain.name;
    access_log  /var/log/nginx/proxmox1.domain.name.access.log;
    error_log  /var/log/nginx/proxmox1.domain.name.error.log;
    include snippets/letsencrypt.conf;
    ssl_certificate /etc/letsencrypt/live/proxmox1.domain.name/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/proxmox1.domain.name/privkey.pem;
    location / {
        proxy_pass  https://proxmox1.domain.name;
        proxy_next_upstream error timeout invalid_header http_500 http_502 http_503                                                                              http_504;
        proxy_redirect off;
        proxy_buffering off;
        proxy_set_header        Host            $host;
        proxy_set_header        X-Real-IP       $remote_addr;
        proxy_set_header        X-Forwarded-For $proxy_add_x_forwarded_for;
}

After installing the SSL certificate, do not forget to set it to auto-renew via cron:

0 */12 * * * /usr/bin/certbot -a \! -d /run/systemd/system && perl -e 'sleep int(rand(3600))' && certbot -q renew --renew-hook "systemctl reload nginx"

Excellent! Now we can access our domain via HTTPS.

Note : to disable the subscription information window, run this command:

sed -i.bak "s/data.status !== 'Active'/false/g" /usr/share/javascript/proxmox-widget-toolkit/proxmoxlib.js && systemctl restart pveproxy.service

Network settings

Before connecting to the cluster, configure the network interfaces on the hypervisor. It is worth noting that the configuration of the remaining nodes is no different, except for IP addresses and server names, so I will not duplicate their settings.

Let's create a network bridge for the internal network so that our virtual machines (in my version there will be an LXC container for convenience), firstly, they are connected to the internal network of the hypervisor and can interact with each other. Secondly, a little later we will add a bridge for the external network so that the virtual machines have their own external IP address. Accordingly, the containers will be at the moment behind NAT'om with us.

There are two ways to work with the Proxmox network configuration: through the web interface or through the configuration file / etc / network / interfaces. In the first option, you will need to restart the server (or you can simply rename the interfaces.new file to interfaces and restart the networking service through systemd). If you are just starting to configure and there are no virtual machines or LXC containers yet, then it is advisable to restart the hypervisor after the changes.

Now create a network bridge called vmbr1 in the network tab in the Proxmox web panel.

Figure 2. Network interfaces of the proxmox1 node

Figure 3. Creating a network bridge

Image 4. Configuring the vmbr1 network configuration

The setup is extremely simple - we need vmbr1 so that the instances get access to the Internet.

Now restart our hypervisor and check if the interface has been created:

Image 5. Network interface vmbr1 in ip command output a

Note: I already have ens19 interface - this is an interface with an internal network, based on which a cluster will be created.

Repeat these steps on the other two hypervisors, and then proceed to the next step - preparing the cluster.

Also, an important stage now is to enable packet forwarding - without it, the instances will not gain access to the external network. Open the sysctl.conf file and change the value of the net.ipv4.ip_forward parameter to 1, after which we enter the following command:

sysctl -p

In the output, you should see the net.ipv4.ip_forward directive (if you have not changed it before)

Configuring the Proxmox cluster

Now we will go directly to the cluster. Each node must resolve itself and other nodes on the internal network, for this it is necessary to change the values in the hosts records as follows (each node must have a record about the others):

172.30.0.15 proxmox1.livelinux.info proxmox1
172.30.0.16 proxmox2.livelinux.info proxmox2
172.30.0.17 proxmox3.livelinux.info proxmox3

It is also required to add the public keys of each node to the others - this is required to create a cluster.

Let's create a cluster through the web panel:

Image 6. Creating a cluster through the web interface

After creating the cluster, we need to get information about it. Go to the same tab of the cluster and click the “Join Information” button:

Image 7. Information about the created cluster

This information is useful to us when joining the second and third nodes in the cluster. We are connected to the second node and in the Cluster tab, click the “Join Cluster” button:

Image 8. Connecting to the node cluster

Let us examine in more detail the parameters for connection:

Peer Address: IP address of the first server (to the one to which we are connecting)
Password: password of the first server
Fingerprint: we get this value from cluster information

Image 9. Cluster state after connecting the second node.

The second node is successfully connected! However, this does not always happen. If you follow the steps incorrectly or network problems arise, then joining the cluster will fail, and the cluster itself will be “broken up”. The best solution is to disconnect the node from the cluster, delete all the information about the cluster on it, then restart the server and check the previous steps. How to safely disconnect a node from a cluster? First, delete it from the cluster on the first server:

pvecm del proxmox2

After which the node will be disconnected from the cluster. Now go to the broken node and disable the following services on it:

systemctl stop pvestatd.service
systemctl stop pvedaemon.service
systemctl stop pve-cluster.service
systemctl stop corosync
systemctl stop pve-cluster

The Proxmox cluster stores information about itself in the sqlite database, it also needs to be cleared:

sqlite3 /var/lib/pve-cluster/config.db
delete from tree where name = 'corosync.conf';
.quit

Data about the bark are successfully deleted. Delete the remaining files, for this you need to start the cluster file system in standalone mode:

pmxcfs -l
rm /etc/pve/corosync.conf
rm /etc/corosync/*
rm /var/lib/corosync/*
rm -rf /etc/pve/nodes/*

We restart the server (this is not necessary, but we are safe: all the services should be up and running correctly in the end. In order not to miss anything, we will restart). After switching on, we will get an empty node without any information about the previous cluster and we can start the connection again.

Install and configure ZFS

ZFS is a file system that can be used with Proxmox. With it, you can allow yourself to replicate data to another hypervisor, migrate the virtual machine / LXC container, access the LXC container from the host system, and so on. Installing it is quite simple, let's proceed to the analysis. Three SSDs are available on my servers, which we will combine into a RAID array.

Add repositories:

nano /etc/apt/sources.list.d/stretch-backports.list
deb http://deb.debian.org/debian stretch-backports main contrib
deb-src http://deb.debian.org/debian stretch-backports main contrib
nano /etc/apt/preferences.d/90_zfs
Package: libnvpair1linux libuutil1linux libzfs2linux libzpool2linux spl-dkms zfs-dkms zfs-test zfsutils-linux zfsutils-linux-dev zfs-zed
Pin: release n=stretch-backports
Pin-Priority: 990

Updating the list of packages:

apt update

Set the required dependencies:

 apt install --yes dpkg-dev linux-headers-$(uname -r) linux-image-amd64

Install ZFS itself:

apt-get install zfs-dkms zfsutils-linux

If in the future you get an error fusermount: fuse device not found, try 'modprobe fuse' first, then run the following command:

modprobe fuse

Now let's proceed directly to the setup. First we need to format the SSDs and configure them through parted:

Configure / dev / sda

parted /dev/sda
(parted) print
Model: ATA SAMSUNG MZ7LM480 (scsi)
Disk /dev/sda: 480GB
Sector size (logical/physical): 512B/512B
Partition Table: msdos
Disk Flags:
Number  Start   End     Size    Type     File system  Flags
 1      1049kB  4296MB  4295MB  primary               raid
 2      4296MB  4833MB  537MB   primary               raid
 3      4833MB  37,0GB  32,2GB  primary               raid
(parted) mkpart
Partition type?  primary/extended? primary
File system type?  [ext2]? zfs
Start? 33GB
End? 480GB
Warning: You requested a partition from 33,0GB to 480GB (sectors 64453125..937500000).
The closest location we can manage is 37,0GB to 480GB (sectors 72353792..937703087).
Is this still acceptable to you?
Yes/No? yes

Similar actions must be performed for other drives. After all the disks are ready, proceed to the next step:

zpool create -f -o ashift = 12 rpool / dev / sda4 / dev / sdb4 / dev / sdc4

We choose ashift = 12 for performance reasons - this is the recommendation of zfsonlinux itself, more details you can read about it on their wiki: github.com/zfsonlinux/zfs/wiki/faq#performance-considerations

Let's apply some settings for ZFS:

zfs set atime=off rpool
zfs set compression=lz4 rpool
zfs set dedup=off rpool
zfs set snapdir=visible rpool
zfs set primarycache=all rpool
zfs set aclinherit=passthrough rpool
zfs inherit acltype rpool
zfs get -r acltype rpool
zfs get all rpool | grep compressratio

Now we need to calculate some variables to calculate zfs_arc_max, I do this as follows:

mem =`free --giga | grep Mem | awk '{print $2}'`
partofmem=$(($mem/10))
echo $setzfscache > /sys/module/zfs/parameters/zfs_arc_max
grep c_max /proc/spl/kstat/zfs/arcstats
zfs create rpool/data
cat > /etc/modprobe.d/zfs.conf << EOL
options zfs zfs_arc_max=$setzfscache
EOL
echo $setzfscache > /sys/module/zfs/parameters/zfs_arc_max
grep c_max /proc/spl/kstat/zfs/arcstats

At the moment, the pool has been successfully created, we also created a data subpool. You can check the status of your pool with the zpool status command. This action must be performed on all hypervisors, and then proceed to the next step.

Now add ZFS to Proxmox. We go to the settings of the data center (it is it, and not a separate node) in the "Storage" section, click on the "Add" button and select the "ZFS" option, after which we will see the following parameters:

ID: Name of the store. I gave it the name local-zfs
ZFS Pool: We created rpool / data, and we add it here.
Nodes: specify all available nodes

This command creates a new pool with the drives we selected. On each hypervisor, a new storage should appear called local-zfs, after which you can migrate your virtual machines from local storage to ZFS.

Replicating instances to a neighboring hypervisor

The Proxmox cluster has the ability to replicate data from one hypervisor to another: this option allows you to switch the instance from one server to another. The data will be relevant at the time of the last synchronization - its time can be set when creating replication (15 minutes is set as standard). There are two ways to migrate an instance to another Proxmox node: manual and automatic. Let's look at the manual option first, and in the end I will give you a Python script that will allow you to create a virtual machine on an accessible hypervisor when one of the hypervisors is unavailable.

To create replication, go to the Proxmox web panel and create a virtual machine or LXC container. In the previous paragraphs, we configured vmbr1 bridge with NAT, which will allow us to go to the external network. I will create an LXC container with MySQL, Nginx and PHP-FPM with a test site to test replication. Below is a step-by-step instruction.

We load the appropriate template (go to storage -> Content -> Templates), an example in the screenshot:

Image 10. Local storage with templates and VM images

Press the “Templates” button and load the LXC container template we need:

Image 11. Select and download the template

Now we can use it when creating new LXC containers. Select the first hypervisor and click the “Create CT” button in the upper right corner: we will see the panel for creating a new instance. The installation steps are quite simple and I will give only the configuration file for this LXC container:

arch: amd64
cores: 3
memory: 2048
nameserver: 8.8.8.8
net0: name=eth0,bridge=vmbr1,firewall=1,gw=172.16.0.1,hwaddr=D6:60:C5:39:98:A0,ip=172.16.0.2/24,type=veth
ostype: centos
rootfs: local:100/vm-100-disk-1.raw,size=10G
swap: 512
unprivileged:

The container was created successfully. You can connect to LXC containers via the pct enter command, I also added the hypervisor SSH key before installation to connect directly via SSH (there are some problems with terminal display in PCT). I prepared the server and installed all the necessary server applications there, now you can proceed to creating replication.

We click on the LXC container and go to the “Replication” tab, where we create the replication parameter using the “Add” button:

Image 12. Creating replication in the Proxmox interface

Image 13. Replication job creation window

I created the task of replicating the container to the second node, as you can see in the next screenshot, the replication was successful - pay attention to the “Status” field, it notifies about the replication status, it is also worth paying attention to the “Duration” field to know how long the data replication takes.

Image 14. VM synchronization list

Now let's try to migrate the machine to the second node using the “Migrate” button.

The migration of the container will start , the log can be viewed in the task list - there will be our migration. After that, the container will be moved to the second node.

“Host Key Verification Failed” Error

Sometimes when configuring a cluster, a similar problem may arise - it prevents the machines from migrating and creating replication, which eliminates the advantages of cluster solutions. To fix this error, delete the known_hosts file and connect via SSH to the conflicting node:

/usr/bin/ssh -o 'HostKeyAlias=proxmox2' root@172.30.0.16

Accept Hostkey and try entering this command, it should connect you to the server:

/usr/bin/ssh -o 'BatchMode=yes' -o 'HostKeyAlias=proxmox2' root@172.30.0.16

Features of network settings on Hetzner

Go to the Robot panel and click on the “Virtual Switches” button. On the next page you will see a panel for creating and managing Virtual Switch interfaces: first you need to create it, and then “connect” dedicated servers to it. In the search, add the necessary servers to connect - they do not need to be rebooted, only have to wait until 10-15 minutes when the connection to the Virtual Switch will be active.

After adding the servers to Virtual Switch through the web panel, we connect to the servers and open the configuration files of the network interfaces, where we create a new network interface:

auto enp4s0.4000
iface enp4s0.4000 inet static
        address  10.1.0.11/24
        mtu 1400
        vlan-raw-device enp4s0

Let's take a closer look at what it is. At its core, it is a VLAN that connects to a single physical interface called enp4s0 (it may vary for you), with a VLAN number - this is the Virtual Switch number that you created in the Hetzner Robot web panel. You can specify any address, as long as it is local.

I note that you should configure enp4s0 as usual, in fact it should contain an external IP address that was issued to your physical server. Repeat these steps on other hypervisors, then reboot the networking service on them, ping to a neighboring node using the Virtual Switch IP address. If the ping was successful, then you have successfully established a connection between the servers using Virtual Switch.

I will also attach the sysctl.conf configuration file, it will be needed if you have problems with the forwarding package and other network parameters:

net.ipv6.conf.all.disable_ipv6=0
net.ipv6.conf.default.disable_ipv6 = 0
net.ipv6.conf.all.forwarding=1
net.ipv4.conf.all.rp_filter=1
net.ipv4.tcp_syncookies=1
net.ipv4.ip_forward=1
net.ipv4.conf.all.send_redirects=0

Adding IPv4 subnets to Hetzner

Before starting work, you need to order a subnet to Hetzner, you can do this through the Robot panel.

Create a network bridge with the address that will be from this subnet. Configuration Example:

auto vmbr2
iface vmbr2 inet static
        address ip-address
        netmask 29
        bridge-ports none
        bridge-stp off
        bridge-fd 0

Now go to the settings of the virtual machine in Proxmox and create a new network interface that will be attached to the vmbr2 bridge. I use the LXC container, its configuration can be changed immediately in Proxmox. Final configuration for Debian:

auto eth0
iface eth0 inet static
        address ip-address
        netmask 26
        gateway bridge-address

Please note: I specified 26 mask, not 29 - this is required in order for the network to work on the virtual machine.

Adding IPv4 Addresses to Hetzner

The situation with a single IP address is different - usually Hetzner gives us an additional address from the server subnet. This means that instead of vmbr2 we need to use vmbr0, but at the moment we don’t have it. The bottom line is that vmbr0 must contain the IP address of the iron server (that is, use the address that used the physical network interface enp2s0). The address must be moved to vmbr0, the following configuration is suitable for this (I advise you to order KVM, in which case to resume network operation):

auto enp2s0
iface enp2s0 inet manual
auto vmbr0
iface vmbr0 inet static
        address  ip-address
        netmask  255.255.255.192
        gateway  ip-gateway
        bridge-ports enp2s0
        bridge-stp off
        bridge-fd 0

Restart the server, if possible (if not, restart the networking service), and then check the network interfaces via ip a:

2: enp2s0:  mtu 1500 qdisc pfifo_fast master vmbr0 state UP group default qlen 1000
    link/ether 44:8a:5b:2c:30:c2 brd ff:ff:ff:ff:ff:ff

As you can see here, enp2s0 is connected to vmbr0 and does not have an IP address, since it was reassigned to vmbr0.

Now in the settings of the virtual machine, add the network interface that will be connected to vmbr0. For the gateway, specify the address attached to vmbr0.

At the end

I hope this article comes in handy when you set up the Proxmox cluster in Hetzner. If time permits, I will expand the article and add instructions for OVH - there, too, not everything is obvious, as it seems at first glance. The material turned out to be quite voluminous, if you find errors, then please write in the comments, I will correct them. Thank you all for your attention.

Posted by Ilya Andreev, edited by Alexei Zhadan and Live Linux Team

Tags: