
Cluster storage Pacemaker + DRBD (Dual primary) + samba
- Tutorial
In continuation of the article “Pacemaker Cluster Storage + DRBD (Dual primary) + ctdb” I present a fully finished and working version of the HA cluster file balls for 2-4 nodes for centos 6 and centos 7. If you want to implement this, you are either a pervert or you They didn’t give any choice, and it’s necessary to implement it somehow.
I’ll just describe the layered cake that we will collect:
On the block device, create the table gpt => one partition for the entire space under the lvm => group of lvm volumes for all available space => lvm volume for all available space => drbd device => dlm => mark as the physical volume of the lvm to the entire available space => on it the cluster group of volumes of the lvm => lvm the volume to the entire available space => mark the fs gfs2 => connect to the mount point.
And all this will be driven by pacemaker with a virtual ip address.

If you still want to continue, read on under the cut.
Under the terms we were given 2 virtual machines and that’s all. It is better to put Ceph for correct operation on 6 nodes, at least 4, plus it would be nice to have some experience with it, otherwise it will work like cloudmouse. Gluster for hundreds of thousands of small files in terms of performance will not work, it is debilitated in the vastness of the Habré many times. ipfs, luster and the like have the same requirements as ceph or even more.
1) Pacemaker version 1.1 does not work with ip correctly, so for reliability, add entries to / etc / hosts:
2) There is no DRBD in standard repositories, so you need to connect a third-party one.
3) Install drbd version 8.4
4) Activate and enable drbd kernel module in startup
5) Create a disk partition and configure lvm
6) Create the configuration file for the resource drbd /etc/drbd.d/r0.res
7) We remove the drbd service from the autoload (later pacemaker will be responsible for it), create metadata for the drbd disk, raise the resource
8) On the first node, make the resource primary
9) Put the pacemaker
10) Set a password for the user hacluster for authorization on nodes
11) Run pcsd on both nodes
12) Log in to the cluster. From this stage we do everything on one node
13) Create a cluster named samba_cluster
14) activate nodes and add services to startup and start them
15) Since we have virtual machines as servers, we turn off the STONITH mechanism, since we do not have any mechanisms for managing them. We also have only 2 cars, so we also disable the quorum, it only works with 3 or more machines.
16) Create VIP
17) Create a drbd resource
18) Install the necessary packages for clvm and prepare clvm
19) Add the dlm and clvd resource to pacemaker
20) We prohibit LVM from writing a cache and clear it. On both nodes
21) Create a CLVM partition. We make only on one note
22) We mark up the section in gfs2, here it is important that the lock table has the same name as our cluster in peacemaker. We make only on one note
23) Next, add the mount of this section in pacemaker and tell it to boot after clvmd
24) Now it’s the turn of ctdb that will run samba
25) Edit the config /etc/ctdb/ctdbd.conf
26) Create a file with a list of nodes / etc / ctdb / nodes
ATTENTION! After each address in the list there should be a line feed. Otherwise, the node will not turn on during initialization.
27) Finally, create the ctdb resource
28) We set the load queue and resource dependencies to run
29) We set the queue for stopping resources, without this, your machine may freeze at the time of shutdown
The ball itself can be on nfs, and on samba, but connection to them is fail-over by IP, although the HA storage itself. If you want full HA, then instead of samba and nfs you need to install iSCSI and connect via multipath. In addition, you can get splitbrain if one of the nodes dies, and when the master rises, it will not. I checked that if the OS turns off correctly, then after raising the node when there is no master, it goes into out-of-date mode and does not become the master in order to avoid split brains. Quorum options (DRBD and / or pacemaker) and any distortions from cascading DRBD constructions after your configuration are untenable due to their high complexity, another admin will take a long time to figure out. Although with what I wrote no better, do not do so.
Links:
There is a similar instruction with syntax for pacemaker 1.0.
I’ll just describe the layered cake that we will collect:
On the block device, create the table gpt => one partition for the entire space under the lvm => group of lvm volumes for all available space => lvm volume for all available space => drbd device => dlm => mark as the physical volume of the lvm to the entire available space => on it the cluster group of volumes of the lvm => lvm the volume to the entire available space => mark the fs gfs2 => connect to the mount point.
And all this will be driven by pacemaker with a virtual ip address.

If you still want to continue, read on under the cut.
From the source we need the following:
Processor 1 core
1 GB minimum RAM
15 GB disk + the place on which you will store data
Disks can be any number, even one.
If you have one disk, it is better to
partition it as follows: Partition table gpt => 200 MB partition for efi (optional) => 1 GB partition for / boot => everything else is a lvm space.
On a lvm volume, you need to create 2 volume groups. The first group of volumes under the OS is 10 GB + twice the size of RAM, but not more than 4 GB.
Whoever said that, but swapping sometimes helps a lot, so on the lvm group we create a lvm partition for swapping equal to twice the size of RAM, but not more than 4 GB and the remaining space is allotted to the OS root.
The second group of lvm for data storage. Create a lvm section for the remaining space.
1 GB minimum RAM
15 GB disk + the place on which you will store data
Disks can be any number, even one.
If you have one disk, it is better to
partition it as follows: Partition table gpt => 200 MB partition for efi (optional) => 1 GB partition for / boot => everything else is a lvm space.
On a lvm volume, you need to create 2 volume groups. The first group of volumes under the OS is 10 GB + twice the size of RAM, but not more than 4 GB.
Whoever said that, but swapping sometimes helps a lot, so on the lvm group we create a lvm partition for swapping equal to twice the size of RAM, but not more than 4 GB and the remaining space is allotted to the OS root.
The second group of lvm for data storage. Create a lvm section for the remaining space.
Under the terms we were given 2 virtual machines and that’s all. It is better to put Ceph for correct operation on 6 nodes, at least 4, plus it would be nice to have some experience with it, otherwise it will work like cloudmouse. Gluster for hundreds of thousands of small files in terms of performance will not work, it is debilitated in the vastness of the Habré many times. ipfs, luster and the like have the same requirements as ceph or even more.
Let's start the battle! I had two virtual machines on CentOS 7 with 2 disks.
1) Pacemaker version 1.1 does not work with ip correctly, so for reliability, add entries to / etc / hosts:
192.168.0.1 node1
192.168.0.2 node2
2) There is no DRBD in standard repositories, so you need to connect a third-party one.
rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org
yum localinstall -y http://ftp.nluug.nl/os/Linux/distr/elrepo/elrepo/el7/x86_64/RPMS/$(curl -s http://ftp.nluug.nl/os/Linux/distr/elrepo/elrepo/el7/x86_64/RPMS/ | grep -oP ">elrepo-release.*rpm" | cut -c 2-)
3) Install drbd version 8.4
yum install -y kmod-drbd84 drbd84-utils
4) Activate and enable drbd kernel module in startup
modprobe drbd
echo drbd > /etc/modules-load.d/drbd.conf
5) Create a disk partition and configure lvm
echo -e "g\nn\n\n\n\nt\n8e\nw\n" | fdisk /dev/sdb
vgcreate drbd_vg /dev/sdb1
lvcreate -l +100%FREE --name r0 drbd_vg
6) Create the configuration file for the resource drbd /etc/drbd.d/r0.res
resource r0 {
protocol C;
device /dev/drbd1;
meta-disk internal;
disk /dev/mapper/drbd_vg-r0;
net {
allow-two-primaries;
}
disk {
fencing resource-and-stonith;
}
handlers {
fence-peer "/usr/lib/drbd/crm-fence-peer.sh";
after-resync-target "/usr/lib/drbd/crm-unfence-peer.sh";
}
startup {
become-primary-on both;
}
on node1 {
address 192.168.0.1:7788;
}
on node2 {
address 192.168.0.2:7788;
}
7) We remove the drbd service from the autoload (later pacemaker will be responsible for it), create metadata for the drbd disk, raise the resource
systemctl disable drbd
drbdadm create-md r0
drbdadm up r0
8) On the first node, make the resource primary
drbdadm primary --force r0
9) Put the pacemaker
yum install -y pacemaker corosync pcs resource-agents fence-agents-all
10) Set a password for the user hacluster for authorization on nodes
echo CHANGEME | passwd --stdin hacluster
11) Run pcsd on both nodes
systemctl enable pcsd
systemctl start pcsd
12) Log in to the cluster. From this stage we do everything on one node
pcs cluster auth node1 node2 -u hacluster -p CHANGEME --force
13) Create a cluster named samba_cluster
pcs cluster setup --force --name samba_cluster node1 node2
14) activate nodes and add services to startup and start them
pcs cluster enable --all
pcs cluster start --all
systemctl start corosync pcsd pacemaker
systemctl enable corosync pcsd pacemaker
15) Since we have virtual machines as servers, we turn off the STONITH mechanism, since we do not have any mechanisms for managing them. We also have only 2 cars, so we also disable the quorum, it only works with 3 or more machines.
pcs property set stonith-enabled=false
pcs property set no-quorum-policy=ignore
16) Create VIP
pcs resource create virtual_ip ocf:heartbeat:IPaddr2 ip=192.168.0.10 cidr_netmask=32 nic=eth0 clusterip_hash=sourceip-sourceport op monitor interval=1s
17) Create a drbd resource
pcs resource create DRBD1 ocf:linbit:drbd drbd_resource=r0 op monitor interval=60s master master-max=2 master-node-max=1 clone-node-max=1 clone-max=2 notify=true op start interval=0s timeout=240 promote interval=0s timeout=130 monitor interval=150s role=Master monitor interval=155s role=Slave
18) Install the necessary packages for clvm and prepare clvm
yum install -y lvm2-cluster gfs2-utils
/sbin/lvmconf --enable-cluster
19) Add the dlm and clvd resource to pacemaker
pcs resource create dlm ocf:pacemaker:controld allow_stonith_disabled=true clone meta interleave=true
pcs resource create clvmd ocf:heartbeat:clvm clone meta interleave=true
20) We prohibit LVM from writing a cache and clear it. On both nodes
sed -i 's/write_cache_state = 1/write_cache_state = 0/' /etc/lvm/lvm.conf
rm /etc/lvm/cache/*
21) Create a CLVM partition. We make only on one note
vgcreate -A y -c y cl_vg /dev/drbd1
lvcreate -l 100%FREE -n r0 cl_vg
22) We mark up the section in gfs2, here it is important that the lock table has the same name as our cluster in peacemaker. We make only on one note
mkfs.gfs2 -j 2 -p lock_dlm -t samba_cluster:r0 /dev/cl_vg/r0
23) Next, add the mount of this section in pacemaker and tell it to boot after clvmd
pcs resource create fs ocf:heartbeat:Filesystem device="/dev/cl_vg/r0" directory="/mnt" fstype="gfs2" clone interleave=true
24) Now it’s the turn of ctdb that will run samba
yum install -y samba ctdb cifs-utils
25) Edit the config /etc/ctdb/ctdbd.conf
CTDB_RECOVERY_LOCK="/mnt/ctdb/.ctdb.lock"
CTDB_NODES=/etc/ctdb/nodes
CTDB_MANAGES_SAMBA=yes
CTDB_LOGGING=file:/var/log/ctdb.log
CTDB_DEBUGLEVEL=NOTICE
26) Create a file with a list of nodes / etc / ctdb / nodes
ATTENTION! After each address in the list there should be a line feed. Otherwise, the node will not turn on during initialization.
192.168.0.1
192.168.0.2
27) Finally, create the ctdb resource
pcs resource create samba systemd:ctdb clone meta interleave=true
28) We set the load queue and resource dependencies to run
pcs constraint colocation add dlm-clone with DRBD1-master
pcs constraint colocation add clvmd-clone with dlm-clone
pcs constraint colocation add fs-clone with clvmd-clone
pcs constraint colocation add samba-clone with fs-clone
pcs constraint colocation add virtual_ip with samba-clone
pcs constraint order promote DRBD1-master then dlm-clone
pcs constraint order start dlm-clone then clvmd-clone
pcs constraint order start clvmd-clone then fs-clone
pcs constraint order start fs-clone then samba-clone
29) We set the queue for stopping resources, without this, your machine may freeze at the time of shutdown
pcs constraint order stop fs-clone then stop clvmd-clone
pcs constraint order stop clvmd-clone then stop dlm-clone
pcs constraint order stop dlm-clone then stop DRBD1-master
pcs constraint order stop samba-clone then stop fs-clone
PS
The ball itself can be on nfs, and on samba, but connection to them is fail-over by IP, although the HA storage itself. If you want full HA, then instead of samba and nfs you need to install iSCSI and connect via multipath. In addition, you can get splitbrain if one of the nodes dies, and when the master rises, it will not. I checked that if the OS turns off correctly, then after raising the node when there is no master, it goes into out-of-date mode and does not become the master in order to avoid split brains. Quorum options (DRBD and / or pacemaker) and any distortions from cascading DRBD constructions after your configuration are untenable due to their high complexity, another admin will take a long time to figure out. Although with what I wrote no better, do not do so.
Links:
There is a similar instruction with syntax for pacemaker 1.0.