Cluster storage Pacemaker + DRBD (Dual primary) + samba

  • Tutorial
In continuation of the article “Pacemaker Cluster Storage + DRBD (Dual primary) + ctdb” I present a fully finished and working version of the HA cluster file balls for 2-4 nodes for centos 6 and centos 7. If you want to implement this, you are either a pervert or you They didn’t give any choice, and it’s necessary to implement it somehow.

I’ll just describe the layered cake that we will collect:

On the block device, create the table gpt => one partition for the entire space under the lvm => group of lvm volumes for all available space => lvm volume for all available space => drbd device => dlm => mark as the physical volume of the lvm to the entire available space => on it the cluster group of volumes of the lvm => lvm the volume to the entire available space => mark the fs gfs2 => connect to the mount point.
And all this will be driven by pacemaker with a virtual ip address.


If you still want to continue, read on under the cut.

From the source we need the following:
Processor 1 core
1 GB minimum RAM
15 GB disk + the place on which you will store data
Disks can be any number, even one.

If you have one disk, it is better to
partition it as follows: Partition table gpt => 200 MB partition for efi (optional) => 1 GB partition for / boot => everything else is a lvm space.

On a lvm volume, you need to create 2 volume groups. The first group of volumes under the OS is 10 GB + twice the size of RAM, but not more than 4 GB.

Whoever said that, but swapping sometimes helps a lot, so on the lvm group we create a lvm partition for swapping equal to twice the size of RAM, but not more than 4 GB and the remaining space is allotted to the OS root.

The second group of lvm for data storage. Create a lvm section for the remaining space.

Under the terms we were given 2 virtual machines and that’s all. It is better to put Ceph for correct operation on 6 nodes, at least 4, plus it would be nice to have some experience with it, otherwise it will work like cloudmouse. Gluster for hundreds of thousands of small files in terms of performance will not work, it is debilitated in the vastness of the Habré many times. ipfs, luster and the like have the same requirements as ceph or even more.

Let's start the battle! I had two virtual machines on CentOS 7 with 2 disks.


1) Pacemaker version 1.1 does not work with ip correctly, so for reliability, add entries to / etc / hosts:

192.168.0.1 node1
192.168.0.2 node2

2) There is no DRBD in standard repositories, so you need to connect a third-party one.

rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org
yum localinstall -y http://ftp.nluug.nl/os/Linux/distr/elrepo/elrepo/el7/x86_64/RPMS/$(curl -s http://ftp.nluug.nl/os/Linux/distr/elrepo/elrepo/el7/x86_64/RPMS/ | grep -oP ">elrepo-release.*rpm" | cut -c 2-)

3) Install drbd version 8.4

yum install -y kmod-drbd84 drbd84-utils

4) Activate and enable drbd kernel module in startup

modprobe drbd
echo drbd > /etc/modules-load.d/drbd.conf

5) Create a disk partition and configure lvm

echo -e "g\nn\n\n\n\nt\n8e\nw\n" | fdisk /dev/sdb
vgcreate drbd_vg /dev/sdb1
lvcreate -l +100%FREE --name r0 drbd_vg

6) Create the configuration file for the resource drbd /etc/drbd.d/r0.res

resource r0 {
  protocol C;
  device /dev/drbd1;
  meta-disk internal;
  disk /dev/mapper/drbd_vg-r0;
  net {
    allow-two-primaries;
  }
  disk {
    fencing resource-and-stonith;
  }
  handlers {
    fence-peer "/usr/lib/drbd/crm-fence-peer.sh";
    after-resync-target "/usr/lib/drbd/crm-unfence-peer.sh";
  }
  startup { 
    become-primary-on both;
  }
  on node1 {
    address 192.168.0.1:7788;
  }
  on node2 {
    address 192.168.0.2:7788;
  }

7) We remove the drbd service from the autoload (later pacemaker will be responsible for it), create metadata for the drbd disk, raise the resource

systemctl disable drbd
drbdadm create-md r0
drbdadm up r0

8) On the first node, make the resource primary

drbdadm primary --force r0

9) Put the pacemaker

yum install -y pacemaker corosync pcs resource-agents fence-agents-all

10) Set a password for the user hacluster for authorization on nodes

echo CHANGEME | passwd --stdin hacluster 

11) Run pcsd on both nodes

systemctl enable pcsd
systemctl start pcsd

12) Log in to the cluster. From this stage we do everything on one node

pcs cluster auth node1 node2 -u hacluster -p CHANGEME --force 

13) Create a cluster named samba_cluster

pcs cluster setup --force --name samba_cluster node1 node2

14) activate nodes and add services to startup and start them

pcs cluster enable --all
pcs cluster start --all
systemctl start corosync pcsd pacemaker
systemctl enable corosync pcsd pacemaker

15) Since we have virtual machines as servers, we turn off the STONITH mechanism, since we do not have any mechanisms for managing them. We also have only 2 cars, so we also disable the quorum, it only works with 3 or more machines.

pcs property set stonith-enabled=false
pcs property set no-quorum-policy=ignore

16) Create VIP

pcs resource create virtual_ip ocf:heartbeat:IPaddr2 ip=192.168.0.10 cidr_netmask=32 nic=eth0 clusterip_hash=sourceip-sourceport op monitor interval=1s

17) Create a drbd resource

pcs resource create DRBD1 ocf:linbit:drbd drbd_resource=r0 op monitor interval=60s master master-max=2 master-node-max=1 clone-node-max=1 clone-max=2 notify=true op start interval=0s timeout=240 promote interval=0s timeout=130 monitor interval=150s role=Master monitor interval=155s role=Slave

18) Install the necessary packages for clvm and prepare clvm

yum install -y lvm2-cluster gfs2-utils
/sbin/lvmconf --enable-cluster

19) Add the dlm and clvd resource to pacemaker

pcs resource create dlm ocf:pacemaker:controld allow_stonith_disabled=true clone meta interleave=true
pcs resource create clvmd ocf:heartbeat:clvm clone meta interleave=true

20) We prohibit LVM from writing a cache and clear it. On both nodes

sed -i 's/write_cache_state = 1/write_cache_state = 0/' /etc/lvm/lvm.conf
rm /etc/lvm/cache/*


21) Create a CLVM partition. We make only on one note
vgcreate -A y -c y cl_vg /dev/drbd1
lvcreate -l 100%FREE -n r0 cl_vg

22) We mark up the section in gfs2, here it is important that the lock table has the same name as our cluster in peacemaker. We make only on one note

mkfs.gfs2 -j 2 -p lock_dlm -t samba_cluster:r0 /dev/cl_vg/r0

23) Next, add the mount of this section in pacemaker and tell it to boot after clvmd

pcs resource create fs ocf:heartbeat:Filesystem device="/dev/cl_vg/r0" directory="/mnt" fstype="gfs2" clone interleave=true

24) Now it’s the turn of ctdb that will run samba

yum install -y samba ctdb cifs-utils

25) Edit the config /etc/ctdb/ctdbd.conf

CTDB_RECOVERY_LOCK="/mnt/ctdb/.ctdb.lock"
CTDB_NODES=/etc/ctdb/nodes 
CTDB_MANAGES_SAMBA=yes
CTDB_LOGGING=file:/var/log/ctdb.log
CTDB_DEBUGLEVEL=NOTICE

26) Create a file with a list of nodes / etc / ctdb / nodes
ATTENTION! After each address in the list there should be a line feed. Otherwise, the node will not turn on during initialization.

192.168.0.1
192.168.0.2

27) Finally, create the ctdb resource

pcs resource create samba systemd:ctdb clone meta interleave=true

28) We set the load queue and resource dependencies to run

pcs constraint colocation add dlm-clone with DRBD1-master
pcs constraint colocation add clvmd-clone with dlm-clone
pcs constraint colocation add fs-clone with clvmd-clone
pcs constraint colocation add samba-clone with fs-clone
pcs constraint colocation add virtual_ip with samba-clone
pcs constraint order promote DRBD1-master then dlm-clone
pcs constraint order start dlm-clone then clvmd-clone
pcs constraint order start clvmd-clone then fs-clone
pcs constraint order start fs-clone then samba-clone

29) We set the queue for stopping resources, without this, your machine may freeze at the time of shutdown

pcs constraint order stop fs-clone then stop clvmd-clone
pcs constraint order stop clvmd-clone then stop dlm-clone
pcs constraint order stop dlm-clone then stop DRBD1-master
pcs constraint order stop samba-clone then stop fs-clone

PS


The ball itself can be on nfs, and on samba, but connection to them is fail-over by IP, although the HA storage itself. If you want full HA, then instead of samba and nfs you need to install iSCSI and connect via multipath. In addition, you can get splitbrain if one of the nodes dies, and when the master rises, it will not. I checked that if the OS turns off correctly, then after raising the node when there is no master, it goes into out-of-date mode and does not become the master in order to avoid split brains. Quorum options (DRBD and / or pacemaker) and any distortions from cascading DRBD constructions after your configuration are untenable due to their high complexity, another admin will take a long time to figure out. Although with what I wrote no better, do not do so.

Links:

There is a similar instruction with syntax for pacemaker 1.0.

Also popular now: