HA cluster, network-based file systems
About what: did a high availability cluster on two nodes, using heartbeat. Cluster for a web server (apache, nginx, php, mysql). Here is not an instruction on raising such a cluster, but notes on the use of clustered file systems, what is missing in common articles and a description of the rake that I stepped on.
First, what is missing in the description of drbd configuration (http://www.opennet.ru/base/sys/drbd_setup.txt.html):
Drbd configuration to reduce the file system on an existing partition - use the drbd command
itself , in kilobytes (e.g. 1000K)
After starting the disk it will issue a / proc / drbd about this state
When entering commands
will sophisticated type curse
in old articles it is recommended to do it
but he doesn’t understand such simple and accessible keys and there is nothing about mana, but it’s right to do so:
Then happiness comes and he starts to sync the disk, which can be seen in / proc / drbd like this: Now about glusterfs: A wonderful file system ... at first, replication seemed to be a master master, a lot of lotions that can be combined, and the main difference from drbd - it can be mounted on all nodes at the same time.
The jamb number 1, according to the developers, should be fixed in version 2.0.1 (did not check if it was fixed correctly) - using the glusterfs section to store the mysql database is contraindicated! Mysql puts locks on the database files and does not remove them immediately after completion of work or, for example, the death of a node. And when mysql from the second node tries to work with this database, then the entire node starts to go black due to the glusterfsd server process and as a result, the nifig cluster is not functional.
Jamb number 2 - performance. I will not say for other configurations, but for a replicated partition with two nodes, with the configuration from the example on the glaster site, the Apache productivity (all www is on the glaster partition) drops up to 10 times. The ab utility was measured with the number of competing requests 10. By long experiments with the configs, the best client config was revealed (this is for my case, when two nodes separate their sections). In the example, both sections were connected first through a network, then combined into a mirror, after which cache and thread translators were used on this mirror. With this option, performance is 10 times worse than with direct Apache work with disks. If you redo the config like this: we cling the section of the current node through posix (just as the server does), the remote section, as in the example via the network then we use the cache on the remote partition and then in mirror we collect the volume cache and the local partition. Threads only slow down work, reading ahead does not produce results, a delayed record was not needed in my example, since recording is very rare. In the specified configuration, the performance loss relative to the use of the local partition is only about 50%. But because of this, I refused glusterfs in favor of the second section on drbd (the first was configured for muscle, and the second is mounted on the second node under Apache). I also want to note that in direct reading tests, glusterfs shows almost no difference with local file systems, but in my case ... alas. In my example, a delayed record was not needed, since a record is very rarely the case. In the specified configuration, the performance loss relative to the use of the local partition is only about 50%. But because of this, I refused glusterfs in favor of the second section on drbd (the first was configured for muscle, and the second is mounted on the second node under Apache). I also want to note that in direct reading tests, glusterfs shows almost no difference with local file systems, but in my case ... alas. In my example, a delayed record was not needed, since a record is very rarely the case. In the specified configuration, the performance loss relative to the use of the local partition is only about 50%. But because of this, I refused glusterfs in favor of the second section on drbd (the first was configured for muscle, and the second is mounted on the second node under Apache). I also want to note that in direct reading tests, glusterfs shows almost no difference with local file systems, but in my case ... alas.
First, what is missing in the description of drbd configuration (http://www.opennet.ru/base/sys/drbd_setup.txt.html):
Drbd configuration to reduce the file system on an existing partition - use the drbd command
resize2fs <путь_к_разделу> <желаемый_размер>
itself , in kilobytes (e.g. 1000K)
After starting the disk it will issue a / proc / drbd about this state
cs:Connected st:Secondary/Secondary ds:Inconsistent/Inconsistent
When entering commands
drbdadm primary <имя_ресурса>
will sophisticated type curse
/dev/drbd1: State change failed: (-2) Refusing to be Primary without at least one UpToDate disk
Command 'drbdsetup /dev/drbd1 primary' terminated with exit code 17
in old articles it is recommended to do it
drbdadm -- --do-what-I-say primary <имя_ресурса>
but he doesn’t understand such simple and accessible keys and there is nothing about mana, but it’s right to do so:
drbdadm -- --overwrite-data-of-peer primary <имя_ресурса>
Then happiness comes and he starts to sync the disk, which can be seen in / proc / drbd like this: Now about glusterfs: A wonderful file system ... at first, replication seemed to be a master master, a lot of lotions that can be combined, and the main difference from drbd - it can be mounted on all nodes at the same time.
1: cs:SyncSource st:Primary/Secondary ds:UpToDate/Inconsistent C r---
ns:225808 nr:0 dw:0 dr:225808 al:0 bm:13 lo:0 pe:895 ua:0 ap:0
[>....................] sync'ed: 0.4% (71460/71676)M
finish: 0:27:25 speed: 44,444 (44,444) K/sec
resync: used:0/61 hits:111995 misses:14 starving:0 dirty:0 changed:14
act_log: used:0/127 hits:0 misses:0 starving:0 dirty:0 changed:0
The jamb number 1, according to the developers, should be fixed in version 2.0.1 (did not check if it was fixed correctly) - using the glusterfs section to store the mysql database is contraindicated! Mysql puts locks on the database files and does not remove them immediately after completion of work or, for example, the death of a node. And when mysql from the second node tries to work with this database, then the entire node starts to go black due to the glusterfsd server process and as a result, the nifig cluster is not functional.
Jamb number 2 - performance. I will not say for other configurations, but for a replicated partition with two nodes, with the configuration from the example on the glaster site, the Apache productivity (all www is on the glaster partition) drops up to 10 times. The ab utility was measured with the number of competing requests 10. By long experiments with the configs, the best client config was revealed (this is for my case, when two nodes separate their sections). In the example, both sections were connected first through a network, then combined into a mirror, after which cache and thread translators were used on this mirror. With this option, performance is 10 times worse than with direct Apache work with disks. If you redo the config like this: we cling the section of the current node through posix (just as the server does), the remote section, as in the example via the network then we use the cache on the remote partition and then in mirror we collect the volume cache and the local partition. Threads only slow down work, reading ahead does not produce results, a delayed record was not needed in my example, since recording is very rare. In the specified configuration, the performance loss relative to the use of the local partition is only about 50%. But because of this, I refused glusterfs in favor of the second section on drbd (the first was configured for muscle, and the second is mounted on the second node under Apache). I also want to note that in direct reading tests, glusterfs shows almost no difference with local file systems, but in my case ... alas. In my example, a delayed record was not needed, since a record is very rarely the case. In the specified configuration, the performance loss relative to the use of the local partition is only about 50%. But because of this, I refused glusterfs in favor of the second section on drbd (the first was configured for muscle, and the second is mounted on the second node under Apache). I also want to note that in direct reading tests, glusterfs shows almost no difference with local file systems, but in my case ... alas. In my example, a delayed record was not needed, since a record is very rarely the case. In the specified configuration, the performance loss relative to the use of the local partition is only about 50%. But because of this, I refused glusterfs in favor of the second section on drbd (the first was configured for muscle, and the second is mounted on the second node under Apache). I also want to note that in direct reading tests, glusterfs shows almost no difference with local file systems, but in my case ... alas.