CentOS6 migration from raid1 to raid10

Good day.
There is a server with CentOS6 and high% iowait. Without downtime, or a maximum of 10 minutes, preferably at night, it is necessary to transfer the system from md raid1 to md raid10. In my case, it is really possible to meet 10 minutes, or even less, because the server has hot-swap baskets, but the engineers of the data center where the server is rented, initially connected the drives to the second and third baskets, and not to the first and second. Because of this, I had to turn off the server, swap the old disks and put the new two.

Before starting, I’ll mention one detail, without it it’s impossible to migrate following this topic. The entire system, except / boot, must be installed on LVM. The mount point / boot is located on the md0 partition, as grub 0.97 cannot load with LVM.

So, the server is loaded and it has four disks. The / dev / md0 partition is / boot, the / dev / md1 partition is the LVM Physical device and the system is on it.
# ls -l / dev / vd *
brw-rw ----. 1 root disk 252, 0 Mar 23 22:34 / dev / vda
brw-rw ----. 1 root disk 252, 1 Mar 23 22:34 / dev / vda1
brw-rw ----. 1 root disk 252, 2 Mar 23 22:34 / dev / vda2
brw-rw ----. 1 root disk 252, 16 Mar 23 22:34 / dev / vdb
brw-rw ----. 1 root disk 252, 17 Mar 23 22:34 / dev / vdb1
brw-rw ----. 1 root disk 252, 18 Mar 23 22:34 / dev / vdb2
brw-rw ----. 1 root disk 252, 32 Mar 23 22:34 / dev / vdc
brw-rw ----. 1 root disk 252, 48 Mar 23 22:34 / dev / vdd
#cat / proc / mdstat
Personalities: [raid1]
md0: active raid1 vda1 [0] vdb1 [1]
      204788 blocks super 1.0 [2/2] [UU]
md1: active raid1 vdb2 [1] vda2 [0]
      5937144 blocks super 1.1 [2/2] [UU]
      bitmap: 1/1 pages [4KB], 65536KB chunk
# df -h
Filesystem Size Used Avail Use% Mounted on
/ dev / mapper / vg_vmraid10-root 2.0G 591M 1.3G 32% /
tmpfs 376M 0 376M 0% / dev / shm
/ dev / md0 194M 33M 151M 18% / boot
/ dev / mapper / vg_vmraid10-part 1008M 34M 924M 4% / mnt / part

In the example, the drives are listed as / dev / vdX, because for example, a virtual machine was used using paravirtualized VirtIO disk drivers.
First, extract the / dev / vdb2 partition from / dev / md1
# mdadm / dev / md1 -f / dev / vdb2
mdadm: set / dev / vdb2 faulty in / dev / md1
# mdadm / dev / md1 -r / dev / vdb2
mdadm: hot removed / dev / vdb2 from / dev / md1

Next, we need to overwrite the super block of the section and score a small part of the section with zeros, because if this is not done, then the data on / dev / md1 and the partition with raid10, as I understand it, there will be consistants because of this there will be problems creating a partition with raid10 (mdadm will swear that the / dev / vdb2 partition is already used in / dev / md1, despite the fact that we extracted it earlier) and after rebooting the system will try to boot not from / dev / md1, but c / dev / md2 and everything will end on kernel panic.
# dd if = / dev / zero of = / dev / vdb2 bs = 512 count = 1
# dd if = / dev / zero of = / dev / vdb2 bs = 1M count = 100

I know that you can do only the second team, but initially began to do so, so I did not dare to experiment on a real machine.

Next, we need to copy the partition table from / dev / vdb to / dev / vdc and / dev / vdd. We will use the sfdisk utility. sfdisk will swear and say that I will not do anything because the section does not start at the boundary of the cylinder, add the -f switch to it.
# sfdisk -d / dev / vdb | sfdisk -f / dev / vdc
# sfdisk -d / dev / vdb | sfdisk -f / dev / vdd

Sections are ready, it's time to create a new raid10 in degraded mode. I indicate my uuid also for insurance, because There were problems during the experiments.
# mdadm --create / dev / md2 --uuid = 3846bee5: d9317441: f8fb6391: 4c024445 --level = 10 --raid-devices = 4 --chunk = 2048 missing / dev / vd [bcd] 2
mdadm: Defaulting to version 1.2 metadata
mdadm: array / dev / md2 started.

Add a line with a new section to /etc/mdadm.conf
# cat /etc/mdadm.conf 
# mdadm.conf written out by anaconda
MAILADDR root
AUTO + imsm + 1.x -all
ARRAY / dev / md0 level = raid1 num-devices = 2 UUID = 7872830a: c480f8c4: ac316f53: c6ea2b52
ARRAY / dev / md1 level = raid1 num-devices = 2 UUID = 3846bee5: d9317441: f8fb6391: c4024454
ARRAY / dev / md2 level = raid10 num-devices = 4 UUID = 3846bee5: d9317441: f8fb6391: 4c024445

Reboot, at boot we see that the / dev / md2 partition boots in degraded mode
md / raid10: md2: active with 3 out of 4 devices

We create a physical volume from the newly made section, I also specify my uuid to avoid duplicate uuid.
# pvcreate --uuid I0OAVm-27U4-KFWZ-4lMB-F3r9-X2kx-LnWADB --norestorefile / dev / md2
  Writing physical volume data to disk "/ dev / md2"
  Physical volume "/ dev / md2" successfully created

Now increase the volume group vg_vmraid10
# vgextend vg_vmraid10 / dev / md2
  Volume group "vg_vmraid10" successfully extended

In LVM on Physical volume, data is stored in blocks called PhysicalExtent (PE), these PE's can be moved between Physical volume, which we now need to do.
# pvmove / dev / md1 / dev / md2
...
  / dev / md1: Moved: 100.0%

Now you need to open the file /boot/grub/menu.lst and in the command line for the kernel, adjust the rd_MD_UUID parameter to the uuid of the / dev / md2 section in my case it is 3846bee5: d9317441: f8fb6391: 4c024445, if this is not done, the system will try to find the root partition on / dev / md1, but the volume is already empty. Also in this line it is advisable to add a useful, for remote work, parameter panic = 10, which means with kernel panic do auto-reboot after 10 seconds. Next, we need to restart the server.
# reboot

And here we are waiting for one present, so far I have not figured out why this is happening. After a reboot, the / dev / md2 partition is renamed to the / dev / md127 partition, but the system is normally booted using uuid. As far as I know, this applies to the new version of mdadm - 3.0 and higher, where you can create a partitionable array in which you can name md partitions. If anyone knows why this happens, then please unsubscribe in the comments. In the meantime, I can only correct the section number in the /etc/mdadm.conf file and delete the line responsible for / dev / md1.
Remove from the Physical group / dev / md1 and remove it from the Physical device list. We stop / dev / md1, overwrite the super block at / dev / vda2 and add it to / dev / md127

# vgreduce vg_vmraid10 / dev / md1
  Removed "/ dev / md1" from volume group "vg_vmraid10"
# pvremove / dev / md1
  Labels on physical volume "/ dev / md1" successfully wiped
# mdadm -S / dev / md1
mdadm: stopped / dev / md1
# dd if = / dev / zero of = / dev / vda2 bs = 512 count = 1
# mdadm / dev / md127 -a / dev / vda2
mdadm: added / dev / vda2
# cat / proc / mdstat
Personalities: [raid10] [raid1]
md0: active raid1 vda1 [0] vdb1 [1]
      204788 blocks super 1.0 [2/2] [UU]
md127: active raid10 vda2 [4] vdb2 [1] vdc2 [2] vdd2 [3]
      11870208 blocks super 1.2 2048K chunks 2 near-copies [4/3] [_UUU]
      [> ....................] recovery = 1.1% (68352/5935104) finish = 2.8min speed = 34176K / sec

Everything, the system is migrated to raid10.
This should work on CentOS5 as well, but there you will not have to change uuid in menu.lst, because there is no such parameter. In debian squeezy, the difference will only be in grub, as there grub2.
In conclusion, I concluded for myself that it is better to always use LVM, as working with disks becomes easier at times. I can expand partitions, transfer data from one physical disk to another in transparent mode, I can take snapshots, etc.
My first article, it may have turned out to be too long, I wanted to describe everything in detail to make it clear.
Thank you for reading to the end.

Also popular now: