Removing disks from a live raid on adaptec 6805

Task:


Extract 2 hard drives from raid-10 at a logical level (i.e., without physically extracting them from the server), to collect raid-1 from them, transfer the system there and prepare everything that can be rebooted, thus minimizing the time and number of downtimes.

What is the difficulty?


In the 5th series of adapteks, the question was solved by two teams:
1. We flashed the disk: arcconf setstate 1 device 0 0 ddd
2. Transferred it to Ready status: arcconf setstate 1 device 0 0 rdy
3. Do what we want with the disks.

In the 6 series, it does not roll like that. Regardless of whether failover is enabled or not, disks return to the Present state, and nothing can be done about them (I think it is clear that the raid itself will be Degraded until Rebuild passes).
The attempt to contact the official technical support was unsuccessful - I received an answer, but there was a feeling that I was using a home piece of iron and not a server that I couldn’t just pull back and forth:

After you run the command “arcconf setstate 1 device 0 0 ddd”, was the system rebooted? If not, then reboot and initialize both disks in the BIOS of the controller. There you can immediately create RAID-1.

To erase the meteorological data on a disk under Arcconf, the disk can be initialized with the arcconf task command. For example: arcconf task start 1 device 0 0 initialize

After that, the disk should be available for creating other logical drives.

However, if you eject two disks from RAID-10, then it remains in the “Degraded” status. If one of the errors remains, the remaining array in the array may crash. So maybe just backup all the data, then just delete the RAID-10 array and create two separate RAID-1s.


I thought and solved the problem with a series of experiments, after which I was able to complete the task.



Description:


We have a logical device with raid-10 on 4 disks
 Logical device segment information
   --------------------------------------------------------
   Group 0, Segment 0                       : Present (0,0)             J0VV3R8N
   Group 0, Segment 1                       : Present (0,1)             J0VV3ZBN
   Group 1, Segment 0                       : Present (0,2)             J0VV3YEN
   Group 1, Segment 1                       : Present (0,3)             J0VX2WXN


It is necessary to pull 2 ​​hard from it (one from different groups) and make raid-1 of them

Decision:


1. Make sure failover is enabled
arcconf failover 1 on


2. Failim 2 discs from different groups
arcconf setstate 1 device 0 0 ddd
arcconf setstate 1 device 0 2 ddd

Disks will become Inconsistent in logicaldevice and Failed in physicaldevice

3. Set these disks to ready status
arcconf setstate 1 device 0 0 rdy
arcconf setstate 1 device 0 2 rdy

Disks will become Missing in logicaldevice and Ready in physicaldevice

4. We wait until failover starts rebuilding
Group 0, Segment 0                       : Rebuilding (0,0)             J0VV3R8N

They will be rebuilt in turn, including, as soon as one of them has a Rebuilding state, we make point 5 immediately, then for the next.

5. Feylim and very quickly go to step 6
arcconf setstate 1 device 0 0 ddd
arcconf setstate 1 device 0 2 ddd

Disks will become Inconsistent in logicaldevice and Failed in physicaldevice

6. We set disks to Ready status and very quickly go to step 7
arcconf setstate 1 device 0 0 rdy
arcconf setstate 1 device 0 2 rdy

Disks will become Missing in logicaldevice and Ready in physicaldevice

7. Turn off failover and very quickly go to step 8
arcconf failover 1 off


8. Initializing disks
arcconf task  start 1 device 0 0 initialize
arcconf task  start 1 device 0 2 initialize


Hooray, we can cook raid-1 from them.
arcconf CREATE 1 LOGICALDRIVE MAX 1 0 0 0 2


The reader may have questions why we performed the same actions 2 times and why we didn’t immediately turn off failover.
I repeat, series 6 of adapters does not allow us to safely remove disks from the raid when failover is disabled after the command:
arcconf setstate 1 device 0 0 rdywe would get the status of the drive in logicaldrive Present, and the status of the raid Degraded, while the drive in physicaldrive would be in Online status, not Ready.
And why, starting from point 5, are we all doing fast? It's simple, after a few seconds the controller manages to recover and change the status of the
disks, so you need to have time to execute the commands before he does it.

I could not find a ready-made solution, I had to invent my own, I hope it will be useful to someone - I’m not the only one who uses the 6th series of adapteks.

UPDI upgraded 10 servers, everything went well. The only fix is ​​that you can only pull one hard from the raid at a time, then you need to repeat the steps and pull the second one. If you managed to eject the disk, but the persistent controller tries to re-engage it, just push it into the JBOD, remove the 2nd disk, remove the 1st one from JBOD and you can create 1 raid on 2 free disks.

Also popular now: