What to do if a RAID controller fails?
Speaking of backups, we always mean that our hard drive may fail at any moment, and this is completely justified. Unfortunately, the reliability of modern HDDs leaves much to be desired, but not only they can be responsible for data loss.
Naturally, I'm talking about RAID arrays and specifically about the failure of the RAID controller itself. What to do in this situation?
In fact, everything is not as scary as it might seem at first glance. RAID configuration information is stored on the HDDs themselves that are part of the array. Usually it is located in the first or last sectors of each disk, and the RAID controller firmware writes it there when the array is formed. The configuration block is almost certainly duplicated on each disk of the array. With the exception of the disk number, the service data on all disks should be identical, and this can be used when restoring the array. Accordingly, all we need is to take a new controller and connect the disks in the same order in which they were connected to the deceased controller.
But this is all theory, let’s check to see if this is really so.
I have 2 servers with integrated RAID controllers:
HP ProLiant DL380 G7 with P410i Controller
An IBM x3650 M4 with a ServeRAID M5110e controller.
And I also have 2 controllers:
LSI Logic 9260-8i. The
results of the verification, in general, did not surprise me personally:
When replacing the controller on the DL380 with Adaptec, the controller saw the array and even tried to start from it system (although here we got kernel panic due to the lack of a driver for the controller in an already installed system), but in any case, the data integrity was not violated and the data must be restored. The test was conducted on both RAID1 and RAID0. With LSI, everything turned out to be more simple and sad - the controller saw the disks, but did not see the array, rebuild and other tricks did not give positive results.
With x3650 the picture turned out to be the opposite. Since the M5110e is built on a chip manufactured by LSI Logic, when replacing the controller with 9260-8i, the array was seen and in the same way as in the first case - we managed to get our data safe and sound, but with Adaptec the array refused to be recognized and the same no tricks helped us.
We can conclude from this - the service information of the array is “tied” to a certain brand of manufacturers of RAID controllers. Personally, my recommendations are the following - try to avoid the use of built-in RAID controllers, since the selection of the controller, in case of failure of the existing one, will be a rather problematic task, which may not succeed. Another thing is if you use an external controller of a certain manufacturer. Manufacturers of expensive controllers (LSI Logic, Adaptec, Intel, Promise) are quite conservative - the same models are produced for a long time, plus there is almost a 100% chance that your array will see perfectly and be fully operational on the updated version of your controller ( in general, in practically this way companies work,
Posted by KorP