The Tale of the Fallen RAID 0

    ... or, more precisely, how I recovered data from the nVidia RAID 0 I put.

    Tie

    Actually, a prelude. He lives at my computer, the IDE controller is implemented in nVidia MCP65. This controller has an option that turns it into a RAID controller with support for RAID 0 and RAID 1. In my case, RAID 0 from two Samsung's of 250 GB each. In general, it was difficult to surprise anything supernatural onboard RAID' already 5 years ago, not like in our time. If someone is interested in the prefix “fake”, that's what semi-hardware RAID implementations are called. We will analyze on the fingers.

    In OSs such as Windows 2003 or, say, Debian Lenny GNU Linux, it is possible to implement a software RAID array using OS tools. In the first case, it will be the conversion of disks into the so-called dynamic disks and the manipulation of the partitions available on them, in the second - for example, the use of LVM utilities (in general, it is also “like” dynamic disks). The lack of need for additional controllers, the use of processor and RAM for operation, etc.

    From the other side of the village are hardware implementations, say, Intel SRCSATAWB : up to 16 SATA drives, RAID level up to 60, controller costs about 15,000 rub ... Compared with the first case - removing the load from the processor and RAM, hardware implementation, the cost of acquiring the device.

    The concepts of “hardware RAID” and “software RAID” are counter-contradictory, that is, they are opposite to each other, but they allow intermediate variants to exist, the common name of which is “semi-hardware RAID” or “fake RAID”. As if part of the work of such a RAID is delegated to the driver in the OS. In fact, almost all the work is delegated: fake RAID can only pretend to be a rag (RAID), telling the OS, let's say that the PC does not have 2 disks, but 1, but 2 times more, and require firewood for work, and engage in basic monitoring of disks, insidiously throwing disks out of the RAID array (however, this is also the responsibility of hardware RAID); the rest is the work on the implementation of the functionality of RAID 0, RAID 1, RAID 5 (underline as necessary) the driver of the RAID controller. By the way

    Uhh ... Now, prelude. =) Having got at my disposal from my own technical department a “non-working” 400 GB WD disk left by the client, I made the decision once again to look at some fresh Linux distribution. The choice this time fell on Ubuntu 9.04 x86. In an already configured system, I was poked to try to access my fake RAID. I didn’t remember the name of the utility that I used last year (however, later I remembered it was dmraid), but in a couple of minutes I googled another software - mdadm . I found software in the ubbyatnik, installed it, began to understand it, I stumbled somewhere in Google on what it says “it would be nice to restore the space reserved by the drive for HPA ”. No sooner said than done (with the help of Victoria ). As it turned out - in vain. =)

    That's what I love RAID for - it’s for certainty. Since the disk has dropped out of the RAID 0 array, so the most you can try to do is check if the screw is in contact with power and the controller / motherboard. So it is with me: the disks are connected, but in the RAID-BIOS, there is only one of them in the array. The options are to delete the array.

    At the end of the tie, I had: a remote RAID 0 array, a disabled RAID-BIOS, two serviceable screws of 250 gigs with the data available on them (that is, smeared on both screws), a 400 GB boot screw with Ubuntu installed, and the same desire to restore acquired "overwork".

    Action

    Fundamentally, there is nothing complicated in recovering data from RAID 0 if the cause of the degradation of the array is not the failure of one of the drives, but only the disassembly of the array manually (for example, with the help of the human factor, as in my case). All the data is in place, perhaps with the exception of the information about the array itself (which, however, is no longer needed in digital form to recover data from the array); the main thing is to remember how RAID 0 is organized, what MBR and Partition Table are , collect data to restore the structure of the array and merge data from the disks.

    The data recovery algorithm in my case turned out to be as follows:
    1. Find out which of the 2 disks is the first in the array, and which is the second.
    2. Find out the size of the stripe in the array.
    3. Use software that will take this data as the source, and the output will give the recovered data (or give a more or less simple option to access them).
    4. Get access to recovered data.
    And all this is desirable to do in Linux - for fun. =)

    Let's sort the items in order.

    1.
    Data in a RAID 0 array from two drives is sequentially written to each disk in equal blocks (strips) throughout the array.

    For example: the
    first X Kb - to drive 1 at the LBA offset Y (for simplicity, we can take that Y = 0 - this will be the 0 sector of the disk; I merged a specific partition of the disk, and Y turned out to be equal to the number of the initial sector of this section), the
    following X Kb - to disk 2 at the same offset Y,
    the following X Kb - to disk 1 at offset Y + 1 * (X / 512), the
    following X Kb - to disk 2 at offset Y + 1 * (X / 512), the
    following X Kb - to disk 1 at offset Y + 2 * (X / 512), the
    following X Kb - to disk 2 at offset Y + 2 * (X / 512), the
    following X Kb - to disk 1 at offset Y + 3 * (X / 512), the
    following X Kb - to disk 2 at offset Y + 3 * (X / 512),
    ...

    It is possible to determine which of the disks is the first by looking at the zero sector of both disks in the hex editor: in this case, the zero sector of the first disk in the array will contain the MBR of the array with all its MBR features: signature “55AA” at offset 510 ( 1FEh) byte of zero sector, text a la “Invalid partition table”; while in the zero sector of the second disk none of this will almost certainly be. In my case, the first disk in the array was / dev / sdb, and the second was / dev / sdc. Below this order of devices will be used. 2. I vaguely remembered that the size of the strip in my array was 64 Kb. However, a check was needed.

    sudo dd if=/dev/sdb of=/home/f/mbr01 count=1
    sudo dd if=/dev/sdc of=/home/f/mbr02 count=1
    ghex2 /home/f/mbr01
    ghex2 /home/f/mbr02







    Checking this is not too difficult, but more troublesome than step 1. For analysis, dumps similar to those made in step 1 were required, only of a larger volume - 32 MB each (65536 sectors). It makes sense to start with the maximum stripe size - as a rule it is 64 Kb. Having translated the number 65536 into the hexadecimal number system (in the hexadecimal editor, the offset is displayed naturally in it) we get 10000h. Moving along the displacements that are multiples of this number, we are convinced of the following: the data immediately above these displacements look visually very different than the data directly below these displacements.

    sudo dd if=/dev/sdb of=/home/f/dump01 count=65536
    sudo dd if=/dev/sdc of=/home/f/dump02 count=65536
    ghex2 /home/f/dump01
    ghex2 /home/f/dump02



    image
    Of course, such displacements (especially at the beginning of the dump) can also come across such patterns when everything is zeros, or vice versa, the entropy above and below the data displacement is so high that it is impossible to conclude whether this displacement is the beginning of a new stripe.
    image
    In the hex editor, you should find the data area by the offset, a multiple of the stripe size (10000h), in which you can navigate and draw a conclusion.

    Having received confirmation that at offsets that are multiples of 10000h we observe sharp transitions, as if some data were skipped (and at this offset, it will be almost certain), the next step is to suggest that the size of the stripe could actually be 2 times smaller - 32 Kb or 8000h. Repeat the above operation for a given offset size. If we find several offsets that have the same effect - changing the data pattern at the offset boundary - we conclude that the stripe size can really be 32 Kb ... Reduce the estimated size of the stripe by 2 times - and repeat ... We stop in that case when we come across at least one confirmation of the fact that, with the current estimated size of the stripe, we found data for at least one offset, for which, with a high degree of probability, the part under the offset is a continuation of the part over the offset, in other words, we see a clearly continuous sequence of data. This is especially noticeable when we stumbled upon the contents of a text file. In this case, we abandon the current stripe size and return to the option of the previous stripe size, which is 2 times larger, and carefully check this option: a text document or obviously sequential information (of course, this does not concern a sequence of zeros, as well as chaotic garbage) should not be detected at the displacement boundary, otherwise you should increase the size of the stripe by another 2 times. when we came across the contents of a text file. In this case, we abandon the current stripe size and return to the option of the previous stripe size, which is 2 times larger, and carefully check this option: a text document or obviously sequential information (of course, this does not concern a sequence of zeros, as well as chaotic garbage) should not be detected at the displacement boundary, otherwise you should increase the size of the stripe by another 2 times. when we came across the contents of a text file. In this case, we abandon the current stripe size and return to the option of the previous stripe size, which is 2 times larger, and carefully check this option: a text document or obviously sequential information (of course, this does not concern a sequence of zeros, as well as chaotic garbage) should not be detected at the displacement boundary, otherwise you should increase the size of the stripe by another 2 times.

    In my case, the size of the stripe was confirmed - 64 Kb, or 128 sectors. Below this number will be used.

    3.
    A Google search for a ready-made solution yielded nothing either for software for Windows or software for Linux: it might have been poorly searched, it might have been too drunk (which, however, is the same thing). Accordingly, I decided that the most suitable way would be to compose a single disk image file (array), and there I’ll figure it out somehow. I, an employee of the technical department, borrowed the medium for recording the file image at a warehouse of the company where I have the honor to work with comrade tozx - hello, friend! =)

    I had to recall my poor scripting skills in Perl: You can find out the size of one disk in sectors using Victoria, or by looking at the label on the disk (LBA). four.

    #! /usr/bin/perl

    use strict;
    use warnings;

    my $innnum = 0; # переменная-счётчик, номер сектора для считывания с дисков из ex-массива
    my $outnum = 0; # переменная-счётчик, номер сектора для записи в файл-образ
    my $multipler = 128; # размер страйпа в секторах; сколько секторов читаем за раз
    my $lbasize = 488395008; # размер физического диска из ex-массива в секторах, округлённый по значению размера страйпа (кратный 128 в меньшую сторону)
    my $timert1 = 0; # необязательный вспомогательный счётчик
    my $timert2 = 0; # ещё один необязательный счётчик

    while ($lbasize - $innnum >= $multipler) # пока не прочитали весь диск
    {
    system ("dd if=/dev/sdb of=/media/recovery/image count=$multipler skip=$innnum seek=$outnum 2> /dev/null");
    $outnum += $multipler;
    system ("dd if=/dev/sdc of=/media/recovery/image count=$multipler skip=$innnum seek=$outnum 2> /dev/null");
    $outnum += $multipler;
    $innnum += $multipler;
    $timert1++;
    $timert2++;
    if ($timert2 == 256) { # выводим время от времени прогресс нашей операции
    $timert2 = 0;
    print ("$innnum OF $lbasize\n");
    }
    if ($timert1 == 16192) { # останавливаем время от времени операцию, давая дискам остыть
    $timert1 = 0;
    print ("So hot! Sleeping for 33 sec...\n");
    sleep (33);
    }
    }
    print ("DONE.\n");





    I bring the recipe for mounting the HDD image, which I used, with minor changes: We get to the terminal a list of partitions found in the image. In my case: For the desired partition (in my case, I was interested in the second section) we take the number from the “Start” column, multiply it by the sector size (512), getting 42952818688, and drive in the command: Next, we need to mount the device / dev / loop1 as a partition with the appropriate file system (without forgetting to create a directory for mounting in advance):

    sudo losetup /dev/loop0 /media/recovery/image
    sudo fdisk -lu /dev/loop0




    Диск /dev/loop0: 500.1 ГБ, 500116488192 байт
    255 heads, 63 sectors/track, 60802 cylinders, всего 976790016 секторов
    Units = секторы of 1 * 512 = 512 bytes
    Disk identifier: 0xdf39af97

    Устр-во Загр Начало Конец Блоки Id Система
    /dev/loop0p1 * 2048 83891429 41944691 7 HPFS/NTFS
    /dev/loop0p2 83892224 488395055 202251416 7 HPFS/NTFS




    sudo losetup -o42952818688 /dev/loop1 /dev/loop0




    sudo mkdir /media/yuppi
    sudo mount -t ntfs -o ro /dev/loop1 /media/yuppi


    Afterword


    “Everything About Everything” spent four evenings. Copying 500 GB (250 GB each) per disk took almost a day - the performance is affected by the fact that the script reads 64 KB from the disk, for each reading you need to call the system command. The read speed was ~ 15 Mb / s. To speed up the operation, the pearl script can be improved by adding 2 buffer arrays of several megabytes each (one for each physical disk) into which to read data from the disks, and after reading it, to parse the data from these buffers into an image file. But it was too lazy to waste time - it would take me more time to implement this feature in the pearl than slow copying in the way I described.

    Should I mention that nothing was lost ... =)

    Success.

    Also popular now: