
Samsung SSDs are justified. The problem was in the Linux kernel

Remember the translation of the article “When Solid State Drives are not that solid” ? In it, Algolia employees blamed for data corruption in the RAID0 configuration on Samsung's SSDs.
The problem was nevertheless resolved as a result of a long trial, during which Algolia employees even had to write software that emulated their type of RAID load so that Samsung engineers could repeat the problem on their equipment. The fix affected the Linux kernel, or rather, the bio.c file, which is responsible for the basic operations of block I / O.
The problem was this: the kernel I / O subsystem can split the block I / O operation (BIO) into several, where appropriate. For separation, the bio_split () function is used. When splitting, a new BIO object is created, and the information in the old one is adjusted taking into account the fact that part of the addresses at which I / O occurs has moved to the new object. In order to save memory, a new object is created by copying values from the old one, while pointers in the new and old objects point to the same memory area. For read / write operations, this works fine, because when performing these operations, the contents of the fields of the BIO object accessible via pointers are not changed. However, this is not the case for the DISCARD operation - the bio_vec field of the bio structure contains a pointer to ancillary data,
The raid0 and raid10 kernel modules use the bio_split () function and send split requests to the SCSI / SATA driver, however, the SCSI / SATA driver does not assume that different requests can use the same memory area and overwrite the contents at the address specified in bio_vec. Therefore, the next request comes with a pointer to incorrect data, which calls DISCARD to incorrect addresses.
The first version of the patch , proposed by Samsung engineers, provided for the modification of the source code of the raid0 driver, however, a more general version was included in the kernel , which provides for a complete copying of the bio structure along with the memory pages it occupies if DISCARD is executed.
All drives supporting TRIM are affected by this problem, regardless of model, in a RAID0 or RAID10 configuration.
The question remains unclear why the problem did not appear on Intel drives. Perhaps the matter is in the timings.