HostingManager July 9, 2016 at 18:18

The expediency and advantages of using server drives, building RAID arrays, is it worth saving and when?

A large number of drives of various speeds and various manufacturers are available on the market. Not everyone clearly understands which drive is best to buy and for what task and why it is sometimes better to pay more, and when you can save. In this article I will try to clarify the main points and make the choice problem easier. The article will be useful not only to those who want to buy / rent a dedicated server, but also to those who want to get a reliable repository of information at home. After reading the material, it will become clear why it is not always advisable to rent desktop solutions in low-cost data centers and it is better to opt for a more reliable server hardware.

To begin with, all the drives available on the market can be clearly divided into classes:

- disks for ordinary desktops (used in home PCs, laptops and desktop servers low-cost data centers);
- server disks with a speed of 7200 revolutions per minute (RPM);
- Enterprise disks with speeds of 10,000 and 15,000 RPM;
- solid state drives.

We will probably consider the features of choosing solid-state drives in a separate article, but now we will focus mainly on hard drives and consider which drive where and when to use.

Let's start with regular PC drives. These are excellent disks with rather large capacity and good performance, but their main drawback is that they are not designed to work in a RAID array due to their design features. In these disks, the vibrations caused by the rotation of the spindle are practically not compensated in any way. Of course, these vibrations are minimal and in the case of using 1-2 discs at home, they are not a problem. However, if we consider the server case, when there are many disks, the influence of vibrations can be quite significant, since mutual vibrations occur, resonance enhances the effect. So, when 12 disks are installed right away in the case, and quite powerful server fans run at 5000-9000 rpm - the vibration level increases quite significantly, and with them the% of errors, losses, which has a negative impact on performance. The performance of desktop-type drives decreases in these cases at times, since they experience significant difficulties with positioning the heads and lose track. This can be clearly seen from the popular graph of performance versus vibration load:

SATA RE (RAID Edition) disks or server disks with a speed of 7200 RPM are another matter. They are less prone to vibration and less dependent on them. As you can see from the graph, the probability of errors due to vibrations is 50% lower for them.

But not only vibration is a problem, another major problem of all drives is the level of non-renewable errors. What does this mean in practice?

For SATA PC disks, the level of non-renewable errors is 1 error per 10 ¹⁴ bits, or 1 error per 12.5 TB of data. A 1TB drive has 1000 / 12500x10 ¹⁴ bits. 5 disks have a capacity of 5x (1000 / 12500x10 ¹⁴ ) bits, and the probability of an error when these disks work in a RAID5 array will be (5x (1000 / 12500x10 ¹⁰ )) / 10 ¹⁴x100% = 40%.

As you can see, it is simply impossible to use 5 PC disks in RAID5, since the probability of an unrecoverable error during rebuild is very high and the rebuild will fail rather. This way we get an array that will certainly fail in the case of a rebuild and the data will be lost. Earlier, I did not know about this feature and in 2008, when I assembled my first server on PC drives, I built a RAID5 array in order to save disk space and money, and less than a month later, the data was lost . Now it is amazing to me that the array has lived for so long :)

Of course, you can use more reliable RAID levels, such as RAID10 or, in extreme cases, RAID6, but with a large number of disks we will also receive a rather high degree of probability of an irreparable error during rebuild.

Another thing is server disks with a speed of 7200 revolutions per minute (RPM) SATA RE or drives Near Line (NL) SAS. The probability of an irreparable error for them is an order of magnitude lower due to their technical features, 1 error occurs per 10 ¹⁵ bits of data. Nevertheless, when using not only a large number of drives, but also large-volume drives, this may already be insufficient and in such cases you still have to use Enterprise-class SAS drives, the reliability level of which is 1 unrecoverable error per 10 ¹⁶ bit of data.

It is also worth noting that, in fact, for SATA RE, Near Line (NL) SAS disks and Enterprise-class SAS disks, in fact disks that can interact effectively with a RAID controller, the probability of an irreparable error is even less, just behind account of this ability. So, when working with a loaded array (databases with which many users work at once, active writing and reading of data), already recoverable errors begin to play a role, with which ordinary disks work inefficiently. They try to re-read the problem repeatedly - in the same Western Digital, the value is set to 64 head passes with different parameters for height, angle, only after which the head goes on to process other tasks. Due to this, the waiting time is greatly increased, which RAID does not tolerate and will certainly consider the drive lost and will try to recover the drive, as a result of which the load on the array will become critical, since at the same time the workload will also be rebuilt. The result is predictable - the collapse of the entire array.

Disks that can work with RAID can tell the RAID controller that there is a problem reading the data block, request this block from other disks and process other requests at the same time, and after receiving the block, overwrite it in another place of the problem disk. Due to this, no drop in the performance of the RAID array occurs and the likelihood of data loss is reduced significantly. However, it should be noted that not all software raid controllers installed on chipsets are able to "understand" such disks, because sometimes it is not enough to have RE disks for a reliable array, but you still need to use a hardware controller or other platform that works correctly with RAID.

Nevertheless, if you want to build more reliable storage than storage on PC-drives, you can buy cheaper disks than RE disks, for example, Constellation CS, which are designed to work exclusively with software raids and lack the lack of desktop (attempts to re-read multiple times data to the detriment of other tasks), while fully, of course, they do not interact with the controllers, so RAID failures are not completely excluded.

Regardless of which drive you use, you should also remember that disks have a cache of 32, 64 MB or more. What does this mean for a RAID array? In terms of performance, the cache is a plus for both reading and writing. However, in terms of recording reliability, this is a minus. Using the cache, the raid controller will think that it has already written data to the array, but in reality it can only be in the cache, and it will be written to the disk later. Depending on the size of the array, the size of the general cache grows, and in the case of 12 drives, the cache is already almost gigabytes. What happens to the data when the power is turned off? Right. They will be lost. And if we are talking about file washes, then it’s probably not so critical, but if we are talking about databases, it will be fun. Therefore, it is recommended for data of special criticality, such like databases, still disable write cache. This will reduce disk performance by 8-15% in database mode, but will significantly increase reliability. For this reason, if you purchase a large-capacity data warehouse, large manufacturers turn off the cache there by default and you cannot enable it. When using drives in servers, especially in a low-cost data center, where power to the server is not reserved, you need to remember this risk and take it into account.

We also note another key feature of SAS Enterprise-class disks, data is stored on them even more reliably, since the minimum cluster size is 520 bytes, and not 512, and 8 more bytes are added for parity. A large number of data recovery algorithms are used without a controller. For this reason, the volume of these disks is not very large.

Speaking of volume, it’s an extreme recommendation, if you have a task to store data reliably, do not try to use larger disks than necessary, since in the case of a rebuild, recovery will take longer. As a rule, the controllers do not analyze how much is actually occupied on the disk and restore the entire disk as a whole, because the difference in recovery time between 1 TB and 6 TB drive will be more than 6 times.

To summarize. Based on the foregoing, it is clear that for a small RAID array, the use of the most expensive Enterprise-class drives is not critical and does not give any advantages in reliability. Nevertheless, the use of server disks is highly desirable, since in this option there is an order of magnitude greater likelihood that the rebuild will complete successfully. You should not use larger disks than necessary, except when you need to provide higher IOPS performance (in some larger disks, there may still be a gain in speed due to more heads and plates). In cases where you need a large volume and many disks and at the same time a sufficient level of reliability - you can look towards SAS NL, which are essentially a modified version of SATA RE drives due to the SAS interface, however, they still have the same 7200 RPM. To increase the level of reliability, it is advisable to use a higher level RAID. When the volume of the array is not fundamental and maximum reliability is required, you need to definitely use the SAS 15000 RPM Enterprise.

Now, choosing to rent a server in the Netherlands, on our Switch site, using the configurator located at the bottom of the page http://www.ua-hosting.company/servers , or by modifying one of the specials. suggestions:

There comes an understanding of which drives and which server is better to use and for which tasks, when is it better to use drives in RAID, and when separately, distributing software files depending on popularity (balancer script depending on load). Why 4 larger disks, in terms of reliability, may be better than 12 smaller ones, but worse in terms of recovery time in case of rebuild. Well, the most important thing is why our offer is really cool for the server segment and we really brought the price closer to desktop platforms, while maintaining an order of magnitude higher reliability without exaggeration! So if you or your friends need a good server - welcome, the sale of some configurations from the list below is limited, very soon the prices for these configurations will be higher, although we are generous, but not unlimited :):

Yes, if someone has real experience in using these or other drives for certain tasks - do not hesitate to share them in the comments. Everything is interesting, up to failure statistics. On this topic, as well as about the problems of choosing an SSD drive, we will try to publish the material later.

Tags:

The expediency and advantages of using server drives, building RAID arrays, is it worth saving and when?

Also popular now: