
SCT Error Recovery Control
... or what is really a 'raid edition' for hard drives

Bit of theory
There are two strategies for HDD behavior when an error is detected:
- standalone / desktop - try to read to the last. It feels like a "braking screw", which still works, if it is a single failure, it "dulled, but passed," plus a characteristic clatter of recalibrated heads.
- raid - fall off right there. It feels like “suddenly there was a disk error but then mhdd, etc. I DID NOT FIND ANYTHING WHAT I DO. ”
Managing error behavior strategies is a feature of expensive hard drives. In desktop series, it often simply does not exist, or it exists, but without the right to turn it on - the hard drive tupitates the error as much as it sees fit. The second important point - on raid hard drives, this option is enabled by default. Which can lead to problems.
Name decoding
The ability to control disk behavior during errors is called very, very confusing: SCT ERC. This stands for SCT Error Recovery Control. SCT, in turn, is the name of the general SMART Command Transport protocol. SMART, in turn, stands for Self-Monitoring, Analysis and Reporting Technology, so the full SCT ERC decryption is: Self-Monitoring, Analysis and Reporting Technology Command Transport Error Recovery Control (exhaled).
Quick reference
You can see if the hard drive supports error management using the command
smartctl -a /dev/sdxx
line SCT capabilities:SCT capabilities: (0x303f) SCT Status supported.
SCT Error Recovery Control supported. *****
SCT Feature Control supported.
If there is no line, their disk (command) does not support them.
Next - in fact, the management process. In those disks that I saw, there are two parameters - the read operation timeout and the write operation timeout. Below I will give the values for all the disks that my hands reached.
To watch timeouts use the command
smartctl -l scterc /dev/sda
. The output looks like this:# smartctl -l scterc /dev/sda
SCT Error Recovery Control:
Read: 70 (7.0 seconds)
Write: 70 (7.0 seconds)
# smartctl -l scterc /dev/sde
SCT Error Recovery Control:
Read: Disabled
Write: Disabled
# smartctl -l scterc /dev/sdd
Warning: device does not support SCT Error Recovery Control command
For installation, respectively, we indicate the values separated by a comma after scterc:
smartctl -l scterc,120,60 /dev/sde
(the value is indicated in tenths of a second, that is, 120 corresponds to 12 seconds, the first number is reading, the second is writing). 0 means “to the end”, that is, unlimitedly long.Default values
Here are the data from the different drives that I have on the farm:
Title | Model | ERC (whether or not, if any, default values) |
---|---|---|
Western Digital VelociRaptor | WDC WD1500HLFS-01G6U1 | Yes, 7/7 |
Western Digital RE4 Serial ATA | WDC WD1500HLFS-01G6U1 | Yes, 7/7 |
Western Digital RE3 Serial ATA family | WD1002FBYS-02A6B0 | Yes, 7/7 |
Western Digital Caviar Green (Adv. Format) | WDC WD20EARS-00MVWB0 | not supported |
Western Digital Caviar Green | WD7500AACS-00D6B0 | Yes, 0/0, cannot be turned on |
Seagate Maxtor DiamondMax 22 | STM3500320AS | Yes, 0/0, you can enable |
Seagate Barracuda 7200.9 | ST3400633AS | No (the maxtors / sigates have the same years, but the sigates do not - wow) |
Seagate Barracuda 7200.10 | ST3500630AS | not |
Seagate Barracuda 7200.11 | ST31500341AS | (suddenly!) Yes, 0/0, you can turn it on |
Seagate Barracuda LP | ST31500541AS | Yes, 0/0 (that is, turned off), you can enable |
SAMSUNG SpinPoint F4 EG (AFT) | SAMSUNG HD204UI | Yes, 0/0 (off), you can enable |
Hitachi Deskstar 7K3000 | HDS723030ALA640 | Yes, 0/0, cannot be enabled (scsi error aborted command) |
Hitachi Deskstar T7K500 | HDT725032VLA360 | Yes, 0/0, cannot be turned on |
(just don’t ask me where I got so many drives from at home).
Morality
People who take RE4 disks for themselves (and other raid editions from other remaining manufacturers), as well as velocity raptors for use as the only hard disk and do not set ERC to zero, do a gigantic stupidity comparable only to stupidity of people which desktop screws drive into the raid without setting up ERC and hope that in case of failure their raid will save.
In fact: they bought a cool screw home in the amount of one piece: turn off ERC (0,0). We bought a screw for a raid - check that his ERC is different from zero, but better closer to a reasonable value in the region of 3-10s. (300-1000).
Models, the use of which on the desktop requires attention: WD RE3, RE4, Raptor, Seagate NS.
PS In addition to ERC, manufacturers promise increased quality and reliability of the RE / NS series, but we can’t verify this, but the presence / absence of ERC is an objective, easily verified sign. A drive without an ERC should not be in a raid under any circumstances, since in the event of a failure of harm, it will bring more than good.
PPS How to perform operations with SMART'om in Microsoft Windows - I have no idea. Call manufacturer support and ask. Phone 8 (800) 200-8001.
For Mac OS X, as far as I know, there is a smartmontools port, so these commands (from the root) are quite executable there.
PPPS (from comments) For WD there is a utility WDTLER (Time-Limited Error Recovery) on some hdd green-series you can still enable ERC / TLER: blog.agdunn.net/?p=208