
Empirical analysis of hardware failures on a million PCs
Microsoft has conducted the first ever large-scale study of hardware failures on a million personal computers ( PDF ). Several interesting facts came to light.
Unlike the widespread opinion that hardware failures are trivial, they are actually quite rare, with 99% of failures being repeated. For example, a machine with accumulated 30+ days of CPU operation over a period of 8 months has a 1/190 probability of failure due to an error in the CPU subsystem. If this happens, then the probability of a repeated failure on this machine is 1 / 2.9.

If the first failure occurred within five days of the CPU, then 84% of such computers show a repeated failure within 10 days, and 97% - within a month.
Overclocking the CPU increases the probability of failure by 4-19 times, depending on the brand of processor.

The probability of DRAM failure during overclocking is increased five times.

On the other hand, the processor at a lower frequency increases the reliability of the equipment.

The study calculated the likelihood of failures of the CPU, DRAM and disk subsystem on desktop computers and laptops, on computers of well-known brands and self-assembly. The dependence of the number of failures on the age of the computer, memory size, CPU performance is shown.

Another interesting fact: it turns out that laptops usually work more reliable than desktop computers.

Unlike the widespread opinion that hardware failures are trivial, they are actually quite rare, with 99% of failures being repeated. For example, a machine with accumulated 30+ days of CPU operation over a period of 8 months has a 1/190 probability of failure due to an error in the CPU subsystem. If this happens, then the probability of a repeated failure on this machine is 1 / 2.9.

If the first failure occurred within five days of the CPU, then 84% of such computers show a repeated failure within 10 days, and 97% - within a month.
Overclocking the CPU increases the probability of failure by 4-19 times, depending on the brand of processor.

The probability of DRAM failure during overclocking is increased five times.

On the other hand, the processor at a lower frequency increases the reliability of the equipment.

The study calculated the likelihood of failures of the CPU, DRAM and disk subsystem on desktop computers and laptops, on computers of well-known brands and self-assembly. The dependence of the number of failures on the age of the computer, memory size, CPU performance is shown.

Another interesting fact: it turns out that laptops usually work more reliable than desktop computers.

Methodology
The analysis was carried out in 2008 on the basis of crash / status reports sent by the Windows Error Reporting system in cases of normal failures or after a restart of Windows (respectively, two samples: RAC and ATLAS). Crash reports indicate the period of operation without failure. The study is considered conservative, because it does not take into account cases where crash reports are not sent due to too high a frequency of failures.
Information was collected from 950 thousand computers.
Information was collected from 950 thousand computers.