"Memory Component Issue", or large-scale defective network equipment

    The existence of a problem, which many suspected, was confirmed.

    Cisco announced that an unnamed memory manufacturer for five years (from 2005 to 2010) delivered them a marriage. The nature of the marriage: equipment with this memory can accumulate uptime for years without causing any complaints about its work, but it is worth rebooting it (by power or even simple reload) - the memory stops working correctly, the device itself either does not boot, or it boots and periodically crashes. This is due to the degradation of memory chips. According to the vendor, the main problems begin after two years of operation.

    Before the rotten tomatoes fly to Cisco, I hasten to warn: the memory is standard, many vendors bought it, therefore a great many pieces of equipment can be affected. there isconfirmation of similar problems at Juniper. But only Cisco confessed, despite the inevitable reputation damage. Their financial losses due to this disaster amount to about $ 655 million .

    We sit down, get validol and look at the list of affected equipment.

    Specific partnumbers and a detailed description of symptoms can be found in Field Notices or directly from the links.

    I repeat, in the risk zone, equipment manufactured 5-10 years ago and still worked perfectly, and failure occurs precisely when rebooting in any way, and not during regular operation.

    The replacement is standard, according to RMA, the entire piece of iron or the memory bar, as soon as it breaks. Apparently, defective memory is far from 100% of the equipment mentioned above, and even if it is in your piece of hardware, it can die not from today's reboot, but after 10 years.

    Check by serial numbers, who is in danger, it is impossible. No way. I've tried.

    Colleagues. I think that at this stage everyone understood that the approach I saw “I once bought a Cisco router for a lot of money, it worked for years and will last for many years, the reserve is not needed” is criminal. And even a hot reserve may no longer help. Imagine that a light blinked in the data center and that’s ityour network equipment is broken and needs to be replaced by the very fact of short-term blackout and reboot. Even a simple scheduled nightly reload of an unreserved piece of iron can result in a frantic search for a replacement and a long downtime. Assess risks, fill out service contracts with fast delivery, find or purchase a replacement memory in advance, change the hardware itself to a newer one. Based on the fact that after the next reboot, any piece of iron from the list above (and not only) may not rise, plan the escape route.

    Finally, with a minute of silence, we honor one of the many untimely deceased memory dice that previously served faithfully as part of 2811 routers.
    Hidden text

    Also popular now: