How to check the reliability of the data center: 3 main points to pay attention to

    Choosing an IaaS-provider, the company focuses on the characteristics of the cloud. They assess the availability, scalability, etc. However, the hardware installed in the data center is responsible for the performance of any virtualized environment. For the most part, this infrastructure (and the place where it is located) depends on the reliability of cloud services.

    Today we decided to tell what points should be paid attention to when evaluating the parameters of the IaaS provider data center.

    / photo Arthur Caranta CC

    Reliability and redundancy

    In the first place, when assessing the data center of an IaaS provider, one should pay attention to the redundancy of the engineering infrastructure, in particular, of power supply systems. Since this parameter affects the level of availability, that is, the time of continuous operation without failures.

    To assess the levels of redundancy, you can use the classification of the Uptime Institute.

    • Tier 1 - in this case, the redundancy scheme is missing (N). Reliability depends on each individual element of the infrastructure, and failure in one piece of equipment leads to downtime of the entire data center.
    • Tier 2 - implies N + 1 redundancy scheme. One additional element is added to the N infrastructure elements, reducing the risk of failure.
    • Tier 3 - redundancy scheme is also N + 1, but with the possibility of parallel technical work.
    • Tier 4 - 2N reservation . When each element is duplicated the same.

    The Tier classification assumes that engineering systems are considered a single entity. If at least one of the components is not reserved, the fault tolerance level of the UI is reduced. The higher the tier, the higher the availability. However, it should be understood that in the UI classification there are no “worst and best”. This is not to say that in any situation any one Tier will do. Therefore, it is necessary to choose a provider with a data center that has a certain level of redundancy, starting from the challenges facing the company.

    For large organizations, for which undesirable downtime, it makes sense to pay attention to the data center with 2N redundancy. For example, on this path went to Facebook. The company's data center, located in the Swedish city of Luleå, has a 2N reservation. Energy systems of the data center of Sberbank in Skolkovoreserved in the same way.

    However, in some cases, such a system may be redundant. Since the higher the Tier, the more expensive the rental of the equipment of the cloud provider. Therefore, companies for which a simple IT infrastructure and services for one hour per year is not critical should choose a data center with a smaller Tier.

    For example, IaaS-provider IT-GRAD places equipment in DataSpace (Moscow) and Xelent (St. Petersburg) data centers. This is a data center with Tier III class, which has an idle time of about one and a half hours per year. In them, the reservation is made according to the scheme N + 1. For example, on the Moscow site, continuous power supply is provided by two city substations along six independent lines. In the data center installedsix independent transformers of 2 MVA, each of which is the point of connection of an independent electrical circuit.

    In the case of force majeure and voltage drops, the possible disconnection of one power supply branch does not affect the operation of the system as a whole, since the entire load is transferred to the backup branch. As a “reserve plan”, there are automatic diesel generators with 6 fuel tanks of 950 liters each. When fully loaded, the reserve will provide the data center with 84 hours of continuous operation.

    Microclimate maintenance

    The next important aspect is the evaluation of the data center “refrigeration” operation. The ability of cooling systems to maintain an optimal microclimate in the engine room affects the reliability of the “iron”, the amount of electricity consumed and, accordingly, the price tag for the equipment placement services provided.

    For example, when the temperature rises in the data center from 22 ° C. to 35 ° C, server power consumption increases by an average of 20%. And according to representatives of the ASHRAE Society of Engineers who develop standards for communications and air quality, temperatures below 18 ° C and above 27 ° C can significantly reduce the output power and battery life of uninterrupted power systems ( report page 29 ).

    However, you also need to consider exactly how the required temperature is maintained in the data center. Since if the efficiency of the cooling system is low, it will consume a large amount of electricity. In some cases, up to 40% of the total energy consumption of the data center is spent on air conditioning . This, in turn, affects the rental of equipment.

    Therefore, often to control the microclimate and air temperature in the data center use the technology of "free cooling" (or free cooling). It reduces energy consumption. According to the latest data, in the Russian market the best indicator of the energy efficiency of the data center ( PUE ) has a data center Xelent - it was1.29. The data center of Google is considered a record holder in this area - the IT giant has managed to achieve a PUE value of 1.11.

    In the data center Xelent temperature for all IT equipment is supported in accordance with the recommendations of ASHRAE. A rotor heat exchanger is responsible for the microclimate in the data center . This is a large five-meter wheel, which provides heat transfer from the data center machine rooms to the street with almost no air convection.

    It is necessary to take into account the fact that in the data center, a given level of humidity is ensured. Condensate formation can be dangerous for the server equipment and lead to its damage. How it happenedwith the first Facebook data center in Prineville, where errors in the operation of the microclimate system led to liquid entering the equipment. In the server room literally went "condensate rain." Equipment had to urgently turn off. In ASHRAE note that the humidity level in the data center should not exceed 60%. In the case of Facebook, this figure reached 95%.

    / photo by Tim Dorr CC

    Physical security

    Today there are data centers located in underground bunkers, which are guarded by armed soldiers. There is a data center protected from nuclear explosions or EMR. However, most often they are used by the largest transnational companies or military structures. For most organizations, such measures are redundant and uneconomical. However, the issue of security and physical penetration remains relevant to all.

    There are three points that need to be taken into account : access control, the presence of video cameras and signal sensors around the perimeter, the security of “cells” with server racks. Perhaps the best way to check each of them is a tour to the data center. So you can independently assess how difficult it is to penetrate into the machine rooms of a particular data center.

    For example, the Xelent data center has an access control system at the entrance. All visitors and cars are inspected at the checkpoint. Also, everyone who passes to the territory of the data center (including employees) must be registered. Two hundred video cameras are scattered around the territory, which monitor the situation in the server room. Access to machine rooms is possible only when accompanied by data center employees who have special access cards (these can be key cards or biometric cards).

    When checking the physical protection of machine rooms, evaluate not only the perimeter and server security, but also the fire safety at the facility. For example, DataSpace data center usessystem of early early fire detection. Sensors throughout the building evaluate air samples, which helps prevent a fire. The data center uses a safe equipment gas fire extinguishing system, which in the case of an emergency can reduce the amount of possible damage to a minimum.

    Let's sum up

    When assessing the reliability of a cloud provider’s data center, the following things should be done:

    • Pay attention to the redundancy of engineering infrastructure. It depends on the level of availability. Choose the necessary backup scheme depending on the requirements and objectives of the company.
    • Assess the cooling system and maintain a microclimate in the engine room. It is good if the data center uses technologies aimed at reducing the PUE index. So, the data center spends more electricity on performing calculations, rather than cooling servers, which saves customers money.
    • Inside the data center, physical protection of server rooms should be organized (security, fire extinguishing systems, video surveillance), and strict procedures for admitting visitors to the territory are prescribed.

    However, the security and reliability of the data center is determined not only by physical security measures, but also by the software: firewalls, DDoS protection mechanisms, data encryption, etc. We will discuss these aspects in our next article.

    PS A few more materials from the First Corporate IaaS blog:

    PPS Fresh posts from our blog on Habré:

    The main activity of the company IT-GRAD is the provision of cloud services:

    Virtual Infrastructure (IaaS) | PCI DSS Hosting | Cloud FZ-152 | Rent 1C in the cloud

    Also popular now: