Three history of modernization in the data center

    Hi, Habr! This year is 10 years since our first OST-1 data center was launched . During this time, we with colleagues from the operation and capital construction services managed to carry out more than one modernization of the engineering infrastructure of the data center. Today I will tell about the most interesting cases.

    The 200-ton crane installs the new Stulz chiller onto the frame. Modernization of the cooling system of the OST-1 data center system in 2015.

    The data center is a living organism, it grows, changes, breaks :) Everything that can be attributed to modernization, I conditionally divide into:

    • scheduled replacements and repairs. Equipment is obsolete, its life expires. We budget such works, plan and do it without haste when it is convenient for us (for example, a full upgrade of the “insides” of the UPS or replacement of rechargeable batteries).
    • design errors . According to the precepts of Uptime, everything must be spent and ended at the same time. Due to improper design, the balance of “cold - electricity - place” may be disturbed, for example: there is where to put the racks, but the hall no longer pulls on electricity or air conditioning. The most unpleasant thing with these errors is that they do not appear immediately, but when the data center is approaching its design capacity.
    • an accident. It happens that the equipment is damaged permanently, irrevocably and unexpectedly, and it needs to be changed.

    I will not dwell on scheduled replacements / repairs. There is almost everything in our power. I'll tell you three stories about design errors and post-crash upgrades.

    History 1. The machine room did not have enough cold.

    This is a story about one of our first halls on Borovaya. It still works. Hall with a design capacity of 80 racks of 5 kW.

    As the hall was filled, the cold was no longer enough: the temperature in the cold corridors was higher than necessary, and local overheating was constantly occurring. It was only later, from the height of our experience, we realized that we had made mistakes in the design, and because of this, the conditioning suffered.

    A long row of racks - more than 20 in a row
    Hot air stagnated in the middle of the row
    Low ceilings - up to 3 meters
    Not enough space for proper ventilation. Local overheating zones occurred
    Low raised floor with a lot of communications under it
    Interference to the circulation of cold air under the raised floor

    The row is so long that the air conditioners at the opposite end are barely visible. Photos of 2009.

    We didn’t see any “magic pill” of these problems then, so we decided to act in stages and on all fronts.

    First, we checked whether all the equipment was installed correctly and whether the plugs are in free units. We also rechecked the layout of the perforated tiles and removed the excess, installed additional air guides under the raised floor. Tried to find and seal up all the holes where cold air could escape. I advise you, too, to check that you have between the air conditioner and the wall. A gap of 5-7 cm is already a lot.

    This is the result that gave us the simple placement of plugs in free units.

    It became better, but not good enough. Then we decided to isolate the cold corridors. They built the roof, doors of polycarbonate. It turned out cheap and cheerful. As a result, we got rid of the parasitic mixing of hot and cold air and increased the efficiency of the cooling system.

    Isolated cold corridor of the same hall.

    We understood that this would be enough for a while. With the growth of IT load, the lack of power will again be felt.

    This problem was tried to be solved by adding a freon conditioner, although the hall was working on glycol cooling. We were very concerned about the dimensions of the air conditioner (whether it will pass through the door, whether the angle of rotation is enough), so we selected a model with the possibility of partial disassembly. The air conditioner was installed not from the side of the hot corridor, as they usually do, but where they could squeeze. This added us 80 kilowatts of cooling.

    This is the Emerson gutta-percha conditioner.

    The whole story turned out to be difficult: it was necessary to figure out how to bring the freon tracks to the external units, how to bring electricity to these air conditioners, where to put the external air conditioner units. All this in a working room.

    Just to understand how little space there is.

    After all these manipulations, we got rid of local overheating, the temperature was distributed evenly in the cold and hot corridors. It turned out to increase the capacity of the hall and place it declared five-kilowatt racks.

    The moral of this story is that you should not be afraid to solve the problem in small steps. In itself, each of the actions may seem (and then it seemed to us) ineffective, but in sum it gives the result.

    History 2. In the engine room, air conditioning and power supply are over.

    Under the client, the machine hall was designed for 100 racks of 5 kW each. Rack width 800 mm, in each row 10 racks. Then the client changed his mind to call in, and the hall was handed over on a common basis. In the life of the rack width of 800 mm are needed mainly for network equipment, for the rest need six hundredth. As a result, instead of 10 racks in the row, we got 13, and there was still room. But electricity and cold is not enough.  

    In the course of modernization, a new room was allocated for two additional UPSs of 300 kW each.

    Additional distribution boards appeared in the hall.

    New power needed to be distributed evenly. To separate the new and old rays, separate cable trays were laid under the raised floor. Some of the operating IT equipment was switched to new switchboards by alternately switching each power supply beam.

    To resolve the issue of lack of cold, put 1 extra air conditioner for 100 kW of cold.

    During the rigging, installation and commissioning of all equipment, the hall continued to operate in normal mode. This was the most difficult moment in the project.

    As a result of the modernization, we added a hall of electricity and cold for another 30 racks of 5 kW each.

    The design capacity and capacity of the hall increased by 30%.

    History 3. About replacing chillers

    A little background. It all started in 2010, when 3 chillers of the OST data center were hit hard during a hurricane. Then, in order to survive, I had to drive chillers without protection for several days, and the compressors quickly turned down. First, change them.

    The IT load grew as the data center was filled, and the Emicon chillers did not reach the declared cooling capacity. In 2012, an additional Hiref chiller was installed in the same hydraulic circuit. So we lived for another three years.

    Over time, Emicon chillers exacerbated operational problems. Their power was not enough, so in the heat I had to pour water from the Karcher. Over the years, heat exchangers have become overgrown with lime deposits. In the gap between the free-cooling heat exchanger and the freon condenser, poplar fluff and other garbage were stuffed, which cannot be removed due to the specific structure of the heat exchangers. There was formed a real boots, which did not pass the air normally.

    In 2015, we just bought a batch of Stulz chillers for NORD-4 . We decided to replace two of three Emicon chillers for this business. Now for the details.

    Installing an additional Hiref chiller without retrofitting pumps.The IT load grew, and the efficiency of the chillers affected by the hurricane fell. In summer, the reserve was barely enough. We decided to add another chiller to increase their total capacity. At the time of work, the cooling system was supposed to continue to function. The most difficult thing about this operation is the organization of the glycol circuit. We made glycol binding: a glycol ring was assigned to each new chiller from each chiller. The chillers were decommissioned in turn, and the glycol pipe was brought to the new chiller.

    Fragment of the hydraulics concept. It shows that from each of the three chillers were made branches to the new chiller.

    The main task of this chiller is to support the cooling system in the summer. Thanks to Hiref, we have a guaranteed reserve of N + 1 in the hot months. But chillers damaged in a hurricane slowly began to faint, and we had to think about replacing them.

    The same "summer" chiller Hiref.

    Replacing Emicon by Stulz. It is better to make such replacements in the fall or spring: in the summer without a reserve, it is absolutely scary, and in winter it is simply unpleasant to carry out work. The operation was planned for February / March, but they began to be prepared in October.

    During these preparatory months, we have laid new cables, boiled sections of the pipeline, developed a plan for entering the car with equipment (we are working closely in the backyard), and cleared the area for the entrance of the crane. Chillers had to be changed in a working data center, and for about 1.5 days it remained without a backup chiller. At the preparatory stage, we conducted tests to understand how the data center would feel without reserve, thought out various situations when something could go wrong during work (for example, a long blackout during the replacement of chillers), and developed an action plan . Here is a brief chronicle of those works.
    Chiller arrived at night. After the successful arrival of the crane to the territory of the data center, it was possible to start disconnecting the old chiller.  

    The old chiller is still in place while preparations are underway. Cook frame for the new chiller.

    Then a car with a chiller had to drive to the immediate work site. We have there, to put it mildly, cramped. I had to sweat to fit into all these difficult turns in a limited space.

    Disassembled and sawn in half chiller dismantled.

    Old and new chillers differ in size. It took some time to prepare the metal frame. The case remains for the rise and installation of the chiller.

    In the background, the photo shows that the glycol contour sections for the new chiller are being parallelized in parallel.

    After installation on the frame is mounted all the hydraulics, the chiller is connected to the power supply. At night, the pressure is done. The next day, commissioning and connection to the monitoring system takes place.

    It took less than two days to complete the operation: in the morning the old chiller was turned off, and at the end of the next day the new chiller was turned on.

    Two weeks later, the second chiller was changed. It would seem that it was just necessary to do everything according to the established scheme, but something went wrong. It was snowing all night. First I had to spend time clearing the area so that the crane could come around. We started to dismantle the old chiller, as a car with a new chiller breaks down two hundred meters from us. The point of no return has already been passed, and the wagon has broken the swivel mechanism of the trailer wheels (control panel from it).

    It was not possible to repair it on the spot, they drove for an additional console, which on Saturday was a miracle in the office of this company. With the remote managed to screw the car. As a result, it took us more than 3 hours to complete one turn. With all the logistics lining work stretched to the night. It is good that we have thought up the lighting for working in the dark. The rest of the work went in the normal mode, and since Monday another new chiller started working in the data center.

    In March of this year, my colleagues made the replacement of the third chiller, the last of the survivors of the hurricane. Now three chiller Stulz and one Hiref are working on Borovaya. Thanks to such a phased modernization, we now have a large supply of cold, and we are not afraid of the hottest weather and poplar fluff. New chillers support free cooling mode at a larger temperature range, consume less energy and work very quietly. They are also very convenient to maintain due to the separate compressor compartments: repairs can be carried out without stopping the chiller completely.

    Also popular now: