Google uses machine learning to increase data center efficiency

The Internet giant is using machine learning and artificial intelligence to increase the efficiency of its data centers. According to Joe Cava, vice president of data center at Google, the company began to use neural networks to analyze the huge amount of data collected on servers and provide recommendations for improving their work.

In fact, Google built a computer that knows even more about its data centers than the engineers themselves. The human resource is not debited, but Kava believes that using a neural network will give Google the opportunity to reach new horizons in the efficiency of server farms by going beyond what engineers can see and analyze.

Google already has some of the most energy-efficient data centers on the planet. Using artificial intelligence will give Google the opportunity to look into the future and simulate thousands of schemes for their data centers.

In the early stages of use, neural networks allowed Google to predict the PUE coefficient with an accuracy of 99.6%. These recommendations, despite their seeming insignificance, led to significant cost savings, because have been applied to thousands of servers.

Why did Google turn to machine learning and neural networks? The main reason lies in the fact that data centers are constantly expanding, which becomes a challenge for Google, which uses sensors to collect millions of data on infrastructure and energy consumption.

“In a dynamic environment such as a data center, it’s sometimes difficult for a person to see all the relationships between system variables,” Kava says. - “We have been working for a long time to optimize the data center. All certainly the best ways have already been implemented, but we should not stop! ”


Meet the genius boy

Google’s neural network was created by Jim Gao, a Google engineer whom colleagues called the “genius boy” because of his ability to analyze large amounts of data. Gao was engaged in the analysis of cooling systems, using the principles of hydrodynamics and monitoring data to create a 3D model of air flows inside the server room.

Gao thought it was possible to create a model that monitors an even larger set of variables, including IT equipment utilization, weather conditions, cooling towers, water pumps, and heat exchangers that maintain the normal temperature of Google’s servers.

“Computers are good because they can see the whole story hidden in the data. Jim took the information that we collect daily and ran it through his model in order to come to an understanding of complex chains of interaction, to understand the meaning that workers might not notice as mere mortals, ”writes Cava on his blog. “Thanks to a series of trial and error, Jim’s model now provides 99.6% accuracy in PUE calculations. This means that he can now apply models in search of new ways to increase the effectiveness of our actions. ” The image below shows the correlation between the predicted (black curve) and the actual (yellow curve) PUE changes.

How it works

Gao began working on machine learning as a “20 percent project." By tradition, Google allows its employees to spend part of their work time developing innovations, in addition to their core responsibilities. Gao was not an expert in artificial intelligence. To learn key points in machine learning, Gao took a course at Stanford with Professor Andrew Eun.

A neural network imitates the functioning of the human brain, allowing the computer to understand and “learn” tasks without the need for explicit programming. The Google search engine is often cited as an example of this type of training, which is also one of the key areas of research in the company. “This model is nothing more than a set of calculations of differential equations,” Cava explained. “But you must understand the math.” The model begins by examining the interaction of variables. ”

To get started, Gao needed to identify the key factors affecting energy efficiency at Google's data centers. He narrowed the number of these indicators to 19 and designed a neural network, a machine learning system that can recognize patterns in large data sets.

“The sheer number of combinations of equipment and settings makes it difficult to find optimal performance,” Gao writes in his report. “In a working data center, tasks can be implemented with a variety of combinations of equipment (mechanical and electrical) and software (control and installation strategies). It’s almost impossible to verify each combination to increase efficiency - there are time limits, frequent load fluctuations in the operation of IT equipment, weather conditions, and the need to maintain stable operation of the data center. ”

Runs on a single server

As for the equipment, according to Kava, the system does not require incredible computing power and works on one server, and could work even on one high-end desktop computer.

The system was launched on several Google data centers. The machine learning tool was able to propose several changes that led to a gradual improvement in PUE, including improved load balancing with increased infrastructure capacity, as well as small changes in the temperature of the water cooling system.

“Recent tests at Google’s data centers have shown that machine learning is an effective method of using existing sensor readings to model the distribution of energy in a data center and leads to significant cost savings,” Gao writes.

Cars don't get the upper hand

Kava believes that this tool will help Google model and improve other projects in the future. But do not worry, Google data centers will not soon acquire self-awareness. Now the company is interested in automation, and even recently acquired robotics development companies, but so far none of Google’s data centers has been working exclusively on automated control. ”

“We still need people to make the right conclusions about all this,” Cava says. “And I still want our engineers to familiarize themselves with these recommendations.”

The greatest bonuses for using the neural network will appear in the coming years, during the construction of the new Google server platform. “I foresee the use of this principle in the design of data centers,” says Kava. “This advanced technology can be used in both design and future improvements. I think we will find other ways to use it. ”

Google shares its approach to machine learning in an articleGao, hoping that those who also manage powerful data centers, will be able to put this into practice. “This mechanism is not something special that only Google or Jim Gao can use,” Kava says. “I would really like to see a wider application of this technology. I think the whole industry will benefit from this. It’s an amazing tool to be as effective as possible. ”

Also popular now: