CooperMaster October 26, 2016 at 15:09

Machine Learning and Intel Xeon: Tencent In-Game Shopping Recommender

Transfer

Online games are very popular these days, especially among young people. Games take up free time, often family members or friends become virtual associates or enemies. In many cases, players need to buy something in order to improve their character and gain an advantage over other gamers.

- Go horse, you will not see a century of will!

In order to improve ways of interacting with users, Tencent has introduced a recommendation system. This system is based on machine learning methods and is designed to help users make decisions about in-game purchases. Tencent's

business is based on the Internet. The company offers many services, including a social network, web portals, e-commerce solutions and online multiplayer games. Here we talk about what the recommender systems are , about the machine learning algorithms that Tencent uses, and how Intel Xeon processors have helped improve Tencent's system performance.

Recommender Systems

A recommender system is a mechanism that issues a list of items recommended to the user that can choose something from it. Such systems are widely used to help users decide what exactly they need. These systems are not limited to games. They are used for the selection of musical compositions, films, scientific publications and so on.

In the case of Tencent, one of the applications of the recommendation system is to advise the user of game products that suit his needs.

The recommender system creates a list of elements using the following methods: collaborative filtering , content filtering , and a hybrid approach .

Collaborative filtering is an algorithm that issues recommendations based on ratings or user behavior. He analyzes activities or preferences and predicts, based on the similarities of already learned and new users, what the latter might like.

The content filtering algorithm bases recommendations on the properties of objects, information about the user and his interests.

The hybrid algorithm includes all the best of the two approaches described above. Tencent, in its recommendation system, uses a machine learning algorithm based on logistic regression. We will tell in a nutshell about what kind of algorithm it is.

Logistic Regression

Logistic regression is a statistical method of predictive analysis. It is one of the most popular machine learning methods for binary classification. Such a classification implies the presence of two classes. For example, “win” or “lose”, “yes” or “no”, “true” or “false”, “1” or “0”. This, say, happens when betting on races. The horse will either win the race or lose, that is, there are two classes: “win” and “lose”. In this case, the target (dependent) variable is the rate. It will have a value of 1 if the horse wins the race, and 0 otherwise.

Logistic regression is a way to find the probability of the logarithm of the odds ratio of the occurrence of an event using the following equation:

p is the probability of an event.
1-p is the probability of the absence of an event.
β is the weight.
x are independent variables.

In the course of calculations, the coefficients β are found in the above formula for predicting the probability of an event.

Tencent Recommender System and Intel Xeon E5 v4

The Tencent machine learning system analyzes huge amounts of data on the behavior of players in order to formulate recommendations about what game items they should use. Thus, serious computing power is needed to maximize model learning acceleration. To calculate the logistic regression coefficients, DGEMM is actively used . DGEMM is a matrix multiplication function for double precision floating point numbers.

The machine learning system uses the DGEMM function through the Intel Math Kernel Library (Intel MKL). Intel Xeon E5 v4 processors support Intel Advanced Vector Extensions 2 instruction set(Intel AVX2). Intel MKL is well optimized, it can achieve very high performance using Intel AVX2. The library automatically determines the new capabilities of the processors, and, if it is designed for this, uses them. Therefore, if Intel MKL is involved in a certain project, to ensure the best level of performance on new processors, it is enough to keep the library up to date.

Performance testing

Here we compare the performance of dual-processor systems when working with an application that implements Tencent's machine learning algorithm.

The first system is based on Intel Xeon E5-2699 v3 (2.3 GHz, 18 cores, 45 MB cache) and is equipped with 128 GB RAM (DDR4-2133 MT / s).

The second uses Intel Xeon E5-2699 v4 processors (2.2 GHz, 22 cores, 55 MB cache) and the same RAM in terms of volume and characteristics.

Red Hat Enterprise Linux 7.2-kernel 3.10.0-327 is installed on test machines. The following software is used: GNU C Compiler Collection 4.8.2, OpenJDK 7, Spark 1.5.2, Intel MKL 11.3.

The following test results show an improvement in the performance of the application, and, accordingly, the coefficient calculation module.

Comparison of application performance using Intel Xeon E5-2699 v3 and v4 processors.

The application supports scaling, when running on Intel Xeon E5-2699 v4, which has more cores, it is able to run more parallel processes than on Intel Xeon E5-2699 v3, which leads to a reduction in system training time and increased performance.

And here is the test result on Intel Xeon E5-2699 v4 with the Intel AVX2 instruction set disabled and enabled. It can be seen that with AVX2, the logistic regression coefficient calculation module works 44% faster.

Comparison of the performance of the logistic regression coefficient calculation module on Intel Xeon E5-2699 v4 with Intel AVX2 disabled and enabled

Please note that the above test results are valid for a specific set of software and hardware; in addition, they reflect the results of software optimization for Intel processors. Any changes in configuration can change these results. The same, in terms of the effect of configuration on the results, applies to universal performance tests. Therefore, when deciding to purchase equipment, it is worthwhile, firstly, to use different sources of information about performance, and secondly, to take into account how a combination of, say, a microprocessor and RAM or other hardware and software components of the system can affect performance. Details on Intel system performance can be found here .

conclusions

In-game recommendation system built into Tencent games. Optimization of its performance allows you to speed up the decision-making process, which gives the system the opportunity to quickly recommend the most suitable game items to the players. The Intel MKL library uses the Intel AVX2 instruction set, which leads to improved performance for the applications in which it is used on systems equipped with Intel Xeon processors.

Tags: