Ingran85 November 9, 2015 at 10:12

Using MatAnalysis in computer games

From the sandbox

Introduction

In many games, especially RPGs, stats are very important. Attack, defense, resistance, damage, penetration of armor, misses, etc. affect the damage done to the enemy or you take from the enemy. Most often, players prefer to adhere to the tactics - "the more and the more, the better." This approach is most likely caused not by a well-thought-out character development strategy, but by the lack of a detailed analysis of the game, laziness, or the lack of information about the specific nature (specific calculation formula) of the influence of “stats” on certain indicators. Moreover, very often, as conceived by the creators of the game, it is impossible to increase all the characteristics at the same time, and therefore it’s especially important to choose “what and where to pump in”.

Further, we will talk about a method that in some cases will allow you to get an explicit formula for the dependence of some parameters on others (for example, spell power on intelligence, or the percentage of damage reduction on the amount of protection). This method is applicable where we have the opportunity to change one parameter and observe the changes, depending on it, of the second. Moreover, this method is also applicable in the case when the average value of the second parameter strictly depends on the first, however, the second parameter itself is a random variable.

The method will be described by the example of calculating the power of pet spells from intelligence, and the percentage of reduction in damage received by a player from the total amount of protection in the ArcheAge game. Actually, the basis of the method is the Least Squares Method, which is very widely known and very often used in various fields. For calculations, Wolfram Mathematica (any version) will be used. Actually a step-by-step description of what needs to be done to obtain the law of interest is the main value of this article. Those familiar with MNCs and Wolfram Mathematica can go directly to examples.

Least square method (OLS)

The MNE method is described in great detail in the literature, I will only describe the essence in general terms. Let us know the general form of the dependence of one quantity on another. How can I find out the general view, I will explain later. At the moment, for example, take a dependence of the form y = a * x ^ 2 + b * x + c. Where y is one quantity and x is another. Moreover, a, b, c are some parameters. And in order to completely get ahead of the dependence of one quantity on another, it is necessary to determine precisely these parameters, because the type of dependence itself is supposed to be known.

In the simplest case, one can find out the value of one quantity at a particular value of another quantity from observations, experiments, or some other sources. There are three such pairs in order to compose a complete system of equations and solve it with respect to three unknown parameters a, b, c. Moreover, in some cases this is enough in games.

Everything gets complicated when another term is added to the dependency function - namely, some random variable. y = a * x ^ 2 + b * x + c + Random_Value. It can be introduced by the developers of the game specifically, such as the dispersion in damage, but may also have a different reason. The fact is that the exact value of some function may have in its record the number of digits greater than the size of the output field in the game menu. In this case, a rounded value is displayed, and you cannot say the exact value is more or less than what we can read in the game interface field. Thus, we can say that rounding adds some random value to us (it can be both negative and positive, but on average it is zero).

When a random variable is added to the "exact" dependence function y = a * x ^ 2 + b * x + c, then the actually observed, measured values of y will not lie on the curve a * x ^ 2 + b * x + c whatever there were options. With neither too much dispersion (average scatter), the random values that are actually observed values marked on the coordinate plane will lie quite close to the curve a * x ^ 2 + b * x, even if we know the parameters ab and c. Some points may even fall on this curve, because it is possible that the random variable at some point simply assumed the value zero. But in this case, how can we find the parameters of our function if even the points that we know do not lie on it? The least-squares method is to choose parameters so that the distance from the points to the curve is minimal! This is the main essence of MNCs.

Here it’s worth clarifying that the distance from a point to a curve does not mean the distance to the nearest point of the curve, but the difference between the value of the point (a point is a pair of variable values — the observed value of the function) and the “exact” value of the function for the same value of the variable as the point . Generally speaking, in the ideal case, this difference is equal to the specific value of the same Random_Value at a given point. It is also worth clarifying that it is necessary that the sum of precisely all the differences between the observed values and the “exact” values is minimal. Unfortunately, it is often tempting to throw out the most “uncomfortable” points that lag far behind the supposed “exact” curve points so that everything else looks better. This can not be done, unless of course you are absolutely not sure that the measurement was carried out erroneously. And another important point - the distance in this case is measured as a strictly positive value. It doesn’t matter if the observed point is higher than necessary or lower - the main thing is how far.

The general form of the dependence, as was said earlier, y = a * x ^ 2 + b * x + c + Random_Value. Moreover, from observations we know the observed value of pairs of both quantities x, y. We can measure the steam as much as we want, and preferably the more, the better! (how much is it, read the literature). To find the difference between the observed and the “exact” value, you need to subtract the measured value of y_measure and a * x_measure ^ 2 + b * x_measure + c. That is, we believe that the variable is known exactly, and that with the exact parameters abc a * x_measure ^ 2 + b * x_measure + c it’s like the “exact” value of our function. From the obtained difference, you need to take the module in order to get the absolute value, as already mentioned. (It is worth noting that, in fact, “exact” means a curve that is closest to all points. We can’t calculate the exact value in any way.

If you are not used to this turn of events, you should get used to it. When there is no ideal, we use the best of what is.) But it is inconvenient to work with the module, and it is easier to square the resulting difference instead of the module. And sum up all the squares of the differences. The resulting amount will be one very long function with many terms, but only three variables in total. It remains only to find such abc values at which this function will be minimal. Which is equivalent to finding the extrema of a function of several variables. Which in turn (at first) glance is a trivial task of mat analysis. This is the least squares method.

The seeming simplicity of the method plays a rather cruel joke when you need to write an algorithm for calculations. This is really quite a difficult task, with a lot of pitfalls. However, there are already implemented algorithms that we will use, due to the fact that our goal is to use the least-squares method, and not to write another algorithm for its implementation.

Wolfram mathematica

If you've never worked with programs like Mathematica MATLAB and Maple, then it's time to start. If it’s very simple (so that you are not afraid to master them), then these programs are needed in order to solve systems of differential equations symbolically, or integrate symbolically, or draw a graph, and all this is done with a “polite request” in one line. Just think, you write the equation in symbolic form (NOT NUMERIC! Everything is in letters, variables, parameters) and you get the same answer - in general symbolic form. Finding parameters by the ANC method, they also know how. (... and not only to them, but these are already details). I advise you to try playing with Wolfram Mathematica. Maybe someone will be more interested if you find out the fact that Mathematica has access to databases of social networks, for example, VKontakte. (It’s clear that only open data, but still) You can do some research yourself using data from real people. Their tastes, professions, interests, frequency of posts, and everything you want about sociology and human behavior. There are a lot of articles on how to work with Mathematica, but what is especially nice is the enormous amount of examples in the built-in help - literally for all occasions. This greatly simplifies the development of Mathematica (I won’t be so flattering about MATLAB: of course, it also has advantages, but still, my choice is for Mathematica). this is a huge number of examples in the built-in help - literally for all occasions. This greatly simplifies the development of Mathematica (I won’t be so flattering about MATLAB: of course, it also has advantages, but still, my choice is for Mathematica). this is a huge number of examples in the built-in help - literally for all occasions. This greatly simplifies the development of Mathematica (I won’t be so flattering about MATLAB: of course, it also has advantages, but still, my choice is for Mathematica).

A few words about those functions that will be applied below. To find the parameters of a function of a known type, in the presence of a set of observable (measured) data (to which some random variable is added), the FindFit function is used. To display an array of points, use the ListPlot function. For plotting, simply Plot is used. Arrays are usually enclosed in curly braces {}, access to the array element is done through double square brackets [[]]. You can also use various functions to create an array, for example, Table. To display a graph and an array of points in one figure, the Show [{}] function is used from an array of two elements (or more) each of which can be any function for graphically displaying data.

Examples

In most cases, games do not use too complex functions for dependencies of some parameters on others. Most often, linear functions ax + b are used, or a relation of the form (ax + b) / (cx + d). There is no clear rule for finding functions. The developer can, if desired, make a very complex and confusing function that is almost impossible to guess. However, such cases are extremely rare. A relationship of the form (ax + b) / (cx + d) is often used where a value is calculated that should be limited from above, for example, reducing the damage received from the protection value. Indeed, it makes no sense to introduce such a concept as damage reduction of more than 100%. In such cases, when there are quantities bounded above, it is best to start trying with a function of the form y = (ax + b) / (cx + d).

Let's try to find the dependence of the percentage of reduction of received damage on the value of protection. To do this, we need to get pairs of numbers (protection, percentage). This is done quite simply. All items are removed from the character. And then, in turn, in different quantities, in different combinations, they are put back on. Thus, we vary the value of protection, and see how the percentage changes. We write the results to an array, in this form.

OurDefSource = {{637, 10.73}, {689, 11.5}, {462, 8.02}, {585, 9.94}, {358, 6.33}, {317, 5.64}, {281, 5.03}, {99, 1.83} , {0, 0}, {3668, 40.9}, {1287, 19.54}, {495, 8.54}, {2471, 31.8}, {4596, 46.44}};

After that, for the convenience of further graphical representation of the points, it will be useful to find the minimum and maximum value of protection. And also sort our data. Transpose from an array of two dimensional arrays makes an array of two one-dimensional arrays, then to calculate the minimum and maximum value of protection.

Def = Sort [OurDef, # 1 [[1]] <# 2 [[1]] &];
MaxDef = Last [Transpose [Def] [[1]]];
MinDef = First [Transpose [Def] [[1]]];

Then we set the general view of our function, and create an array of all the parameters that need to be found.

Fdef [x _]: = (a * x + b) / (c * x + d);
CoefsFdef = {a, b, c, d};

Then we use FindFit to search for parameters of a function of a certain type, using data obtained experimentally.

CoefsFdefFit = FindFit [Def, Fdef [x], CoefsFdef, x]

After which we show the form of the function with the found parameters (for this, use “/.”)

Fdef [x] /. CoefsFdefFit

As a result, we get: (-1.1499 + 7.32728 x ) / (388.234 + 0.0732995 x) I

must say, the view is not very beautiful. And I must say right away that very beautiful coefficients are rarely used, due to the fact that it is much easier for developers to work with some compact, visual coefficients. But our result is a fraction that we can reduce, put something out of the brackets. We will lead to a more beautiful look. If you look closely, you can see that the coefficients before x in the numerator and denominator are the same. We divide by the coefficient at x in the numerator both parts of the fraction.

(-0.156933+ x) / (52.9848_ + 0.0100036 x)

As we recall, the variable x means the amount of protection in units. The characteristic values for x are of the order of thousands. Therefore, the numerical constant in the numerator can be neglected, which means that the general form of the function is actually somewhat different than we expected, namely, without a constant in the numerator. As for the denominator, the factor before x is very similar to 0.01. The term in the denominator is very similar to 53.00. Having made all these assumptions, as well as multiplying the denominator by 100, we see that the percentage of damage reduction is 100 * x / (5300 + x), where - x is the total amount of protection.

To check how “beautiful” the formula corresponds to reality, we find the differences between the experimental points and the values of our function.

OurDiff = Fgood [x _]: = 100 * x / (5300 + x);
OurDiff = Table [Fgood [Def [[i]] [[1]]] - Def [[i]] [[2]], {i, 1, Length [Def]}]
Max [OurDiff] The

maximum difference value is 0.00493997 , which is less than half of the last significant (displayed) digit in the percentage value. Which is quite satisfactory.

The result can be displayed on the chart.

Show [{ListPlot [Def, PlotStyle -> {Blue}], Plot [Fgood [x], {x, MinDef, MaxDef}, PlotStyle -> {Green}]}]

Full code for Mathematica:

OurDefSource = {{637, 10.73}, {689, 11.5}, {462, 8.02}, {585, 9.94}, {358, 6.33}, {317, 5.64}, {281, 5.03}, {99, 1.83}, {0, 0}, {3668, 40.9}, {1287, 19.54}, {495, 8.54}, {2471, 31.8}, {4596, 46.44}};
Def = Sort[OurDefSource, #1[[1]] < #2[[1]] &];
MaxDef = Last[Transpose[Def][[1]]];
MinDef = First[Transpose[Def][[1]]];
Fdef[x_] := (a*x + b)/(c*x + d);
CoefsFdef = {a, b, c, d};
CoefsFdefFit = FindFit[Def, Fdef[x], CoefsFdef, x]
Fdef[x] /. CoefsFdefFit
OurDiff = Fgood[x_] := 100*x/(5300 + x);
OurDiff =  Table[Fgood[Def[[i]][[1]]] - Def[[i]][[2]], {i, 1, Length[Def]}]
Max[OurDiff]
Show[{ListPlot[Def, PlotStyle -> {Blue}],  Plot[Fgood[x], {x, MinDef, MaxDef}, PlotStyle -> {Green}]}]

A similar approach also works successfully when calculating the formula for calculating the damage of a spell. Poisonous breathing of a pet. Converted witch from an inta bonus from gear. In this example, the general form of the function will be slightly different. If we assume that the formula for the pet will be similar to the formula for the character, then the inta is first converted to spell power, then something else is added to this force (from gear or buffs), after which the received spell power is converted by some formula to damage from a particular spell. And often the damage from spells has a constant part, which does not depend on the strength of the spells. Thus, the formula should presumably have the form

a51 + b51 * (c51 * x + y)

where a51 is a constant in the calculation of a specific spell; b51 is the factor in the spell formula; c51 is the coefficient of proportionality of spell and int power; x is inta and y is a bonus gain to spell power from gear. FindFit is capable of working for functions of several variables. As a result, we get:

Damage of the spell Poisonous breath = 1669 + 4.8 * (1.25 * Inta + Bonus StrengthClose);

It is worth noting that, unlike the character, the coefficient of proportionality of spell and int power is not 0.2 but 1.25, which means that the power of spells from spells in pets grows almost 6 times faster.

For comparison:

“Lesovik hurricane” 654 + 1.92 * (1.25 * x + y);
"Arrow of the forester" 980 + 2.88 * (1.25 * x + y);

conclusions

Thus, using a fairly simple technique, you can get information about the laws and principles of the game. Along the way, as was shown, we learned not only the dependence of one parameter on another, but also noticed something that we did not even look for initially, such as the coefficient c51. Now, knowing the exact formula for the calculation, you can analyze and model which parameters and how to improve. Thus, the character development strategy becomes more meaningful.

This method does not apply in any way to hacking the code, or to anything else illegal. But you must admit, this method allows you to see what is not explicitly shown for everyone else.

Conclusion

Matan helps even in games. I hope the described example of using OLS in games will increase your interest in matan.

Tags: