Magic constant 0x5f3759df

Original author: Christian Plesner Hansen
  • Transfer
In this article, we will talk about the “magic” constant 0x5f3759df, which underlies an elegant algorithmic trick for quickly calculating the inverse square root .

Here is a complete implementation of this algorithm:

float FastInvSqrt(float x) {
  float xhalf = 0.5f * x;
  int i = *(int*)&x;  // представим биты float в виде целого числа
  i = 0x5f3759df - (i >> 1);  // какого черта здесь происходит ?
  x = *(float*)&i;
  x = x*(1.5f-(xhalf*x*x));
  return x;
}

This code calculates some (fairly good) approximation for the formula.

image

Today, this implementation is already well known, and it became so after the appearance of Quake III Arena in the code in 2005. Its creation was once attributed to John Carmack, but it turned out that the roots go much further - to Ardent Computer , where Greg Walsh wrote it in the mid-80s. Specifically, the version of the code shown above (with funny comments) is really from Quake code.
In this article we will try to deal with this hack, mathematically derive this very constant and try to generalize this method to calculate arbitrary degrees from -1 to 1.

Yes, it will take a bit of mathematics, but there will be more than a school course.

What for?


Why do we even need to consider the inverse square root, and even try to do it so fast that we need to implement special hacks for this? Because it is one of the main operations in 3D programming. When working with 3D graphics, surface normals are used. This is a vector (with three coordinates) of length one, which is needed to describe lighting, reflection, etc. There are many such vectors. No, not just a lot, but MUCH. How do we normalize (reduce the length to unity) a vector? We divide each coordinate by the current length of the vector. Well, or, to paraphrase, we multiply each coordinate of the vector by the value:

image

Calculationimagerelatively simple and runs fast on all types of processors. But square root calculation and division are expensive operations. And that's why - come across the FastInvSqrt fast square root inverse algorithm.

What is he doing?


What does the above function do to calculate the result? It consists of four main steps. First of all, she takes an input number (which came to us in float format) and interprets its bits as the value of a new variable i of type integer (integer).

int i = *(int*)&x;         // представим биты float в виде целого числа

Next, an integer arithmetic operation is performed on the resulting integer, which works quickly enough and gives us some approximation of the desired result

i = 0x5f3759df - (i >> 1);  // какого черта здесь происходит ?

What we received as a result of this operation is not yet, in fact, the result. This is just an integer whose bits represent some other floating point number that we need. So, you need to perform the inverse conversion of int to float.

x = *(float*)&i;

And finally, one iteration of the Newton method is performed to improve the approximation.

x = x*(1.5f-(xhalf*x*x));

And now we have an excellent approximation of the inverse square root extraction operation. The last part of the algorithm (Newton's method) is a rather trivial thing, I will not dwell on this. The key part of the algorithm is step # 2: performing some tricky arithmetic operation on an integer obtained from interpreting bits of a floating-point number as int type content. This is where we will focus.

What the hell is going on here?


To understand further text, we need to recall the format in which floating point numbers are stored in memory. I will describe here only what is important here for us (the rest can always be seen on Wikipedia). A floating point number is stored as a combination of three components: sign, exponent, and mantissa. Here are the bits of a 32-bit floating-point number:

image

First comes the sign bit, then 8 bits of the exponent and 23 bits of the mantissa. Since we are dealing with the calculation of the square root, I will assume that we will only deal with positive numbers (the first bit will always be 0).

Considering a floating point number as just a set of bits, the exponent and the mantissa can be thought of as just two positive integers. Let us denote them, respectively, by E and M (since we will often refer to them below). On the other hand, interpreting the bits of a floating-point number, we will consider the mantissa as a number between 0 and 1, i.e. all zeros in the mantissa will mean 0, and all ones - some number very close (but still not equal) 1. Well, instead of interpreting the exponent as an unsigned 8-bit integer, let's subtract the offset (denoted by B) to get a signed integer ranging from -127 to 128. Let's denote the float interpretation of these values ​​as e and m. In order not to get confused, we will use the uppercase notation (E,

The conversion from one to the other is trivial:

image

image

In these formulas for 32-bit numbers L = 2 ^ 23, and B = 127. Having some values ​​of e and m, you can get the number that they represent:

image

and the value of the corresponding integer interpretation of the number:

image

Now, we have almost all the pieces of the puzzle that are needed to explain the “hack” in the code above. So, we get a certain number x at the input and we need to calculate its inverse square root:

image

For some reasons, which will soon become clear, I'll start by taking the base 2 logarithm from both sides of this equation:

image

Since the numbers we work with are actually floating point numbers, we can represent x and y according to the above formula for representing such numbers:

image

Oh, these logarithms. You may not have used them since your school days and you forgot a little. Do not worry, we will get rid of them a little further, but for now we still need them, so let's see how they work here.
In both parts of the equation we have an expression of the form:

image

where v is in the range from 0 to 1. You can notice that for v from 0 to 1 this function is pretty close to a straight line:

image

Well, or in the form of the expression:

image

Where σ is some constant. This is not an ideal approximation, but we can try to choose σ so that it is quite good. Now, using it, we can transform the above equality with logarithms into another, not strictly equal, but quite close and, most importantly, STRICTLY LINEAR expression:

image

This is already something! Now is the time to stop working with floating point representations and move on to the integer representation of the mantissa and exponent:

image

After doing a few more trivial transformations (you can skip the details), we get something that is already pretty familiar:

image

image

image

Look carefully at the left and right parts of the last equations. As we can see, we have obtained an expression of the integer form of the required value of y, expressed through a linear form of a formula that includes an integer representation of the value of x:

image

In simple words: “y (in integer form) is some constant minus half of the integer form of x”. In code form this is:

i = K - (i >> 1);

Very similar to the formula in the function code at the beginning of the article, right?

It remains for us to find the constant K. We already know the values ​​of B and L, but still do not know what σ is equal to.
As you recall, σ is a certain “correction value” that we introduced to improve the approximation of the logarithm function to a straight line on a segment from 0 to 1. That is, we can pick this number ourselves. I will take the number 0.0450465 as giving a good approximation and used in the original implementation. Using it, we get:

image

Guess how the number 1597463007 is represented in HEX? Well of course this is 0x5f3759df. Well, that was how it should be, since I chose σ in such a way as to get exactly this number.

Thus, this number is not a bitmask (as some people think, simply because it is written in hex form), but the result of calculating the approximation.

But, as Knut would say: “So far, we have only proved that this should work, but we have not verified that it really works.” To evaluate the quality of our formula, let's draw graphs of the inverse square root calculated in this way and its real, exact implementation: The

image

graph is built for numbers from 1 to 100. Not bad, right? And this is not magic, not trick, but just the correct use of a few, possibly exotic tricks with the representation of floating point numbers as integers and vice versa.

But that's not all!


All the above transformations and expressions gave us not only an explanation of the constant 0x5f3759df, but also a few more valuable conclusions.

First, let's talk about the values ​​of the numbers L and B. They are determined not by our task of extracting the inverse square root, but by the storage format of the floating point number. This means that the same trick can be done for both 64-bit and 128-bit floating point numbers - you just need to repeat the calculations to calculate other constants.

Secondly, the chosen value of σ is not very important for us. It may not (and indeed does not) give a better approximation of the function x + σ to the logarithm. σ was chosen so, because it gives the best result together with the subsequent application of the Newton algorithm. If we did not apply it, then the choice of σ would be a separate interesting task in itself; this topic has been disclosed in other publications.

Well, in the end, let's look at the “-1/2” coefficient in the final formula. It turned out so because of the essence of what we wanted to calculate (the “inverse square root”). But, in general, the degree here can be any from -1 to 1. If we denote the degree as p and generalize all the same transformations, then instead of “-1/2” we get:

image

Let's put p = 0.5. This will be the calculation of the usual (not inverse) square root of the number:

image
image

In the form of code, it will be:

i = 0x1fbd1df5 + (i >> 1);

So does it work? Of course, it works:

image

This is probably a well-known way to quickly approximate the square root value, but fluent googling did not give me its name. Perhaps you will prompt?
This method will work with more “strange” degrees, like the cubic root:

image

image

What will be expressed in the code as:

i = (int) (0x2a517d3c + (0.333f * i));

Unfortunately, due to the power of 1/3, we cannot use bitwise shift operations and are forced to apply multiplication by 0.333f here. The approximation is still good enough:

image

And even more than that!


At this point, you could already replace that changing the degree of the function being calculated trivially changes our calculations: we just calculate a new constant. This is absolutely not a costly operation, and we can do it even at the stage of code execution, for different degrees required. If we multiply two previously known constants:

image

Then we can calculate the required values ​​on the fly, for an arbitrary degree from -1 to 1:

i = (1 - p) * 0x3f7a3bea + (p * i);

By simplifying the expression a little, we can even save on one multiplication:

i = 0x3f7a3bea + p * (i - 0x3f7a3bea);

This formula gives us a “magic” constant, with which you can calculate different degrees of numbers on the fly (for degrees from -1 to 1). For complete happiness, we just lack the confidence that the approximate value calculated in this way can be as effectively improved by the Newton algorithm as it did in the original implementation for the inverse square root. I have not studied this topic more deeply and this is likely to be the topic of a separate publication (probably not mine).

The expression above contains the new “magic constant” 0x3f7a3bea. In a sense (due to its versatility) it is even “more magical” than a constant in the original code. Let's call her C and look at her a little more carefully.

Let's check the work of our formula for the case when p = 0. As you remember from the course of mathematics, any number in degree 0 is equal to one. What will happen to our formula? Everything is very simple - multiplying by 0 will destroy the second term and we will have:

i = 0x3f7a3bea;

Which is really a constant and, when converted to a float format, will give us 0.977477 - i.e. “Almost 1”. Since we are dealing with approximations, this is a good approximation. In addition, it tells us something else. Our constant C has a completely non-random value. This is a unit in the format of floating point numbers (well, or “almost a unit”)
This is interesting. Let's take a closer look: The

integer representation of C is:

image

This is an almost, but still not quite, form of a floating point number. The only problem is that we subtract the second part of the expression, but we should add it. But this can be fixed:

image

Now it looks exactly like an integer representation of a floating point number. To determine which number, we will calculate the exponent and the mantissa, and then the number C itself. Here is the exponent:

image

But the mantissa:

image

And, therefore, the value of the number itself will be equal:

image

Indeed, if we divide our σ (and it was equal to 0.0450465) by 2 and subtract the result from unity, then we will get 0.97747675, already known to us, the one that is “almost 1”. This allows us to look at C from the other side and calculate it on runtime:

float sigma = 0.0450465;
float c_sigma = 1 - (0.5f * sigma);
int C_sigma = *(*int)&c_sigma;

Note that for a fixed σ, all these numbers will be constants and the compiler will be able to calculate them at the compilation stage. The result will be 0x3f7a3beb, which is not exactly 0x3f7a3bea from the calculations above, but differs from it by only 1 bit (least significant). You can get the original constant from this code from the code (and the title of this article) by multiplying the calculation result by another 1.5.

With all these calculations, we came closer to understanding that in the code at the beginning of the article there is no “magic”, “tricks”, “intuition”, “selection” and other dirty hacks, but there is only pure mathematics, in all its pristine beauty. For me, the main conclusion from this story was the news that converting float to int and vice versa by reinterpreting the same set of bits is not a programmer’s error and not a “hack”, but quite a reasonable operation for itself sometimes. Slightly, of course, exotic, but very fast and yielding practically useful results. And, it seems to me, other applications can be found for this operation - we will wait.

Also popular now: