We optimize the code, or overtake Firelis in speed
I read the topic about new superoptimizations in Ognelis and thought for a long time.
It’s not very clear to me why a holiday with a firework and a snow maiden is arranged around this kind of work. Let's take a closer look at what has been done.
* Function Inlining: Removing the overhead of function calls by simply replacing them with their resulting native code.
Almost all compilers can do function inlining. This is a very simple and you can say free way to speed up the program. It has certain limitations:
1) Excessive inlining leads to bloat code. This is now practically no problem - there is a lot of memory, but the compiler has a top limit on the size of the included function.
2) The compiler is NOT able to inline functions from a neighboring module, from system libraries, or from dynamic libraries.
In addition, the normal (modern) compiler is able to work with the so-called "intrinsics" - functions that it recognizes by the plate and has ready-made code for them. These are usually mathematical functions such as sin (). So, this very sine will not be made like call sin (), but will be inserted by a piece of code, i.e. automatically inline.
* Type Inference: Removing checks surrounding common operators (like "+") when the types contained within a variable are already known. This means that the engine will have already pre-determined, for example, that two strings need to be concated when it sees the "+" operator.
Well, this is usually called RTTI skip - they threw out type checking where it is not needed ... Perhaps for the JIT compiler this is super cool, but ordinary ones have been able to do this for a long time. Naturally, all responsibility for types, or rather their possible inconsistency, lies with the programmer :)
* Looping: The overhead of looping has been grossly diminished. It's one of the most common areas of overhead in JavaScript applications (common repetition of a task) and the constant determining of bounds and the resulting inner code is made negligible.
9 out of 10 that a simple anroll cycle was made. Unrolling is a duplication of the body of the loop N times: if unroll is 4 then we get: Well, plus an additional check that maxI is a multiple of 4 :)
Now we take our program, Intel C Compiler (or Intel Fortran Compiler - whoever likes it) and assemble the project with the following keys:
-O3 (yes, aggressive optimization);
-axT (enable vectorization, i.e. using SSEx + general code generation for severe cases);
-ip (inter-procedure optimization, including partial inlining. for fans there is an option -ipo - inter-module optimization, not portable !!!);
-ansi-alias -fno-alias (improves vectorizability :) cycles)
Plus, we get: automatic anroll of cycles by at least 4, inlining and intrinsics of all mathematical functions.
And fireside will not catch up;)
Yes, and in the general case there is a difference between the regular and the JIT compiler, but it is not so big as to give out long-known features for discoveries (like: “Oh! We came up with inlining!”)
PS. The first pancake, please do not hit on the face with a lump.
It’s not very clear to me why a holiday with a firework and a snow maiden is arranged around this kind of work. Let's take a closer look at what has been done.
* Function Inlining: Removing the overhead of function calls by simply replacing them with their resulting native code.
Almost all compilers can do function inlining. This is a very simple and you can say free way to speed up the program. It has certain limitations:
1) Excessive inlining leads to bloat code. This is now practically no problem - there is a lot of memory, but the compiler has a top limit on the size of the included function.
2) The compiler is NOT able to inline functions from a neighboring module, from system libraries, or from dynamic libraries.
In addition, the normal (modern) compiler is able to work with the so-called "intrinsics" - functions that it recognizes by the plate and has ready-made code for them. These are usually mathematical functions such as sin (). So, this very sine will not be made like call sin (), but will be inserted by a piece of code, i.e. automatically inline.
* Type Inference: Removing checks surrounding common operators (like "+") when the types contained within a variable are already known. This means that the engine will have already pre-determined, for example, that two strings need to be concated when it sees the "+" operator.
Well, this is usually called RTTI skip - they threw out type checking where it is not needed ... Perhaps for the JIT compiler this is super cool, but ordinary ones have been able to do this for a long time. Naturally, all responsibility for types, or rather their possible inconsistency, lies with the programmer :)
* Looping: The overhead of looping has been grossly diminished. It's one of the most common areas of overhead in JavaScript applications (common repetition of a task) and the constant determining of bounds and the resulting inner code is made negligible.
9 out of 10 that a simple anroll cycle was made. Unrolling is a duplication of the body of the loop N times: if unroll is 4 then we get: Well, plus an additional check that maxI is a multiple of 4 :)
for (int i=0; i
a[i] = b[i]+c[i];
} for (int i=0; i
a[i] = b[i]+c[i];
a[i+1] = b[i+1]+c[i+1];
a[i+2] = b[i+2]+c[i+2];
a[i+3] = b[i+3]+c[i+3];
} Now we take our program, Intel C Compiler (or Intel Fortran Compiler - whoever likes it) and assemble the project with the following keys:
-O3 (yes, aggressive optimization);
-axT (enable vectorization, i.e. using SSEx + general code generation for severe cases);
-ip (inter-procedure optimization, including partial inlining. for fans there is an option -ipo - inter-module optimization, not portable !!!);
-ansi-alias -fno-alias (improves vectorizability :) cycles)
Plus, we get: automatic anroll of cycles by at least 4, inlining and intrinsics of all mathematical functions.
And fireside will not catch up;)
Yes, and in the general case there is a difference between the regular and the JIT compiler, but it is not so big as to give out long-known features for discoveries (like: “Oh! We came up with inlining!”)
PS. The first pancake, please do not hit on the face with a lump.