A test program compiled by Intel Compiler on an AMD system. "Before" and "after" the patch
Hi Habr! After reading a recent article, will Intel have to remove from the compiler a function that intentionally produces bad code for AMD processors? and all the comments on it, strangely enough, I didn’t see the main thing: tests of “live” applications before applying a patch that blocks the processor manager in the Intel compiler code and after.
Having stock days off and being thusnot the happy owner of a processor from AMD, I decided to study the issue in more detail. Namely, to find out whether a significant difference in the performance of real applications compiled by the Intel compiler is really visible?
I decided to start by reading the original article in English, as well as reading all the discussions on this topic in various forums. Surprisingly, even there I did not find the answer to my question - what is the real increase in real applications. But I found information that, depending on the compiler versions, the registers in the code of the processor manager may be different, and not just eax, ebx, ecx, edx and ebp, as for example implemented in the patch intel_patch-ppro.pl . The esi and edi registers can also be used. Not being an expert on Perl and not having its interpreter on the computer, it was decided to sketch “on my knee” a new patch on Pascal (Virtual Pascal, but Free Pascal can easily be compiled). The patch itself and its source code can be found here: icc_patch.rar
The second task that confronted me was how to actually determine whether a particular program has been compiled using the Intel Compiler? It would seem that programs like PEiD should come to the rescue. I don’t know what’s going on with their signatures, but they give the average distance from a beer stall to the moon. PEiD assured me that a compiled Delphi program with an empty form and a cut-off Reallocation Table was written in Borland C ++ compiler, and things like that. Of course, I had to refuse this option. The second obvious solution is to shovel all the EXE files on the computer and look for the very comparison of cmp eax, 'Genu' and so on. But, such checks are in many programs that are not related to the Intel compiler. Therefore, I had to be patient and look for approximate code in the “heavyweight” programs on the computer,
As a result, there was only one such program from the entire set of software installed on the computer, and lo and behold, it turned out to be a benchmark! It turned out to be ConBench R10 from Maxon
. It perfectly contains the above code sections. Without further ado, we use the patcher on the main EXE of the program, and do not forget to patch all the other libraries of the program, which, by the way, have a non-standard extension .CDL. The test itself evaluates the speed of rendering the picture and, as a result, gives out some “parrots”. Now is the time to take a look at the chart:
Three launches of the program with a patch and without a patch were made. As you can see in the graph, the results vary. But in fact, they differ very, very little, the graph slightly distorts the view. For example, if you take the render time, then on the patched program it is about 7 minutes and 30-40 seconds, and on the benchmark without a patch 7 minutes 50 seconds - 8 minutes. The average difference is no more than 15-20 seconds.
What do we see in the end? Yes, there is some difference, but it is so insignificant that it can be safely neglected. If on large (long-term) tasks the difference does not exceed 20 seconds, then on simple (office) tasks it will not be visible at all.
On the other hand, it’s important not to be fooled - after all, no one bothered the developers of this program from writing all critical cycles in pure assembler, thereby being independent of the compiler. And I suspect that it is so, so unfortunately I could not see the full picture with this program.
What did I find out for myself during this time? The first is that there are not many programs used everyday on the computer of an ordinary user and compiled using the Intel Compiler. To be more precise - miserly little. Secondly, it is very difficult to evaluate the performance gain on other people's programs, because it is not known how they are arranged from the inside. Currently, there is a slow download of the trial version of the compiler, for the experiment directly on their own programs. Even then, it will be possible to draw at least some more or less reliable conclusions.
The purpose of this article is not to incite holivars on the subject of Intel vs AMD, or the like. The goal is to find out and understand how much the notorious processor manager really affects the execution time of programs, as presented to us in the original article. And I would like to do this with your help,% username%.
Having stock days off and being thus
I decided to start by reading the original article in English, as well as reading all the discussions on this topic in various forums. Surprisingly, even there I did not find the answer to my question - what is the real increase in real applications. But I found information that, depending on the compiler versions, the registers in the code of the processor manager may be different, and not just eax, ebx, ecx, edx and ebp, as for example implemented in the patch intel_patch-ppro.pl . The esi and edi registers can also be used. Not being an expert on Perl and not having its interpreter on the computer, it was decided to sketch “on my knee” a new patch on Pascal (Virtual Pascal, but Free Pascal can easily be compiled). The patch itself and its source code can be found here: icc_patch.rar
The second task that confronted me was how to actually determine whether a particular program has been compiled using the Intel Compiler? It would seem that programs like PEiD should come to the rescue. I don’t know what’s going on with their signatures, but they give the average distance from a beer stall to the moon. PEiD assured me that a compiled Delphi program with an empty form and a cut-off Reallocation Table was written in Borland C ++ compiler, and things like that. Of course, I had to refuse this option. The second obvious solution is to shovel all the EXE files on the computer and look for the very comparison of cmp eax, 'Genu' and so on. But, such checks are in many programs that are not related to the Intel compiler. Therefore, I had to be patient and look for approximate code in the “heavyweight” programs on the computer,
mov eax,[ebp][-0008]
cmp eax,0756E6547 ;"uneG" ; Проверка на "Genu"
jne not_intel ; если не равно, уходим на not_intel
mov eax,[ebp][-0010]
cmp eax,049656E69 ;"Ieni" ; Проверка на "ineI"
jne not_intel ; не равно - not_intel
mov eax,[ebp][-0014]
cmp eax,06C65746E ;"letn" ; Проверка на "ntel"
jne not_intel ; не равно - not_intel
mov edx,000000001 ; тот самый секретный байт
jmps next
not_intel:
xor edx,edx ; а здесь 0, для всех не-интел процов
next:
As a result, there was only one such program from the entire set of software installed on the computer, and lo and behold, it turned out to be a benchmark! It turned out to be ConBench R10 from Maxon
. It perfectly contains the above code sections. Without further ado, we use the patcher on the main EXE of the program, and do not forget to patch all the other libraries of the program, which, by the way, have a non-standard extension .CDL. The test itself evaluates the speed of rendering the picture and, as a result, gives out some “parrots”. Now is the time to take a look at the chart:
Three launches of the program with a patch and without a patch were made. As you can see in the graph, the results vary. But in fact, they differ very, very little, the graph slightly distorts the view. For example, if you take the render time, then on the patched program it is about 7 minutes and 30-40 seconds, and on the benchmark without a patch 7 minutes 50 seconds - 8 minutes. The average difference is no more than 15-20 seconds.
What do we see in the end? Yes, there is some difference, but it is so insignificant that it can be safely neglected. If on large (long-term) tasks the difference does not exceed 20 seconds, then on simple (office) tasks it will not be visible at all.
On the other hand, it’s important not to be fooled - after all, no one bothered the developers of this program from writing all critical cycles in pure assembler, thereby being independent of the compiler. And I suspect that it is so, so unfortunately I could not see the full picture with this program.
What did I find out for myself during this time? The first is that there are not many programs used everyday on the computer of an ordinary user and compiled using the Intel Compiler. To be more precise - miserly little. Secondly, it is very difficult to evaluate the performance gain on other people's programs, because it is not known how they are arranged from the inside. Currently, there is a slow download of the trial version of the compiler, for the experiment directly on their own programs. Even then, it will be possible to draw at least some more or less reliable conclusions.
The purpose of this article is not to incite holivars on the subject of Intel vs AMD, or the like. The goal is to find out and understand how much the notorious processor manager really affects the execution time of programs, as presented to us in the original article. And I would like to do this with your help,% username%.