Andrey2008 January 19, 2010 at 15:06

64-bit horse that can count

The article is devoted to the behavior of the Visual C ++ compiler when generating 64-bit code and the potential errors associated with this.

Introduction

The phenomenon of “Smart Hans”, the horse of Mr. von Osten, was described in 1911 [1]. Clever Hans was famous for his ability to read and solve mathematical problems, tapping the answer with his front hoof. Of course, there were many skeptics. Therefore, the ability of Hans was tested by a commission of experts, which established that the horse demonstrates them without the help of Mr. von Osten. But how could such a human exist! - the intelligence level of a simple horse? Psychologist O. Pfangst, with extreme care, performed a series of experiments, as a result of which he discovered that Hans received subtle unintentional clues from those who asked him questions. For example, after Hans was asked about something, people fixed their eyes on his front hoof, with the help of which the horse “answered”. But as soon as Hans kicked the right number of times, the questioners raised their eyes or head quite a bit, waiting for the completion of his answer. And the horse, which was trained to notice and use these movements, almost elusive to observers, perceived them as signals for the cessation of their actions. From the outside, it always looked like the right answer to the question.

Here was such a wonderful horse, who considered and solved problems, although he did not know how to do it. The digital horses of the beginning of the 21st century were 64-bit programs, many of which also do not know how to count, although they successfully pretend. Consider this phenomenon in more detail.

1. Potential errors

I will be the author and co-author of a number of articles devoted to the problems of developing 64-bit applications. Articles you can find on our website: http://www.viva64.com/en/articles/64-bit-development . In these articles, I try to use the term “potential error” or “hidden error”, and not just “error” [ 2 , 3 , 4 ].

I explain this by the fact that the same code can be considered both correct and incorrect depending on its purpose. A simple example is the use of an int variable for indexing array elements. If with this variable we refer to an array of graphic windows, then everything is correct. There is no need, and it will not work with billions of windows. But indexing using an int variable to array elements in 64-bit mathematical programs or databases may well be a problem when the number of elements falls outside the range 0..INT_MAX.

But there is another, much more subtle reason to call the errors "potential." The fact is whether the error will manifest itself or not, depends not only on the input data, but also on the mood of the compiler optimizer. I went around this topic for a long time, since most of these errors work well in the debug version, and are “potential” only in release versions. However, not every program built as debug can be debugged on large amounts of data. There is a situation when the debug version is tested only on the simplest data sets. And load testing and testing by end users on real data is performed on release versions, where errors can be temporarily hidden. Therefore, I decided to share the knowledge that is. I hope I can convince that when porting a program to another platform, it is dangerous to rely only on verification of the execution phase (unit tests, dynamic analysis, manual testing). You will say all this to promote the Viva64 tool. Yes, for this, but still listen to the terrible stories that I will now tell you. I love to tell them.

2. How it all began

- And why do you have two identical JMPs in your code in a row?
- What if the first does not work.

For the first time, I came across Visual C ++ 2005 compiler optimization features when preparing a PortSample program. This is a project that is part of the Viva64 distribution kit and is intended to demonstrate all the errors that the Viva64 analyzer diagnoses. The examples contained in this project should work correctly in 32-bit mode and lead to errors in the 64-bit version. Everything worked fine in the debug version, but there were some difficulties with the release version. That code, which in 64-bit mode was supposed to freeze or crash, worked successfully! The reason was optimization. The solution was an additional excessive complication of the code for the examples and the arrangement of the “volatile” keywords, which you will be able to observe in plenty in the PortSample project.

The same applies to Visual C ++ 2008. The code will of course be slightly different, but everything that will be written in this article can be attributed to both Visual C ++ 2005 and Visual C ++ 2008. And further in the article there will be no differences.

If it seems to you that this is only good, if some errors do not manifest themselves, then rather drive this thought. Code with similar errors becomes extremely unstable. And the slightest code change that is not directly related to the error can lead to a change in behavior. Just in case, I emphasize that this is not the compiler's fault, but hidden code defects. Next, we will show approximate phantom errors that disappear and appear in release versions with the slightest code changes and which you can hunt for a long time.

3. Phantoms

The chapter will be long and boring, so I’ll start with a joke, which is its brief summary:
Ilya Muromets walked through the woods and went into the clearing on which Zmey Gorynych was sitting. Ilya Muromets ran up to the Serpent Gorynych and cut down his only head. And the Serpent of Gorynych instead of this head grew two. Ilya cut two heads, grew 4. He cut 4, grew 8 ... And so it went on for an hour, two, three ... And then Ilya Muromets cut off the Snake Gorynych 32768 heads and died the Snake Gorynych, because he was 16-bit.

As in the joke, the problems lie in the type overflow, which may or may not occur, depending on what code the compiler generates when optimization is turned on. Consider the first example code that works in release mode, although it should not:

int index = 0;
size_t arraySize = ...;
for (size_t i = 0; i! = arraySize; i ++)
  array [index ++] = BYTE (i);

This code correctly fills the entire array with values, even if the size of the array is much larger than INT_MAX. Theoretically, this is not possible, since the variable index is of type int. After some time, due to overflow, access to elements at a negative index should occur. However, optimization leads to the generation of the following code:

0000000140001040 mov byte ptr [rcx + rax], cl 
0000000140001043 add rcx, 1 
0000000140001047 cmp rcx, rbx 
000000014000104A jne wmain + 40h (140001040h)

As you can see, 64-bit registers are used and overflow does not occur. But let's make a very small code fix:

int index = 0;
for (size_t i = 0; i! = arraySize; i ++)
{
  array [index] = BYTE (index);
  ++ index;
}

We assume that this way the code looks more beautiful. Agree that it functionally remained the same. But the result will be significant - the program will crash. Consider the code generated by the compiler:

0000000140001040 movsxd rcx, r8d 
0000000140001043 mov byte ptr [rcx + rbx], r8b 
0000000140001047 add r8d, 1 
000000014000104B sub rax, 1 
000000014000104F jne wmain + 40h (140001040h)

The same overflow occurs, which should have been in the previous example. The value of the register r8d = 0x80000000 is expanded in rcx as 0xffffffff80000000. And as a result, writing outside the array.

Let's look at another optimization example and how easy it is to ruin everything. Example:

unsigned index = 0;
for (size_t i = 0; i! = arraySize; ++ i) {
  array [index ++] = 1;
  if (array [i]! = 1) {
    printf ("Error \ n");
    break;
  }
}

Assembly code:

0000000140001040 mov byte ptr [rdx], 1 
0000000140001043 add rdx, 1 
0000000140001047 cmp byte ptr [rcx + rax], 1 
000000014000104B jne wmain + 58h (140001058h) 
000000014000104D add rcx, 1 
0000000140001051 cmp rcx, rdi 
0000000140001054 jne wmain + 40h (140001040h)

The compiler decided to use the 64-bit rdx register to store the index variable. As a result, the code can correctly process arrays larger than UINT_MAX.

But the world is fragile. It is enough to complicate the code a little and it will become incorrect:

volatile unsigned volatileVar = 1;
...
unsigned index = 0;
for (size_t i = 0; i! = arraySize; ++ i) {
  array [index] = 1;
  index + = volatileVar;
  if (array [i]! = 1) {
    printf ("Error \ n");
    break;
  }
}

Using instead of index ++ the expression "index + = volatileVar;" leads to the fact that 32-bit registers begin to participate in the code, because of which overflows occur:

0000000140001040 mov ecx, r8d 
0000000140001043 add r8d, dword ptr [volatileVar (140003020h)] 
000000014000104A mov byte ptr [rcx + rax], 1 
000000014000104E cmp byte ptr [rdx + rax], 1 
0000000140001052 jne wmain + 5Fh (14000105Fh) 
0000000140001054 add rdx, 1 
0000000140001058 cmp rdx, rdi 
000000014000105B jne wmain + 40h (140001040h)

In the end I will give an interesting, but great example. Unfortunately, I could not reduce it in order to maintain the necessary behavior. This is precisely why such errors are dangerous, since it is impossible to predict what a simple code change leads to.

ptrdiff_t UnsafeCalcIndex (int x, int y, int width) {
  int result = x + y * width;
  return result;
}

...
int domainWidth = 50000;
int domainHeght = 50000;
for (int x = 0; x! = domainWidth; ++ x)
  for (int y = 0; y! = domainHeght; ++ y)
    array [UnsafeCalcIndex (x, y, domainWidth)] = 1;

This code cannot correctly fill an array consisting of 50,000 * 50,000 elements. This is impossible for the reason that when calculating "int result = x + y * width;" overflow should occur.

Thanks to a miracle, the array is still correctly filled in the release version. The UnsafeCalcIndex function is built into the loop, using 64-bit registers:

0000000140001052 test rsi, rsi 
0000000140001055 je wmain + 6Ch (14000106Ch) 
0000000140001057 lea rcx, [r9 + rax] 
000000014000105B mov rdx, rsi 
000000014000105E xchg ax, ax 
0000000140001060 mov byte ptr [rcx], 1 
0000000140001063 add rcx, rbx 
0000000140001066 sub rdx, 1 
000000014000106A jne wmain + 60h (140001060h) 
000000014000106C add r9.1 
0000000140001070 cmp r9, rbx 
0000000140001073 jne wmain + 52h (140001052h)

All this happened because the UnsafeCalcIndex function is simple and can be easily integrated. It’s worth complicating it a bit or the compiler should consider that it’s not worth embedding it, and an error will occur that will manifest itself on large volumes of data.

We slightly modify (complicate) the UnsafeCalcIndex function. Note that the function logic has not changed a bit:

ptrdiff_t UnsafeCalcIndex (int x, int y, int width) {
  int result = 0;
  if (width! = 0)
    result = y * width;
  return result + x;
}

The result is an abnormal termination of the program when it goes beyond the boundaries of the array:

0000000140001050 test esi, esi 
0000000140001052 je wmain + 7Ah (14000107Ah) 
0000000140001054 mov r8d, ecx 
0000000140001057 mov r9d, esi 
000000014000105A xchg ax, ax 
000000014000105D xchg ax, ax 
0000000140001060 mov eax, ecx 
0000000140001062 test ebx, ebx 
0000000140001064 cmovne eax, r8d 
0000000140001068 add r8d, ebx 
000000014000106B cdqe             
000000014000106D add rax, rdx 
0000000140001070 sub r9.1 
0000000140001074 mov byte ptr [rax + rdi], 1 
0000000140001078 jne wmain + 60h (140001060h) 
000000014000107A add rdx, 1 
000000014000107E cmp rdx, r12 
0000000140001081 jne wmain + 50h (140001050h)

I think you are already bored. I apologize. I just wanted to show how an easily working 64-bit program can become broken after you make the most harmless edits to it or compile it with another version of the compiler.

4. Diagnosis of potential errors

A program is a sequence of error handling.
(c) Unknown author

I suppose that many existing 64-bit applications, or those that will soon be ported to 64-bit systems, may suddenly begin to bring more and more unpleasant surprises. They can reveal a large number of defects with an increase in the input data volume, which was not available for processing on 32-bit systems. Hidden defects can unexpectedly manifest themselves during the further modernization of the program code or when changing the version of libraries or the compiler.

As with the horse, the first impression may be deceiving. And the fact that your program has successfully begun to process a large amount of data can only seem to you. A much more thorough check is needed, which will be able to accurately show whether your 64-bit horse really believes or not.

To be sure that the 64-bit program is correct, the most minimal step will be to use not only release, but also debug versions at all stages of testing. Keep in mind that this is a necessary but not sufficient condition. If the tests do not use data sets that, for example, do not use a large amount of RAM, then the error may not manifest itself in both release and debug version [ 5]. It is necessary to expand unit tests, expand data sets for load and manual testing. It is necessary to force the algorithms to process new combinations of data that are available only on 64-bit systems [ 6 ].

An alternative way to diagnose 64-bit errors is to use static analysis tools. It is much more radical and reliable than guessing about enough tests added or not. It is convenient because it does not require the use of a debug version to grind gigabytes of data.

The meaning of the method is to perform a full analysis of the project once during the porting of the program and view all diagnostic messages about suspicious places in the code. A list of thousands and tens of thousands of warnings scares people away. But the total time immediately spent on analyzing them will be an order of magnitude less than correcting for years a variety of bug reports that arise literally from nowhere. These will be exactly the same phantoms described earlier. In addition, when you start working with the list of warnings, it quickly becomes clear that most of them can be filtered out and there will be not so much analysis work as it seems. In the future, it will be enough for you to use static analysis, only for newly created code, which does not take much time.

Of the tools for finding 64-bit phantoms, I will certainly offer the tool that we are developing - Viva64 . By the way, soon this tool will become part of PVS-Studio, which will combine all our static analysis tools.

To be more objective and less criticized with this article, as an advertisement, I will mention some other tools. It should be called Gimpel PC-Lint and Parasoft C ++ Test . They also implement rules for checking 64-bit errors, although in this they have less diagnostic capabilities than the highly specialized Viva64 tool [ 7 ]. There is still Abraxas CodeCheck, the new version of which (14.5) also implements diagnostic functions for 64-bit errors, but I do not have more detailed information.

Conclusion

I will be glad if the article helps you learn new platforms more easily, knowing what hidden problems may arise. Thanks for attention.

Bibliographic list

Roger R. Hawk. 40 studies that shocked psychology. 4th int. ed. St. Petersburg: Prime Euroznak, 2006, ISBN: 5-93878-237-6.
Andrey Karpov. 64 bits, / Wp64, Visual Studio 2008, Viva64 and everything, everything, everything ... http://www.viva64.com/art-1-1-253695945.html
Andrey Karpov, Evgeny Ryzhkov. Static code analysis for verification of 64-bit applications. http://www.viva64.com/art-1-1-1630333432.html
Andrey Karpov. 7 steps to port a program to a 64-bit system. http://www.viva64.com/art-1-1-1148261225.html
Andrey Karpov, Evgeny Ryzhkov. 20 pitfalls of porting C ++ code to a 64-bit platform. http://www.viva64.com/art-1-1-1958348565.html
Andrey Karpov, Evgeny Ryzhkov. Finding traps in C / C ++ code when porting applications to a 64-bit version of Windows. http://www.viva64.com/art-1-1-329725213.html
Andrey Karpov. Comparison of the diagnostic capabilities of analyzers when checking 64-bit code. http://www.viva64.com/art-1-1-1441719613.html

Tags: