Andrey2008 June 29, 2010 at 15:33

A collection of examples of 64-bit errors in real programs - part 1

I dedicate this article to the habrauser f0b0s , who constantly monitors our activity, accompanying it with subtle humor, which keeps us in good shape.

Readers of our articles on the development of 64-bit applications often blame us for the lack of substantiation of the described problems. Namely, that we do not give examples of errors in real applications.

I decided to collect examples of various types of errors that we ourselves found in real programs that we read about on the Internet or that PVS-Studio users told us about. So, I bring to your attention an article, which is a collection of 30 examples of 64-bit errors in C and C ++.

Continuation of the article >>

Introduction

Our company OOO "Program Verification Systems" is developing a specialized static analyzer Viva64 detecting 64-bit errors in the application code in C / C ++. In the course of this work, our collection of examples of 64-bit defects is constantly updated, and we decided to collect the most interesting errors in our opinion in this article. The article gives examples both taken directly from the code of real applications, and compiled synthetically based on real code, since they are too “stretched" in it.

The article only demonstrates various types of 64-bit errors and does not describe methods for their detection and prevention. You can familiarize yourself in detail with the methods for diagnosing and fixing defects in 64-bit programs by contacting the following resources:

You can also familiarize yourself with the demo version of the PVS-Studio tool , which includes the Viva64 static code analyzer that detects almost all the errors described in the article. The demo version is available for download at: http://www.viva64.com/en/pvs-studio/download/ .

Example 1. Buffer overflow

struct STRUCT_1
{
  int * a;
};
struct STRUCT_2
{
  int x;
};
...
STRUCT_1 Abcd;
STRUCT_2 Qwer;
memset (& Abcd, 0, sizeof (Abcd));
memset (& Qwer, 0, sizeof (Abcd));

Two objects of type STRUCT_1 and STRUCT_2 are declared in the program, which must be cleared before initial use (initialize all fields with zeros). When implementing the initialization, the programmer decided to copy a similar line and replaced "& Abcd" with "& Qwer" in it. But at the same time, he forgot to replace “sizeof (Abcd)” with “sizeof (Qwer).” By a fortunate coincidence, the size of the STRUCT_1 and STRUCT_2 structures coincided on a 32-bit system and the code worked correctly for a long time.

When porting the code to a 64-bit system, the size of the Abcd structure increased, and as a result, a buffer overflow error occurred (see Figure 1). Figure 1 - Schematic explanation of an example of a buffer overflow. Such an error can be difficult to detect if the data used much later is corrupted.

Example 2. Extra casts

char * buffer;
char * curr_pos;
int length;
...
while ((* (curr_pos ++)! = 0x0a) && 
       ((UINT) curr_pos - (UINT) buffer <(UINT) length));

The code is bad, but it is real code. His task is to find the end of the line indicated by 0x0A. The code will not work with lines longer than INT_MAX characters, since the variable length is of type int. However, we are interested in another error, so we assume that the program works with a small buffer and the use of the int type is correct.

The problem is that on a 64-bit system, the buffer and curr_pos pointers can lie outside the first 4 gigabytes of the address space. In this case, explicit casting of pointers to the UINT type will discard the significant bits, and the algorithm will be violated (see Figure 2). Figure 2 - Incorrect calculations when searching for a terminal symbol

The error is unpleasant in that the code can work correctly for a long time while the memory for the buffer is allocated in the lower four gigabytes of the address space. The bug fix is to remove completely unnecessary explicit type conversions:

while (curr_pos - buffer <length && * curr_pos! = '\ r')
  curr_pos ++;

Example 3. Incorrect #ifdef

Often in programs with a long history, you can find sections of code wrapped in the #ifdef - - # else - #endif construct. When porting programs to the new architecture, incorrectly written conditions can lead to compilation of the wrong code fragments as planned by the developers in the past (see Figure 3). Example:

#ifdef _WIN32 // Win32 code
  cout << "This is Win32" << endl;
#else // win16 code
  cout << "This is Win16" << endl;
#endif
// Alternative incorrect option:
#ifdef _WIN16 // Win16 code
  cout << "This is Win16" << endl;
#else // win32 code
  cout << "This is Win32" << endl;
#endif

Figure 3 - Two options - it's too little

Relying on the #else option is dangerous in such situations. It is better to explicitly consider the behavior for each case (see Figure 4), and put a compilation error message in the #else branch:

#if defined _M_X64 // Win64 code (Intel 64)
  cout << "This is Win64" << endl;
#elif defined _WIN32 // Win32 code
  cout << "This is Win32" << endl;
#elif defined _WIN16 // Win16 code
  cout << "This is Win16" << endl;
#else
  static_assert (false, "Unknown platform");
#endif

Figure 4 - All possible compilation paths are checked

Example 4. Confusion with int and int *

In old programs, especially in C, snippets of code where the pointer is stored in int type are not rare. However, sometimes this is not done intentionally, but rather by inattention. Consider an example containing confusion arising from the use of type int and a pointer to type int:

int GlobalInt = 1;
void GetValue (int ** x)
{
  * x = & GlobalInt;
}
void SetValue (int * x)
{
  GlobalInt = * x;
}
...
int XX;
GetValue ((int **) & XX);
SetValue ((int *) XX);

In this example, the variable XX is used as a buffer to hold the pointer. This code will work correctly on those 32-bit systems where the size of the pointer matches the size of the int type. On a 64-bit system, this code is incorrect and the call

GetValue ((int **) & XX);

it will corrupt 4 bytes of memory next to the variable XX (see Figure 5). Figure 5 - Memory corruption next to variable XX. The above code was written either by a novice or in a hurry. Moreover, explicit type conversions indicate that the compiler resisted to the last, hinting to the developer that the pointer and int are different entities. However, brute force won. The error correction is elementary and consists in choosing the right type for the variable XX. In this case, explicit casting ceases to be necessary:

int * XX;
GetValue (& XX);
SetValue (XX);

Example 5. Using deprecated features

A number of API functions, although left for compatibility, constitute a danger when developing 64-bit applications. A classic example is the use of functions such as SetWindowLong and GetWindowLong. In programs, you can find code similar to the following:

SetWindowLong (window, 0, (LONG) this);
...
Win32Window * this_window = (Win32Window *) GetWindowLong (window, 0);

The programmer who once wrote this code has nothing to blame. During development, about 5-10 years ago, a programmer, drawing on his experience and MSDN, compiled the code completely correct from the point of view of a 32-bit Windows system. The prototype of these functions is as follows:

LONG WINAPI SetWindowLong (HWND hWnd, int nIndex, LONG dwNewLong);
LONG WINAPI GetWindowLong (HWND hWnd, int nIndex);

The fact that the pointer is explicitly cast to the LONG type is also justified, since the size of the pointer and the LONG type are the same on Win32 systems. But I think it’s clear that when recompiling the program in the 64-bit version, data of type casting can cause the application to crash or malfunction.

The unpleasantness of the error lies in its irregular or even extremely rare manifestation. Whether an error occurs or not depends on in which area of the memory the object is created that the “this” pointer points to. If an object is created in the lower 4 gigabytes of address space, then the 64-bit program can function correctly. The error can unexpectedly manifest itself after a long period of time when, due to the allocation of memory, objects will begin to be created outside the first four gigabytes.

In a 64-bit system, you can use the SetWindowLong / GetWindowLong functions only if the program really saves some values like LONG, int, bool and the like. If you need to work with pointers, then you should use advanced options for functions: SetWindowLongPtr / GetWindowLongPtr. Although, perhaps, it is recommended to use new functions in any case, so as not to provoke new errors in the future.

Examples with the SetWindowLong and GetWindowLong functions are classic and are given in almost all articles devoted to the development of 64-bit applications. However, it should be noted that business is not limited to these functions. Pay attention to: SetClassLong, GetClassLong, GetFileSize, EnumProcessModules, GlobalMemoryStatus (see Figure 6).

Figure 6 - Table with the names of some obsolete and modern functions

Example 6. Trimming Values with Implicit Type Coercion

Implicit casts of size_t to unsigned and similar casts are well diagnosed with compiler warnings. However, in large programs, such warnings can easily be lost. Consider an example similar to real code, where the warning was ignored, because it seemed that nothing bad could happen when working with short lines.

bool Find (const ArrayOfStrings & arrStr)
{
  ArrayOfStrings :: const_iterator it;
  for (it = arrStr.begin (); it! = arrStr.end (); ++ it)
  {
    unsigned n = it-> find ("ABC"); // truncation
    if (n! = string :: npos)
      return true;
  }
  return false;
};

The above function searches for the text “ABC” in an array of strings and returns true if at least one string contains the sequence “ABC”. When compiling a 64-bit version of the code, this function will always return true.

The constant “string :: npos” in a 64-bit system has the value 0xFFFFFFFFFFFFFFFFFF of type size_t. When this value is placed in an unsigned variable "n", it is truncated to 0xFFFFFFFF. As a result, the condition "n! = String :: npos" is always true, since 0xFFFFFFFFFFFFFFFFFF is not equal to 0xFFFFFFFF (see Figure 7). Figure 7 - Schematic explanation of the error of cutting the value The correction is elementary, just listen to the compiler warnings:

for (auto it = arrStr.begin (); it! = arrStr.end (); ++ it)
{
  auto n = it-> find ("ABC");
  if (n! = string :: npos)
    return true;
}
return false;

Example 7. Undeclared functions in C

Despite the years, programs or parts of programs written in C remain alive than all living things. The code of these programs is much more prone to 64-bit errors due to less stringent type control rules in the C language.

In C, you can use functions without first declaring them. Let us analyze an interesting example of a 64-bit error related to this. To begin, consider the correct version of the code in which the allocation and use of three gigabyte-sized arrays each occurs:

#include 
void test ()
{
  const size_t Gbyte = 1024 * 1024 * 1024;
  size_t i;
  char * Pointers [3];
  // Allocate
  for (i = 0; i! = 3; ++ i)
    Pointers [i] = (char *) malloc (Gbyte);
  // Use
  for (i = 0; i! = 3; ++ i)
    Pointers [i] [0] = 1;
  // Free
  for (i = 0; i! = 3; ++ i)
    free (Pointers [i]);
}

This code will correctly allocate memory, write to the first element of each array one by one, and free up occupied memory. The code works correctly on a 64-bit system.

Now delete or comment out the line "#include". The code will continue to be collected, but it will crash when the program starts. If the header file" stdlib.h "is not connected, the C compiler considers the malloc function to return the int type. The first two memory allocations will most likely pass successfully. After the third call malloc function returns the address of the array beyond the first 2 gigabytes. Since the compiler considers that the result of the function is an int, it incorrectly interprets the result and stores in an array pointers invalid pointer value.

Consider the assembly to . Generated by the Visual C ++ compiler for 64-bit version of the Debug First is the correct code that will be generated when there is classified malloc (attached file «stdlib.h») function:

Pointers [i] = (char *) malloc (Gbyte);
mov rcx, qword ptr [Gbyte]
call qword ptr [__imp_malloc (14000A518h)]
mov rcx, qword ptr [i]
mov qword ptr Pointers [rcx * 8], rax

Now consider a variant of incorrect code when there is no declaration of the malloc function:

Pointers [i] = (char *) malloc (Gbyte);
mov rcx, qword ptr [Gbyte]
call malloc (1400011A6h)
cdqe
mov rcx, qword ptr [i]
mov qword ptr Pointers [rcx * 8], rax

Note the availability of the CDQE (Convert doubleword to quadword) statement. The compiler calculated that the result is in the eax register and expanded it to a 64-bit value to write to the Pointers array. Accordingly, the high bits of the rax register will be lost. Even if the address of the allocated memory lies within the first four gigabytes, in the case when the highest bit of the eax register is 1, we will still get an incorrect result. For example, the address 0x81000000 will turn into 0xFFFFFFFF81000000.

Example 8. The remains of dinosaurs in large and old programs

Large old software systems that have been developing for decades are replete with a variety of atavisms and simply pieces of code written using popular paradigms and styles over the years. In such systems, one can observe the evolution of the development of programming languages, when the oldest parts are written in the style of the C language, and in the latest ones you can find complex templates in the style of Alexandrescu.

Figure 8 - Dinosaur excavations

There are atavisms associated with 64-bit. Rather, atavisms that impede the operation of modern 64-bit code. Consider an example:

// beyond this, assume a programming error
#define MAX_ALLOCATION 0xc0000000 
void * malloc_zone_calloc (malloc_zone_t * zone,
  size_t num_items, size_t size)
{
  void * ptr;
  ...
  if (((unsigned) num_items> = MAX_ALLOCATION) ||
      ((unsigned) size> = MAX_ALLOCATION) ||
      ((long long) size * num_items> =
       (long long) MAX_ALLOCATION))
  {  
    fprintf (stderr,
      "*** malloc_zone_calloc [% d]: arguments too large:% d,% d \ n",
      getpid (), (unsigned) num_items, (unsigned) size);
    return NULL;
  }
  ptr = zone-> calloc (zone, num_items, size);
  ...
  return ptr;
}

Firstly, the function code contains a check for the permissible size of the allocated memory, which is strange for a 64-bit system. And secondly, the diagnostic message that is issued will be incorrect, because if we ask to allocate memory for 4,400,000,000 elements, due to the explicit casting to unsigned, we will get a strange message about the impossibility of allocating memory for only 105,032,704 elements.

Example 9. Virtual functions

One of the beautiful examples of 64-bit errors is the use of invalid argument types in virtual function declarations. And usually this is not someone’s sloppiness, but simply an “accident”, where there are no guilty parties, but there is a mistake. Consider the following situation.

Since time immemorial, the MFC library has a CWinApp class that has a WinHelp function:

class CWinApp {
  ...
  virtual void WinHelp (DWORD dwData, UINT nCmd);
};

To show your own help in a user application, you had to block this function:

class CSampleApp: public CWinApp {
  ...
  virtual void WinHelp (DWORD dwData, UINT nCmd);
};

And everything was fine until 64-bit systems appeared. MFC developers had to change the interface of the WinHelp function (and some other functions) as follows:

class CWinApp {
  ...
  virtual void WinHelp (DWORD_PTR dwData, UINT nCmd);
};

In 32-bit mode, the types DWORD_PTR and DWORD coincided, but in 64-bit mode, no. Naturally, custom application developers should also change the type to DWORD_PTR, but to do this, you need to find out about this at the beginning. As a result, an error occurs in a 64-bit program, since the WinHelp function in the user class is not called (see Figure 9). Figure 9 - Error related to virtual functions

Example 10. Magic numbers as parameters

Magic numbers contained in the body of programs are bad style and cause errors. An example of magic numbers is 1024 and 768, which rigidly indicate the size of the screen resolution. In the framework of this article, we are interested in those magic numbers that can lead to problems in a 64-bit application. The most common numbers that are dangerous for 64-bit programs are presented in the table in Figure 10.

Figure 10 - Magic numbers that are dangerous for 64-bit programs

Let us demonstrate an example of working with the CreateFileMapping function found in one of the CAD systems:

HANDLE hFileMapping = CreateFileMapping (
  (HANDLE) 0xFFFFFFFF,
  NULL
  PAGE_READWRITE,
  dwMaximumSizeHigh,
  dwMaximumSizeLow,
  name);

Instead of the correct reserved constant INVALID_HANDLE_VALUE, the number 0xFFFFFFFF is used. It is incorrect to Win64 program, which is set to a constant INVALID_HANDLE_VALUE 0xFFFFFFFFFFFFFFFF. The correct option to call the function would be:

HANDLE hFileMapping = CreateFileMapping (
  INVALID_HANDLE_VALUE,
  NULL
  PAGE_READWRITE,
  dwMaximumSizeHigh,
  dwMaximumSizeLow,
  name);

Note. Some believe that the value 0xFFFFFFFF when expanded to a pointer turns into 0xFFFFFFFFFFFFFFFFFF. This is not true. According to C / C ++ rules, the value 0xFFFFFFFF is of type “unsigned int”, since it cannot be represented by type “int”. Accordingly, expanding to a 64-bit type, the value 0xFFFFFFFFFFu turns into 0x00000000FFFFFFFFFFu. But if you write like this (size_t) (- 1), then we get the expected 0xFFFFFFFFFFFFFFFF. Here, “int” is first expanded to “ptrdiff_t”, and then converted to “size_t”.

Example 11. Magic constants denoting size

Another common mistake is to use magic numbers to set the size of an object. Consider an example of allocating and zeroing a buffer:

size_t count = 500;
size_t * values = new size_t [count];
// Only part of the buffer will be filled
memset (values, 0, count * 4);

In this case, in a 64-bit system, more memory is allocated than is then filled with zero values (see Figure 11). The error is to assume that size_t is always four bytes. Figure 11 - Filling only part of the array. Correct option:

size_t count = 500;
size_t * values = new size_t [count];
memset (values, 0, count * sizeof (values [0]));

Similar errors can be encountered when calculating the size of the allocated memory or data serialization.

Example 12. Stack Overflow

In many cases, a 64-bit program consumes more memory and stack. Allocating more memory on the heap is not dangerous, since this type of memory is available to a 64-bit program many times more than 32-bit. But increasing the used stack memory can lead to its unexpected overflow (stack overflow).

The mechanism for using the stack is different in different operating systems and compilers. We will consider the peculiarity of using the stack in Win64 code of applications built by the Visual C ++ compiler.

In the development of agreements on calls ( callingconventions) in Win64 systems decided to put an end to the existence of various options for calling functions. There were a number of calling conventions in Win32: stdcall, cdecl, fastcall, thiscall, and so on. In Win64, there is only one “native” calling convention. Modifiers like the __cdecl compiler are ignored.

The x86-64 calling convention is similar to the fastcall convention in x86. In the x64 convention, the first four integer arguments (from left to right) are passed in 64-bit registers selected specifically for this purpose:

RCX: 1st integer argument
RDX: 2nd integer argument
R8: 3rd integer argument
R9: 4- th integer argument

The remaining integer arguments are passed through the stack. The this pointer is considered an integer argument, so it is always placed in the RCX register. If floating-point values are passed, then the first four of them are transferred in the XMM0-XMM3 registers, and the subsequent ones through the stack.

Although arguments can be passed in registers, the compiler still reserves space for them on the stack, decreasing the value of the RSP register (stack pointer). At a minimum, each function should reserve 32 bytes on the stack (four 64-bit values corresponding to the registers RCX, RDX, R8, R9). This space on the stack makes it easy to save the contents of the registers passed to the function on the stack. The function being called is not required to dump the input parameters passed through the registers onto the stack, but reserving the place on the stack allows this, if necessary. If more than four integer parameters are passed, the corresponding additional space is reserved on the stack.

The described feature leads to a substantial increase in the rate of absorption of the stack. Even if the function has no parameters, 32 bytes will still be "bitten off" from the stack, which are then not used in any way. The meaning of using such an uneconomical mechanism is related to unification and simplification of debugging.

Let's pay attention to one more moment. The RSP stack pointer must be aligned at the 16 byte boundary before the next function call. Thus, the total size of the used stack when calling a function without parameters in a 64-bit code is 48 bytes: 8 (return address) + 8 (alignment) + 32 (reserve for arguments).

is it so bad? Not. It should not be forgotten that the larger number of registers available to the 64-bit compiler allow you to build more efficient code and not reserve memory on the stack for some local function variables. Thus, in some cases, the 64-bit version of the function uses less stack than the 32-bit version. This issue and various examples are discussed in more detail in the article " Reasons why 64-bit programs require more stack memory ."

It is impossible to predict whether a 64-bit program will consume more than a stack or less. Due to the fact that the Win64-program can use 2-3 times more stack memory, you need to play it safe and change the project setting, which is responsible for the size of the reserved stack. In the project settings, select the parameter Stack Reserve Size (switch / STACK: reserve) and increase the size of the reserved stack three times. By default, this size is 1 megabyte.

Example 13. A function with a variable number of arguments and buffer overflow

Although using functions with a variable number of arguments, such as printf, scanf is considered a bad style in C ++, they are still widely used. These functions create many problems when porting applications to other systems, including 64-bit systems. Consider an example:

int x;
char buf [9];

sprintf (buf, "% p", & x);
The author of the code did not take into account that the size of the pointer in the future may be more than 32 bits. As a result, on a 64-bit architecture, this code will lead to a buffer overflow (see Figure 12). This error can be attributed to the use of the magic number '9', but in a real application, buffer overflows can occur without magic numbers. Figure 12 - Buffer overflow when working with the sprintf function. The options for fixing this code are different. The most efficient way is to refactor the code in order to get rid of the use of dangerous functions. For example, you can replace printf with cout, and sprintf with boost :: format or std :: stringstream.

Примечание. Эту рекомендацию часто критикуют разработчики под Linux, аргументируя тем, что gcc проверяет соответствие строки форматирования фактическим параметрам, передаваемым, например, в функцию printf. И, следовательно, использование printf безопасно. Однако они забывают, что строка форматирования может передаваться из другой части программы, загружаться из ресурсов. Другими словами, в реальной программе строка форматирования редко присутствует в явном виде в коде, и, соответственно, компилятор не может ее проверить. Если же разработчик использует Visual Studio 2005/2008/2010, то он не сможет получить предупреждение на код вида void *p = 0; printf("%x", p); даже используя ключи /W4 и /Wall.

Пример 14. Функция с переменным количеством аргументов и неверный формат

Often in programs you can find incorrect formatting strings when working with the printf function and other similar functions. Because of this, incorrect values will be displayed, which, although it will not lead to an abnormal termination of the program, it is, of course, an error:

const char * invalidFormat = "% u";
size_t value = SIZE_MAX;
// Wrong value will be printed
printf (invalidFormat, value);

In other cases, an error in the format string will be critical. Consider an example based on the implementation of the UNDO / REDO subsystem in one of the programs:

// Here the pointers were stored as a string
int * p1, * p2;
....
char str [128];
sprintf (str, "% X% X", p1, p2);
// In another function, this line
// processed as follows:
void foo (char * str)
{
  int * p1, * p2;
  sscanf (str, "% X% X", & p1, & p2);
  // Result - incorrect value of pointers p1 and p2.
  ...
}

The format "% X" is not intended for working with pointers, and as a result, such a code is incorrect from the point of view of 64-bit systems. In 32-bit systems, it is quite functional, although not beautiful.

Example 15. Storage of integer values in double

We did not have to encounter such a mistake ourselves. Probably this mistake is rare, but quite real.
Type double, has a size of 64 bits, and is compatible with the IEEE-754 standard on 32-bit and 64-bit systems. Some programmers use the double type to store and work with integer types:

size_t a = size_t (-1);
double b = a;
--a;
--b;
size_t c = b; // x86: a == c
              // x64: a! = c

You can still try to justify this example on a 32-bit system, since the double type has 52 significant bits and is capable of storing a 32-bit integer value without loss. But when trying to save a 64-bit integer into double, the exact value may be lost (see Figure 13).

Figure 13 - The number of significant bits in the types size_t and double

The second part of the article.

Tags: