How to detect overflow of a 32-bit variable in long loops in a 64-bit program

    One of the problems faced by developers of 64-bit applications is the overflow of 32-bit variables in very long cycles. The PVS-Studio code analyzer (Viva64 diagnostics set) handles this task well. There are a number of questions on StackOverflow.com on the topic of variable overflow in loops. But since my answers may be considered purely advertising, and not as useful information, I decided to describe the capabilities of PVS-Studio in an article.

    A typical C / C ++ language construct is a loop. When porting programs to a 64-bit architecture, loops unexpectedly become a weak point, since when developing code rarely anyone thought in advance what would happen if the program had to perform billions of iterations.

    In our articles, we call such situations 64-bit errors. These are actually just mistakes. But their peculiarity is that they manifest themselves only in a 64-bit program. In a 32-bit program, such long cycles simply do not occur, and it is impossible to create an array with the number of elements greater than INT_MAX .

    So the problem. An overflow of integer 32-bit types occurs in a 64-bit program. We are talking about such types as int , unsigned , long (if it is Win64 ). It is necessary to somehow identify all such dangerous places. The PVS-Studio analyzer can do this, which we will talk about.

    Consider the various options for overflowing variables associated with long cycles.

    First situation. Described on the StackOverflow website here: " How can elusive 64-bit portability issues be detected? ". There is a code of the following form:
    int n;
    size_t pos, npos;
    /* ... initialization ... */
    while((pos = find(ch, start)) != npos)
    {
        /* ... advance start position ... */
        n++; // this will overflow if the loop iterates too many times
    }

    The program processes very long lines. In a 32-bit program, the length of the string cannot exceed INT_MAX . Therefore, no error can occur. Yes, the program cannot process any large amounts of data, but this is not an error, but a limitation of the capabilities of the 32-bit architecture.

    In a 64-bit program, the string length may already be longer than INT_MAX and, accordingly, the variable nmay overflow. This will lead to undefined program behavior. Do not think that overflow will simply turn the number 2147483647 into -2147483648. This is precisely vague behavior and it is impossible to predict the consequences. For those who do not believe that overflowing a signed variable leads to unexpected changes in the program, I propose to get acquainted with my article " Undefined behavior is closer than you think ."

    So, you need to find that the variable n can overflow. There is nothing easier. We start PVS-Studio and get a warning:

    V127 An overflow of the 32-bit 'n' variable is possible inside a long cycle which utilizes a memsize-type loop counter. mfcapplication2dlg.cpp 190

    If you change the type of a variablen on size_t , then the error, and accordingly the analyzer message, will disappear.

    It also provides another example of code that needs to be identified:
    int i = 0;
    for (iter = c.begin(); iter != c.end(); iter++, i++)
    {
        /* ... */
    }

    We start PVS-Studio and again we get warning V127:

    V127 An overflow of the 32-bit 'i' variable is possible inside a long cycle which utilizes a memsize-type loop counter. mfcapplication2dlg.cpp 201

    The topic on StackOverflow also raises the question of what to do if the code base is huge and how to find all such errors.

    As you can see, these errors can be detected using the PVS-Studio static code analyzer. And this is the only way to cope with a large project. It should also be noted that PVS-Studio provides a convenient interface for working with a large number of diagnostic messages. You can interactively filter messages, mark them as false, and so on. However, the description of PVS-Studio features is beyond the scope of this article. For those who are interested in the tool, I suggest you get acquainted with the following materials:
    I also note that we have the experience of porting a large project of 9 million lines of code to a 64-bit platform. And PVS-Studio proved to be excellent in working on this project.

    Let's move on to the next topic on the StackOverflow website: " Can Klocwork (or other tools) be aware of types, typedefs and #define directives? ".

    As I understand it, a person set himself the task of finding a suitable tool to search for all loops organized using 32-bit loop counters. Those. in other words, where the int type is used .

    This task is somewhat different from the previous one. But such cycles really should be sought. After all, using the int variable it is impossible to process huge arrays and so on.

    The man approached the solution of the problem incorrectly. This is not his fault. He simply does not know about the existence of PVS-Studio. Now you will understand why I say so.

    So he plans to look for:
    for (int i = 0; i < 10; i++)
        // ...

    This is terrible. You have to look at an incredible number of cycles in order to understand whether they can lead to an error or not. This is a huge job and it can hardly be done without losing attention. Most likely, many dangerous places will be missed.

    Edit all cycles in a row, replacing int , for example, with intptr_t is also a bad option. This is a lot of work and changes in the code.

    PVS-Studio analyzer can help. He will not find the cycle above. Because it does not need to be sought. There simply is no room for error. The loop performs 10 iterations. And there can be no overflow in it. So there is nothing for the programmer to waste time on this piece of code.

    But the analyzer will point out these cycles:
    void Foo(std::vector &v)
    {
      for (int i = 0; i < v.size(); i++)
        v[i] = 1.0;
    }

    The analyzer will immediately display 2 warnings. The first warns that the expression compares the 32-bit type with the memsize type :

    V104 Implicit conversion of 'i' to memsize type in an arithmetic expression: i <v.size () mfcapplication2dlg.cpp 210

    And indeed, the type of the variable i not suitable for organizing long cycles.

    The second warning says that it is strange to use a 32-bit variable as an index. If the array is large, then the code is erroneous.

    V108 Incorrect index type: v [not a memsize-type]. Use memsize type instead. mfcapplication2dlg.cpp 211 The

    correct code should look like this:
    void Foo(std::vector &v)
    {
      for (std::vector::size_type i = 0; i < v.size(); i++)
        v[i] = 1.0;
    }

    The code has become long and ugly, so there is a temptation to use the auto keyword , but you can’t do this - the code changed in this way is again incorrect:
    for (auto i = 0; i < v.size(); i++)
      v[i] = 1.0;

    Since the constant 0 is of type int , then the variable i will also be of type int . And we returned to where we started. By the way, when it came to the new features of the C ++ language standard, I suggest looking at the article " C ++ 11 and 64-bit errors ."

    I think you can compromise and write not the perfect, but the right code:
    for (size_t i = 0; i < v.size(); i++)
      v[i] = 1.0;

    Note. Of course, iterators or the fill () algorithm would be even more correct. But we are talking about searching for overflow of 32-bit variables in old programs. Therefore, I do not consider such options for fixing the code. This is a completely different topic.

    I want to emphasize that the analyzer is smart enough and tries not to bother the programmer in vain. For example, it will not issue warnings if it sees that a small array is being processed:
    void Foo(int n)
    {
      float A[100];
      for (int i = 0; i < n; i++)
        A[i] = 1.0;
    }

    Conclusion

    PVS-Studio analyzer is a leader in searching for 64-bit errors. Initially, it was just created to help programmers port their programs to 64-bit systems. At that time it was also called Viva64. Only later, it turned into a general-purpose analyzer, but the existing 64-bit diagnostics did not disappear anywhere and everyone is also ready to help you.

    Download the demo here .

    Read more about the development of 64-bit programs .

    Also popular now: