How we searched and found an error in Visual Studio C ++

    It was a wonderful summer day. Outside the window, clouds were shining, crows were singing in gentle voices, a car was cheerfully stained with shampoo on a car wash, a perforator quietly scrubbed behind the wall - in general, an idyll.

    Nothing foreshadowed the troubles exactly until the moment I launched the program just assembled in release mode in order to check the changes before transferring them to the testing department.

    Background


    Our company has existed for a relatively long time, and the main product is already older than some of the company's employees, so there is enough ancient code. Nevertheless, we try to keep up to date, Modern C ++ is actively used, so about a year ago the main project was transferred to VC2015. It was a separate circus with horses, tambourines, blackjack and valerian. The helper code is translated as time and desire appear. In this case, I decided to transfer to VC2015 one of such auxiliary projects, which is very actively used by our technical support.

    I was sure that I already know the pitfalls of such a transition, and that the task will take no more than an hour of work.

    Outwardly, the program is simple: it shows a list of rows taken from different database tables. When a user selects a row, the column headings of the list change to the column names in the corresponding table.

    And then I notice that this is not happening.

    In vivo


    I check several times, reproducibility is absolute. I collect the project from scratch, run, check. Zero to mass.

    This was strange, because I absolutely remember that the columns switched correctly during debugging, this was part of the test case. In order to verify my own sanity, I launch the Debug version and see that the columns are switched. I also make sure that the error appears as soon as I turn on any optimization.

    What a wonderful day.

    The header update feature itself is pretty simple. First, she considers a certain condition, and then something like this code works:
    int flag = !application->settings.showsize; // showsize имеет тип BYTE
    int first = columnData - flag;
    int last = ALL_COLUMNS - flag;
    if (condition)
    {
        for (int i = first; i < last; i++)
        {
            listctrl.SetColumn(i, "Какой-то текст");
        }
    }
    else
    {
        for (int i = first; i < last; i++)
        {
            listctrl.SetColumn(i, "Какой-то другой текст");
        }
    }
    

    I ask you not to throw uranium slippers, I already said that the code is quite old.

    Thus, in any situation, at least something was supposed to happen. Moreover, the debugger, getting to this code, simply jumps to the end of the function. I open the disassembler and see that all this piece of code (and another ten lines in front of it) is simply missing in the listing. Of course, optimized code sometimes looks cooler than any spaghetti or even felt boots code. But in this case, nowhere from the very beginning of the function is there even a hint of any transitions elsewhere. The listing shows the calculation of the condition, and immediately after it, the output.

    I’m beginning to understand that the error is much more interesting than it seemed at the beginning, and I’m calling our specialist, who in the company deals with situations like “any unknown crap”, and who don’t feed dinner, let me find some kind of bug in Windows. We have no illusions on the topic “there are no errors in compilers”, but experience suggests that 99.99% of “errors in compilers” come down to the crooked hands of program developers.

    In vitro


    When an error occurs only in optimized code, this may mean that we stumbled upon an unknown UB somewhere, and the line becomes the first candidate
    int flag = !application->settings.showsize;
    It quickly becomes clear that the problem is really somewhere here, but "not everything is so simple." It was enough to replace the expression with a constant, another variable, put the ternary operator instead of negation, or at least put the structure on the stack, as the code magically appeared in the listing.

    For lack of better ideas, we pull this function into a separate clean project and ruthlessly throw out all that is unnecessary. Then we find out that shamanism with structures and pointers can be replaced with the usual volatile:
    #include 
    int main()
    {
        volatile int someVar = 1;
        const int indexOffset = someVar ? 0 : 1;    // Цикл выбрасывается
        // const int indexOffset = !someVar;        // Цикл выбрасывается
        // const int indexOffset = 0;               // Всё хорошо
        // const int indexOffset = 1;               // Всё хорошо
        // const int indexOffset = someVar;         // Всё хорошо
        // const int indexOffset = someVar + 1;     // Всё хорошо
        for (int i = 1 - indexOffset; i < 2 - indexOffset; ++i)
        {
            printf("Test passed\n");
        }
        return 1;
    }

    Here we were quite surprised, because in the original code the replacement of negation with the ternary operator in the line
    int flag = !application->settings.showsize;
    returned a piece of code in place, but for volatile it didn’t work anymore.

    There was almost no doubt that we stumbled upon an error in the compiler, but it seemed incredible that in dozens of megabytes of code there was no similar piece.

    Investigation


    It is worth noting that the main program is now going to vs2015 Update 2, because the antivirus suddenly began to swear at the program compiled in Update 3, and the one that we install for the clients could turn out ... ugly. However, some developers, including myself, have Update 3. installed. We checked on several different computers and versions of VS, and it turned out that the error was present only in Update 3. Later it turned out that it would be more correct to write “starting with Update 3”.

    Google made it clear that he was out of business, so the next logical step was to write a
    questionon stackoverflow. I must say, send a question to StackOverflow starting with the phrase “We found a mistake in the compiler” - this is the same as cutting your hand and jumping into the pool with sharks, but in this case the sharks were full and friendly. In just a few minutes, the test case was simplified even more, they suggested a tool that allowed you to see the result of translating this code by different compilers, and, more importantly, the magic phrase “ new SSA optimizer introduced in VS2015 Update 3sounded . It also mentioned the magic key -d2SSAOptimizer-, which disables the new optimizer.

    This time google brought us to the blog Introducing a new, advanced Visual C ++ code optimizerdeveloper from the Visual C ++ Optimizer team with its coordinates and a proposal to send him error messages, which we used. And literally after 10-15 minutes they received the following answer:
    Yes, this is definitely a mistake in the SSA optimizer itself - usually most of the errors that we are told about as optimizer errors are in other places and sometimes appear after 20 years only now.

    The error is in a small optimization that tries to remove comparisons of the form (a - Const1) CMP (a - Const2) if overflow does not occur. The error arises because your code contains the expression (1 - indexOffset) CMP (2 - indexOffset), and although subtraction is, of course, not commutable, the optimizer does not take this into account and processes (1 - indexOffset), as if it were ( indexOffset - 1).

    A fix for this error will be published in the next big update for VS2017. Until this time, disabling the SSA optimizer for this function may be a good solution if it does not cause a strong slowdown. This can be done using #pragma optimize ("", off): msdn.microsoft.com/en-us/library/chh3fb0k.aspx
    Original
    Yes, this is indeed a bug in the SSA Optimizer itself - usually most bugs reported as being in the new optimizer are in other parts, sometimes exposed now after 20 years.

    It's in a small opt. that tries to remove a comparison looking like (a - Const1) CMP (a - Const2), if there is no overflow. The issue is that your code has (1 - indexOffset) CMP (2 - indexOffset) and subtraction is not commutative, of course - but the optimizer code disregards that and handles (1 - indexOffset) as if it's (indexOffset - 1).

    A fix for this issue will be released in the next larger update for VS2017. Until then, disabling the SSA Optimizer would be a decent workaround. Disabling optimizations for only this function may be a better approach if it doesn't slow down things too much. This can be done with #pragma optimize ("", off): msdn.microsoft.com/en-us/library/chh3fb0k.aspx

    Epilogue


    As the investigation showed, currently all VC ++ compilers are subject to this error, starting from version 2015 Update 3 and ending with the most modern versions. It is still unknown when the patch will be released, so if you find that pieces of code miraculously disappear from your program, check if the new optimizer decided that it needs this code more than you?

    It is somewhat disappointing that the fix will be released only for VS2017, however, now we know what to do with it.

    What a wonderful day!

    Thanks to Codeguard for helping me find this error.

    Also popular now: