Why did the compiler turn my loop into an infinite condition?

Original author: Raymond Chen
  • Transfer
One of the users of the Visual C ++ compiler gave the following code example and asked why its conditional loop runs infinitely, although at some point the condition should stop and the loop should end:

#include<windows.h>int x = 0, y = 1;
int* ptr;
DWORD CALLBACK ThreadProc(void*){
  Sleep(1000);
  ptr = &y;
  return0;
}
intmain(int, char**){
 ptr = &x; // starts out pointing to x
 DWORD id;
 HANDLE hThread = CreateThread(nullptr, 0, ThreadProc, 0, &id);
 // Ждём, пока другой поток изменит значение по указателю ptr// на некоторое ненулевое числоwhile (*ptr == 0) { }
 return0;
}

For those who are not familiar with platform-specific functions, here is the equivalent in pure C ++:

#include<chrono>#include<thread>int x = 0, y = 1;
int* ptr = &x;
voidThreadProc(){
  std::this_thread::sleep_for(std::chrono::seconds(1));
  ptr = &y;
}
intmain(int, char**){
 ptr = &x; // starts out pointing to xstd::thread thread(ThreadProc);
 // Ждём, пока другой поток изменит значение по указателю ptr// на некоторое ненулевое числоwhile (*ptr == 0) { }
 return0;
}

Next, the user brought his understanding of the program:
The conditional loop has been turned by the compiler into an infinite one. I see it by the generated assembler code, which once loads the value of the ptr pointer into the register (at the start of the cycle), and then at each iteration compares the value of this register with zero. Since the reloading of a value from ptr never happens again, the loop never ends.

I understand that declaring a ptr as “volatile int *” should cause the compiler to discard the optimization and read the ptr value at each iteration of the loop, which will fix the problem. But still I would like to know why the compiler cannot be smart enough to do such things automatically? Obviously, a global variable used in two different threads can be changed, which means it cannot simply be cached in a register. Why can't the compiler generate the correct code right away?


Before answering this question, let's start with a little niggle: “volatile int * ptr” does not declare the variable ptr as “a pointer for which optimization is prohibited”. This is "a regular pointer to a variable for which optimization is prohibited." What the author of the above question had in mind was to be declared as “int * volatile ptr”.

And now back to the main issue. What is going on here?

Even a quick glance at the code will tell us that there are neither variables of std :: atomic type, nor the use of std :: memory_order (neither explicit nor implicit). This means that any attempt to access ptr or * ptr from two different streams leads to undefined behavior. Intuitively, you can think of it this way: “The compiler optimizes each stream as if it works in the program alone. The only points where the standard compiler MUST think about accessing data from different streams is using std :: atomic or std :: memory_order. ”

This explains why the program behaved differently than the programmer expected. From the moment you allow for undefined behavior, absolutely nothing can be guaranteed.

But okay, let's think about the second part of his question: why is the compiler smart enough not to recognize such a situation and automatically turn off optimization with loading the pointer value into the register? Well, the compiler automatically applies all possible optimizations that do not contradict the standard. It would be strange to require him to be able to read the thoughts of the programmer and turn off some optimizations that do not contradict the standard, which, according to the programmer, should have changed the logic of the program for the better. “Oh, what if this cycle expects a change in the value of a global variable in another thread, although it has not announced this explicitly? I'll take it, I will slow it down a hundred times to be ready for this situation! ” Should it be so? Hardly.

But suppose that we add to the compiler a rule like “If optimization has led to the appearance of an infinite loop, then you need to cancel it and build code without optimization”. Or even like this: “Sequentially cancel individual optimizations until the result is a non-infinite loop.” In addition to the amazing surprises that this will bring, will it give any benefit at all?

Yes, in this theoretical case we will not get an infinite loop. It will abort if some other thread writes a non-zero value to * ptr. It will also be interrupted if another thread writes a non-zero value to the variable x. It is becoming unclear how deeply the dependency analysis should be worked out in order to “catch” all the cases that may affect the situation. Since the compiler does not actually run the created program and does not analyze its behavior at runtime, the only solution would be to assume that no references to global variables, pointers and links can be optimized.

int limit;
voiddo_something(){
    ...
    if (value > limit)
        value = limit; // перечитываем переменную limit
    ...
    for (i = 0; i < 10; i++)
      array[i] = limit; // перечитываем переменную limit
    ...
}

This is completely contrary to the spirit of C ++. The language standard says that if you modify a variable and expect to see this modification in another thread, you must EXACTLY say: use an atomic operation or streamlining memory access (usually using the synchronization object).

So please do just that.

Also popular now: