Rouse May 31, 2013 at 22:10

Atomic operations

Just the other day they asked me a question.

And why do you need the LOCK prefix, or its analogue InterlockedDecrement when calling the _LStrClr procedure from the System module. This procedure decrements the link reference counter of the line and when it is reset, it frees the memory previously occupied by the line.

The essence of the question was this: it is almost impossible to imagine a situation where a string will lose refs simultaneously from two threads, and therefore the atomic operation in this case is redundant.

In principle, the premise is interesting, but ...

But we pass the string to the thread class.
This at least leads to an increase in refCnt, and therefore we can “get” to MemLeak if atomic operations were not used when decrementing the reference counter.

This demonstrates the _LStrClr code:

procedure _LStrClr(var S);
{$IFDEF PUREPASCAL}
var
  P: PStrRec;
begin
  if Pointer(S) <> nil then
  begin
    P := Pointer(Integer(S) - Sizeof(StrRec));
    Pointer(S) := nil;
    if P.refCnt > 0 then
      if InterlockedDecrement(P.refCnt) = 0 then
        FreeMem(P);
  end;
end;
{$ELSE}
asm
        { ->    EAX pointer to str      }
        MOV     EDX,[EAX]                       { fetch str                     }
        TEST    EDX,EDX                         { if nil, nothing to do         }
        JE      @@done
        MOV     dword ptr [EAX],0               { clear str                     }
        MOV     ECX,[EDX-skew].StrRec.refCnt    { fetch refCnt                  }
        DEC     ECX                             { if < 0: literal str           }
        JL      @@done
   LOCK DEC     [EDX-skew].StrRec.refCnt        { threadsafe dec refCount       }
        JNE     @@done
        PUSH    EAX
        LEA     EAX,[EDX-skew].StrRec.refCnt    { if refCnt now zero, deallocate}
        CALL    _FreeMem
        POP     EAX
@@done:
end;
{$ENDIF}

In the case of using a non-atomic decrement, the JNE instruction has a huge chance of being executed incorrectly. (And it really will not be executed correctly if you remove the LOCK prefix).

Of course, I tried to explain this situation with examples from the Intel manual, where the work is explained, but in the end I decided to implement the following example (with which I was able to convince the author of the question):

program interlocked;
{$APPTYPE CONSOLE}
uses
  Windows;
const
  Limit = 1000000;
  DoubleLimit = Limit shl 1;
var
  SameGlobalVariable: Integer;
function Test1(lpParam: Pointer): DWORD; stdcall;
var
  I: Integer;
begin
  for I := 0 to Limit - 1 do
  asm
    lea eax, SameGlobalVariable
    inc [eax] // обычный инкремент
  end;
end;
function Test2(lpParam: Pointer): DWORD; stdcall;
var
  I: Integer;
begin
  for I := 0 to Limit - 1 do
  asm
    lea eax, SameGlobalVariable
    lock inc [eax] // атомарный инкремент
  end;
end;
var
  I: Integer;
  hThread: THandle;
  ThreadID: DWORD;
begin
  // Неатомарное увеличение значения переменной SameGlobalVariable
  SameGlobalVariable := 0;
  hThread := CreateThread(nil, 0, @Test1, nil, 0, ThreadID);
  for I := 0 to Limit - 1 do
  asm
    lea eax, SameGlobalVariable
    inc [eax] // обычный инкремент
  end;
  WaitForSingleObject(hThread, INFINITE);
  CloseHandle(hThread);
  if SameGlobalVariable <> DoubleLimit then
    Writeln('Step one failed. Expected: ', DoubleLimit, ' but current: ', SameGlobalVariable);
  // Атомарное увеличение значения переменной SameGlobalVariable
  SameGlobalVariable := 0;
  hThread := CreateThread(nil, 0, @Test2, nil, 0, ThreadID);
  for I := 0 to Limit - 1 do
  asm
    lea eax, SameGlobalVariable
    lock inc [eax] // атомарный инкремент
  end;
  WaitForSingleObject(hThread, INFINITE);
  CloseHandle(hThread);
  if SameGlobalVariable <> DoubleLimit then
    Writeln('Step two failed. Expected: ', DoubleLimit, ' but current: ', SameGlobalVariable);
  Readln;
end.

The essence of the example is a certain global variable SameGlobalVariable (it acts as a line reference counter from the original statement of the task) and changes are made to its value in normal and atomic modes using a thread.

Here you can clearly see the differences between the two modes of operation.
In the console, you will see something like the following:

Step one failed. Expected: 2000000 but current: 1018924

Errors in the second embodiment you will never see.

By the way, the first option can be used as a good enough randomizer (which I talked about in previous articles).

Summarizing:

An analysis of the source code of Delphi and VCL system modules in particular can sometimes give you much more information than assumptions about how it actually works and this is a fact, but ...

No, this is not a fact, it is more than a fact - this is how it really was

Tags:

nuances and facts

Atomic operations

Also popular now: