tyomitch May 1, 2011 at 00:31

Lockless Algorithms: Make, Write, (Try Again) Model

Transfer

The atomic multiplication we implemented last time is an example of a more general model, which Raymond called "do, write, (try again)."

for (;;) {
 // take the initial value of the shared variable,
 // which we are going to change
 oldValue = sharedVariable;
 ... take the initial values of other parameters ...
 newValue = ... compute the new value using
                oldValue and copies of other parameters ...
 // instead of Xxx there may be Acquire, Release, or nothing
 if (InterlockedCompareExchangeXxx (
            & sharedVariable,
            newValue, oldValue) == oldValue) {
  break; // record failed
 }
 ... remove newValue ...
} // try again

We calculate the new value, and then call InterlockedCompareExchangeit to write to the general variable only if its value has not changed. If it has changed, then another stream has passed us; in this case, we will try to perform the operation in a new way, from the very beginning, in the hope that next time no one will “cut us up”.

It is important to understand that the “do, write, try again” model does not guarantee that the thread that first starts updating the value of the variable will win the “race” first. The “unsuccessful” stream may, at each attempt, again and again concede to faster rivals; it is theoretically possible that he neverwill not complete the operation begun (but each of the streams overtaking it will complete its own). This feature is typical of lockless algorithms - unlike locks, which usually implement the waiting queue: the first one comes in, the first one passes.

The translator dared to come up with his own analogy: the box office at McDonald's is a castle, behind which there is a line of waiting customers. If a client standing at the checkout has rummaged through his pockets for little things, then the rest of the clients and the cashier languish in anticipation; the system as a whole is idle. On the other hand, the waiter in the restaurant is a castleless resource. Each client calls him to pay for lunch; but another client can “shake” and intercept the waiter along the way. It is not known how many times the waiter will be distracted until he finally reaches your table; but on the other hand, if he does not suit you, then now he is serving someone else. In general, the system is not idle.

The function InterlockedMultiplyfrom the last post corresponds to the template given here almost verbatim: “another parameter” is the factor transmitted by the function parameter; "New meaning" is a work. If the recording of the new value failed, then someone else changed it, and we start the operation again.

Initialization of static variables in the last post also corresponds to the “do, write, (try again)” model, just the “try again” stage is optimized: we know that its result would be the same values that were already written into the shared variables by the overtaking stream; therefore, this step may not be performed. In the case of variable initialization, it is InterlockedCompareExchange used in the Release variant, because the new value is valid only together with other data in memory (only together witha), and should be recorded only after them.

Another “optimized” version of the model is “do, write, (despair)”. There is no loop in it either: if writing a new value fails, the function considers the failure to be fatal, and refrains from further attempts. That's the way, for example TryEnterCriticalSection. It is used in it InterlockedCompareExchangeAcquire: changes made from inside the critical section should not be written to memory before it is recorded that the section is busy.

If the common variable is not a number, but a pointer, then you need to take care that the already mentioned “ABA problem” does not occur when the same pointer points to a new object. Initialization of static objects does not require additional difficulties: the pointer can only change from NULL to something, and once it becomes non-NULL, the pointer no longer changes. If the pointer can be changed arbitrarily, then, as a rule, this pointer and the “version number” are combined in a common variable, which increases with each change of the pointer.

In the following example of the use of the “do, write, (try again)” model, the version number will be stored in a common variable. For starters, two helper functions:

LONG InterlockedReadAcquire (__ in LONG * pl, __in LONG lUnlikely)
{
  return InterlockedCompareExchangeAcquire (pl, lUnlikely, lUnlikely);
}
LONG InterlockedReadRelease (__ in LONG * pl, __in LONG lUnlikely)
{
  return InterlockedCompareExchangeRelease (pl, lUnlikely, lUnlikely);
}

Unlike reading LONG variables directly, these functions impose restrictions on the order in which they are written to memory. Since the compared and the new value in the operation InterlockedCompareExchangecoincide, then at any outcome of the operation, the checked value does not change. The actions performed are similar to the code:

if (* pl == lUnlikely) * pl = lUnlikely;

As lUnlikelyRaymond, he advises choosing an unlikely value so that in most cases the assignment is not performed, and the cache block does not get dirty. (Not on all processors, this allows you to prevent writing to memory, but such a trick will not be superfluous.)

So, our example:

LONG g_lColorChange; // color version number
...
case WM_SYSCOLORCHANGE:
 InterlockedIncrement (& g_lColorChange);
 ...
int CalculateSomethingAboutSystemColors ()
{
 LONG lColorChangeStart;
 do {
  lColorChangeStart = InterlockedReadAcquire (& g_lColorChange, -1);
  COLORREF clrWindow = GetSysColor (COLOR_WINDOW);
  COLORREF clrHighlight = GetSysColor (COLOR_HIGHLIGHT);
  ... other calculations using GetSysColor (...)
 } while (InterlockedReadRelease (
                       & g_lColorChange, -1)! = lColorChangeStart);
 return iResult;
}

We take the old version number value and begin our calculations. We take it with the Acquire operation, so the read version number is written into memory before the color values that we use for calculations.
When the calculations are completed, we compare the current version number with the saved one. If they do not match, then the colors have changed during our calculations, so they will have to be redone.
An ABA problem is possible if the colors have changed four billion times during the calculations: we won’t notice this because of the counter overflow. It is clear that in reality this is unlikely to happen.

Tags:

Lockless Algorithms: Make, Write, (Try Again) Model

Also popular now: