Change ConcurrentDictionary during iteration

Recently I decided to deal with the internal device of thread-safe collections, the publication on Habré was the starting point in studying the ConcurrentDictionary device . The principle of its operation is described simply and clearly, for which special thanks to the author.

It seemed to me that one moment in the publication was not fully covered and I decided to fill this gap.

Thread-safe collections are designed for use in a multi-threaded environment and should be able to change at any time. Accordingly, they can be changed even at the time of their enumeration. From here I had a question, if the collection is modified during iteration, will the iterator see these changes?

Let us turn to the article indicated above:
GetEnumerator - can return old values ​​if changes were made by another thread after a method call and how the iterator passed this element.

Well, it’s quite logical that changes to the elements that the iterator has already passed will not be taken into account when sorting through the collection. And what happens if you change an element to which the iterator has not yet "reached" or if you insert a new element into the collection?
We turn to MSDN (the Russian translation of this note is not very well done, so I will also insert the note in the original language):
The enumerator returned from the dictionary is safe to use while reading from and writing to the dictionary, but it does not provide a snapshot of the dictionary. Content accessible through an enumerator may contain changes made to the dictionary after calling GetEnumerator.

The enumerator returned from the dictionary is safe to use concurrently with reads and writes to the dictionary, however it does not represent a moment-in-time snapshot of the dictionary. The contents exposed through the enumerator may contain modifications made to the dictionary after GetEnumerator was called.

I, as a person with a technical education, are confused by the wording “may contain”. Those. may or may not contain? Let's check:

ConcurrentDictionary dictionary = new ConcurrentDictionary();
dictionary.TryAdd(0, "item0");
int x = 1;
foreach (var element in dictionary)
    var tmp = x++;
    if (!dictionary.TryAdd(tmp, "item" + tmp.ToString()))
        throw new Exception("Вставить элемент не удалось");

What will be displayed in the console? Will one element or program enter an infinite loop? Neither one nor the other. The following will be displayed:

[0, item0]
[1, item1]
[2, item2]
[3, item3]
[4, item4]
[5, item5]
[6, item6]
[7, item7]
[8, item8]
[ 9, item9]
[10, item10]
[11, item11]
[12, item12]
[13, item13]
[14, item14]
[15, item15]
[16, item16]

There were no exceptions, respectively, 18 items in the collection were inserted successfully, but the iterator did not see it, why?

Let's take a look at the sources of this collection, namely the implementation of the GetEnumerator method:

public IEnumerator> GetEnumerator()
    Node[] buckets = m_tables.m_buckets;
    for (int i = 0; i < buckets.Length; i++)
        // The Volatile.Read ensures that the load of the fields of 'current' doesn't move before the load from buckets[i].
        Node current = Volatile.Read(ref buckets[i]);
        while (current != null)
            yield return new KeyValuePair(current.m_key, current.m_value);
            current = current.m_next;

The m_tables field is marked with the volatile keyword, so changes to the Node [] m_buckets array contained in it are visible to all threads. Each element of this array represents the first element in a singly linked list and contains a link to the next element in the list. Further, it is easy to guess that as long as adding / changing elements leads to a change in the simply connected lists themselves, the iterator “sees” these changes, but the changes to the array itself are not visible to the iterator.

The m_buckets array changes in two cases. The first is an increase in size when inserting elements, the second is a call to the Clear () method (resets the size of the array to the default value).
In order to answer the question when the size of the m_buckets array increases, I will make a small remark about the internal structure of ConcurrentDictionary.
To provide locks on the Add / Change / Delete operation from the collection, there is an object [] m_locks array, its default size is 4 * number of processors (each time m_tables.m_buckets is increased, the size of the array with objects for locks is doubled).
There is also an int m_budget field that defines the maximum number of elements per lock. It is calculated as follows: m_buckets.Length / m_locks.Length.
The number of elements for each lock is contained in the int [] m_countPerLock field, which is incremented when a new element is added to the linked list, and decremented when an element is removed from the list.
Now back to the condition for increasing the m_buckets array. It increases after the condition tables.m_countPerLock [lockNo]> m_budget is fulfilled, i.e. when the number of elements per lock exceeds the maximum allowed number. It should be noted that this check occurs at the end of the element insertion method, and resizing of the internal m_buckets collection will occur after the current element has been inserted into it.
In my example: I have 4 processors, respectively, the size of the array is m_locks = 16, and m_budget = 31/16 = 1. As soon as we insert 17 elements, 2 elements now fall on one lock and the collection expands.
/ Update The
Update and Remove operations do not change the size of the array and, accordingly, these changes will always be visible to the iterator (of course, if we are talking about changing an element that the iterator hasn’t reached yet).


Despite the fact that we now know when the changes made during the enumeration of the collection will be visible, and when not, we should not take this knowledge into account when programming using ConcurrentDictionary. It is best to adhere to the rule described on MSDN that the changes made may or may not be visible.

Also popular now: