Concurrency structures in .net. ConcurrentDictionary from the inside

It all started with one interview, which prompted me to write this article. Quite a large part of developers on the .Net platform do not understand the basic things, although they use them everyday, for example, all methods that use ConcurrentDictionary are wrapped with a lock, although one could do with the usual Dictionary <>.

In science, there are 3 main ways to implement competitive data structures:
• Lock-free data structures;
• Fine-grained lock;
• Transactional memory implementation (transactional memory);

ConcurrentDictionary Is a thread-safe Dictionary counterpart. It is based on the so-called Fine-grained lock.

general information


There are 2 main parameters for setting:
сapacity - the initial number of elements. The default is 31.
concurrencyLevel - estimated number of threads per record. By default, the number of processors is set to = 4 *. I

propose to consider these parameters in more detail in order to understand their true essence.

Capacity

The given parameter is similar to that used in the usual Dictionary dictionaryIs the initial size of the array for storing elements.
If we look at the source, we will see the line:
ConcurrentDictionary.Node[] buckets = new ConcurrentDictionary.Node[capacity];

As you know, Dictionary is a classic hash table, therefore, a hash value is used to store elements, which in .Net is calculated using the GetHashCode () function and is of type int. The hash values ​​just lie in the buckets above, the desired bucket is calculated:
bucketNo = (hashcode & int.MaxValue) % bucketCount;

This gives the access speed - O (1).
When overflowing, an increase in the bucket array occurs (newLength = oldLength * 2 + 1) and all elements are redistributed.

Concurrencylevel

Estimated number of threads per record. From the definition, the question immediately arises - why should the “dictionary" know about the number of threads. In fact, everything is simple. This is nothing like the number of independent locks (lock pool), i.e. this is the maximum number of threads that can "write" to the "dictionary" at the same time. In fact, each lock is "given" approximately the same set of buckets, in which it provides synchronization of streams for recording.

Operations on ConcurrentDictionary.


The main operations on the dictionary can be divided into 3 groups:
  • completely non-blocking;
  • Locking one item from the lock pool
  • blocking the entire dictionary;


Completely non-blocking operations include:
  • ContainsKey
  • Tryget
  • this []
  • GetEnumerator - the operation does not ensure data integrity (does not use snapshots), i.e. data during the operation of the function may change.

All read operations (Get / ContainsKey) have approximately the same operation algorithm:
  • calculating a key hash using GetHashCode ()
  • calculating the bucket in which our element lies
  • comparing the key value in the bucket with the one we have
  • reading a value using Volatile.Read


Operations with blocking one element from the pool of locks include:
  • Tryadd
  • Tryupdate
  • TryRemove

Below is an example algorithm of work:
  1. Calculating the key hash of a new item
  2. Calculating the bucketNo bucket to which the item will be added, and the blocking numbers from the pool
    bucketNo = (hashcode & int.MaxValue) % bucketCount;
    lockNo = bucketNo % lockCount;
    

  3. BucketNo lock via Monitor.Enter
  4. Writing an element using Volatile.Write
  5. Monitor.Exit Lock Release


The most inefficient operations that block the entire dictionary include:
  • Count, IsEmpty. Yes, these operations require a complete dictionary lock. If you need to save the number of elements in the log file, you can use GetEnumerator and LINQ.
  • Keys, Values ​​- getting a list of keys and a list of values, respectively. By the way, here you get complete data - snapshots.
  • CopyTo - explicit ICollection
  • Clear ToArray


Methods AddOrUpdate, GetOrAdd

These methods are quite interesting in that they use atomic Add / Get / Update operations, but they themselves are not atomic, they do not use locking for the whole operation. This is what the AddOrUpdate implementation looks like:
      do
      {
        while (!TryGetValue(key, out comparisonValue))
        {
          TValue obj = addValueFactory(key);
          TValue resultingValue;
          if (TryAddInternal(key, obj, false, true, out resultingValue))
            return resultingValue;
        }
        newValue = updateValueFactory(key, comparisonValue);
      }
      while (!TryUpdate(key, newValue, comparisonValue));
      return newValue;


MSDN:
Also, although all methods of ConcurrentDictionary are thread-safe, not all methods are atomic, specifically GetOrAdd and AddOrUpdate. The user delegate that is passed to these methods is invoked outside of the dictionary's internal lock. (This is done to prevent unknown code from blocking all threads.)

Usage example:
 items.AddOrUpdate(srcKey, srcValue,
                (key, existingVal) =>
                {
                    // сравниваем добавляемое значение и значение из словаря
                    if (srcValue != existingVal)
                        throw new ArgumentException("...");


In the end, I want to note the complexity of the methods:

  1. TryAdd, TryUpdate, TryRemove, - O (1)
  2. Get / Contains, Item [] by key - O (1)
  3. ContainsValue, ToArray (), Keys, Values ​​- O (n)


Ps In this article I wanted to pay attention to the subtleties of using ConcurrentDictionary and the main points in its implementation.

Also popular now: