Top 20 mistakes when working with multithreading in C ++ and ways to avoid them

Hello, Habr! I bring to your attention a translation of the article “Top 20 C ++ multithreading errors and how to avoid them” by Deb Haldar.


Scene from the movie “The Loop of Time” (2012)

Multithreading is one of the most difficult areas in programming, especially in C ++. Over the years of development, I made many mistakes. Fortunately, most of them were identified by review code and testing. Nevertheless, some somehow slipped on the productive, and we had to edit the operating systems, which is always expensive.

In this article, I tried to categorize all the errors I know with possible solutions. If you are aware of any other pitfalls, or have suggestions for resolving the errors described, please leave your comments under the article.

Mistake # 1: Do not use join () to wait for background threads before exiting the application


If you forget to attach the stream ( join () ) or detach it ( detach () ) (make it not joinable) before the program terminates, this will lead to crash. (The translation will contain the words join in the context of join () and detach in the context of detach () , although this is not entirely correct. In fact, join () is the point at which one thread of execution waits for the completion of another, and no joining or merging of threads occurs [comment translator]).

In the example below, we forgot to join () the thread t1 in the main thread: Why did the program crash ?! Because at the end of the main () function

#include "stdafx.h"
#include 
#include 
 
using namespace std;
 
void LaunchRocket()
{
   cout << "Launching Rocket" << endl;
}
int main()
{
   thread t1(LaunchRocket);
   //t1.join(); // как только мы забыли join- мы получаем аварийное завершение программы
   return 0;
}


the variable t1 went out of scope and the thread destructor was called. The destructor checks if the thread t1 is joinable . A thread is joinable if it has not been detached. In this case, std :: terminate is called in its destructor . Here's what, for example, the MSVC ++ compiler does. There are two ways to fix the problem depending on the task: 1. Call join () of thread t1 in the main thread: 2. Detach thread t1 from the main thread, let it continue to work as a "demonized" thread:

~thread() _NOEXCEPT
{  // clean up
    if (joinable())
        XSTD terminate();
}






int main()
{
  thread t1(LaunchRocket);
  t1.join(); // выполняем join потока t1, ожидаем завершение этого потока в основном потоке выполнения
    return 0;
}




int main()
{
    thread t1(LaunchRocket);
    t1.detach(); // открепление  t1 от основного потока
    return 0;
}


Mistake # 2: Trying to attach a thread that was previously detached


If at some point in the program’s work you have a detach stream, you cannot attach it back to the main stream. This is a very obvious mistake. The problem is that you can unpin the stream, and then write a few hundred lines of code and try reattaching it. After all, who remembers that he wrote 300 lines back, right?

The problem is that this will not cause a compilation error, instead the program will crash at startup. For example: The solution is to always check the thread for joinable () before trying to attach it to the calling thread.

#include "stdafx.h"
#include 
#include 
 
using namespace std;
 
void LaunchRocket()
{
    cout << "Launching Rocket" << endl;
}
 
int main()
{
    thread t1(LaunchRocket);
    t1.detach();
    //..... 100 строк какого-то кода
    t1.join(); // CRASH !!!
    return 0;
}




int main()
{
  thread t1(LaunchRocket);
  t1.detach();
  //..... 100 строк какого-то кода
 
  if (t1.joinable())
  {
    t1.join(); 
  }
 
  return 0;
}


Mistake # 3: Misunderstanding that std :: thread :: join () blocks the calling thread of execution


In real applications, you often need to separate out “long-playing” operations of processing network I / O or waiting for a user to click a button, etc. A call to join () for such workflows (for example, the UI render thread) may cause the user interface to hang. There are more suitable implementation methods.

For example, in GUI applications, a worker thread, upon completion, may send a message to the UI thread. The UI stream has its own event processing loop such as: moving the mouse, pressing keys, etc. This loop can also receive messages from worker threads and respond to them without having to call the blocking join () method .

For this very reason, in the WinRT platformfrom Microsoft, almost all user interactions are made asynchronous, and synchronous alternatives are not available. These decisions were made to ensure that developers will use the API that provides the best possible end-user experience. You can refer to the “ Modern C ++ and Windows Store Apps ” manual for more information on this topic.

Mistake # 4: Assuming that stream function arguments are passed by reference by default


The arguments to the stream function are passed by value by default. If you need to make changes to the arguments passed, you must pass them by reference using the std :: ref () function .

Under the spoiler, examples from another C ++ 11 article Multithreading Tutorial via Q&A - Thread Management Basics (Deb Haldar) , illustrating parameter passing [approx. translator].

more details:
When executing the code: It will be displayed in the terminal: As you can see, the value of the targetCity variable received by the function called in the stream has not changed by reference. We rewrite the code using std :: ref () to pass the argument: It will be output: Changes made in the new stream will affect the value of the targetCity variable declared and initialized in the main function .
#include "stdafx.h"
#include 
#include 
#include 
#include 
 
using namespace std;
 
void ChangeCurrentMissileTarget(string& targetCity)
{
  targetCity = "Metropolis";
  cout << " Changing The Target City To " << targetCity << endl;
}
 
int main()
{
  string targetCity = "Star City";
  thread t1(ChangeCurrentMissileTarget, targetCity);
  t1.join();
  cout << "Current Target City is " << targetCity << endl;
 
  return 0;
}
 



Changing The Target City To Metropolis
Current Target City is Star City






#include "stdafx.h"
#include 
#include 
#include 
#include 
 
using namespace std;
 
void ChangeCurrentMissileTarget(string& targetCity)
{
  targetCity = "Metropolis";
  cout << " Changing The Target City To " << targetCity << endl;
}
 
int main()
{
  string targetCity = "Star City";
  thread t1(ChangeCurrentMissileTarget, std::ref(targetCity));
  t1.join();
  cout << "Current Target City is " << targetCity << endl;
 
  return 0;
}



Changing The Target City To Metropolis
Current Target City is Metropolis




Mistake # 5: Do not protect shared data and resources with a critical section (for example, a mutex)


In a multi-threaded environment, usually more than one thread competes for resources and shared data. Often this leads to an uncertain state for resources and data, except when access to them is protected by some mechanism that allows only one thread of execution to perform operations on them at any time.

In the example below, std :: cout is a shared resource that 6 threads work with (t1-t5 + main). If we execute this program, we get the conclusion: This is because five streams simultaneously access the output stream in random order. To make the conclusion more specific, you must protect access to the shared resource using std :: mutex . Just change the function CallHome ()

#include "stdafx.h"
#include 
#include 
#include 
#include 
 
using namespace std;
 
std::mutex mu;
 
void CallHome(string message)
{
  cout << "Thread " << this_thread::get_id() << " says " << message << endl;
}
 
int main()
{
  thread t1(CallHome, "Hello from Jupiter");
  thread t2(CallHome, "Hello from Pluto");
  thread t3(CallHome, "Hello from Moon");
 
  CallHome("Hello from Main/Earth");
 
  thread t4(CallHome, "Hello from Uranus");
  thread t5(CallHome, "Hello from Neptune");
 
  t1.join();
  t2.join();
  t3.join();
  t4.join();
  t5.join();
 
  return 0;
}
 




Thread 0x1000fb5c0 says Hello from Main/Earth
Thread Thread Thread 0x700005bd20000x700005b4f000 says says Thread Thread Hello from Pluto0x700005c55000Hello from Jupiter says 0x700005d5b000Hello from Moon
0x700005cd8000 says says Hello from Uranus

Hello from Neptune


so that it captures the mutex before using std :: cout and frees it after.

void CallHome(string message)
{
  mu.lock();
  cout << "Thread " << this_thread::get_id() << " says " << message << endl;
  mu.unlock();
}


Mistake # 6: Forget to release the lock after exiting the critical section


In the previous paragraph, you saw how to protect a critical section with a mutex. However, calling the lock () and unlock () methods directly on the mutex is not the preferred option because you may forget to give the held lock. What will happen next? All other threads that are waiting for the release of the resource will be infinitely blocked and the program may hang.

In our synthetic example, if you forgot to unlock the mutex in the CallHome () function call , the first message from stream t1 will be output to the standard stream and the program will crash. This is due to the fact that thread t1 received a mutex lock, and the remaining threads wait for this lock to be released.

void CallHome(string message)
{
  mu.lock();
  cout << "Thread " << this_thread::get_id() << " says " << message << endl;
  //mu.unlock();  мы забыли освободить блокировку
}


The following is the output of this code - the program crashed, displaying the only message in the terminal, and did not end: Such errors often occur, which is why it is undesirable to use the lock () / unlock () methods directly from the mutex. Instead, use the std :: lock_guard template class , which uses the RAII idiom to control the lock's lifetime. When the lock_guard object is created, it tries to take over the mutex. When the program leaves the scope of the lock_guard object, the destructor is called, which frees the mutex. We rewrite the CallHome () function using the std :: lock_guard object:

Thread 0x700005986000 says Hello from Pluto







void CallHome(string message)
{
  std::lock_guard lock(mu);  // пытаемся захватить блокировку
  cout << "Thread " << this_thread::get_id() << " says " << message << endl;
}// объект lock_guard уничтожится и освободит мьютекс


Mistake # 7: Make the critical section size larger than necessary


When one thread executes inside a critical section, all the others trying to enter it are essentially blocked. We should keep as few instructions as possible in the critical section. To illustrate, an example of bad code with a large critical section is given: The ReadFifyThousandRecords () method does not modify the data. There is no reason to execute it under lock. If this method is executed for 10 seconds, reading 50 thousand rows from the database, all other threads will be blocked for this entire period unnecessarily. This can seriously affect program performance. The correct solution would be to keep in the critical section only working with std :: cout .

void CallHome(string message)
{
  std::lock_guard lock(mu); // Начало критической секции, защищаем доступ к std::cout
 
  ReadFifyThousandRecords();
 
  cout << "Thread " << this_thread::get_id() << " says " << message << endl;
 
}// при уничтожении объекта lock_guard блокировка на мьютекс mu освобождается






void CallHome(string message)
{
  ReadFifyThousandRecords(); // Нет необходимости держать данный метод в критической секции т.к. он не модифицирует данные
  std::lock_guard lock(mu); // Начало критической секции, защищаем доступ к std::cout
  cout << "Thread " << this_thread::get_id() << " says " << message << endl;
 
}//  при уничтожении объекта lock_guard блокировка на мьютекс mu освобождается


Mistake # 8: Taking multiple locks in a different order



This is one of the most common causes of deadlock , a situation in which threads are infinitely blocked due to waiting for access to resources blocked by other threads. Consider an example:

stream 1stream 2
lock Alock B
// ... some operations// ... some operations
lock Block A
// ... some other operations// ... some other operations
unlock Bunlock A
unlock Aunlock B

A situation may arise in which thread 1 will try to capture lock B and be blocked because thread 2 has already captured it. At the same time, the second thread is trying to capture lock A, but cannot do this, because it was captured by the first thread. Thread 1 cannot release lock A until it locks B, etc. In other words, the program freezes.

This code example will help you reproduce deadlock : If you run this code, it will freeze. If you go deeper into the debugger in the thread window, you will see that the first thread (called from CallHome_Th1 () ) is trying to get a mutex B lock, while thread 2 (called from CallHome_Th2 ()

#include "stdafx.h"
#include 
#include 
#include 
#include 
 
using namespace std;
 
std::mutex muA;
std::mutex muB;
 
void CallHome_Th1(string message)
{
  muA.lock();
  // выполнение каких-то операций
  std::this_thread::sleep_for(std::chrono::milliseconds(100));
  muB.lock();
 
  cout << "Thread " << this_thread::get_id() << " says " << message << endl;
 
  muB.unlock();
  muA.unlock();
}
 
void CallHome_Th2(string message)
{
  muB.lock();
  // какие-то дополнительные операции
  std::this_thread::sleep_for(std::chrono::milliseconds(100));
  muA.lock();
 
  cout << "Thread " << this_thread::get_id() << " says " << message << endl;
 
  muA.unlock();
  muB.unlock();
}
 
int main()
{
  thread t1(CallHome_Th1, "Hello from Jupiter");
  thread t2(CallHome_Th2, "Hello from Pluto");
 
  t1.join();
  t2.join();
 
  return 0;
}


) tries to block mutex A. None of the threads can succeed, which leads to a mutual lock! (the picture is clickable) What can you do about it? The best solution would be to restructure the code so that locking locks occur in the same order each time. Depending on the situation, you can use other strategies: 1. Use the std :: scoped_lock wrapper class to jointly capture several locks: 2. Use the std :: timed_mutex class , in which you can specify a timeout after which the lock will be released if the resource is not became available.










std::scoped_lock lock{muA, muB};



std::timed_mutex m;
 
void DoSome(){
    std::chrono::milliseconds timeout(100);
 
    while(true){
        if(m.try_lock_for(timeout)){
            std::cout << std::this_thread::get_id() << ": acquire mutex successfully" << std::endl;
            m.unlock();
        } else {
            std::cout << std::this_thread::get_id() << ": can’t  acquire mutex, do something else" << std::endl;
        }
    }
}


Mistake # 9: Trying to grab a std :: mutex lock twice


Trying to lock the lock twice will result in undefined behavior. In most debug implementations, this will crash. For example, in the code below, LaunchRocket () will lock the mutex and then call StartThruster () . What is curious, in the above code you will not encounter this problem during normal operation of the program, the problem only occurs when an exception is thrown, which is accompanied by undefined behavior or the program terminates abnormally. To resolve this problem, you must correct the code in such a way as to prevent the re-retrieval of previously received locks. As a crutch solution, you can use std :: recursive_mutex

#include "stdafx.h"
#include 
#include 
#include 
 
std::mutex mu;
 
static int counter = 0;
 
void StartThruster()
{
  try
  {
    // какие-то операции
  }
  catch (...)
  {
    std::lock_guard lock(mu);
    std::cout << "Launching rocket" << std::endl;
  }
}
 
void LaunchRocket()
{
  std::lock_guard lock(mu);
  counter++;
  StartThruster();
}
 
int main()
{
  std::thread t1(LaunchRocket);
  t1.join();
  return 0;
}
 


, but such a solution almost always indicates a poor architecture of the program.

Mistake # 10: Use mutexes when std :: atomic types are enough



When you need to change simple data types, for example, a Boolean value or an integer counter, using std: atomic will usually give better performance than using mutexes.

For example, instead of using the following construction: It is better to declare a variable as std :: atomic : For a detailed comparison of mutex and atomic, refer to the article “Comparison: Lockless programming with atomics in C ++ 11 vs. mutex and RW-locks »

int counter;
...
mu.lock();
counter++;
mu.unlock();




std::atomic counter;
...
counter++;




Mistake # 11: Create and destroy a large number of threads directly, instead of using a pool of free threads


Creating and destroying threads is an expensive operation in terms of processor time. Imagine an attempt to create a stream while the system is performing computationally intensive operations, for example, rendering graphics or computing game physics. The approach often used for such tasks is to create a pool of pre-allocated threads that can handle routine tasks, such as writing to disk or sending data over the network throughout the entire life cycle of the process.

Another advantage of the thread pool compared to spawning and destroying threads yourself is that you don't have to worry about thread oversubscription(a situation in which the number of threads exceeds the number of available cores and a significant part of the processor time is spent switching contexts [approx. translator]). This may affect system performance.

In addition, the use of the pool saves us from the pangs of managing the life cycle of threads, which ultimately translates into more compact code with fewer errors.

The two most popular libraries that implement the thread pool are Intel Thread Building Blocks (TBB) and Microsoft Parallel Patterns Library (PPL) .

Error No. 12: Do not handle exceptions that occur in background threads


Exceptions thrown in one thread cannot be handled in another thread. Let's imagine that we have a function that throws an exception. If we execute this function in a separate thread branching from the main thread of execution, and expect that we will catch any exception thrown from the additional thread, then this will not work. Consider an example: When this program is executed, it will crash, however, the catch block in the main () function will not execute and will not handle the exception thrown in thread t1. The solution to this problem is to use the features from C ++ 11: std :: exception_ptr is used to handle the exception thrown in the background thread. Here are the steps you need to take:

#include "stdafx.h"
#include
#include
#include
#include
 
static std::exception_ptr teptr = nullptr;
 
void LaunchRocket()
{
  throw std::runtime_error("Catch me in MAIN");
}
 
int main()
{
  try
  {
    std::thread t1(LaunchRocket);
    t1.join();
  }
  catch (const std::exception &ex)
  {
    std::cout << "Thread exited with exception: " << ex.what() << "\n";
  }
 
  return 0;
}
 






  • Create a global instance of the std :: exception_ptr class initialized to nullptr
  • Inside a function that runs in a separate thread, handle all exceptions and set the value std :: current_exception () of the global variable std :: exception_ptr declared in the previous step
  • Check the value of a global variable inside the main thread
  • If the value is set, use the std :: rethrow_exception (exception_ptr p) function to repeatedly call the previously caught exception, passing it by reference as a parameter

Recalling an exception by reference does not occur in the thread in which it was created, so this feature is great for handling exceptions in different threads.

In the code below, you can safely handle the exception thrown in the background thread.

#include "stdafx.h"
#include
#include
#include
#include
 
static std::exception_ptr globalExceptionPtr = nullptr;
 
void LaunchRocket()
{
  try
  {
    std::this_thread::sleep_for(std::chrono::milliseconds(100));
    throw std::runtime_error("Catch me in MAIN");
  }
  catch (...)
  {
    //При возникновении исключения присваиваем значение указателю
    globalExceptionPtr = std::current_exception();
  }
}
 
int main()
{
  std::thread t1(LaunchRocket);
  t1.join();
 
  if (globalExceptionPtr)
  {
    try
    {
      std::rethrow_exception(globalExceptionPtr);
    }
    catch (const std::exception &ex)
    {
      std::cout << "Thread exited with exception: " << ex.what() << "\n";
    }
  }
 
  return 0;
}
 


Mistake # 13: Use threads to simulate asynchronous operation, instead of using std :: async


If you need the code to execute asynchronously, i.e. without blocking the main thread of execution, the best choice would be to use std :: async () . This is equivalent to creating a stream and passing the necessary code to execute in this stream through a pointer to a function or parameter in the form of a lambda function. However, in the latter case, you need to monitor the creation, attachment / detachment of this thread, as well as the handling of all exceptions that may occur in this thread. If you use std :: async () , you relieve yourself of these problems and also sharply reduce your chances of getting into deadlock .

Another significant benefit of using std :: asyncIt consists in the ability to get the result of an asynchronous operation back to the calling thread using the std :: future object. Imagine that we have a ConjureMagic () function that returns an int. We can start an asynchronous operation, which will set the value in the future to the future object, when the task completes, and we can extract the result of the execution from this object in the flow of execution from which the operation was called. Getting the result back from the running thread to the caller is more cumbersome. Two ways are possible:

// запуск асинхронной операции и получение обработчика для future 
std::future asyncResult2 = std::async(&ConjureMagic);
 
//... выполнение каких-то операций пока future не будет установлено
 
// получение результата выполнения из future 
 int v = asyncResult2.get();




  1. Passing a reference to the output variable to the stream in which it will save the result.
  2. Store the result in the field variable of the workflow object, which can be read as soon as the thread completes execution.

Kurt Guntheroth found that in terms of performance, the overhead of creating a stream is 14 times that of using async .

Bottom line: use std :: async () by default until you find strong arguments in favor of using std :: thread directly .

Error No. 14: Do not use std :: launch :: async if asynchrony is required


The std :: async () function is not quite the correct name, because by default it may not run asynchronously!

There are two std :: async runtime policies :

  1. std :: launch :: async : the passed function starts to execute immediately in a separate thread
  2. std :: launch :: deferred : the passed function does not start immediately, its launch is delayed before the get () or wait () calls are made on the std :: future object, which will be returned from the std :: async call . In the place of calling these methods, the function will be executed synchronously.

When we call std :: async () with default parameters, it starts with a combination of these two parameters, which in fact leads to unpredictable behavior. There are a number of other difficulties associated with using std: async () with the default startup policy:

  • inability to predict the correct access to local flow variables
  • an asynchronous task may not start at all due to the fact that calls to the get () and wait () methods may not be called during program execution
  • when used in loops in which the exit condition expects the std :: future object to be ready, these loops may never end, because the std :: future returned by the call to std :: async may start in deferred state.

To avoid all these difficulties, always call std :: async with the std :: launch :: async launch policy .

Don’t do it this way: Instead, do it this way: This point is discussed in more detail in Scott Meyers' book Effective and Modern C ++.

//выполнение функции myFunction используя std::async с политикой запуска по умолчанию
auto myFuture = std::async(myFunction);




//выполнение функции myFunction асинхронно
auto myFuture = std::async(std::launch::async, myFunction);




Ошибка №15: Вызывать метод get() у std::future объекта в блоке кода, время выполнение которого критично


The code below processes the result obtained from the std :: future object of an asynchronous operation. However, the while loop will be locked until the asynchronous operation is completed (in this case, for 10 seconds). If you want to use this loop to display information on the screen, this can lead to unpleasant delays in rendering the user interface. Note : another problem of the above code is that it tries to access the std :: future object a second time, although the state of the std :: future object was retrieved at the first iteration of the loop and could not be retrieved. The correct solution would be to check the validity of std :: future

#include "stdafx.h"
#include 
#include 
 
int main()
{
  std::future myFuture = std::async(std::launch::async, []()
  {
    std::this_thread::sleep_for(std::chrono::seconds(10));
    return 8;
  });
 
  // Цикл обновления для выводимых данных
  while (true)
  {
    // вывод некоторой информации в терминал          
    std::cout << "Rendering Data" << std::endl;
    int val = myFuture.get(); // вызов блокируется на 10 секунд
    // выполнение каких-то операций над Val
  }
 
  return 0;
}
 




object before calling the get () method. Thus, we do not block completion of the asynchronous task and we do not try to interrogate the already extracted std :: future object.

This code snippet allows you to achieve this:

#include "stdafx.h"
#include 
#include 
 
int main()
{
  std::future myFuture = std::async(std::launch::async, []()
  {
    std::this_thread::sleep_for(std::chrono::seconds(10));
    return 8;
  });
 
  // Цикл обновления для выводимых данных
  while (true)
  {
    // вывод некоторой информации в терминал           
    std::cout << "Rendering Data" << std::endl;
 
    if (myFuture.valid())
    {
      int val = myFuture.get(); // вызов блокируется на 10 секунд
 
      //  выполнение каких-то операций над Val
    }
  }
 
  return 0;
}
 


Mistake # 16: Failure to understand that exceptions thrown inside an asynchronous operation will be passed to the calling thread only when calling std :: future :: get ()


Imagine that we have the following code fragment, what do you think will be the result of calling std :: future :: get () ? If you assume that the program will crash - you are absolutely right! The exception thrown in the asynchronous operation is thrown only when the get () method is called on the std :: future object . And if the get () method is not called, then the exception will be ignored and thrown when the std :: future object goes out of scope. If your asynchronous operation can throw an exception, you should always wrap the call to std :: future :: get () in a try / catch block. An example of how this might look:

#include "stdafx.h"
#include 
#include 
 
int main()
{
  std::future myFuture = std::async(std::launch::async, []()
  {
    throw std::runtime_error("Catch me in MAIN");
    return 8;
  });
 
  if (myFuture.valid())
  {
    int result = myFuture.get();
  }
 
  return 0;
}
 








#include "stdafx.h"
#include 
#include 
 
int main()
{
  std::future myFuture = std::async(std::launch::async, []() 
  {
    throw std::runtime_error("Catch me in MAIN");
    return 8;
  });
 
  if (myFuture.valid())
  {
    try
    {
      int result = myFuture.get();
    }
    catch (const std::runtime_error& e)
    {
       std::cout << "Async task threw exception: " << e.what() << std::endl;
    }
  }
  return 0;
}
 


Ошибка №17: Использование std::async, когда требуется чёткий контроль над исполнением потока


Although std :: async () is sufficient in most cases, there are situations in which you may need careful control over the execution of your code in a stream. For example, if you want to bind a specific thread to a specific processor core in a multiprocessor system (for example, Xbox).

The given code fragment establishes the binding of the thread to the 5th processor core in the system. This is possible thanks to the native_handle () method of the std :: thread object , and passing it to the Win32 API stream function . There are many other features provided through the streaming Win32 API that are not available in std :: thread or std :: async () . When working through

#include "stdafx.h"
#include 
#include 
#include 
 
using namespace std;
 
void LaunchRocket()
{
  cout << "Launching Rocket" << endl;
}
 
int main()
{
  thread t1(LaunchRocket);
 
  DWORD result = ::SetThreadIdealProcessor(t1.native_handle(), 5);
 
  t1.join();
 
  return 0;
}
 


std :: async (), these basic platform functions are not available, which makes this method unsuitable for more complex tasks.

An alternative is to create std :: packaged_task and move it to the desired execution thread after setting the properties of the thread.

Mistake # 18: Creating a lot more “running” threads than cores are available


From an architectural point of view, flows can be classified into two groups: “running” and “waiting”.

Running threads utilize 100% of the processor time of the kernel they are running on. When more than one running thread is allocated to one core, CPU utilization efficiency drops. We don’t get a performance gain if we execute more than one running thread on one processor core - in fact, performance drops due to additional context switches.

Waiting threads utilize only a few clock cycles on which they are executed while they are waiting for system events or network I / O, etc. In this case, most of the available processor time of the kernel remains unused. One waiting thread can process data, while the others are waiting for events to trigger — this is why it is advantageous to distribute several waiting threads to one core. Scheduling multiple pending threads per core can provide much greater program performance.

So, how to understand how many running threads the system supports? Use the std :: thread :: hardware_concurrency () method . This function usually returns the number of processor cores, but it takes into account cores that behave as two or more logical cores due tohypertreading .

You must use the obtained value of the target platform to plan the maximum number of simultaneously running threads of your program. You can also assign one core for all pending threads, and use the remaining number of cores for running threads. For example, in a quad-core system, use one core for ALL pending threads, and for the remaining three cores, three running threads. Depending on the efficiency of your thread scheduler, some of your executable threads may switch context (due to page access failures, etc.), leaving the kernel inactive for some time. If you observe this situation during profiling, you should create a slightly larger number of executed threads than the number of cores, and configure this value for your system.

Mistake # 19: Using the volatile keyword for synchronization


The volatile keyword, before specifying the type of a variable, does not make operations on this variable atomic or thread safe. What you probably want is std :: atomic .

See the discussion on stackoverflow for more details.

Mistake # 20: Using Lock Free Architecture unless absolutely necessary


There is something in complexity that every engineer likes. Creating programs that work without locks sounds very tempting compared to conventional synchronization mechanisms, such as mutex, condition variables, asynchrony, etc. However, every experienced C ++ developer I spoke to had an opinion that the use of non-locking programming as an initial option is a kind of premature optimization that can go sideways at the most inopportune moment (think about a failure in an operating system when you do not have a full heap dump!).

In my career in C ++, there was only one situation that required the execution of code without locks, because we worked in a system with limited resources, where each transaction in our component should take no more than 10 microseconds.

Before thinking about applying a development approach without blocking, please answer three questions:

  • Have you tried to design the architecture of your system so that it does not need a synchronization mechanism? As a rule, the best synchronization is the lack of synchronization.
  • If you need synchronization, have you profiled your code to understand performance characteristics? If so, have you tried to optimize bottlenecks?
  • Can you scale horizontally instead of scale vertically?

In summary, for normal application development, please consider non-locking programming only when you have exhausted all other alternatives. Another way to look at this is that if you are still making some of the above 19 errors, you should probably stay away from programming without blocking.

[From. translator: many thanks to vovo4K for helping me prepare this article.]

Also popular now: