podkolzzzin May 23, 2019 at 11:22

.NET: Tools for working with multithreading and asynchrony. Part 1

I publish the original article on Habr, the translation of which is posted on the Codingsight blog .
The second part is available here.

The need to do something asynchronously, without waiting for the result here and now, or to share a lot of work between several units performing it, was even before the advent of computers. With their appearance, such a need has become very tangible. Now, in 2019, typing this article on a laptop with an 8-core Intel Core processor, on which not one hundred processes work at the same time, but even more threads. Next to it lies a slightly battered phone, bought a couple of years ago, with an 8-core processor on board. The thematic resources are full of articles and videos where their authors admire this year's flagship smartphones where they put 16-core processors. For less than $ 20 / hour, MS Azure provides a virtual machine with 128 core processors and 2 TB RAM.

Terminology

Process - An OS object, an isolated address space, contains threads.
Thread (Thread) - an OS object, the smallest unit of execution, part of a process, threads share memory and other resources among themselves within the process.
Multitasking is a property of the OS, the ability to execute several processes at the same time.
Multicore is a property of a processor, the ability to use several cores for data processing.
Multiprocessing is a property of a computer, the ability to simultaneously work with several processors physically.
Multithreading is a property of a process, the ability to distribute data processing between multiple threads.
Parallelism- performing several actions physically at the same time per unit of time
Asynchrony - performing an operation without waiting for the completion of this processing, the result of the execution can be processed later.

Metaphor

Not all definitions are good and some need additional explanation, so I will add a metaphor for cooking breakfast to the formally introduced terminology. Cooking breakfast in this metaphor is a process.

Cooking breakfast in the morning I ( CPU ) come to the kitchen ( Computer ). I have 2 hands ( Cores ). The kitchen has a number of devices ( IO ): oven, kettle, toaster, refrigerator. I turn on the gas, put a frying pan on it and pour oil in there, without waiting until it warms up ( asynchronously, Non-Blocking-IO-Wait ), I take the eggs out of the refrigerator and break them into a plate, and then beat them with one hand ( Thread # 1 ), and the second ( Thread # 2) I hold the plate (Shared Resource). Now I would still turn on the kettle, but there are not enough hands ( Thread Starvation ) During this time, the frying pan is heated (Processing the result) where I pour what I whipped. I reach for the kettle and turn it on and stupidly watch how the water in it boils ( Blocking-IO-Wait ), although I could wash the plate during this time, where I beat the omelet.

I cooked an omelet using only 2 hands, and I don’t have more, but at the same time, 3 operations took place at the moment of whipping an omelet: whipping an omelet, holding a plate, heating a frying pan. CPU is the fastest part of the computer, IO is that more often slows down everything, so often an effective solution is to take something CPU while receiving data from IO.

Continuing the metaphor:

If in the process of preparing an omelet, I would also try to change clothes, this would be an example of multitasking. An important nuance: computers with this are much better than people.
A kitchen with several chefs, for example in a restaurant, is a multi-core computer.
Many food court restaurants in a shopping center - data center

.NET Tools

In working with threads, as in many other things, .NET is good. With each new version, he presents more and more new tools for working with them, new layers of abstraction over OS threads. In working with the construction of abstractions, the framework developers use the approach that leaves the possibility when using high-level abstraction, it will go down one or several levels below. Most often this is not necessary, moreover, this opens up the possibility of a shotgun being shot in the foot, but sometimes, in rare cases, this may be the only way to solve a problem that does not solve at the current level of abstraction.

By tools, I mean both the program interfaces (APIs) provided by the framework and third-party packages, and a whole software solution that simplifies the search for any problems associated with multi-threaded code.

Stream start

The Thread class, the most basic class in .NET for working with threads. The constructor accepts one of two delegates:

ThreadStart - No Parameters
ParametrizedThreadStart - with one parameter of type object.

The delegate will be executed in the newly created thread after calling the Start method, if a delegate of the ParametrizedThreadStart type was passed to the constructor, then an object must be passed to the Start method. This mechanism is needed to transfer any local information to the stream. It is worth noting that creating a stream is an expensive operation, and the stream itself is a heavy object, at least because 1MB of memory is allocated to the stack, and requires interaction with the OS API.

new Thread(...).Start(...);

The ThreadPool class represents the concept of a pool. In .NET, the thread pool is a work of art and developers from Microsoft have put a lot of effort into making it work optimally in a wide variety of scenarios.

General concept:

From the start, the application in the background creates several threads in reserve and provides the opportunity to take them into use. If threads are used frequently and in large numbers, the pool expands to meet the need of the calling code. When there are no free flows in the pool at the right time, it will either wait for one of the flows to return or create a new one. It follows that the thread pool is great for some short actions and poorly suited for operations that operate as a service throughout the entire application.

To use a thread from the pool, there is a QueueUserWorkItem method that accepts a WaitCallback delegate, which is the same signature as ParametrizedThreadStart, and the parameter passed to it performs the same function.

ThreadPool.QueueUserWorkItem(...);

The lesser-known thread pool method RegisterWaitForSingleObject is used to organize non-blocking IO operations. The delegate passed to this method will be called when the WaitHandle passed to the method is “Released”.

ThreadPool.RegisterWaitForSingleObject(...)

.NET has a stream timer and it differs from WinForms / WPF timers in that its handler will be called in a stream taken from the pool.

System.Threading.Timer

There is also a rather exotic way to send a delegate to the thread from the pool - the BeginInvoke method.

DelegateInstance.BeginInvoke

I also want to dwell in passing on a function that calls many of the above methods - CreateThread from Kernel32.dll Win32 API. There is a way, thanks to the mechanism of extern methods, to call this function. I saw such a challenge only once in a terrible example of legacy code, and the motivation of the author to do just that is still a mystery to me.

Kernel32.dll CreateThread

View and debug threads

The threads you created personally by all third-party components and the .NET pool can be viewed in the Threads Visual Studio window. This window will display information about flows only when the application is under debugging and in break mode (Break mode). Here you can conveniently view the stack names and priorities of each thread, switch debugging to a specific thread. The Priority property of the Thread class allows you to set the priority of the thread, which OC and CLR will perceive as a recommendation when dividing CPU time between threads.

Task parallel library

Task Parallel Library (TPL) appeared in .NET 4.0. Now it is the standard and the main tool for working with asynchrony. Any code using an older approach is considered legacy. The basic unit of TPL is the Task class from the System.Threading.Tasks namespace. Task is an abstraction over a thread. With the new version of C #, we got an elegant way to work with Task - async / await operators. These concepts made it possible to write asynchronous code as if it were simple and synchronous, this made it possible even for people with little understanding of the internal kitchen of threads to write applications that use them, applications that do not freeze during long operations. Using async / await is a topic for one or even several articles, but I will try to get the gist of a few sentences:

async is a modifier of the method returning Task or void
and await is the Task non-blocking wait statement.

Once again: the await operator, in the general case (there are exceptions), will release the current thread of execution further, and when the Task finishes its execution, and the thread (in fact it is more correct to say the context, but more on that later) will be free to continue the method further. Inside .NET, this mechanism is implemented in the same way as yield return, when a written method turns into a whole class, which is a state machine and can be executed in separate pieces depending on these states. Anyone interested can write any simple code using asynс / await, compile and view the assembly using JetBrains dotPeek with Compiler Generated Code enabled.

Consider the options for launching and using Task. Using the example code below, we create a new task that does nothing useful (Thread.Sleep (10000) ), but in real life it should be some kind of complex CPU-involving work.

using TCO = System.Threading.Tasks.TaskCreationOptions;
public static async void VoidAsyncMethod() {
    var cancellationSource = new CancellationTokenSource();
    await Task.Factory.StartNew(
        // Code of action will be executed on other context
        () => Thread.Sleep(10000),
        cancellationSource.Token,
        TCO.LongRunning | TCO.AttachedToParent | TCO.PreferFairness,
        scheduler
    );
    //  Code after await will be executed on captured context
}

Task is created with a number of options:

LongRunning is a hint that the task will not be completed quickly, which means that it might be worth considering not to take a thread from the pool, but to create a separate one for this Task so as not to harm the others.
AttachedToParent - Task 's can be arranged in a hierarchy. If this option was used, then the Task may be in a state when it has completed itself and is waiting for the children to complete.
PreferFairness - means that it would be nice to execute the tasks sent earlier for execution before those that were sent later. But this is just a recommendation and the result is not guaranteed.

The second parameter to the method passed CancellationToken. In order to correctly process the cancellation of an operation after its launch, the executed code must be filled with status checks of CancellationToken. If there are no checks, then the Cancel method called on the CancellationTokenSource object will be able to stop the execution of the Task only before it starts.

The last parameter passed the scheduler object of type TaskScheduler. This class and its descendants are designed to control the strategies for distributing Task'ov by thread, by default, Task will be executed on a random thread from the pool.

The await operator is applied to the created Task, which means the code written after it, if any, will be executed in the same context (often this means that it is on the same thread) as the code before await.

The method is marked as async void, which means that you can use the await operator in it, but the calling code cannot wait for execution. If this feature is necessary, then the method should return Task. Methods marked async void are quite common: as a rule, these are event handlers or other methods that work on the principle of fire and forget. If you need to not only give the opportunity to wait until the completion of execution, but also return the result, then you must use Task.

On the Task that the StartNew method returned, however, as on any other, you can call the ConfigureAwait method with the false parameter, then execution after await will continue not on the captured context, but on an arbitrary one. This should always be done when the execution context is not important for the code after await. It is also a recommendation from MS when writing code that it will come packaged in a library form.

Let's dwell a little more on how you can wait until the completion of the Task. Below is a sample code, with comments, when the wait is done conditionally good and when conditionally bad.

public static async void AnotherMethod() {
    int result = await AsyncMethod(); // good
    result = AsyncMethod().Result; // bad
    AsyncMethod().Wait(); // bad
    IEnumerable tasks = new Task[] {
        AsyncMethod(), OtherAsyncMethod()
    };
    await Task.WhenAll(tasks); // good
    await Task.WhenAny(tasks); // good
    Task.WaitAll(tasks.ToArray()); // bad
}

In the first example, we wait for the Task to complete and without blocking the calling thread, we will return to processing the result only when it is already there, until the calling thread is left to itself.

In the second option, we block the calling thread until the result of the method is calculated. This is bad not only because we took the thread, such a valuable resource of the program, with simple idleness, but also because if the method code that we call has await, and the synchronization context involves returning to the calling thread after await, then we will get deadlock : the calling thread waits until the result of the asynchronous method is calculated, the asynchronous method tries in vain to continue its execution in the calling thread.

Another drawback of this approach is the complicated error handling. The fact is that errors in asynchronous code when using async / await are very easy to handle - they behave as if the code was synchronous. While, if we apply ~~exorcism,~~ synchronous expectation to Task, the original exception turns into an AggregateException, i.e. To handle an exception, you will have to examine the InnerException type and write the if chain inside one catch block or use the catch when construct instead of the more familiar catch block chain in C #.

The third and last examples are also marked bad for the same reason and contain all the same problems.

WhenAny and WhenAll methods are extremely convenient in waiting for a group of Task'ov, they wrap a group of Task'ov in one, which will work either on the first operation of Task'a from the group, or when everyone completes their execution.

Flow stop

For various reasons, it may be necessary to stop the stream after it starts. There are a number of ways to do this. The Thread class has two methods with appropriate names - Abort and Interrupt . The first is not recommended for use, as after it is called at any random moment, during the processing of any instruction, a ThreadAbortedException will be thrown . You don’t expect such an exception to crash when incrementing an integer variable, right? And when using this method, this is a very real situation. If it is necessary to prevent the CLR from throwing such an exception in a certain section of the code, you can wrap it in calls to Thread.BeginCriticalRegion ,Thread.EndCriticalRegion . Any code written in a finally block is wrapped with such calls. For this reason, in the bowels of the framework code, you can find blocks with an empty try, but not an empty finally. Microsoft so does not recommend using this method that they did not include it in .net core.

The Interrupt method works more predictably. It can interrupt a thread with the exception of ThreadInterruptedException only when the thread is in the idle state. In this state, it goes into suspension while waiting for WaitHandle, lock, or after calling Thread.Sleep.

Both of the options described above are bad for their unpredictability. The solution is to use the CancellationToken structure and the CancellationTokenSource class.. The bottom line is: an instance of the class CancellationTokenSource is created and only the person who owns it can stop the operation by calling the Cancel method . Only the CancellationToken is passed to the operation itself. Owners of the CancellationToken cannot cancel the operation themselves, but can only check whether the operation has been canceled. To do this, there is a Boolean property IsCancellationRequested and the ThrowIfCancelRequested method . The latter will raise a TaskCancelledException if the Cancel method is called on the canceled CancellationToken instance of the CancellationTokenSource. And it is this method that I recommend using. This is better than the previous options by gaining full control over at what points the exception operation can be interrupted.

The most cruel option to stop the thread is to call the Win32 API TerminateThread function. The behavior of the CLR after calling this function can be unpredictable. On MSDN, the following is written about this function: “TerminateThread is a dangerous function that should only be used in the most extreme cases. “

Convert legacy-API to Task Based using FromAsync method

If you were lucky enough to work on a project that was started after the tasks were introduced and ceased to cause quiet horror for most developers, then you won’t have to deal with a lot of old APIs, both third-party and your team’s tortured in the past. Fortunately, the .NET Framework development team took care of us, although perhaps the goal was to take care of ourselves. Be that as it may, .NET has a number of tools for painlessly converting code written in old asynchronous programming approaches to a new one. One of them is the FromAsync method of TaskFactory. Using the code example below, I wrap the old asynchronous methods of the WebRequest class in Task using this method.

object state = null;
WebRequest wr = WebRequest.CreateHttp("http://github.com");
await Task.Factory.FromAsync(
    wr.BeginGetResponse,
    we.EndGetResponse
);

This is just an example and you are unlikely to do this with built-in types, but any old project is simply teeming with BeginDoSomething methods that return IAsyncResult and EndDoSomething methods that accept it.

Convert legacy-API to Task Based using TaskCompletionSource class

Another important tool to consider is the TaskCompletionSource class . In terms of functions, purpose and principle of operation, it can somehow remind the RegisterWaitForSingleObject method of the ThreadPool class about which I wrote above. Using this class, you can easily and conveniently wrap old asynchronous APIs in Task.

You will say that I already spoke about the FromAsync method of the TaskFactory class intended for these purposes. Here we will have to recall the entire history of the development of asynchronous models in .net that Microsoft has offered over the past 15 years: before the Task-Based Asynchronous Pattern (TAP) there were Asynchronous Programming Pattern (APP), which was about Begin DoSomething methods returning IAsyncResult and End methodsDoSomething is its host and the FromAsync method is just fine for the legacy of these years, but over time, it was replaced by Event Based Asynchronous Pattern ( EAP ), which assumed that an event would be triggered when the asynchronous operation was completed.

TaskCompletionSource is just great for wrapping in Task and legacy-API built around the event model. The essence of his work is as follows: an object of this class has a public property of type Task whose state can be controlled through the methods SetResult, SetException, etc. of the TaskCompletionSource class. In places where the await operator was applied to this Task, it will be executed or crashed with an exception, depending on the method applied to the TaskCompletionSource. If everything is still not clear, then let's look at this code example, where some old EAP API is wrapped in Task using TaskCompletionSource: when the event is fired, the Task will be transferred to the Completed state, and the method that applied the await operator to this Task will resume execution getting the result object .

public static Task DoAsync(this SomeApiInstance someApiObj) {
    var completionSource = new TaskCompletionSource();
    someApiObj.Done += 
        result => completionSource.SetResult(result);
    someApiObj.Do();
    result completionSource.Task;
}

TaskCompletionSource Tips & Tricks

Wrapping older APIs is not all you can do with TaskCompletionSource. Using this class opens up an interesting possibility of designing various APIs on tasks that do not occupy threads. And the flow, as we recall, is an expensive resource and their number is limited (mainly by RAM). This limitation is easily achieved by developing, for example, a loaded web-application with complex business logic. Consider the possibilities that I’m talking about implementing such a trick as Long-Polling.

In short, the gist of the trick is this: you need to get information from the API about some events occurring on its side, while the API for some reason can not report the event, but can only return the state. An example of such is all APIs built on top of HTTP before the times of WebSocket or when it is not possible for some reason to use this technology. The client may ask the HTTP server. An HTTP server cannot itself provoke communication with a client. A simple solution is to interrogate the server by timer, but this creates additional load on the server and an additional delay on average TimerInterval / 2. To work around this, a trick called Long Polling was invented, which involves delaying the response from the server until Timeout expires or an event will happen. If an event has occurred, then it is processed; if not,

while(!eventOccures && !timeoutExceeded)  {
  CheckTimout();
  CheckEvent();
  Thread.Sleep(1);
}

But such a solution will show itself terribly as soon as the number of clients waiting for the event grows, because Each such client, in anticipation of the event, takes up a whole stream. Yes, and we get an additional delay of 1ms on the triggering of the event, most often it is not significant, but why make the software worse than it can be? If you remove Thread.Sleep (1), then in vain we will load one processor core at 100% idle, spinning in a useless cycle. Using TaskCompletionSource, you can easily redo this code and solve all the problems identified above:

class LongPollingApi {
    private Dictionary> tasks;
    public async Task AcceptMessageAsync(int userId, int duration) {
        var cs = new TaskCompletionSource();
        tasks[userId] = cs;
        await Task.WhenAny(Task.Delay(duration), cs.Task);
        return cs.Task.IsCompleted ? cs.Task.Result : null;
    }
    public void SendMessage(int userId, Msg m) {
        if (tasks.TryGetValue(userId, out var completionSource))
            completionSource.SetResult(m);
    }
}

This code is not production-ready, but just a demo. To use it in real cases, you need to at least handle the situation when a message arrives at a time when no one is expecting it: in this case, the AsseptMessageAsync method should return an already completed Task. If this case is the most frequent, then you can think about using ValueTask.

Upon receipt of a request for a message, we create and place TaskCompletionSource in the dictionary, and then we wait for what happens first: the specified time interval expires or a message is received.

ValueTask: why and how

Async / await operators, like the yield return operator, generate a state machine from the method, which is creating a new object, which is almost always not important, but in rare cases it can create a problem. This case may be a method called really often, talking about tens and hundreds of thousands of calls per second. If such a method is written so that in most cases it returns a result bypassing all await methods, then .NET provides a tool to optimize this - the ValueTask structure. To make it clear, consider an example of its use: there is a cache that we go to very often. There are some values in it and then we just return them, if not, then we go to some slow IO behind them. I want to do the latter asynchronously, which means the whole method is asynchronous. Thus, the obvious way to write a method is as follows:

public async Task GetById(int id) {
    if (cache.TryGetValue(id, out string val))
        return val;
    return await RequestById(id);
}

Because of the desire to optimize a little, and a slight fear of what Roslyn will generate by compiling this code, we can rewrite this example as follows:

public Task GetById(int id) {
    if (cache.TryGetValue(id, out string val))
        return Task.FromResult(val);
    return RequestById(id);
}

Indeed, the optimal solution in this case is to optimize the hot-path, namely, getting the value from the dictionary without any extra allocations and load on the GC, while in those rare cases when we still need to go to the IO, everything will remain plus / minus old:

public ValueTask GetById(int id) {
    if (cache.TryGetValue(id, out string val))
        return new ValueTask(val);
    return new ValueTask(RequestById(id));
}

Let's take a closer look at this code fragment: if there is a value in the cache, we create a structure, otherwise the real task will be wrapped in a significant one. The calling code doesn’t care in what way this code was executed: ValueTask from the point of view of C # syntax will behave just like the usual Task in this case.

TaskSchedulers: Managing Task Launch Strategies

The next API that I would like to consider is the TaskScheduler class and its derivatives. I already mentioned above that in TPL there is the ability to control the strategies for distributing Task'ov by thread. Such strategies are defined in the descendants of the TaskScheduler class. Almost any strategy that may be needed will be found in the ParallelExtensionsExtras library , developed by microsoft, but not part of .NET, but delivered as a Nuget package. Let's briefly consider some of them:

CurrentThreadTaskScheduler - Performs Task on the current thread
LimitedConcurrencyLevelTaskScheduler - limits the number of simultaneously executed tasks to the N parameter, which is accepted in the constructor
OrderedTaskScheduler - defined as LimitedConcurrencyLevelTaskScheduler (1), therefore, tasks will be performed sequentially.
WorkStealingTaskScheduler - implements a work-stealing approach to task distribution. Essentially a separate ThreadPool. It solves the problem that in .NET ThreadPool is a static class, one for all applications, which means that overloading or improper use in one part of the program can lead to side effects in another. Moreover, it is extremely difficult to understand the cause of such defects. T.O. there may be a need to use separate WorkStealingTaskSchedulers in parts of the program where the use of ThreadPool can be aggressive and unpredictable.
QueuedTaskScheduler - allows you to perform tasks according to the queue rules with priorities
ThreadPerTaskScheduler - creates a separate thread for each Task that runs on it. It can be useful for tasks that run unpredictably long.

There is a good detailed article on TaskSchedulers on the microsoft blog.

For convenient debugging of everything related to Tasks in Visual Studio there is a Tasks window. In this window, you can see the current status of the task and go to the currently executing line of code.

PLinq and the Parallel class

In addition to Task and everything that was said with them in .NET, there are two more interesting tools: PLinq (Linq2Parallel) and the Parallel class. The first promises parallel execution of all Linq operations on multiple threads. The number of threads can be configured with the WithDegreeOfParallelism extension method. Unfortunately, most often PLinq in the run mode by default will not have enough information about the insides of your data source to provide a significant speed gain, on the other hand, the attempt price is very low: you just need to call the AsParallel method in front of the Linq method chain and conduct performance tests. Moreover, it is possible to transfer to PLinq additional information about the nature of your data source using the Partitions mechanism. You can read more here and here..

The Parallel static class provides methods for iterating over a Foreach collection in parallel, executing a For loop, and executing multiple delegates in parallel to Invoke. The execution of the current thread will be stopped until the end of the calculations. The number of threads can be configured by passing ParallelOptions as the last argument. Using options, you can also specify TaskScheduler and CancellationToken.

conclusions

When I began to write this article based on the materials of my report and the information that I collected during the work after it, I did not expect that it would turn out so much. Now, when the text editor in which I am typing this article reproachfully tells me that the 15th page has gone, I will summarize the intermediate results. Other tricks, APIs, visual tools, and pitfalls will be discussed in a future article.

Conclusions:

You need to know the tools for working with threads, asynchrony and parallelism in order to use the resources of modern PCs.
.NET has many different tools for this purpose.
Not all of them appeared at once, because legacy can often be found, however there are ways to convert old APIs without much effort.
Work with threads in .NET is represented by the classes Thread and ThreadPool
Методы Thread.Abort, Thread.Interrupt, функция Win32 API TerminateThread опасны и не рекомендуются к использованию. Вместо них лучше использовать механизм CancellationToken’ов
Поток — ценный ресурс, их количество ограничено. Нужно избегать ситуаций, когда потоки заняты ожиданием событий. Для этого удобно использовать класс TaskCompletionSource
Наиболее мощным и продвинутым инструментов .NET для работы с параллелизмом и асинхронностью являются Task’и.
Операторы c# async/await реализуют концепцию неблокирующего ожидания
Управлять распределением Task’ов по потокам можно с помощью производных TaskScheduler’у классов
Структура ValueTask может быть полезна в оптимизации hot-paths и memory-traffic
Окна Tasks и Threads Visual Studio предоставляют много полезной для отладки многопоточного или асинхронного кода информации
PLinq is a cool tool, but it may not have enough information about your data source, however this can be fixed using the partitioning mechanism.
To be continued ...

Tags: