Atakua April 25, 2014 at 08:50

Threads are Goto Parallel Programming

Immediately I will reveal the thought made in the heading. Using threads (also called threads, threads, English threads) and direct manipulation tools (creation, destruction, synchronization) for writing parallel applications has the same detrimental effect on the complexity of the algorithms, the quality of the code and its debugging speed, which was introduced by the use of the Goto operator in sequential programs.
As programmers once abandoned unstructured transitions, we need to abandon the direct use of threads now and in the future. And just as each of us uses structural blocks instead of Goto, structures built on top of them should be used instead of threads. Fortunately, all the tools for this appeared in quite traditional languages.

^{Photo by: Rainer Zenz}

First, a bit of history and links with discussions already held.

Goto considered harmful

Probably the most authoritative nail in the coffin of the unfortunate operator at the time was driven by Edsger Dijkstra in his five-page article of 1968, “A Case against the GO TO Statement” , also known as “Go-to statement considered harmful.”

On Habré, the topic of using / expelling Goto from programs in high-level languages has been raised repeatedly:
habrahabr.ru/post/114211
habrahabr.ru/post/114470
habrahabr.ru/post/114326
Undoubtedly, the existence of Goto is a source of endless holivar. However, modern “general purpose” languages, starting approximately with Java, do not include Goto in their syntax, at least in its original form.

Where is the Goto still in progress

I note one often used but not mentioned application of the label jump operation, which concerns me personally quite strongly: assembler languages and machine codes . Almost all microprocessor architectures have instructions for conditional and unconditional jumps. Moreover, I do not recall the assembler in which the for or while statement is made in hardware . As a result, programmers working at this level of abstraction are forced to deal with the entire mishmash of non-local transitions. Dijkstra has a remark about this: "... goto must be expelled from all high-level languages (that is, from everywhere except, maybe, simple machine code )" [in the original: "everything except —perhaps— plain machine code "].

I will omit the description of all known arguments against Goto; anyone can find them in the links above. I’ll write a conclusion right away, as I understand it: using Goto significantly reduces the "high level" code, hiding the algorithm in the details of a sequential implementation . Let's move on to the flows.

What is the thread problem?

To formulate where to expect problems from threads, we turn to Edward A. Lee 's article “The Problem with Threads” . Its author tried to bring some formalism (in my opinion, superfluous) to explain the following fact. The direct use of threads requires analysis of all possible alternations of basic operations that make up individual threads of execution. The number of such combinations grows like an avalanche with increasing application size and quickly exceeds the capabilities of human perception and analysis tools. Those. completely debugging such a parallel program becomes impossible, not to mention formal proof of correctness.
In addition to this crucial aspect, threading programming (for example, Pthreads) is not optimally simple in terms of the performance of both the programmer and the resulting application.

Lack of composition properties. Calling a library function from a stream, without analyzing its code, you cannot say whether it will spawn several more parallel threads of execution and thereby exceed the capabilities of the equipment (the so-called oversubscription ).
Concurrency on threads cannot be made optional. It is always present and hardwired into the program logic, despite the fact that in reality two related processes do not always have to work simultaneously; often a decision must be made dynamically, taking into account the current situation and the availability of resources.
The difficulty of balancing mechanisms. Even a slight skew in the speeds of different threads can significantly degrade the performance of the entire application ("the caravan goes at the speed of the slowest camel"). All worries that the equipment was evenly loaded are transferred to the application programmer, who may not have enough information about the situation in the system. And this is not his business, in general - he must solve the applied problem.

The conclusion almost literally repeats the one that was made a little higher: the use of threads significantly reduces the "high level" of the code, hiding the algorithm in the details of a parallel implementation. “Manual control” of flows in a program written in a high-level language exposes many details of the underlying equipment, which one does not want to see.

Well, if not threads?

How to use the capabilities of multicore equipment without resorting to threads? Of course, there are various programming languages that were originally designed with the expectation of efficiently writing parallel programs. Here and Erlang, and functional languages. If extreme scalability of the solution is needed, the answer should be sought in them and in the mechanisms they propose. But what about programmers using more traditional languages, for example, C ++, and / or working with existing code?

OpenMP is good, but not that

Quite a long time, neither in C nor in C ++ (unlike, for example, the more “younger” Java) the presence of parallelism in the programs was not reflected in any way, i.e. it was actually assigned to "third-party" libraries like Pthread. OpenMP has long been known for introducing structured fork-join concurrency in these languages, as well as in Fortran. In my opinion, this standard does not bring solutions related to the above flow problems. Those. OpenMP is still too low-level mechanism. The latest revision of the standard did not offer an increase in the level of abstraction, but added features (and difficulties) to those who want to use OpenMP to run codes on heterogeneous systems (more details about version 4.0 were written on Habré ).

Extensions and Libraries

Between the new languages, which initially try to support parallelism, and the traditional languages that completely ignore it, there are extensions - attempts to add the necessary abstractions and fix them in the syntax - and libraries - wrapped in existing language concepts (such as calling subprograms) for solving problems. Language extensions theoretically allow you to achieve better results than libraries, because with their help we break out of the limitations of the source language, creating a new one. But very rarely, such extensions are gaining popularity among a wide audience of users. Recognition often comes only after standardizing such an extension as part of the language.

Extensions of languages and libraries, including for parallel programming, are handled by many companies, universities and combinations thereof. At Intel, of course, there are many times the options of the first and second mentioned on Habré: Intel Cilk Plus , Intel Threading Building Blocks. I will express my opinion that Cilk (Plus) is more interesting as a means of increasing the level of parallelism abstraction than TBB. I am glad that he has support in the GCC .

C ++ 11

In the latest C ++ standards, the parallel nature of modern computing has finally gained acceptance; the ability of the code to be executed simultaneously with something else is taken into account when describing many language constructs and standard classes. Moreover, the programmer can choose from a wide range of levels of abstractions: from direct manipulation of flows through std::thread, through an asynchronous call std::packaged_taskto an asynchronous / lazy callstd::async. A lot of work to ensure the proper operation of all this machinery is shifting from third-party libraries to the standard one, which comes with a compiler that implements the capabilities of the new standard. An open (at least for me) question is: do C ++ 11 implementations already exist that provide all three properties of high-level parallelism: composition, optionality and balancing, and thereby freeing the application programmer from these worries.

What else to read

Finally, I want to share one book. Its main idea for me is that it is necessary to introduce an understanding of the existence of the structure of parallel applications into the design process. Moreover, it is necessary to teach students this as early as possible, at about the same time as they are told why “goto is bad.”

Michael McCool, Arch Robison, James Reinders. Structured Parallel Programming - 2012 - parallelbook.com .

The book, in particular, shows solutions to the same problems using several libraries / parallel programming languages: Intel Cilk Plus, OpenMP, Intel TBB, OpenCL and Intel ArBB. This allows you to compare the expressiveness and effectiveness of these approaches in various conditions of practical problems.

Thank you for attention!

Tags: