alan008 May 28, 2012 at 11:48

OmniThreadLibrary Library - Simple Multithreading in a Delphi Environment

Writing an interesting article on a technical topic is very difficult. We have to balance between not slipping into technical jungle and not saying anything at all. Today I’ll try in general terms (without details) to talk about how things are with the development of multi-threaded desktop applications in the not so popular today, but certainly familiar to many Russian developers Delphi environment. The article is aimed at NOT newcomers to programming, while being new to the field of creating multi-threaded applications.

The topic covered in the title is very extensive. All that will be written below is not even the tip of the iceberg, but rather a flight at an altitude of 10,000 meters above the ocean in which these icebergs float. Why write such an article? Rather, in order to pay attention to the wide opportunities that have long been available, but which for some reason many are afraid and shy.

Why Delphi?

I have been programming in Delphi for a very long time and do not stop enjoying it. It is in many ways a wonderful language. Its uniqueness lies in the fact that at the same time it allows you to create code of an arbitrarily high level, while remaining “close to hardware”, because the output is a native application, not code for a Java or .Net virtual machine. And at the same time, the Delphi language is very simple and concise, the code on it is nice to read and easy to understand, which I can not say about the code in C or C ++ (with all my great respect to the C developers, although someone will say that this is just a matter of habit).
At the moment, Delphi has lost its former popularity. This probably happened due to the fact that in the 2000s this product was practically abandoned for several years by developers, as a result of which it fell out of the competitive race of development environments for some time. Indeed, after Delphi 7, released by Borland in 2002, a less stable product appeared only in 2007. It was CodeGear Delphi 2007, released by CodeGear, a subsidiary of Borland. All versions between Delphi 7 and Delphi 2007 were practically unusable. In 2008, Borland sold the CodeGear division to Embarcadero Technologies, which (for which special thanks to her!) Immediately began to turn what she got into a modern, high-quality development environment. The current version of Delphi at the time of writing is Embarcadero Delphi XE2, released in September 2011. Due to the rather high quality of the latest Delphi versions, this development environment is gradually winning back the lost positions.

Why do we need multithreading?

People wanted to perform several tasks on a computer at the same time. This is called multitasking. Multitasking is implemented by means of the operating system. But if the OS is able to execute several applications at the same time, why not one application inside itself, too, to perform several tasks at once. For example, when archiving a large list of files, the archiver can simultaneously read the next file, at this time in the memory, archive the current read and write the result to the output file on disk. Those. instead of performing “read” -> “archive” -> “write the result to disk” sequentially on each file in one stream, you can start 3 streams, one of which will read files into memory, the second stream will be archived, and the third is to save to disk.
If processors continued to increase their clock speed at the same pace as they did in the 90s and early 2000s, one would not have to bother with multithreading and continue to write classic single-threaded code. However, in recent years, processors have ceased to actively increase the speed of one core, but they have begun to increase the number of these cores themselves. To use the potential of modern processors to 100%, multithreading is simply indispensable.

Why is it difficult to write multithreaded code?

1) It is easy to make a mistake.
When several applications are simultaneously running on a computer, the address space (memory) of each process is reliably isolated from other processes by the operating system and it is quite difficult to get into someone else's address space. With threads inside the same process, on the contrary, they all work with the common address space of the process and can change it arbitrarily. Therefore, in a multi-threaded application, you have to independently implement memory protection and thread synchronization, which leads to the need to write relatively complex, but not carrying payload code. This code is called a “boilerplate” (frying pan), because the pan must first be cooked before you start frying something on it. It is the need to write a “non-standard” boilerplate code that holds back the development of multi-threaded computing.
2) The code of a multithreaded application is difficult to analyze.
One of the difficulties of a multi-threaded application is that it is not visually clear when looking at the code of a multi-threaded application whether a particular method can be called (or called) from different threads. Those. you have to keep in mind which methods can be called from different threads and which cannot. Since making absolutely all methods thread-safe is not an option, there is always a chance to run into an error by calling a method that is not thread-safe from several threads.
3) A multi-threaded application is difficult to debug.
In a multi-threaded application, a lot of errors can occur with a certain state of threads running in parallel (as a rule, with a sequence of commands executed in different threads). An interesting example is described here (http://www.thedelphigeek.com/2011/08/multithreading-is-hard.html). To create such a situation artificially is often very difficult, almost impossible. In addition, there are not very many tools for debugging multi-threaded applications in Delphi, Visual Studio is a clear leader in this regard.
4) In a multi-threaded application, it is difficult to handle errors.
If the application has a graphical user interface, then only one thread can interact with the user. Usually, when some kind of error occurs in the application, we either process it inside the application or display a message to the user. If the error occurs in the additional stream, it cannot say anything to the user “immediately”. Accordingly, it is necessary to save the error that occurred in the additional stream until it is synchronized with the main stream and only then issue it to the user. This can lead to a relatively complex and confusing code structure.

Is there any way to simplify my life a little?

I present to you the OmniThreadLibrary (OTL for short). OmniThreadLibrary is a library for creating multi-threaded applications in Delphi. Its author, Primoz Gabrijelcic from Slovenia, is an unsurpassed professional with many years of application development experience at Delphi. OmniThreadLibrary is a completely free open source library. At the moment, the library is already in a fairly mature stage and is quite suitable for use in serious projects.

Where can I find OTL information?

In this forum.
In the blog of the author of the library.
On the project page in GoogleCode.

Also, the author of the library is now filling the wiki-book about OmniThreadLibrary and multithreading, articles about most of the high-level OTL primitives are ready.

What features does OTL provide?

This library contains low-level and high-level classes that allow simplified management of multithreading, without going into details of the processes of creating / releasing / synchronizing threads at the WinAPI level.
Of particular interest are high-level primitives for simplified multithreading management. They are noteworthy in that they are relatively easy to integrate into a ready-made single-threaded application, practically without changing the structure of the source code. These primitives allow you to create multi-threaded applications, focusing on the useful application code, rather than on auxiliary code for managing multi-threading.
The main high-level primitives include Future (asynchronous function), Pipeline (pipeline), Join(parallel call of several methods), ForkJoin (recursion with parallelism), Async (asynchronous method), ForEach (parallel loop).
In my opinion, the most interesting and useful primitives are Future and Pipeline, because To use them, the existing code almost does not need to be rewritten.

Future

This primitive allows you to perform an asynchronous function call and at the right time to wait until the calculation is completed and get the result of the execution. With the help of this primitive, a call to any procedure or function can be easily turned into asynchronous.
It looks something like this:

uses
  OtlParallel;
...
procedure TestFuture;
var
  vFuture: IOmniFuture; 
begin
  // Запускаем вычисления в параллельном потоке
  vFuture := Parallel.Future(
    function: integer
    var
      i: integer;
    begin
      Result := 0;
      for i := 1 to 100000 do
        Result := Result + i;
    end
  );
// Здесь делаем какие-то вычисления в основном потоке (в это время параллельный поток работает (он может еще не запустился, а может уже завершился, но мы об этом ничего пока не знаем)
// Теперь нам понадобилось узнать результат, полученный в параллельном потоке
  ShowMessage(IntToStr(vFuture.Value));
end;

Please note that it is the call to vFuture.Value that is the moment of synchronization of the main stream with the additional one, i.e. Until we turn to Value, we don’t know anything about the state of another thread. As soon as we called Value, the main thread is suspended until the calculation is completed in the additional thread.

If required, you can implement a non-blocking wait for the result in the main thread:

while not vFuture.IsDone do
  Application.ProcessMessages;

Thus, the Future primitive allows you to perform some task asynchronously and return the result to the main thread exactly at the moment when it is needed there.

Pipeline

Pipeline (pipeline) is a much more powerful primitive compared to Future.
Imagine an algorithm running in a loop for many elements. For example, some processing of files in a directory is performed. A single-threaded program will take the next file, read it, perform some actions and save the modified file to disk. Having a pipeline, you can divide the initial algorithm into stages (reading, processing, saving) and run these stages in parallel threads. At the very beginning, only the very first stage will start and read the first file. As soon as the reading is completed, the second stage will start and begin processing the read file or its portion (if the first stage reads the files not in whole but in portions). At this time, the first stage will already begin to read the second file. As soon as the second stage processes the first file, the third stage connects and starts saving. At that moment we will get a state
An example for Pipeline, close to real life, would load the article too much, therefore, to illustrate the use of Pipeline, I restrict myself to a copy of an absolutely synthetic example from OtlBook (don't beat too much!)

uses
  OtlCommon,
  OtlCollections,
  OtlParallel;
var
  sum: integer;
begin
  sum := Parallel.Pipeline
  .Stage(
    procedure (const input, output: IOmniBlockingCollection)
    var
      i: integer;
    begin
      for i := 1 to 1000000 do
        output.Add(i);
    end)
  .Stage(
    procedure (const input: TOmniValue; var output: TOmniValue)
    begin
      output := input.AsInteger * 3;
    end)
  .Stage(
    procedure (const input, output: IOmniBlockingCollection)
    var
      sum: integer;
      value: TOmniValue;
    begin
      sum := 0;
      for value in input do
        Inc(sum, value);
      output.Add(sum);
    end)
  .Run.Output.Next;
end;

In this example, the first stage generates a million numbers, passing them one at a time to the next stage. The second stage multiplies each number by 3 and passes to the third stage. The third step summarizes the results and returns a single number. Each Stage is executed in its own thread. Moreover, Otl allows you to specify how many threads for each Stage to use (if one is not enough) due to the simple modifier .NumTasks (N). OTL features are really very wide.

The base class for supporting data exchange between pipeline stages is the thread-protected queue class - TOmniBlockingCollection. This class allows multiple threads to add and read items at the same time. The collection’s high speed is achieved through tricky memory management and the use of locks based on thread-safe processor instructions instead of locks based on OS synchronization objects. You can read about the details of implementing the TOmniBlockingCollection class here , here and here .

Conclusion

Someone looking at the above examples will say "yes, I already saw all this." Indeed, the Task Parallel Library for .Net Framework 4 contains about the same classes. There are a number of differences between how threads are executed inside the .Net machine and how threads are executed on a real processor. Consideration of these differences is beyond the scope of this article. I just wanted to focus on the wonderful library, and the wide possibilities that it provides Delphi developers. I want to note that the library is equipped with a large number of examples illustrating the use of both low-level and high-level classes.

To dispel fears about the maturity and reliability of this library, I can only say that by using Pipeline in a complex commercial multi-user application (not web), it was possible to reduce the execution time of an operation on a group of files on the client by almost half due to the separation of file processing on separate streams on the client and their transfer to the server. Whether to use the Delphi + OmniThreadLibrary bundle in your projects is up to you;)

Tags: