Ogoun March 7, 2013 at 13:13

C # programming paradigm shift, transition to signals and queues (slots)

In this post, I consider the concept and its implementation (so far in the initial, but working stage), which has recently become very attractive to me. I had no previous experience in programming on signals, so I could have missed something or thought out non-optimal, that's why I am writing here. I hope for qualified reviews and advice. Despite the fact that the library was just beginning to develop, I already started using it in real projects, on a real load, this helps to quickly understand what is really needed and where to move on. So all the above code is operational, compiled and ready to use. Everything is done only on Framework 4.5, but I don’t think that it will be an obstacle for someone, if the idea is worthwhile, there will be no problems rebuilding under 3.5.

What is wrong with the current paradigm

The device of a standard .NET application implies that we have a set of classes, classes have data, and methods that process this data. Our classes also need to know about each other, about public methods, properties, and events. That is, we have a strongly connected architecture. Of course, we can reduce connectivity, build interaction exclusively through interfaces and factories (which will increase the code size by a factor of two, and significantly complicate readability), we can remove open methods and cost everything on events, you can come up with a lot of things, but go to a loosely coupled architecture anyway it doesn’t work out, we get at best “average” connectivity.

Yes, and there is still such a thing that with the development of processors becomes more and more relevant, this is asynchrony, microsoft does a lot of good in this direction, the same PLINQ, any kind of sugar like await, but all this is done anyway within the usual framework of OOP, and we you still have to create the threads yourself, albeit in the form of tasks, but by yourself. You need to track the completion of tasks to determine when resources become unnecessary.

In general, all this gradually bothers, it becomes too lazy to write the same things in each new project, when it would be more correct to focus on the logic of the task.

Formalization of the new rules of the game

First, we introduce a hard separation, there is data, and there is business logic code (hereinafter referred to as simply logic), data are classes that (suddenly) contain data, and (since we have .NET, not Erlang), methods and properties to facilitate their presentation. It makes no sense to completely remove the methods when we can combine the advantages of the two approaches.

Logic classes operate on data and communicate with each other using signals. At the same time, they do not contain any public methods, properties, or events other than the constructor and destructor (or the implementation of the IDisposable interface).
It is also logical to make logic classes in the form of singletones, but not necessarily, it all depends on the task and your solution.
The logic class contains its own internal methods, and signal handlers, and it also has the right to generate new signals that can be processed in any other logic class (even in it itself).

A signal is an identifier, and, optionally, some kind of payload (a link to the data in memory).
You can do anything with a signal identifier, a string, a GUID, etc., for myself I chose an enumeration and its value as it, mainly because I love IntelliSense , I haven’t come up with a better one. Also, with this approach, one cannot make a mistake when generating or subscribing to a signal, as, for example, in the case of string identifiers.

As a payload, the most frequent thing is to expect data transfer by reference, and here it is worth observing another important rule, data should not be changed in handlers until it is controlled, and remains on the conscience of the programmer. This rule follows from the fact that any number of handlers in different classes can be subscribed to one signal, and changing data in one of them will lead to a failure in the other. (I think that's why Erlang has a restriction prohibiting reassigning variables).

Another important rule, we must forget about threads / tasks, and about any other parallelization of the code in the logic classes, the library is also responsible for this, the next paragraph will show how this is achieved. This requirement is especially important to observe if we need to establish the fact that all subscribers have finished processing the signal.

The example from the code below, we have a temporary file storage, in which files are placed only at the processing stage, when the file appears, the storage gives a signal, all subscribers start processing (in parallel), read the file, write a log about its appearance, after processing is completed, the repository receives (automatically) a signal that everyone who wanted to did everything they wanted with this file, and nobody needs it anymore, which means that you can safely delete it. If some handler creates a stream and returns control to the library, then the file can be deleted before the code in the manually created stream reads it, and we get a hard-to-catch failure.

Applying New Rules

Initialization:

// Указываем сборку в которой классы бинес-логики отмечены атрибутом [rtSignalParticipator]
rtSignalCore.AppendAssembly(Assembly.GetEntryAssembly());

// Второй путь, указываем явно экзкмпляр класса бизнес-логики, в этом случае атрибут  [rtSignalParticipator] не нужен
rtSignalCore.AppendTypeInstance(new FileHandler());

An enumeration whose value is used as a signal:

/// 
/// Сигналы от буфера
/// 
public enum BufferSignal
{
        /// 
        /// Появился новый файл в буфере
        /// 
        FileInBuffer
}

An example of a class (a link to the code at the end of the article, the code itself from the task below) containing the signal handler:

// Аттрибут указывает что класс содержит обработчики сигналов, 
// можно обойтись без него указав это явным образом
[rtSignalParticipator]
class FileHandler
{
    // Аттрибут указывает что в методе ведется
    // обработка сигнала BufferSignal.FileInBuffer
    // и что обработка должна вызываться асинхронно
    [rtSignalAsyncHanlder(BufferSignal.FileInBuffer)]
    void ProcessFileInBuffer(rtSignal signal)
    {
	...
    }
}

An example of signal generation and a handler for completing signal processing by all synchronous and asynchronous handlers.

[rtSignalAsyncHanlder(DirectoryWatcherSignal.ChangedDirectory)]
void NewFileHandler(rtSignal signal)
{
    string path = (string)signal.State;
    ......................................................................
    // Читаем файлы из входящего каталога и для каждого файла генерируем сигнал
    // Генерация сигнала с передачей состояния
    rtSignalCore.Signal(BufferSignal.FileInBuffer, filePath);
    ......................................................................
}
/// 
/// Удаление файла из буфера по завершении его обработки всеми методами
/// 
[rtSignalCompletedAsyncHanlder(BufferSignal.FileInBuffer)]
void RemoveFileFromBuffer(rtSignal signal)
{
    string path = (string)signal.State;
    if (File.Exists(path))
        File.Delete(path);
}

The following attributes are available for specifying signal handlers:

[rtSignalHanlder (SignalID)] - attribute of the signal handler that will be called synchronously
[rtSignalAsyncHanlder (SignalID)] - attribute of the signal handler that will be called asynchronously
[rtSignalCompletedHanlder (SignalID)] - attribute of the method receiving the signal when all signal handlers have completed their work (including asynchronous)
[rtSignalCompletedAsyncHanlder (SignalID)] - attribute of the method receiving the signal, when all signal handlers have completed their work (including asynchronous ones), the method runs asynchronously

To generate a signal, the following format is used:

rtSignalCore.Signal(идентификатор);

rtSignalCore.Signal(идентификатор, полезная_нагрузка);

probably worth coming up with something prettier until it comes down.

What solves the approach using signals

Asynchrony becomes a consequence, and does not require additional efforts, it does not require the creation of threads / tasks, everything is achieved by marking the handlers with the necessary attributes.
Weak code connectivity, business logic classes do not need to know about each other at all, just describe the possible signals.
Ease of testing individual components, due to the removal of hard links between classes.
Ease and readability of the code

We try to put into practice

For convenience, we will come up with a problem that we solve both using the proposed approach and the usual way for comparison.
Task:
There is an input directory in which files appear, when a file appears, move it to the buffer, perform actions on it, after which we move it to the archive and write the result of processing to the log. What processing in this case does not matter.
As an example, I wrote two applications that solve this problem, one using signals, the second usual. Programs are completely identical, except for the way classes interact.

Sketches, in the case of no signals.

Using signals, where classes do not know about each other's methods and events, and generally do not know about the environment:

Graphs from the profiler for the test on 11210 files of small size:
Without signals:

Using signals:

According to the graphs and in actual use, it is clear that the introduction of signals did not bring any performance damage, rather, on the contrary, processing is stable regardless of the number of files.

Conclusion

Once again I was convinced of the universality of the C # language, indeed it can be programmed using any paradigm, and if not, then finish it to the desired state.

While it is difficult to judge how efficiently the use of signals is in the .NET environment, it is difficult to immediately discard the familiar writing style and start thinking within the framework of the new model. Subjectively, the code becomes lighter and asynchrony is a consequence of the new model, which also pleases. Objectively - it will be clear with time. At the moment, it is clear that performance is not affected for the worse. I decided for myself that I will try to switch to this programming model and continue to develop the library and tools.

Already before publication, I found materials on signals and slots in QT, in general, the ideas are similar, but in QT I did not find whether it is possible to determine the fact of the end of signal processing from all slots.

I don’t know if there have already been attempts to implement a similar model on .NET, if you shared links, it’s interesting to compare the approaches.

A project on sourceforge (everything is bad with English, if you find errors there, please unsubscribe)

UPD # 1: thanks to user mayorovp , for a practical note on signal generation and handlers, now you can write handlers with any number of arguments, of any type, and pass these arguments when generating a signal (with checking the correspondence of the transmitted types in runtime).
Examples:

No argument

// Генерация
rtSignalCore.Signal(SignalIdentifierEnum.One);
...................
// Обработчик
[rtSignalHanlder(SignalIdentifierEnum.One)]
void HandlerSignalClassBOne()

An example with passing one argument of type string

// Генерация
"Hello world".SendSignal(SignalIdentifierEnum.One);
...................
// Обработчик
[rtSignalHanlder(SignalIdentifierEnum.One)]
void HandlerSignalClassBOne(string line)

Two argument example

// Генерация
rtSignalCore.Signal(SignalIdentifierEnum.One, filePath, new FileInfo(filePath));
...................
// Обработчик
[rtSignalHanlder(SignalIdentifierEnum.One)]
void HandlerSignalClassBOne(string filePath, FileInfo file)

Performance drops in the tests were not noticed.

Only registered users can participate in the survey. Please come in.

Your assessment of work and approach

68.8% Worth continuing, has the right to exist 93
8.8% The approach is good, the implementation is incorrect (indicate your option in the comments) 12
13.3% Approach not applicable in .NET 18
8.8% The approach is not applicable in the .NET environment and poor implementation (indicate your option in the comments) 12

Tags: