eao197 April 28, 2018 at 09:24

Let's take a look at the SObjectizer under the hood

We continue to acquaint readers with an open C ++ framework called SObjectizer . Our framework simplifies the development of complex multi-threaded applications due to the fact that higher-level tools borrowed from the Actor Model, CSP and Publish-Subscribe become available to a C ++ programmer. At the same time, no matter how arrogant it may sound, SObjectizer is one of the few open, lively and developing actor frameworks for C ++.

We already devoted more than ten articles on Habré to SObjectizer. But still, readers complain about the presence of "white spots" in understanding how SObjectizer works and how the various types of entities that SObjectizer operates with are interconnected.

In this article, we will try to look under the hood of a SObjectizer and try “on fingers” and explain in pictures what it consists of and how, in general terms, it works.

SObjectizer Environment

Let's start with a thing like SObjectizer Environment (or SOEnv, for short). SOEnv is a container inside which all entities related to SObjectizer are created and work: agents, cooperations, dispatchers, mailboxes, timers, etc. What can be illustrated by the following picture:

In fact, to start working with SObjectizer, you need to create and start an instance of SOEnv. For example, in this example, the programmer manually creates an instance of SOEnv as an object of type so_5 :: wrapped_env_t:

int main() {
  so_5::wrapped_env_t sobj{...};
  ... // Какая-то прикладная логика приложения.
  return 0;
}

This instance will immediately start working and automatically shut down when the so_5 :: wrapped_env_t object is destroyed.

The essence of SOEnv itself, as a separate concept, was required for us in order to be able to run several instances of SObjectizer, independent of each other, within the same application.

int main() {
  so_5::wrapped_env_t first_soenv{...};
  so_5::wrapped_env_t second_soenv{...};
  ...
  so_5::wrapped_env_t another_soenv{...};
  ... // Какая-то прикладная логика приложения.
  return 0;
}

This makes it possible to get this picture in your application:

Fun fact. Our closest and much more popular competitor, C ++ Actor Framework (aka CAF), was not so long ago able to run only one subsystem of actors in an application. And we even came across a discussion in which CAF developers were asked why. But over time, the concept of actor_system and the ability to simultaneously launch multiple actor_system in an application appeared in CAF.

What is SObjectizer Environment responsible for?

In addition to the fact that SOEnv is a container that stores cooperations, dispatchers, etc., SOEnv also manages these entities.

For example, when starting SOEnv the following should be started:

a timer that will serve pending and periodic messages;
the default dispatcher, on which all agents that were not explicitly tied to other dispatchers will work;
user-created public dispatchers.

Accordingly, when SOEnv is stopped, all of these entities must be stopped.

Also, when a user wants to add his agents to SOEnv, SOEnv must complete the registration procedure for cooperation with user agents. And when the user wants to remove their agents, SOEnv must deregister the cooperation.

SOEnv has two main repositories that it owns and is responsible for its contents. The first repository, which may completely disappear in the next major version of SObjectizer, is the repository of public dispatchers. Each public dispatcher should have its own unique string name, by which the dispatcher can be found and reused.

The second repository, which is the most important, is the cooperative repository. Each cooperation should also have its own unique string name, under which the cooperation is stored in the cooperation repository. An attempt to register cooperation with an already taken name will fail.

Perhaps the presence of names in cooperatives is a rudiment inherited from SObjectizer-4. Currently, the names of cooperations are considered as a rather ambiguous feature and, possibly, over time, cooperations in SObjectizer will become anonymous. But it is not exactly.

So, in summary:

SOEnv owns such entities as timer, default and public dispatchers, cooperations;
at start, SOEnv starts a timer, default and public dispatchers;
during operation, SOEnv is responsible for the registration and deregistration of cooperatives;
при завершении работы SOEnv дерегистрирует все остающиеся живые кооперации, останавливает публичные и дефолтный диспетчеры, после чего останавливает таймер.

Environment Infrastructure

Inside SOEnv there is another interesting thing that makes SOEnv a more complex entity than it might seem. This is a SObjectizer Environment Infrastructure (or, in short, env_infrastructure). To explain what it is and why, you need to talk about what interesting conditions we encountered as SObjectizer was used in completely different types of tasks.

When SObjectizer-5 appeared, SOEnv used multithreading to do its job. So, the timer was implemented as a separate timer thread. There was a separate working thread on which SOEnv completed deregistration of cooperatives and freed up all resources related to cooperatives. And the default dispatcher was another working thread, on which the requests of dispatchers tied to the default agent were serviced.

Since SObjectizer is designed to simplify the implementation of complex multi-threaded applications, the use of multi-threading inside the SObjectizer itself was considered (and is being considered now) as a completely normal and acceptable solution.

However, as time went on, SObjectizer began to be used in projects that we ourselves did not think about before, and situations began to appear when multi-threaded SOEnv is too redundant and expensive.

For example, a small application that periodically wakes up, checks for the presence of some information, processes this information when it appears, adds the result somewhere and falls asleep again. All operations may well be performed on a single workflow. In addition, the application itself should be lightweight and I would like to avoid the cost of creating additional workflows inside SOEnv.

Another example: another small single-threaded application that actively works with the network through Asio. But at the same time, some part of the logic is easier to do not on Asio, but on SObjectizer agents. In this case, I would like to make both Asio and SObjectizer work on the same working context. Moreover, I would also like to avoid duplication of functionality: since Asio is used and Asio has its own timers, it makes no sense to run the same mechanism in SOEnv as well, even if SObjectizer uses Asio timers to serve delayed and periodic messages.

To make it possible to use SObjectizer in such specific conditions, the concept of env_infrastructure has appeared in SOEnv. At the C ++ level, env_infrastructure is an interfacewith some set of methods. When SOEnv starts, an object is created that implements this interface, after which SOEnv uses this object to do its job.

SObjectizer includes several ready-made implementations of env_infrastructure: regular multi-threaded; single-threaded, which is not thread-safe; single-threaded, which in this case is thread-safe. Plus in so_5_extra there are a couple of single-threaded env_infrastructure based on Asio - one thread-safe, and the second- not. With a great desire, the user can write his own env_infrastructure, although this is not an easy task, and it’s also ungrateful, because we, the developers of SObjectizer, cannot guarantee that the env_infrastructure interface will remain unchanged. Too deep this thing integrates with SOEnv.

Agents, cooperations, dispatchers and disp_binders. And also event_queue

When working with SObjectizer, the developer basically has to deal with the following entities:

agents in which the business logic of the application (or parts of the application) is implemented;
dispatchers who determine how and where agents will work;
Messages and mailboxes through which agents exchange information between themselves and other parts of the application.

In this section we will talk about agents and dispatchers, and in the next we will go through the mailboxes.

Dispatchers and event_queue

We start the conversation about agents with dispatchers, as Having understood what dispatchers are for, it will be easier to figure out the Venegret from agents, agent cooperations, and disp_binders.

The key point in implementing SObjectizer is that SObjectizer itself delivers messages to agents. The agent does not need to call any receive method in a loop, and then analyze the type of the message returned from receive. Instead, the agent subscribes to messages of interest to him and when the desired message appears, SObjectizer itself calls the agent's handler method for this message.

However, the most important question in this scheme is this: where exactly does the SObjectizer make the call to the handler method? Those. in the context of what working thread will the agent process messages addressed to it?

This is just the dispatcher - this is the very essence in SObjectizer, which is responsible for providing a working context for agents to process messages. Roughly speaking, the dispatcher owns one or more work threads, on which the handler methods of the agents are called.

SObjectizer includes eight full-time dispatchers - from the most primitive (for example, one_thread or thread_pool) to advanced ones (like adv_thread_pool or prio_dedicated_threads :: one_per_prio). A developer can create as many dispatchers in his application as he needs.

For example, imagine that you need to make an application that will poll several devices connected to a computer, somehow process the information received, put it in the database and send this information to the outside world through some kind of MQ broker. At the same time, interaction with devices will be synchronous, and data processing can be quite complex and multi-level.

You can create one one_thread dispatcher per device. Accordingly, all actions with the device will be performed on a separate thread and blocking this thread by a synchronous operation will not affect the rest of the application. Also, a separate one_thread manager can be allocated for working with the database. For the rest of the tasks it will be possible to create one single thread_pool dispatcher.

Thus, when a developer selects SObjectizer as a tool, then one of the main tasks of the developer is to create the necessary dispatchers for the developer and link the agents to the corresponding dispatchers.

event_queue

So, in SObjectizer, the agent does not need to determine whether there is any message waiting to be processed. Instead, the dispatcher to which the agent is bound calls the agent handler methods for messages received by the agent.

But here the question arises: how does the dispatcher find out that some kind of message is addressed to the agent?

The question is by no means an idle one, because in the "classical" Model of Actors, each actor has his own queue of messages addressed to the actor. In the first versions of SObjectizer-5, we followed the same path: each agent had its own message queue. When a message was sent to the agent, the message was stored in this queue, and then the dispatcher, to which the agent was attached, was requested to process this message. It turned out that sending a message to an agent required replenishment of two queues: the message queue of the agent itself and the dispatcher's request queue.

This scheme had its positive aspects, but all of them were leveled by a huge drawback - its inefficiency. Therefore, soon in SObjectizer-5 we abandoned our own message queues for agents. Now the queue in which messages addressed to the agent are placed does not belong to the agent, but to the dispatcher.

The logic is simple, if the dispatcher determines where and when the agent will process its messages, then let the dispatcher own the message queue of the agent. So now in SObjectizer the following picture takes place:

The connecting element between the agent and the dispatcher is event_queue - an object with a specific interface that saves the agent message in the corresponding dispatch queue of the dispatcher.

The event_queue object is owned by the dispatcher. It is the dispatcher who determines how exactly the event_queue is implemented, how many event_queue objects it will have, whether the event_queue will be unique for each agent, or if several agents will work with the common event_queue object, etc.

The agent does not initially have a connection with the dispatcher, this connection appears when the agent is bound to the dispatcher. After the agent is attached to the dispatcher, the agent has a link to event_queue and when the message addressed to the agent is transmitted to the agent, this message is sent to event_queue and event_queue is already responsible for ensuring that the request for processing the message is in the dispatcher's necessary queue.

At the same time, there are several moments in the agent’s life when the agent has no connection with the dispatcher, i.e. the agent has no reference to its event_queue. The first point is the gap between the creation of the agent and its binding to the dispatcher at the time of registration. The second point is the period of time during deregistration when the agent is already untied from the dispatcher, but not yet destroyed. If at these moments the message is addressed to the agent, then during its delivery it is revealed that the agent does not have event_queue and in this case the message is simply thrown out.

Agents, cooperations and disp_binder

Launching agents in SObjectizer takes place in four stages.

At the first stage, the programmer creates an empty cooperation (more details below).

At the second stage, the programmer creates an instance of his agent. An agent in SObjectizer is implemented by a regular C ++ class, and agent creation is performed as normal instance creation of this class.

At the third stage, the programmer must add his agent to the cooperation. Cooperation is another unique thing that, as far as we know, is only in SObjectizer. Cooperation is a group of agents that must appear in SOEnv and disappear from SOEnv at the same time and transactionally. That is, if there are three agents in cooperation, then all three must successfully start their work in SOEnv, or none of them should do this. In the same way, either all three agents are simultaneously removed from SOEnv, or all three agents continue to work together.

The need for cooperation arose almost immediately at the beginning of work on the SObjectizer, when it became clear that in most cases the agents will be created in the application not one by one, but by interconnected groups. And in order to prevent the developer from having to come up with control schemes for the start of the group and the implementation of the rollback, when two of the three agents he needed started successfully, and the third did not, and cooperatives were invented.

So, in the third step, the programmer fills his cooperation with agents. After the cooperation is full, the fourth step is followed - registration of cooperation. In code, it might look like this:

so_5::environment_t & env = ...; // SOEnv внутри которого будет жить кооперация.
// Шаг №1: создаем кооперацию.
auto coop = env.create_coop("demo");
// Шаг №2: создаем агента, которого мы хотим поместить в кооперацию.
auto a = std::make_unique(... /*аргументы для конструктора my_agent*/);
// Шаг №3: отдаем агента в кооперацию.
coop->add_agent(std::move(a));
...
// Шаг №4: регистрируем кооперацию.
env.register_coop(std::move(coop));

But usually this is done in a more compact form.

so_5::environment_t & env = ...; // SOEnv внутри которого будет жить кооперация.
env.introduce_coop("demo", [](so_5::coop_t & coop) { // Шаг №1 уже сделан автоматически.
  // Здесь сразу выполняются шаги №2 и №3.
  coop.make_agent(... /*аргументы для конструктора my_agent*/);
  ...
}); // Шаг №4 выполняется автоматически.

When registering a cooperation, the developer transfers the created and completed SOEnv cooperation. SOEnv performs a number of actions: it checks the uniqueness of the cooperation name, requests dispatchers resources necessary for cooperation agents, calls the so_define_agent () method on agents, binds agents to dispatchers, sends a special message to each agent so that the so_evt_start () method is called on the agent . Naturally, with the rollback of previously performed actions, if some operation from this list ended unsuccessfully.

And when the cooperation is registered, then the agents are already inside the SObjectizer (more precisely, inside the specific SOEnv) and can fully work.

One of the most important parts of registration of cooperation is the binding of agents to dispatchers. It is after the binding that the agent has a real link to event_queue, which makes it possible to deliver messages to the agent.

After the successful registration of cooperation, we will have some kind of picture:

disp_binder

We have already mentioned several times “binding agents to dispatchers”, but we have never yet explained how this binding is performed. How does SObjectizer understand which dispatcher each agent should be bound to?

And here special objects called disp_binder come into action. They serve just to bind the agent to the dispatcher when registering cooperation with the agent. And also in order to untie the agent from the dispatcher during deregistration of cooperation.

SObjectizer defines an interface that all disp_binders must support. Concrete disp_binder implementations depend on the particular type of dispatcher. And each dispatcher implements its own disp_binder.

To bind the agent to the dispatcher, the developer must create disp_binder and specify this disp_binder when adding the agent to the cooperation. In fact, the cooperation filling code should look something like this:

auto & disp = ...; // Ссылка на диспетчер, к которому нужно привязать агента.
env.introduce_coop("demo", [&](so_5::coop_t & coop) {
  // Создаем агента и указываем, какой disp_binder ему нужен.
  coop.make_agent_with_binder(disp->binder(),
     ... /* аргументы для конструктора my_agent */);
  ...
});

An important point: it is the cooperation that owns the disp_binder, and only the cooperation knows which agent which disp_binder uses. Therefore, the real picture of the registered cooperation will look like this:

Mailboxes

Another key element of SObjectizer, which makes sense to consider at least superficially, is mailboxes (or mbox-s, in the terminology of SObjectizer).

The presence of mailboxes also distinguishes SObjectizer from other actor frameworks that implement the "classic" Model of Actors. In the “classic” Model of Actors, messages are addressed to a specific actor. Therefore, the sender of the message must know the link to the recipient actor.

With SObjectizer, legs grow not only (and not so much) from the Model of Actors, but also from the Publish-Subscribe mechanism. Therefore, we have the operation of sending a message in 1: N mode originally built into SObjectizer. And therefore, in SObjectizer, messages are not sent directly to agents, but in mbox-s. Behind the mbox one hiding agent can be hidden. Or several (or several hundred thousand recipients). Or none at all.

Since the messages are not sent directly to the recipient agent, but in the mailbox, we needed to introduce another concept that is not in the “classic” Model of Actors, but which is the cornerstone in Publish-Subscribe: subscribing to messages from mbox. In SObjectizer, if an agent wants to receive messages from mbox, he must subscribe to the message. No subscription - messages do not reach the agent. There is a subscription - they are reaching.

Native mbox types

There are two types of mboxes in SObjectizer. The first type is multi-producer / multi-consumer (MPMC). This type of mbox is used to implement interaction in M: N mode. The second type is multi-producer / single-consumer (MPSC). This type of mbox appeared later and it is designed for effective interaction in M: 1 mode.

Initially, in SObjectizer-5 there were only MPMC mboxes, since the M: N delivery mechanism is enough to solve any problems. And those where interaction is required in M: N mode, and those where interaction is required in M: 1 mode (in this case, a separate mbox is created, which is owned by a single recipient). But in M: 1 mode, MPMC-mboxes have too high overhead compared to MPSC-mboxes, therefore, to reduce the overhead for cases of M: 1 interaction in SObjectizer, MPSC-mboxes were added.

Curious moment. The presence of MPSC-mboxes later helped to add such a feature as mutable messaging to SObjectizer . This functionality initially seemed incredible, but since users needed it, we came up with a way to implement it. And it was MPSC mboxes that became one of the basic things for mutable messages.

Multi-Producer / Multi-Consumer mbox

MPMC-mbox is responsible for delivering the message to all agents who have subscribed to the message. There will be many such agents, whether such an agent will be singular or there will not be such agents at all - these are just the details of the work. Therefore, MPMC-mbox stores a list of subscribers for each type of message. And the general scheme of MPMC-mbox can be represented as follows:

Here Msg1, Msg2, ..., MsgN are the types of messages that agents subscribe to.

Multi-Producer / Single-Consumer mbox

MPSC-mbox is much simpler than MPMC-mbox, which is why it works more efficiently. In MPSC-mbox, only the link to the agent with which this MPSC-mbox is associated is stored:

The mechanism for delivering a message to an agent "on the fingers"

If you talk very briefly about how messages in SObjectizer are delivered to the recipient, then the following picture emerges: The

message is sent to mbox. Mbox selects the recipient (in the case of MPMC-mbox, these are all subscribers to this type of message, in the case of MPSC-mbox, this is the sole owner of mbox) and sends the message to the recipient.

The recipient looks to see if he has an actual link to event_queue. If so, the message is passed to event_queue. If there is no reference to event_queue, then the message is ignored.

If the message was passed to event_queue, then event_queue saves the message in the appropriate dispatcher queue. What this queue will be depends on the type of dispatcher.

When the dispatcher, when raking his queues reaches this message, he will call the agent in his working context (roughly speaking in the context of one of his work threads). The agent will find a handler method for this message and call it (once again we emphasize that the call will take place in the context provided by the dispatcher).

That, in fact, is all that can be said about the principle of operation of the message delivery mechanism in SObjectizer in general terms. Although the details are somewhat more complicated, we will not look into the details today.

Conclusion

In this article, we tried to make an understandable, albeit superficial, overview of the main mechanisms and features of SObjectizer. We hope this article helps someone better understand how SObjectizer works. And maybe it’s better to understand what a SObjectizer might be for.

But if you do not understand something or want to know more about something, then ask questions in the comments. We like it when we are asked questions and we are happy to answer them. At the same time, many thanks to everyone who asks questions - you force us to improve and develop both SObjectizer itself and its documentation.

Also, taking this opportunity, we want to invite everyone who is not familiar with SObjectizer to get acquainted with our framework. It is written in C ++ 11 (the minimum requirements are gcc-4.8 or VC ++ 12.0), works under Windows, Linux, FreeBSD, macOS and, using CrystaxNDK, on Android. Distributed under the BSD-3-CLAUSE license (i.e. for free). You can take it with github or with SourceForge . The currently available documentation is here . Plus, the SObjectizer includes a large number of examples and yes, they are all up to date :)

Look, you’ll like it suddenly. And if you don’t like something, then let us know, we will try to fix it. Feedback is very important to us now, so if you did not find something necessary for yourself in SObjectizer, then tell us about it. Perhaps we can add this in future versions of SO-5.

Tags: