eao197 April 12, 2019 at 10:14

A bit of C ++ template magic and CRTP to control the correctness of the programmer's actions in compile time

Recently, while working on a new version of SObjectizer , I was faced with the task of controlling the developer’s actions in the compile time. The bottom line was that previously a programmer could make calls of the form:

receive(from(ch).empty_timeout(150ms), ...);
receive(from(ch).handle_n(2).no_wait_on_empty(), ...);
receive(from(ch).empty_timeout(2s).extract_n(20).stop_on(...), ...);
receive(from(ch).no_wait_on_empty().stop_on(...), ...);

The receive () operation required a set of parameters, for indicating which chains of methods were used, such as those shown above from(ch).empty_timeout(150ms)or from(ch).handle_n(2).no_wait_on_empty(). At the same time, calling the handle_n () / extract_n () methods, which limit the number of messages to be extracted / processed, was optional. Therefore, all the chains shown above were correct.

But in the new version, it was required to force the user to explicitly indicate the number of messages to extract and / or process. Those. view chain from(ch).empty_timeout(150ms)now becomes incorrect. It should have been replaced by from(ch).handle_all().empty_timeout(150ms).

And I wanted to make it so that the compiler would beat the programmer by hand if the programmer forgot to make a call to handle_all (), handle_n () or extract_n ().

Can C ++ help with this?

Yes. And if someone is interested in exactly how, then you are welcome under cat.

There is more than a receive () function

The receive () function was shown above, the parameters for which were set using a chain of calls (also known as builder pattern ). But there was also a select () function, which received almost the same set of parameters:

select(from_all().empty_timeout(150ms), case_(...), case_(...), ...);
select(from_all().handle_n(2).no_wait_on_empty(), case_(...), case_(...), ...);
select(from_all().empty_timeout(2s).extract_n(20).stop_on(...), case_(...), case_(...), ...);
select(from_all().no_wait_on_empty().stop_on(...), case_(...), case_(...), ...);

Accordingly, I wanted to get one solution that would be suitable for both select () and receive (). Moreover, the parameters for select () and receive () themselves were already represented in the code so as to avoid copy-and-paste. But this will be discussed below.

Possible solutions

So, the task is for the user to invoke handle_all (), handle_n () or extract_n () without fail.

In principle, this can be achieved without resorting to any complex decisions. For example, you could enter an additional argument for select () and receive ():

receive(handle_all(), from(ch).empty_timeout(150ms), ...);
select(handle_n(20), from_all().no_wait_on_empty(), ...);

Or it would be possible to force the user to make the receive () / select () call differently:

receive(handle_all(from(ch).empty_timeout(150ms)), ...);
select(handle_n(20, from_all().no_wait_on_empty()), ...);

But the problem here is that when switching to a new version of SObjectizer, the user would have to redo his code. Even if the code, in principle, did not require rework. Say, in this situation:

receive(from(ch).handle_n(2).no_wait_on_empty(), ...);
select(from_all().empty_timeout(2s).extract_n(20).stop_on(...), case_(...), case_(...), ...);

And this, in my opinion, is a very serious problem. Which makes you look for another way. And this method will be described below.

So where does CRTP come in?

The title of the article mentioned CRTP. He's also a Curiously Recurring Template Pattern (those who want to get acquainted with this interesting, but slightly brain-tolerant technique, can start with this series of posts on the Fluent C ++ blog).

CRTP was mentioned because through CRTP we implemented work with receive () and select () function parameters. Since the lion's share of the parameters for receive () and select () was the same, the code used something like this:

template
class bulk_processing_params_t
   {
      ...; // Общие для всех операций параметры.
      Derived & self_reference() { return static_cast(*this); }
      ...
   public:
      auto & handle_n(int v)
         {
            to_handle_ = v;
            return self_reference();
         }
      ...
      auto & extract_n(int v)
         {
            to_extract_ = v;
            return self_reference();
         }
      ...
   };
class receive_processing_params_t final
   : public bulk_processing_params_t
   {
      ...; // Специфические для receive параметры.
   };
class select_processing_params_t final
   : public bulk_processing_params_t
   {
      ...;
   };

Why is CRTP here at all?

We had to use CRTP here so that the setter methods that were defined in the base class could return a reference not to the base type, but to the derived one.

That is, if it were not CRTP that was used, but ordinary inheritance, then we could only write like this:

class bulk_processing_params_t
   {
   public:
      // Можем возвратить только ссылку на bulk_processing_params_t,
      // а не на производный класс.
      bulk_processing_params_t & handle_n(int v) {...}
      bulk_processing_params_t & extract_n(int v) {...}
      ...
   };
class receive_processing_params_t final
   : public bulk_processing_params_t
   {
   public:
      // Здесь унаследованные методы будут возвращать
      // ссылку на bulk_processing_params_t, а не на
      // receive_processing_params_t.
      ...
      // Собственные методы могут возвращать ссылку на
      // класс receive_processing_params_t.
      receive_processing_params_t & receive_payload(int v) {...}
   };
class select_processing_params_t final
   : public bulk_processing_params_t
   {
   public:
      // Здесь унаследованные методы будут возвращать
      // ссылку на bulk_processing_params_t, а не на
      // select_processing_params_t.
      ...
   };

But such a primitive mechanism will not allow us to use the same builder pattern, because:

receive_processing_params_t{}.handle_n(20).receive_payload(0)

not compiled. The handle_n () method will return a reference to bulk_processing_params_t, and there the receive_payload () method is not yet defined.

But with CRTP we have no problems with the builder pattern.

Final decision

The final solution is for the final types, such as receive_processing_params_t and select_processing_params_t, to become template types themselves. So that they are parameterized with a scalar of the following form:

enum class msg_count_status_t
   {
      undefined,
      defined
   };

And so that the final type can be converted from T in T.

This will allow, for example, in the receive () function to receive receive_processing_params_t and check the Status value in comp-time. Sort of:

template<
   msg_count_status_t Msg_Count_Status,
   typename... Handlers >
inline mchain_receive_result_t
receive(
   const mchain_receive_params_t & params,
   Handlers &&... handlers )
   {
      static_assert(
         Msg_Count_Status == msg_count_status_t::defined,
         "message count to be processed/extracted should be defined "
         "by using handle_all()/handle_n()/extract_n() methods" );

In general, everything is simple, as usual: take and do;)

Description of the decision made

Let's look at a minimal example, detached from the specifics of SObjectizer, as it looks.

So, we already have a type that determines whether the limit on the number of messages is set or not set:

enum class msg_count_status_t
   {
      undefined,
      defined
   };

Next, we need a structure in which all common parameters will be stored:

struct basic_data_t
   {
      int to_extract_{};
      int to_handle_{};
      int common_payload_{};
   };

Generally it doesn’t matter what the contents of basic_data_t will be. For example, the minimal set of fields shown above is suitable.

In relation to basic_data_t, it is important that for specific operations (whether it be receive (), select (), or something else), its own concrete type will be created that inherits basic_data_t. For example, for receive () in our abstracted example, this would be the following structure:

struct receive_specific_data_t final : public basic_data_t
   {
      int receive_payload_{};
      receive_specific_data_t() = default;
      receive_specific_data_t(int v) : receive_payload_{v} {}
   };

We assume that the basic_data_t structure and its descendants do not cause difficulties. Therefore, we move on to the more complex parts of the solution.

Now we need a wrapper around basic_data_t, which will provide getter methods. This will be a template class of the following form:

template
class basic_data_holder_t
   {
   private :
      Basic_Data data_;
   protected :
      void set_to_extract(int v) { data_.to_extract_ = v; }
      void set_to_handle(int v) { data_.to_handle_ = v; }
      void set_common_payload(int v) { data_.common_payload_ = v; }
      const auto & data() const { return data_; }
   public :
      basic_data_holder_t() = default;
      basic_data_holder_t(Basic_Data data) : data_{std::move(data)} {}
      int to_extract() const { return data_.to_extract_; }
      int to_handle() const { return data_.to_handle_; }
      int common_payload() const { return data_.common_payload_; }
   };

This class is boilerplate so that it can contain any descendant from basic_data_t, although it implements getter methods only for those fields that are in basic_data_t.

Before we move on to the even more complex parts of the solution, you should pay attention to the data () method in basic_data_holder_t. This is an important method and we will encounter it later.

Now we can move on to the key template class, which can look pretty scary for people who are not very dedicated to modern C ++:

template
class basic_params_t : public basic_data_holder_t
   {
      using base_type = basic_data_holder_t;
   public :
      using actual_type = Derived;
      using data_type = Data;
   protected :
      actual_type & self_reference()
         { return static_cast(*this); }
      decltype(auto) clone_as_defined()
         {
            return self_reference().template clone_if_necessary<
                  msg_count_status_t::defined >();
         }
   public :
      basic_params_t() = default;
      basic_params_t(data_type data) : base_type{std::move(data)} {}
      decltype(auto) handle_all()
         {
            this->set_to_handle(0);
            return clone_as_defined();
         }
      decltype(auto) handle_n(int v)
         {
            this->set_to_handle(v);
            return clone_as_defined();
         }
      decltype(auto) extract_n(int v)
         {
            this->set_to_extract(v);
            return clone_as_defined();
         }
      actual_type & common_payload(int v)
         {
            this->set_common_payload(v);
            return self_reference();
         }
      using base_type::common_payload;
   };

This basic_params_t is the main CRTP template. Only now it is parameterized by two parameters.

The first parameter is the data type that must be contained inside. For example, receive_specific_data_t or select_specific_data_t.

The second parameter is the type of successor familiar to CRTP. It is used in the self_reference () method to get a reference to a derived type.

The key point in the implementation of the basic_params_t template is its clone_as_defined () method. This method expects the heir to implement the clone_if_necessary () method. And this clone_if_necessary () is just designed to transform the object T to object T. And such a transformation is initiated in the setter methods handle_all (), handle_n () and extract_n ().

Moreover, you can pay attention to the fact that clone_as_defined (), handle_all (), handle_n () and extract_n () determine the type of their return value as decltype (auto). This is another trick, which we will talk about soon.

Now we can already look at one of the final types, for which all this was conceived:

template< msg_count_status_t Msg_Count_Status >
class receive_specific_params_t final
   : public basic_params_t<
         receive_specific_data_t,
         receive_specific_params_t >
   {
      using base_type = basic_params_t<
            receive_specific_data_t,
            receive_specific_params_t >;
   public :
      template
      std::enable_if_t<
            New_Msg_Count_Status != Msg_Count_Status,
            receive_specific_params_t >
      clone_if_necessary() const
         {
            return { this->data() };
         }
      template
      std::enable_if_t<
            New_Msg_Count_Status == Msg_Count_Status,
            receive_specific_params_t& >
      clone_if_necessary()
         {
            return *this;
         }
      receive_specific_params_t(int receive_payload)
         :  base_type{ typename base_type::data_type{receive_payload} }
         {}
      receive_specific_params_t(typename base_type::data_type data)
         :  base_type{ std::move(data) }
         {}
      int receive_payload() const { return this->data().receive_payload_; }
   };

The first thing you should pay attention to here is the constructor, which takes base_type :: data_type. By means of this constructor, the current parameter values are transferred during transformation from T in T.

By and large, this receive_specific_params_t is something like this:

template
class holder_t {
  V v_;
public:
  holder_t() = default;
  holder_t(V v) : v_{std::move(v)} {}
  const V & value() const { return v_; }
};
holder_t v1{"Hello!"};
holder_t v2;
v2 = v1; // Так не получится, поскольку у v1 и v2 формально разные типы.
v2 = holder_t{v1.value()}; // А вот так запросто.

And just mentioned constructor receive_specific_params_t allows you to initialize receive_specific_params_t values from receive_specific_params_t.

The second important thing in receive_specific_params_t is the two clone_if_necessary () methods.

Why are there two? And what does all this SFINAE-vskaya magic mean in their definition?

Two clone_if_necessary () methods have been made in order to avoid unnecessary transformations. Let's say the programmer called the handle_n () method and already received receive_specific_params_t. And then it called extract_n (). This is allowed, handle_n () and extract_n () set slightly different restrictions. The call to extract_n () should also give us receive_specific_params_t. But we already have one. So why not reuse an existing one?

That's why there are two clone_if_necessary () methods here. The first will work when the transformation is really needed:

      template
      std::enable_if_t<
            New_Msg_Count_Status != Msg_Count_Status,
            receive_specific_params_t >
      clone_if_necessary() const
         {
            return { this->data() };
         }

The compiler will select it, for example, when the status changes from undefined to defined. And this method will return a new object. And yes, in the implementation of this method, we pay attention to the data () call, which was already defined in basic_data_holder_t.

The second method:

      template
      std::enable_if_t<
            New_Msg_Count_Status == Msg_Count_Status,
            receive_specific_params_t& >
      clone_if_necessary()
         {
            return *this;
         }

will be called when it is not necessary to change the status. And this method returns a reference to an existing object.

Now it should become clear why in basic_params_t for a number of methods the return type was defined as decltype (auto). After all, these methods depend on which particular version of clone_if_necessary () will be called in the derived type, and either an object or a link can be returned there ... You will not predict in advance. And here decltype (auto) comes to the rescue.

Small disclaimer

The described minimalistic example was aimed at the simplest and most understandable demonstration of the chosen solution. Therefore, it does not have quite obvious things that beg to be included in the code.

For example, the basic_data_holder_t :: data () method returns a constant reference to the data. Which leads to the copying of parameter values during the transformation T in T. If copying parameters is an expensive operation, then you should be puzzled by move semantics and the data () method could have the following form:

auto data() { return std::move(data_); }

Also now, in every final type (like receive_specific_params_t and select_specific_params_t), you have to include implementations of the clone_if_necessary methods. Those. in this place we still use copy paste. Perhaps there should also be something to come up with in order to avoid duplication of the same type of code.

Well and yes, noexcept is not put down in the code in order to reduce the "syntax overhead" (s).

That's all

The source code for the minimalistic example discussed here can be found here . And you can play in the on-line compiler, for example, here (you can comment out the call to handle_all () on line 163 and see what happens).

I do not want to say that the approach I implemented is the only correct one. But, firstly, I saw an alternative unless in copy-paste. And, secondly, it was not at all difficult to do this, and fortunately, it did not take a lot of time. But the compiler’s punches very much helped right away, as the old tests and examples adapted to the new features of the latest version of SObjectizer.

So, as for me, C ++ has once again confirmed that it is complex. But not just like that, but in order to give more opportunities to the developer. Well, I won’t be surprised if all this could be obtained in modern C ++ in an even simpler way than I did.

PS. If one of the readers follows the SObjectizer, then I can say that the new version 5.6, in which compatibility with the 5.5 branch was significantly violated, has already breathed quite a bit. You can find it on BitBucket . The release is still a long way off, but SObjectizer-5.6 is already what it was meant to be. You can take, try and share your impressions.

Tags: