Eggs.Variant - Part I

Original author: K-ballo
  • Transfer
To publish this translation, I was motivated by a comment by @encyclopedist on a recent article , Factory Method Without Dynamic Memory Allocation . The article interested me, but a quick googling did not reveal a translation. “The mess.” - I thought - “Such an interesting article on C ++ is not translated into Russian. We need to fix it. ”

Table of contents
  1. Introduction
  2. Design
  3. Implementation

  4. What is not said yet


Reflections on the development of Eggs.Variant  , a generic type-safe markup union in C ++ 11/14 .

Introduction


A union  is a special type of class that only one of its non-static members can store at one time. It takes up as much space as is required to accommodate the largest of its members.
9 [class] / 5 A union is a class defined with a keyword union; at the same time, he can keep only one of his members (9.5). [...]
9.5 [class.union] / 1 Only one of the non-static members can be active in the union, that is, at a given moment in the union the value of only one of its non-static members can be stored. [...] The size of the pool is sufficient to accommodate the largest of its non-static members. Each non-static member is stored in memory as if it were the only member of the structure. All non-static members of a union object have the same address.

Original
9 [class] / 5 A union is a class defined with the class-key union; it holds at most one data member at a time (9.5). [...]
9.5 [class.union] / 1 In a union, at most one of the non-static data members can be active at any time, that is, the value of at most one of the non-static data members can be stored in a union at any time. [...] The size of a union is sufficient to contain the largest of its non-static data members. Each non-static data member is allocated as if it were the sole member of a struct. All non-static data members of a union object have the same address.





In C ++ 98 , union members are limited to trivial object types. For these types, their life time begins when the repository is received and ends when it is reused or released.
3.8 [basic.life] / 1 [...] The lifetime of an object of type Tbegins when:
  • a repository with type matching Tand size is obtained and
  • initialization of the object, if it is nontrivial, is completed.

The lifetime of an object of type Tends when:
  • if it Tis a class type with a nontrivial destructor (12.4), a call to the destructor begins, or
  • The storage that the object occupies has been reused or freed.


Original
3.8 [basic.life] / 1 [...] The lifetime of an object of type Tbegins when:
  • storage with the proper alignment and size for type Tis obtained, and
  • if the object has non-trivial initialization, its initialization is complete.

The lifetime of an object of type Tends when:
  • if Tis a class type with a non-trivial destructor (12.4), the destructor call starts, or
  • the storage which the object occupies is reused or released.



This special guarantee allows you to change the active member of the association simply by assigning a new value to the association, allowing you to effectively reuse the storage - which is in good agreement, if not with the letter, then with the spirit of the standard.
In addition, the association does not know which member, if any, is active, so its special member functions must be implemented without knowing this information. Since union members are limited to trivial types, special member functions can be implemented in terms of the bytes underlying the union, independent of the active member.
9.5 [class.union] / 1 [...] A class object with a nontrivial constructor (12.1), a nontrivial copy constructor (12.8), a nontrivial destructor (12.4), or a nontrivial copy assignment operator (13.5.3, 12.8) cannot be a member unions, or an array element lying in the union. [...]

Original
9.5 [class.union] / 1 [...] An object of a class with a non-trivial constructor (12.1), a non-trivial copy constructor (12.8), a non-trivial destructor (12.4), or a non -trivial copy assignment operator (13.5.3, 12.8) cannot be a member of a union, nor can an array of such objects. [...]


In C ++ 11, this restriction was removed; union members can now be of any type. Switching between non-trivial members requires explicitly destroying the current active member and using the placement operatornew to construct the new active member.
9.5 [class.union] / 4 [Note: in the general case, it is necessary to use one explicit call to the destructor and one explicit call to the host operator newto change the active member of the enumeration. —The end of the note] [Example: Consider an object uhaving an enumeration type Uwith non-static members of mtype Mand ntype N. If it Mhas a non-trivial destructor and Nhas a non-trivial constructor (for example, if they are declared or inherit virtual functions), the active member ucan be safely changed from mto nby the following use of the destructor and the placement operator new:
u.m.~M();
new (&u.n) N;

- end of example]

Original
9.5 [class.union] / 4 [Note: In general, one must use explicit destructor calls and placement new operators to change the active member of a union. –End note] [Example: Consider an object uof a union type Uhaving non-static data members mof type Mand nof type N. If Mhas a non-trivial destructor and Nhas a non-trivial constructor (for instance, if they declare or inherit virtual functions), the active member of ucan be safely switched from mto nusing the destructor and placement new operator as follows:
u.m.~M();
new (&u.n) N;

—End example]


If a special member function of any of the members of the association is nontrivial, then the special member function of the association itself will be implicitly defined as deleted, unless it was provided by the user himself.
9.5 [class.union] / 2 [Note: if any non-static member of the union has a non-trivial default constructor (12.1), copy constructor (12.8), move constructor (12.8), copy assignment operator (12.8), move assignment operator (12.8) or a destructor (12.4), the corresponding member function of the union must be provided by the user, or it will be implicitly removed (8.4.3) from the union. - end of note]
9.5 [class.union] / 3 [Example: consider the following union:
union U {
  int i;
  float f;
  std::string s;
};

Since type std::string(21.3) declares non-trivial versions of all special member functions, the Udefault constructor, copy / move constructors, copy / move assignment operators, and destructor will be implicitly removed from the type . To use a type, Usome of these member functions must be provided by the user. - end of example]

Original
9.5 [class.union] / 2 [Note: If any non-static data member of a union has a non-trivial default constructor (12.1), copy constructor (12.8), move constructor (12.8), copy assignment operator (12.8) , move assignment operator (12.8), or destructor (12.4), the corresponding member function of the union must be user-provided or it will be implicitly deleted (8.4.3) for the union. –End note]
9.5 [class.union] / 3 [Example: Consider the following union:
union U {
  int i;
  float f;
  std::string s;
};

Since std::string(21.3) declares non-trivial versions of all of the special member functions, Uwill have an implicitly deleted default constructor, copy / move constructor, copy / move assignment operator, and destructor. To use U, some or all of these member functions must be user-provided. —End example]


Эти нетривиальные функции-члены могут быть предоставлены — с соблюдением их обычной семантики — только если будет известен активный член перечисления, чтобы его можно было передать дальше. Размеченное объединение — это объединение или класс, подобный объединению, которое обладает знанием о себе, то есть, оно содержит некоторый идентификатор, который позволяет узнать, какой член — если он есть — сейчас активен. Размеченное объединение может предоставлять все специальные функции-члены независимо от их тривиальности.
Экземпляр класса eggs::variant является размеченным объединением объектов с типами Ts. Он обеспечивает естественный интерфейс для переключения активного члена и предоставляет все специальные функции-члены с их обычной семантикой:
eggs::variants u; // u не имеет активного члена
u = M{}; // теперь активный член u имеет тип M
u = N{}; // теперь активный член u имеет тип N, предыдущий активный член был разрушен
// предоставляются все специальные функции-члены
using U = eggs::variants;


Design


The ultimate goal of design is to generalize and improve the design of a marked union without compromising its functionality. That is, it should not have a special member to select the current active member of the association, or it should take up very little space:
struct U {
  union { T0 m0; ...; TN mN; };
  std::size_t which;
} u;

replaced by this:
using V = eggs::variant;
V v;

In particular:
  • The type size Vmust match the size of the corresponding type U. Any active member vmust be located in the area Vappropriately aligned for types T0, ... TN; the use of additional storage, for example, dynamic memory, is not allowed.
  • Semantics vshould match or improve well-defined semantics u. An interface vshould not allow undefined behavior, for example, a reference to an inactive member u.
  • A type Vmust provide all special member functions with their expected semantics.

Basically, the interface is based on how it is defined in the Technical specification of the main library . The conceptual model consists of a labeled union of types and . Design decisions made for are easily transferable to , whose conceptual model consists of a marked combination of types and those that are hidden behind . The semantics of all special member functions and associated operators, as well as the interface for switching an active member — through construction, assignment, or placement — are borrowed from . Access to the active member is done in the stylestd::experimental::optionaloptionalnullopt_tToptionalvariantnullvariant_tTsoptional
std::function, which returns a pointer to the target, if requested with the correct type of target - something like dynamic_castfor the poor. In addition, it also allows you to get a null pointer to the active member (if any), which turned out to be very useful to simplify the implementation of auxiliary functions.
Finally, helper classes are provided, similar std::tuple, as well as access to elements by index or by type - albeit with non-obvious semantics closer to the semantics of casting at runtime.
Reference documentation can be found here .

Implementation


A direct implementation as storage would use the underlying relaxed pool :variant
template 
union storage {
  nullvariant_t nullvariant;
  Ts... members;
};

However, the above code is not correct from the point of view of C ++ syntax , since the set of template parameters cannot be expanded in this context - it must contain member names so that they can be referenced. Instead, a recursive approach should be taken to build the underlying storage:
template 
union storage;
template 
union storage {
  nullvariant_t nullvariant;
  T head;
  storage tail;
};
template <>
union storage<> {
  nullvariant_t nullvariant;
};

Unfortunately, this is not so simple, given that any non-trivial special member function for a type from Tsresults in the removal of the corresponding special member function from the repository. In order for this implementation to be used, at least the default constructor and destructor must be implemented for the resulting list, although the destructor cannot do anything useful.
The simplest implementation, which was used before relaxed associations appeared in C ++ , would use bare storage, suitable for storing any of the types in Ts - attention, spoiler: in some cases, it will not work. The standard even provides a special property to facilitate our work:
10.20.7.6 [meta.trans.other]
template 
  struct aligned_union;

  • Condition: At least one type must be provided.
  • Comments: typedef per member type must be a POD type applicable for use as uninitialized storage for any object whose type is listed Types; its size must be at least Len. The static member alignment_valuemust be an integer type constant std::size_twhose value defines the strictest alignment for all types listed in the list Types.


Original
10.20.7.6 [meta.trans.other]
template 
  struct aligned_union;

  • Condition: At least one type is provided.
  • Comments: The member typedef type shall be a POD type suitable for use as uninitialized storage for any object whose type is listed in Types; its size shall be at least Len. The static member alignment_valueshall be an integral constant of type std::size_twhose value is the strictest alignment of all types listed in Types.



It should be noted that this property has already been removed from the working draft  - along with the advent of relaxed associations - and is now a potential candidate for obsolescence. A possible replacement in C ++ 14 might be:
template 
struct aligned_union {
  static constexpr std::size_t alignment_value = std::max({alignof(Types)...});
  struct type {
    alignas(alignment_value) unsigned char _[std::max({Len, sizeof(Types)...})];
  };
};

Using aligned_unionas a type for storage, a simplified version can be implemented as follows:variant
template 
class variant {
  template  struct _index_of { /*...*/ }; // индекс T в Ts..., начинающийся с 0
public:
  static constexpr std::size_t npos = std::size_t(-1);
  variant() noexcept
    : _which{npos}
  {}
  template 
  variant(T const& v)
    : _which{_index_of::value}
  {
    new (target()) T(v); // Конструирует тип T в хранилище при помощи
                            // размещающего оператора new
  }
  /*...*/
  std::size_t which() const noexcept {
    return _which;
  }
  template 
  T* target() noexcept {
    return _which == _index_of::value ?
      static_cast(static_cast(&_storage)) : nullptr;
  }
  template 
  T const* target() const noexcept {
    return _which == _index_of::value ?
      static_cast(static_cast(&_storage)) : nullptr;
  }
private:
  std::size_t _which;
  typename std::aligned_union<0, Ts...>::type _storage;
};

Special member functions must be redirected to the active member (if any). And again, we can’t just use the instruction switchto achieve this goal - although if we could do this, it is unlikely that everything would be significantly simplified - and immediate substitution would have a recursive implementation:
struct _destructor {
  template 
  static void call(void* ptr) {
    static_cast(ptr)->~T();
  }
};
variant::~variant() {
  apply<_destructor, Ts...>(_which, &_storage);
}
template 
void apply(std::size_t /*which*/, void* /*storage*/) {}
template 
void apply(std::size_t which, void* storage) {
  if (which == 0) { F::template call(storage); }
  else { apply(which - 1, storage); }
}

A non-recursive implementation of a method applycan be constructed using a transition table, similar to how an instruction does it switch, followed by a transition to a suitable record:
template 
void apply(std::size_t which, void* storage) {
  using fun_ptr = void(*)(void*);
  static constexpr fun_ptr table[] = {&F::template call...};
  if (which < sizeof...(Ts)) { table[which](storage); }
}

A non-recursive implementation not only provides faster generated code, but also significantly speeds up compilation with the case of a large number of types in the list Ts - in your case, the estimate can change.

Trivially copied types


A trivially copied type is one that can be copied by copying its constituent bits - that is, with std::memcpy.
3.9 [basic.types]/2 Для любого объекта (кроме подобъектов базового класса) тривиально копируемого типа T, содержат ли они или нет допустимые значения типа T, составляющие его байты (1.7) можно скопировать в массив char или unsigned char. Если содержимое массива char или unsigned char скопировать обратно в объект, в результате объект должен получить своё первоначальное значение. [...]
3.9 [basic.types]/3 Для любого тривиально копируемого типа T, если два указателя на T указывают на различные объекты obj1 и obj2 типа T, где либо obj1, либо obj2 являются подобъектами базового класса и составляющие obj1 байты (1.7) копируются в obj2, в результате obj2 должен содержать тоже самое значение, что и obj1. [...]

Оригинал
3.9 [basic.types]/2 For any object (other than a base-class subobject) of trivially copyable type T, whether or not the object holds a valid value of type T, the underlying bytes (1.7) making up the object can be copied into an array of char or unsigned char. If the content of the array of char or unsigned char is copied back into the object, the object shall subsequently hold its original value. [...]
3.9 [basic.types]/3 For any trivially copyable type T, if two pointers to T point to distinct T objects obj1 and obj2, where neither obj1 nor obj2 is a base-class subobject, if the underlying bytes (1.7) making up obj1 are copied into obj2, obj2shall subsequently hold the same value as obj1. [...]


The union of trivially copied members is itself trivially copied, which puts it in candidates for additional optimization, which cannot be done for a nontrivially copied type. It follows that variantwith trivially copied types it should also tend to be trivially copied.
9 [class] / 6 A trivially copied class is a class that:
  • does not have non-trivial copy constructors (12.8),
  • has no non-trivial displacement constructors (12.8),
  • no nontrivial copy assignment operators (13.5.3, 12.8),
  • has no nontrivial moving assignment operators (13.5.3, 12.8) and
  • has a trivial destructor (12.4).

[...]

Original
9 [class]/6 A trivially copyable class is a class that:
  • has no non-trivial copy constructors (12.8),
  • has no non-trivial move constructors (12.8),
  • has no non-trivial copy assignment operators (13.5.3, 12.8),
  • has no non-trivial move assignment operators (13.5.3, 12.8), and
  • has a trivial destructor (12.4).

[...]


To achieve this, a separate specialization must be selected variantif all types in Tsare trivially copied. The special member functions listed above should not be provided by the user for this specialization - they should either be provided implicitly or explicitly specified when they are first defined, as generated by default. The implementation will provide them with implicit definitions that will be trivial; copy and move operators will simply copy the storage bits containing them along with the discriminator, but the destructor will do nothing.
But there is one catch: a trivially copied class can be uncopyable at all, although the special member function that was deleted when it was first defined is trivial. Take a look, for example, at how a class is defined boost::noncopyable:
class noncopyable {
protected:
  constexpr noncopyable() = default;
  noncopyable(noncopyable const&) = delete;
  noncopyable& operator=(noncopyable const&) = delete;
  ~noncopyable() = default;
};

It may come as a surprise to what is being std::is_trivially_copyablededuced truefor the class noncopyable. An even bigger surprise may be that the instance can be successfully copied, since remote member functions are not used at all. This, in essence, a violation of type safety stems from the decision to use untyped bare storage to store the active member.variantnoncopyable

Trivially destructible types


Another important category of types are trivially destructible types.
12.4 [class.dtor]/5 [...] Деструктор является тривиальным, если он не предоставлен пользователем и если:
  • он не является виртуальным,
  • все непосредственные базовые классы даного класса имеют тривиальные деструкторы и
  • для всех нестатических членов данного класса, имеющих своим типом класс (или массив классов), каждый такой класс имеет тривиальный деструктор.

В противном случае деструктор является нетривиальным.

Оригинал
12.4 [class.dtor]/5 [...] A destructor is trivial if it is not user-provided and if:
  • the destructor is not virtual,
  • all of the direct base classes of its class have trivial destructors, and
  • for all of the non-static data members of its class that are of class type (or array thereof), each such class has a trivial destructor.

Otherwise, the destructor is non-trivial.


This is especially important because one of the requirements for the literal type  — whose instances can be used in the context of an constexprexpression — is its trivial destructibility.
3.9 [basic.types] / 10 A type is a literal type if it is:
  • [...]
  • class type (clause 9), possessing all of the following properties:
    • he has a trivial destructor,
    • it is a composite type (8.5.1) or has at least one constexprconstructor or template constructor that is not a copy or move constructor and
    • all its non-static members and base classes are immutable literal types.



Original
3.9 [basic.types] / 10 A type is a literal type if it is:
  • [...]
  • a class type (Clause 9) that has all of the following properties:
    • it has a trivial destructor,
    • it is an aggregate type (8.5.1) or has at least one constexpr constructor or constructor template that is not a copy or move constructor, and
    • all of its non-static data members and base classes are of non-volatile literal types.




A union can be a literal type, provided that at least one of its members is a literal type and the remaining members are trivially destructible. It follows that variantunder these conditions it should also strive to be a literal type. Another specialization variantshould be selected if all types in Tsare trivially destructible. However, it is not so useful, because among the restrictions on constant expressions there is a restriction on casting a pointer voidto a pointer to an object:
5.19 [expr.const] / 2 A conditional expression eis the core of a constant expression only if the calculation efollowing the rules of the abstract machine (1.9) cannot be calculated in one of the following expressions:
  • [...]
  • conversion from type cv void*to type of a pointer to an object;
  • [...]


Original
5.19 [expr.const] / 2 A conditional-expression eis a core constant expression unless the evaluation of e, following the rules of the abstract machine (1.9), would evaluate one of the following expressions:
  • [...]
  • a conversion from type cv void*to a pointer-to-object type;
  • [...]



And again, the decision to use untyped raw storage limits the completeness of the implementation of the generalized labeled pool to the same level that can be achieved by implementing it manually.

What is not said yet


The implementation on the basis of raw storage - although it justifies its price - does not hold water, it is worth going beyond the usual framework. The storage of values ​​in a generalized type-safe marked-up union must be a regular union, otherwise it will not work to use all its functionality.

Also popular now: