Test Data Generator for C ++


    When unit-testing code, sooner or later the question arises of test data. And if in one case just a few hard-wired variables are enough, then in other cases some kind of large and random data is needed. In a controlled world, there is no problem generating custom types (take the same Autofixture), but the C ++ world often causes pain and suffering (correct me if it is not). Not so long ago I got acquainted with the wonderful boost :: di library and under its influence the idea of ​​a library began to ripen that would allow C ++ programmers to generate custom data types clogged with random values, and this would not require their preliminary description. It turned out something like:

    struct dummy_member{
        float a;
        int b;
    struct dummy{
        explicit dummy(dummy_member val, std::string c) : val_(val), c_(c) {}
        dummy_member val_;
        std::string c_;
    int main(int argc, char* argv){
        auto d = datagen::random<dummy>();
        return 0;

    Link to the code . Header-only library, C ++ 14. I ask everyone interested in cat.

    Key library features

    • Built-in Type Generation
    • Generation of user types (== not built-in)
    • Limiters on the set of generated values

    Built-in Type Generation

    The generation of built-in types (char, wchar_t, etc.) is naturally supported. In this case, integer types are generated simply as a set of bits, and float and double as the sum of a random integer (int32_t and int64_t, respectively) and random values ​​in the range from -1 to 1. To generate a bool value, a comparison of two random integers is used.

    std::cout << "The answer to the question of everything is:" << datagen::random<int>() << std::endl;

    Generation of custom types.

    To generate custom types, the same idea was used as in boost :: di (thanks to its author), namely the possibility of writing a universal type any_type, implicitly convertible to any other (with rare exceptions). Adding a LITTLE templates, it turned out a thing that generates custom types using the following tools:

    1. User-defined generation algorithm.
    2. Generation based on a public constructor with the maximum number of parameters. Everything is as in the example at the beginning of the article ( struct dymmy).
    3. Generation based on {} -initialization. Everything is the same as in the first example ( struct dummy_member).

    To generate objects based on a user-defined procedure, it is necessary to partially or fully specialize the template

    template<> struct datagen::value_generation_algorithm<TType> { 
        TType get_random(random_source_base&); 

    This adds the ability to make some type generation parameters members of this class, which in turn allows you to influence type generation. For example, the std line generation algorithm looks like this:

    namespace datagen{
        template <class CharType, class Traits, class Allocator>
        struct value_generation_algorithm<std::basic_string<CharType, Traits, Allocator>>{
            using string_t = std::basic_string<CharType, Traits, Allocator>;
            string_t get_random(random_source_base& r_source){...};
            size_t min_size{0};
            size_t max_size{30};
            std::basic_string<CharType> alphabet{"abcd...6789"};

    Limiters on the set of generated values

    The library supports limiters on generated values, for example:

    std::cout << "The answer to the question of everything is:" << random<int>(between(42,42)) << std::endl;

    There are 2 types of limiters:

    1. Limiters on the generation algorithm. With their help, you can change the values ​​of the parameters in the class value_generation_algorithm<T>.
    2. Limiters (rather correctors) to an already generated value.

    In this case, they can be used in 2 ways :

    1. Passing them as a parameter to the random function, as in the example above. In this case, they will be applied only to the current algorithm / value.
    2. Creating based on them scoped_limitand applying it to a set of types. Then the limiter is applied for all specified types for the entire depth of the generated type tree throughout the life of scoped_limit.

    To create custom delimiters, you must declare a delimiter structure / class and implement one or both functions:

    struct dummy_algorithm_limit{};
    struct dummy_value_limit{};
    namespace datagen{
        namespace limits    {
            void adjust_algorithm(random_source_base&, dummy_algorithm_limit const& l, value_generation_algorithm<dummy>& a){
            // здесь можно подправить параметры генерации dummy
            void adjust_value(random_source_base&, dummy_value_limit const& l, dummy& a){
            //здесь можно подправить сгенерированное значение dummy

    The restrictions apply in the following order:

    1. scoped_limit for algorithms
    2. parametric limiters on the algorithm
      here is a tree of objects
    3. scoped_limit for values
    4. parametric limiters on values

    general information

    The source of entropy in the library is the random_source_impl class, which uses <random>. But it is possible to override this by providing structure specialization at the compilation stage random_source_instance<int>.
    For today, the generation of the following containers from stl has been implemented (actually what I need to work with):

    • std :: array
    • std :: map
    • std :: set
    • std :: string
    • std :: vector

    pairs of types from boost:

    • boost :: asio :: ip :: address (v4, v6)
    • boost :: optional
    • boost :: posix_time :: ptime
    • boost :: posix_time :: time_duration

    Limiters for them:

    • between for built-in types and not only
    • greater_than, less_than, odd, even
    • container_size :: between, container_size :: less_than etc.
    • alphabet :: consists_of, alphabet :: does_not_contain

    Tested on msvc-14.0 compiler, requires c ++ 14. Unfortunately, gcc behaves a little differently, as a result of which the code of the library was not compiled under mingw (gcc-6.3.0), but I think those who have constant contact with it can quickly fix it.
    The library is in the public domain . Ideas and implementations of new types are welcome.

    Only registered users can participate in the survey. Please come in.

    And how do you generate random data for C ++ unit tests?

    Also popular now: