Writing Your Mapper for .NET Standard 2.0

В сегодняшней заметке я хотел бы поведать вам о коротком приключении по написанию своего маппера для .NET Standard 2.0. Ссылка на github и результаты benchmark'ов прилагаются.

I think it’s no secret to any of you what mapper is and what it is for. Literally at every step in the process of work, we are faced with various examples of mapping (or transformation) of data from one type to another. These include mapping records from the repository to the domain model, response response mapping of the remote service in the view model, and only then in the domain model, etc. Often, at the boundary of the abstraction level, there are input and output data formats, and it is at the moments of abstraction interaction that such a thing as a mapper can show itself in all its glory, bringing with it a significant saving of time and effort for the developer and, as a result, taking away yourself a fraction of the total system performance.

Based on this, one can describe the MVP requirements:

Speed of work (less performance & memory impact);
Ease of use (clean & easy to use API).

As for the first point, BenchmarkDotNet and a thoughtful implementation, not without optimizations, will help us in this. For the second, I wrote a simple unit test, which, in some way, acts as the documentation of our mapper's API:

[TestMethod]
publicvoidWhenMappingExist_Then_Map()
{
    var dto = new CustomerDto
    {
        Id = 42,
        Title = "Test",
        CreatedAtUtc = new DateTime(2017, 9, 3),
        IsDeleted = true
    };
    mapper.Register<CustomerDto, Customer>();
    var customer = mapper.Map<CustomerDto, Customer>(dto);
    Assert.AreEqual(42, customer.Id);
    Assert.AreEqual("Test", customer.Title);
    Assert.AreEqual(new DateTime(2017, 9, 3), customer.CreatedAtUtc);
    Assert.AreEqual(true, customer.IsDeleted);
}

In total, we need to implement only 2 simple methods:

void Register<TSource, TDest>();
TDest Map<TSource, TDest>(TSource source).

check in

In fact, the registration process can be carried out the first time the method is called Map, thereby becoming superfluous. However, I took it out separately for the following reasons:

For verification, in the absence of a default constructor (or the inability to map the final type) in my opinion, you should report this as soon as possible at the configuration stage, thereby observing the Fail fast principle. Otherwise, the error of the impossibility of creating an instance of the type can overtake us already at the stage of execution of the infrastructure code or business logic;
For expansion, at the moment the API is extremely simple and under the hood means mapping based on naming conventions, however, it is likely that very soon we will want to introduce rules for mapping of certain fields, the value for assignment of which may result from the method . In this case, in order to comply with the Single responsible principle, such a division seems to me quite logical.

If the method Mapin any mapper is the main one and it accounts for the lion's share of the execution time, then the method, on the Registercontrary, is called only once for each type pair at the configuration stage. That is why he is an excellent candidate for all the necessary "heavy" manipulations: generating the optimal mapping execution plan and, as a result, further caching the results.

Thus, its implementation should include:

Construction of a plan for creating and initializing an instance of the required type;
Caching Results.

Execution plan

In C #, we don’t have many ways to create and initialize an instance of a type in runtime, and the higher the level of abstraction of a method, the less optimal it is from the point of view of runtime. Earlier, I already faced a similar choice in my other small project called FsContainer, and therefore the following results did not become surprising to me.

BenchmarkDotNet=v0.10.9, OS=Windows 8.1 (6.3.9600)
Processor=Intel Core i5-5200U CPU 2.20GHz (Broadwell), ProcessorCount=4
Frequency=2143473 Hz, Resolution=466.5326 ns, Timer=TSC
.NET Core SDK=2.0.0
  [Host]     : .NET Core 2.0.0 (Framework 4.6.00001.0), 64bit RyuJIT
  DefaultJob : .NET Core 2.0.0 (Framework 4.6.00001.0), 64bit RyuJIT

|                      Method |       Mean |     Error |    StdDev |     Median ||---------------------------- |-----------:|----------:|----------:|-----------:|| ExpressionCtorObjectBuilder |8.548 ns | 0.2764 ns |0.4541 ns |   8.608 ns ||     ActivatorCreateInstance |79.379 ns | 1.6812 ns |3.1987 ns |  78.890 ns ||       ConstructorInfoInvoke |164.445 ns | 3.3355 ns |4.3371 ns | 164.016 ns ||    DynamicMethodILGenerator |5.859 ns | 0.2455 ns |0.3015 ns |   5.819 ns ||                     NewCtor |6.989 ns | 0.2615 ns |0.5741 ns |   6.756 ns |

Despite the fact that the use ConstructorInfo.Invokeand Activator.CreateInstancequite easy in the list by a wide margin, they are the clear outsiders due to the fact that the details of their implementation they are using RuntimeTypeand System.Reflection. This is quite acceptable in everyday tasks, but completely inappropriate within our requirements, where creating an instance of a type is the narrowest bottle neck in terms of performance.

As for the use of Expressionand DynamicMethod, here, without surprises, the result of the execution is pointers to compiled functions, which can only be called by passing the corresponding arguments.

Although the Delegate, compiled by generating IL code on the fly, performs somewhat faster, it does not include the initialization code for the type instance. Moreover, for me personally, playing IL instructions through ilgen.Emitis a very non-trivial exercise.

var dynamicMethod = new DynamicMethod("Create_" + ctorInfo.Name, ctorInfo.DeclaringType, new[] { typeof(object[]) });       
var ilgen = dynamicMethod.GetILGenerator();     
ilgen.Emit(OpCodes.Newobj, ctorInfo);       
ilgen.Emit(OpCodes.Ret);        
return dynamicMethod.CreateDelegate(typeof(Func<TDest>));

That is why I settled on the implementation using Expression:

var body = Expression.MemberInit(
    Expression.New(typeof(TDest)), props
);
return Expression.Lambda<Func<TSource, TDest>>(body, orig).Compile();

Caching

To cache the compiled delegate, which will later be used to perform mapping, I chose between Dictionaryand Hashtable. Looking ahead, I would like to note that the key roles are played not only by the type of collection, but also by the type of key by which the selection will be made. To verify this statement, a separate benchmark was written and the following results were obtained:

BenchmarkDotNet=v0.10.9, OS=Windows 8.1 (6.3.9600)
Processor=Intel Core i5-5200U CPU 2.20GHz (Broadwell), ProcessorCount=4
Frequency=2143473 Hz, Resolution=466.5326 ns, Timer=TSC
.NET Core SDK=2.0.0
  [Host]     : .NET Core 2.0.0 (Framework 4.6.00001.0), 64bit RyuJIT
  DefaultJob : .NET Core 2.0.0 (Framework 4.6.00001.0), 64bit RyuJIT

|              Method |      Mean |     Error |    StdDev |
 |-------------------- |----------:|----------:|----------:||     DictionaryTuple |80.37 ns | 1.6473 ns |1.6179 ns |
 | DictionaryTypeTuple |  49.35 ns |0.6235 ns | 0.5832 ns ||      HashtableTuple |103.07 ns | 2.6081 ns |2.4397 ns |
 |  HashtableTypeTuple |  71.51 ns |0.8679 ns | 0.7694 ns |

Taking this into account, we can draw the following conclusions:

Using a type is Dictionarypreferable Hashtablein terms of time spent on obtaining a collection item;
Using the TypeTuple( src ) type as the key is preferable Tuple<Type, Type>in terms of time expenditures for Equals& GetHashCode;

Mapping

The internal implementation of the method Mapshould be extremely simple and optimized due to the fact that this method will be called in 99.9% of cases. Therefore, all we need to do is to find the link to the previously compiled Delegatein the cache as quickly as possible and return the result of its execution:

public TDest Map<TSource, TDest>(TSource source)
{
    var key = new TypeTuple(typeof(TSource), typeof(TDest));
    var activator = GetMap(key);
    return ((Func<TSource, TDest>)activator)(source);
}

results

As results, I would like to give the results of the final measurements of existing (and up to date) mappers at the moment:

BenchmarkDotNet=v0.10.9, OS=Windows 8.1 (6.3.9600)
Processor=Intel Core i5-5200U CPU 2.20GHz (Broadwell), ProcessorCount=4
Frequency=2143473 Hz, Resolution=466.5326 ns, Timer=TSC
.NET Core SDK=2.0.0
  [Host]     : .NET Core 2.0.0 (Framework 4.6.00001.0), 64bit RyuJIT
  DefaultJob : .NET Core 2.0.0 (Framework 4.6.00001.0), 64bit RyuJIT

|                 Method |       Mean |     Error |    StdDev |
 |----------------------- |-----------:|----------:|----------:||      FsMapperBenchmark |84.492 ns | 1.6972 ns |1.6669 ns |
 | ExpressMapperBenchmark | 251.161 ns |4.6736 ns | 4.3717 ns ||    AutoMapperBenchmark |204.142 ns | 4.2002 ns |9.1309 ns |
 |       MapsterBenchmark |  90.949 ns |1.6393 ns | 1.4532 ns ||   AgileMapperBenchmark |218.021 ns | 3.0921 ns |2.7410 ns |
 |    CtorMapperBenchmark |   7.806 ns |0.2472 ns | 0.2312 ns |

The source code of the project is available on github: https://github.com/FSou1/FsMapper .

Thank you for reading to the end and I hope this article was useful to you. Write in the comments what in your opinion could still be optimized.

Tags: