Choosing an ORM Strategy (.NET)

Original author: Jimmy Bogard
  • Transfer

One of the mistakes that developers make (and I used to be one of them) is the statement that you must use exactly one ORM strategy for the application you are creating. This is generally not true. You can (and should) tie the choice of strategy to a specific scenario, and be sure that you choose the right tools for a particular case.

I note right away that 99.9% of the time you should not use ADO.NET directly. If you are still using it dataReader.MoveNext, stop!

Many people do not like ORM as such. After listening to their argument, I agree that Martin Fowler wrote in OrmHate :

The biggest disappointment from ORM lies in high expectations .

We must come to terms with the idea that ORMs are bad, ugly, and overloaded . ORMs are designed to solve the problem and have many different approaches for this. But, before we look at these approaches, let's examine what problem are we trying to solve?

Bridging the gap

If you must load or embed data in SQL, you must map ("map") your .NET data types in SQL. For .NET, this means using ADO.NET to send SQL commands to the SQL server. Then we need to map the SQL types to .NET types. And there are nuances - for example, dates in SQL are different from dates in .NET.

ADO.NET helps us with this, but leaves us the job of processing raw data sets and creating .NET objects. In the end, we get what we want - we work with objects and .NET types. And some code translates this into SQL queries and vice versa.

ORMs are designed to solve this problem by adding layers of various abstractions on top of ADO.NET. But there are many strategies for this. Let's take a look at each of them and see where which ones are best suited.

Entity-based relational mapping

In such a mapping, almost always the database tables correspond 1: 1 with the entities in your system. When you add a property to an object, you add a column to the table. The use of this method is based on loading an entity (or aggregate) by its identifier, managing this object and, possibly, related objects, and then saving this object to the database using ORM.

ORM in this case provides a lot of functionality, for example:
  • Change tracking
  • Lazy loading (lazy-loading)
  • Preload (eager fetching)
  • Cascading
  • Ensuring the uniqueness of objects ( Identity map )
  • Work with units of work ( Unit of work )

If I work with only one entity or aggregate at a time, then ORMs like NHibernate are very suitable for us. They use the specified configuration to track loaded entities and automatically save changes during transaction commit. And this is nice, because we should not carry our own layer of working with data. NHibernate does all the dirty work for us.

As long as we load the object by Id for the sole purpose of changing it, all this works fine. This eliminates the large amount of code that I would need to create to keep track of adding objects, saving them, etc.

The flip side of this approach is that ORM does not know if you are only going to read objects, or load an entity to modify it. We often see people stumble when they don’t realize that change tracking is enabled by default and how it works.

If you want to load an entity in order to change it and save changes (or create a new entity), this approach provides more flexibility from including a data access level in your infrastructure layer and allows your entity types to be relatively independent of their save method. This independence does not mean that my C # model and data schema may diverge. On the contrary, this means that the data access layer will not penetrate my object model, which instead I would rather prefer to load with business rules.

Mapping to datasets (Result-set-based relational mapping)

In most applications, data reading requirements far exceed the number of records. We saw a 100: 1 ratio between SELECT and INSERT / UPDATE / DELETE in our recent application. When we look at what SQL is really good at, it’s working with data in sets (sets). To select a data set from an SQL server, it often makes no sense to try to directly map this data to entities.

But we still prefer not to work directly with IDataReader or DataTable. These are poorly typed objects that are difficult to transfer to the upper layers of the application. On the contrary, we often construct objects adapted to data. These objects are often called DTO (Data-Transfer Objects), or Read Models. We create such DTOs for individual SQL samples - and rarely in order to reuse them in other queries.

Many ORMs have features optimized for such scenarios. In NHibernate, you can use projections to turn off tracking, and display data directly in the DTO. You can use SQL queries to do this and do not need a mapping configuration. Or you can use micro-ORM like PetaPoco.

These reads can also generate DTO objects as they read. Both NHibernate and several micro-ORMs allow you to receive individual DTO objects sequentially one by one while reading query result lines, thereby minimizing the amount of objects contained in memory.

In our applications, we still often use NHiberante for reading, but do not use entity objects, but instead use raw SQL. We rely on optimized NHiberanate mappers to simply provide the DTO type, and the result of the selection will be displayed automatically.
This approach does not work very well if we need to apply business rules and save the information back. As these models are usually displayed in separate data sets, but not in database tables.

Active record - This is another example of the essential display of data, in which the functionality for working with data is included in the object model itself.

DML-based relational mapping

If you know what SQL you need to implement CRUD, and would prefer to create it manually, then you are already looking for something to effectively abstract the DML commands to a level higher than ADO.NET.
And this is the micro-ORM arena. Frameworks such as PetaPoco , Dapper , Massive and others are created to help solve the problems of ADO.NET. They usually still allow us to work with ADO.NET objects, but our interaction is greatly simplified. We just need a connection, and these frameworks can allow working with all CRUD operations in a form that offers much simpler code than ADO.NET itself.

In the case when you do not have entities and the need to display them in tables and vice versa, micro-ORM will give a much easier approach. And since micro-ORMs do not require preliminary configuration, they rely on lazy execution and optimized caching techniques to map SQL parameters and query results on the fly. Many applications can start with a DML-based mappig, switching to a full-fledged ORM as soon as relationships or entities require it.

Bulk loading tools

This is what occupies a special place - sometimes you do not want to insert / load data in an object way. Instead, you would prefer to work with all sets as a whole. Tools like SQL Bulk Copy allow you to retrieve and upload data in CSV or tabular formats.

These utilities work just like a bazooka, tearing all the data back and forth, but not providing anything else. You cannot update or delete data, but in order to get large amounts of data from SQL, these utilities are what you need.

In many integration scenarios where you provide data files to external partners, or vice versa, these downloaders allow you to use files as tables and directly upload them to databases.

These utilities are much faster than traditional methods of parsing / loading data. In some of our tests, we saw a difference in orders compared to progressive loading. And in one case, we saw the difference between a few hours and a minute. The flip side of all this is that the functionality is limited only to INSERT and SELECT. Everything else requires other approaches.

The right tool for the task

In a recent project, I used each of the above approaches in working with one database and one code. NHibernate for entity / aggregate mapping, selection of ready-made result sets for reading data sets (and further preparation of messages / export / views from the results), DML-mapping for simple tables and modules, as well as bulk-load tools for downloading files from partners with many millions of lines.

The key point is that you don’t have to bind yourself to a particular tool or approach. No ORM strategy works in all scenarios, and should not. NHibernate can work with many other scenarios (except for direct mapping of entities), but does not do anything in the world. Complexity often arises from attempts to use the same approach always.

Every application written outside the SQL server uses ORM. Either this handwritten ADO.NET code, or NHibernate - you have to bridge the gap between .NET and SQL. This overcoming is a difficult task, and nothing solves the problem completely perfectly. And it should not!

Choose an approach that solves a specific problem. Do not worry that you will have several ORM strategies in one project. This does not mean that unsystematic solutions are acceptable. But on the contrary - the application of verified solutions based on knowledge of possible options is always a good idea.

Also popular now: