CQRS: divide and conquer principle in the service of a programmer

Puff architecture is a salvation in the world of corporate development. With its help, you can unload iron, parallelize processes and clean up the code. We tried to use the CQRS pattern in the development of a corporate project. Everything has become more logical and ... more difficult. Recently, I talked about what I had to face, on the Panda-Meetup C # .Net mitap , and now I am sharing with you.



Have you ever noticed what your corporate application looks like? Why can't it be the same as Apple and Google? Yes, because we have a constant lack of time. Requirements change frequently, the term of their changes is usually "yesterday." And what is most unpleasant, the business does not like mistakes very much.



In order to somehow live with this, the developers began to divide their applications into parts. It all started simply - with the data. Many people know the scheme when the data is separate, the client is separate, and the logic is stored in the same place where the data is.



Good scheme. The largest DBMS has quite workable procedural SQL extensions. About Oracle, the proverb goes, "Where there is Oracle, there is logic." It is difficult to argue about the convenience and speed of this configuration.

But we have a corporate application, and there is a problem: the logic is difficult to scale. And it is unreasonable to load the capacity of a DBMS, which already has enough problems with extracting and updating data, also with trivial business tasks.

Well, business logic programming tools built into the DBMS, to be honest, are rather weak for creating normal corporate applications. Supporting business logic in T-SQL / PL-SQL is a pain. It’s not for nothing that OOP languages ​​have spread so widely among corporate applications: C #, Java, you don’t have to go far for an example.



It would seem a logical decision: select the business logic. She will live on her server, the base on her own, the client separately.

What can be improved in this three-tier architecture? In the layer of business logic architecture is involved, I would like to avoid this. Business logic does not want to know anything about data storage either. UI is also a separate world, in which there are entities that are not characteristic of business logic.

Will increase the layers. This solution looks almost perfect, it has some kind of inner beauty.



We have a DAL (Data Access Layer) - the data are separated from the logic, usually this is a CRUD repository using ORM, plus stored procedures for complex queries. This option allows you to develop and quickly enough, and have an acceptable speed.

Business logic can go as part of services or be a separate layer. The interaction between the layers can be carried out through transport objects (DTO).

The request from the UI goes to the service from us, it communicates with the business logic, climbs into the DAL to access the data. This approach is called N-tier, and it has clear advantages.

Each layer has its own obvious goals and objectives, which we, as programmers, like so much. Each concrete layer is engaged only in its own business. Services can be scaled horizontally. The approach is understandable even to a novice developer, a person quickly understands how the system works. It is very easy to trace all interactions as the request goes from beginning to end.

Another consistency: all project subsystems work with one data, you do not need to worry that we have recorded data in one place, and the user does not see it in another part.

Layer Cake 1. N-Tier


Below is an example of a typical application fragment built on these principles. We have a monetary requirement, here I considered the Anemic-model. And there is a classic repository, work with which goes through ORM.



This is a typical service, they are also called managers. He works with the repository, receives requests and responds to clients. In this service, we see some confusion: we have a processing process, a process for working with UI and a process for some internal control units, they are weakly interconnected.

Here is the typical method of this service. For example, registration of a monetary claim.



We receive data, perform some business checks. Then there is an update, and after it - some post-actions, for example, sending a notification or writing to the user log.

In this approach, despite all its beauty, there are problems. Very often in corporate applications the load is not symmetrical: there are an order or two more reads than write operations. With the scaling of the database itself there is already a problem. Of course, this is done, and even by means of a DBMS on a database scale, it is called partitioning. But it is difficult. If this is done with the wrong qualification or done earlier than necessary, partitioning will fail.

For example, in one of our systems, the data volume reached 25 TB, problems appeared. We tried to scale ourselves, invited cool guys from a famous company. They looked and said: we will need 14 hours of complete idle base. We thought and said: guys, it will not work, business will not accept it.

In addition to the base volume, the number of methods in services and repositories is growing. For example, in the service for the monetary requirements of more than a hundred methods. It is difficult to maintain, there are constant conflicts with merge request, it is harder to carry out code review. And if we consider that the processes are different, different groups of developers are working on them, then the task of tracking down all the changes associated with a problem becomes a real headache.

Layer Cake 2. CQRS


So what to do? There is a solution that was invented back in ancient Rome: to divide and conquer.



As they say, everything new is well forgotten old. Back in 1988, Bertrand Meyer formulated the principle of imperative CQS programming - Command-query separation - for working with objects. All methods are clearly divided into two types. The first is Query — queries that return a result without changing the state of the object. That is, when you look at the client’s monetary requirements, no one should write to the database that the client looked at this and that, there should not be any side effects in the request.

The second is Commands — commands that change the state of an object without returning data. That is, you ordered something to change, and in response do not expect a report for 10 thousand lines.



Here the data model for reading is clearly separated from the model for writing. Most business logic runs on write operations. Reading can work on the materialized views or even on a different basis. They can be divided and synchronized through events or some internal services. There are many options.

CQRS is not complicated. We must clearly identify the commands that change the state of the system, but do not return anything. Here the approach may be more balanced. It is not particularly scary if the team returns the result of the execution: an error or, for example, an identifier of the created entity, then there is no crime in this. It is important that the team does not work with the query, it should not look for data and return the business entity.

Requests - everything is simple there. Does not change the state so that there are no side effects. This means that if we called the request twice in a row, and there were no other commands, the state of the object in both cases should remain identical. This allows parallel queries. Interestingly, a separate model for queries is not needed for work, since there is no point in attracting business logic from the domain model for this.

Our CQRS project


Here is what we wanted to do in our project:



We have an existing application that has been functioning since 2006, it has a classic layered architecture. Old fashioned, but still working. No one wants to change it and does not even know what to replace it with. The moment came when it was necessary to develop something new, from scratch practically. In 2011-2012, Event Sourcing and CQRS were a very fashionable topic. We thought it was cool that in this way we could keep the original state of the object and the events that led to it.

That is, we are not updating the object. There is an original state and a number - that applied to it. In this case, there is a huge plus - we can restore the state of the object at any time in history. In fact, the magazine is no longer needed. As we store events, we understand exactly what happened. That is, it’s not just that the client has updated the value in the “address” cell; we’ll have an event, for example, a client’s move.

It is clear that such a scheme is slow when receiving data, so we have a separate database with material representations to choose from. Well, the synchronization of events: every time an event arrives at a state change, it is published. In theory, everything seems to be fine, but ... I never met people who fully implemented it in production, at high loads with acceptable consistency for business.



The scheme can be developed further by separating the handlers and commands / requests. Here, as an example, we have a team - a registered money requirement: there is a date, amount, customer, and other fields.

We put a restriction on the cash request registration processor that it can only accept our team (where TCommand: ICommand). We can write handlers, without changing old ones, simply by the method of adding complex requirements. For example, first update the date, then write down the value, and here the client sent a notification - all this is written in different handlers for one command.

How do we cause all this? There is a dispatcher who knows where he has all these handlers stored.



The dispatcher is passed (for example, via the DI container) to the API. And when the command comes, it only executes. He knows where the container is, where the teams are, and executes them. With requests - it is similar.

What is the problem of such a scheme: all interactions become less obvious. We build a hierarchy on the types that are registered in containers, and then respond to their commands / requests. Requires very clearly design the architecture. Any action by one method with one parameter is no longer limited. You write a command, write a handler, register in a container. The number of overheads increases. In a large project, there are problems with basic navigation. We decided to go in a more classic way.

For asynchronous communication, a Rebus service bus was used.



For simple tasks, it is more than enough.

CQRS makes a slightly different approach to the code, focus on the process, because all actions are born within the framework of the process. We have allocated a repository for requests, separately made the commands related to processing, and separately requests related to processing. For reading, we did not use a separate repository, just in teams we work with ORM.



Here, for example, is the method from which all unnecessary is thrown out. In the cash claim registration team, we register the claim and publish the event to the bus that a cash claim is registered.



Who is interested in this, he will respond to it. For example, user authentication and logging work there.

Here is an example request. Everything became simple too: we read and give to the repository.



I want to focus separately on Rebus.Saga. This is a pattern that allows you to break a business transaction into atomic actions. This allows you to block not all at once, but gradually and in turn.



The first element performs actions and sends a message, the second subscriber reacts to it, fulfills, sends its message, to which the third part of the system responds. If everything ended well, Saga generates its own message of the specified type, to which other subscribers will already respond.

Let's see how in this case the money claim processing class looks like. Everything is clear: there are teams, there are requests that relate to the registration process, well, the bus with logs.



In this case, there is one handler. When an event occurs and the team arrives to register the money requirements, it responds to it. Inside, everything is the same as before, but the peculiarity is that here there is a grouping by process.



Because of this, it became a little easier, there were fewer changes in each file.

findings


What needs to be remembered, working with CQRS? You need a better design approach, because rewriting the process is a bit more complicated. There is a small overhead, a little more classes have become, but this is not critical at all. The code has become less connected, however, this is not so much because of the CQRS, but because of the transition to the bus. But it was CQRS that prompted us to use such event interaction. The code began to be added more often than to change. Classes have become more, but they are now more specialized.

Do we all need to throw everything and massively move on CQRS? No, you need to look at which work scenario is better suited for a specific project. For example, if your subsystem works with reference books, CQRS is not needed, the classical layer approach gives a simpler and more convenient result.

The full performance on Panda Meetup is available below.


If you want to dive deeper into the topic, it makes sense to explore these resources:

CQRS architecture style - from Microsoft

Blog Byndu

Contoso University Blog Examples with CQRS, MediatR, AutoMapper and more - from Jimmy Bogard

CQRS - from Martin Fowler

Rebus

Also popular now: