Building blocks of distributed applications. Zero approximation
The world does not stand still. Progress creates new technological challenges. In accordance with the changing requirements, the architecture of information systems should also evolve. Today we will talk about event-oriented architecture, competitiveness, concurrency, asynchrony, and how to live peacefully with all of this in Erlang.
Depending on the size of the system being designed and the requirements for it, we, the developers, choose the method of exchanging information in the system. In most cases, to organize the interaction of services, a working option may be a scheme with a broker, for example, based on RabbitMQ or kafka. But sometimes the flow of events, SLA and the level of control over the system are such that ready messaging does not suit us. Of course, you can complicate the system a bit by taking responsibility for the transport layer and cluster formation, for example using ZeroMQ or nanomsg. But if the system has enough bandwidth and the capabilities of a standard Erlang cluster, then the issue of introducing an additional entity requires a detailed study and economic justification.
The topic of reactive distributed applications is quite extensive. To keep within the format of the article, the subject of today's discussion will be only homogeneous environments built on the basis of Erlang / Elixir. The Erlang / OTP ecosystem allows for a low-cost reactive architecture. But in any case, we need a messaging layer.
Design begins with the definition of goals and limitations. The main goal is not in development for development. We need to get a safe and scalable tool on the basis of which we can create, and most importantly, develop modern applications of different levels: from single-server ones serving a small audience, which can later develop into clusters of up to 50-60 nodes, ending with cluster federations. Thus, the main goal is to maximize profits by reducing the cost of development and ownership of the final system.
There are 4 main requirements for the final system:
- With everyday orientation.
The system is always ready to pass through itself a stream of events and perform the necessary actions;
- M asshtabiruemost.
Individual blocks can be scaled both vertically and horizontally. The whole system should be able to infinite horizontal growth;
- About fault tolerance.
All levels and all services should be able to automatically recover from failures;
- Mr. arantirovannoe response time.
Time is valuable and users should not wait too long.
Remember the old fairy tale about “The little engine that could”, aka “The engine that could”? For the designed system to successfully emerge from the prototype stage and be progressive, its foundation must meet the minimum requirements of SMOG .
Another thing is added to messaging as an infrastructure tool and a basis for all services: usability for programmers.
In order for an application to grow from a single server to a cluster, its architecture must provide weak connectivity. The asynchronous model meets this requirement. In it, the sender and the recipient take care of the information load of the message and do not worry about the transmission and routing within the system.
Scalability and system performance stand side by side. Application components must be able to utilize all available resources. The more efficient we can utilize the capacities and the more optimal our processing methods, the less we spend money on equipment.
Erlang creates a highly competitive environment within a single machine. The balance between concurrency and concurrency can be set by selecting the number of operating system threads available for Erlang VM and the number of schedulers that utilize these threads.
Erlang processes do not have a common state and work in non-blocking mode. This provides a relatively low latency and higher bandwidth than traditional applications built on blocking synchronization. The Erlang scheduler takes care of the fair distribution of CPU and IO, and the absence of locks allows the application to respond even in peak loads or failures.
At the cluster level, a recycling problem also exists. It is important that all the machines in the cluster are evenly loaded and the network is not overloaded. Imagine a situation: user traffic lands on incoming balancers (haproxy, nginx, etc), they distribute processing requests as evenly as possible between the set of available backends. Within the framework of the application infrastructure, a service that implements the required interface is only the last mile, and it will need to request a number of other services in order to answer the initial request. Internal queries also require routing and balancing.
To effectively manage data flows, messaging must provide developers with an interface to control routing and load balancing. Thanks to this, developers will be able, using microservice patterns (aggregator, proxy, chain, branch, etc), to solve both standard tasks and rarely arising.
From a business perspective, scalability is one of the risk management tools. The main thing is to satisfy customers' demands by optimally using equipment:
- With an increase in equipment capacity as a result of progress. It will not be idle due to software imperfections. Erlang scales perfectly vertically and can always recycle all CPU cores and available memory;
- In cloudy environments, we can control the amount of equipment depending on the current or predicted load and guarantee SLA.
Consider two axioms: “Failures are unacceptable” and “Failures will always be.” For businesses, software failure is money loss, and worse, reputation. Balancing between potential losses and the cost of developing fault-tolerant software, you can often find a compromise.
In the short-term, the architecture with fault tolerance saves money on the purchase of turnkey clustering solutions. They are expensive, and they also have errors.
In the long-term, fault-tolerant architecture repeatedly pays for the costs of its application at all stages of development.
Messaging inside the code base at the design stage allows you to work out in detail the interaction of components within the system. This simplifies the task of responding and managing failures, since all critical components handle failures, and the resulting system knows how to automatically return to normal after a failure by design.
Regardless of failures, the application must respond to requests and satisfy SLAs. The reality is that people do not want to wait, so the business must adjust. More applications are expected to be highly responsive.
Responsive applications work in close to real time mode. Erlang VM operates in soft real-time mode. For some areas, such as exchange trading, medicine, industrial equipment management, hard real-time mode is important.
Responsive systems enhance UX and help businesses.
In planning this article, I wanted to share the experience of creating a messaging broker and building complex systems on its basis. But the theoretical and motivational part turned out to be quite extensive.
In the second part of the article I will talk about the nuances of the implementation of exchange points, messaging templates and their application.
In the third part, we consider the general issues of service organization, routing, and balancing. Let's talk about the practical side of scalability and fault tolerance of systems.
The end of the first part.
Photo @lucabravo .