bucefal91 April 18, 2018 at 09:11

Code Architecture

In this article I want to share my personal experience related to the proper organization of code (architecture). Proper architecture greatly simplifies long-term support.
This is a very philosophical topic, so I can offer nothing more than my subjective analysis and experience.

Problems, symptoms

My initial experience as a programmer was quite cloudless - I riveted business card websites without any problems. He wrote the code, as I now call it “in a line” or “canvas”. On small volumes and simple tasks, everything was fine.

But I changed my job, and I had to develop one single website for 4 years. Naturally, the complexity of this code was not comparable with the business cards from my previous work. At some point, problems just rained down on me - the amount of regression went off scale. There was a feeling that I was just walking in a circle - while repairing “here”, I broke something “there”. And so this “here” and “there” corny changed places and the circle repeated.

I lost confidence that I was in control of the situation - with all my desire to prevent bugs, they slipped through. All these 4 years the project was actively developed - we improved the existing functionality, expanded, completed it. I saw and felt how the unit cost of each new refactoring / refinement is growing - the total amount of code has increased, and accordingly the cost of any editing has increased. Trite, I went to the threshold through which I could no longer cross, continuing to write code “on the line”, without using architecture. But at that moment, I did not understand this yet.

Other important symptoms were books and video tutorials that I was reading / watching at the time. The code from these sources looked “glossy” beautiful, natural, and intuitive. Seeing such a difference between textbooks and real life, my first reaction was the thought that this is normal - in life it is always more difficult than in theory, more routine and more specific.

Nevertheless, the product at work needed to be expanded, improved, in general, to move on. At the same moment, I began to actively participate in one open source project. And collectively, these factors pushed me onto the path of architectural thinking.

What is architecture?

One of my university lecturers used the phrase “you need to design in such a way as to maximize the number of objects and minimize the number of connections between them.” The longer I live, the more I agree with him. If you look closely at this quote, it is clear that these 2 conditions are mutually exclusive to some extent - the more we split up a system into subsystems, the more connections we will have to enter between them in order to “connect” each of the subsystems with other actors. To find the optimal balance between the first and second is a kind of art that, like other arts, can be mastered through practice.

A complex system is split into subsystems through interfaces. In order to isolate some subsystem from a complex system, it is necessary to determinean interface that will declare the boundary between the first and second. Imagine, we had a complex system, and some subsystems seem to perceive inside, but they are “spread out” at different places in the main system and there is no clear format (interface) for interaction between them:

We will calculate, de facto, we have 1 system and 0 connections . Everything is fine with minimizing communications :) But the number of systems is very small.

And now, someone has done a code analysis, and clearly identified 2 subsystems, defined the interfaces through which communication is conducted. This means that the boundaries of the subsystems are defined and the scheme is as follows:

Here we have: 3 systems and 2 connections between them. Please note that the amount of functionality remains the same - the architecture neither increases nor decreases the functionality, this is just a way to organize the code.

What is the difference between the two alternatives? When refactoring in the first case, we need to “comb through” 100% of the code (the entire green square) to make sure that we did not introduce any regression. With the same refactoring in the second case, we first need to determine which system it belongs to. And then all our refactoring will be reduced to combing only one of the 3 systems. The task was simplified in the second case! Due to the successful fragmentation of the architecture, it is enough for us to concentrate on only part of the code, and not on all 100% of the code.

This example shows why it is profitable to split into the maximum number of objects. But there is also the second part of the quotation - minimizing the connections between them. But what if a new revision that came to us from the authorities affects the interface itself (a red bridge between 2 systems)? Then things are bad - changes in the interface involve changes at both ends of this bridge. And just the less connections we have between systems, the less likely that our refactoring will affect any interface at all. And the simpler each of the interfaces, the easier it will be to make the necessary changes on both sides of the interface.

Interface in the broadest sense

The key to the correct application of architecture, I think, is the interface, because it determines the format of interaction and, accordingly, the boundaries of each of the systems. In other words, the number of subsystems and their connectivity (the number of connections) depend on the selected interfaces. Let's take a closer look.

First of all, he must be honest . There should be no communication between systems outside the interface. Otherwise, we will slide back to the original version - diffusion (yes, it is also in programming!) Will combine 2 systems back into one common system.
The interface must be complete. An actor on one side of the interface should not have any idea about the internal structure of the actor on the other side of the bridge - no more than what the interface means by which they interact, i.e. the interface should fully (sufficient for our needs) describe the partner on the “other side of the bridge”. By making the interface complete from the beginning, we significantly reduce the chances of having to edit the interface in the future - remember, making changes to the interface is the most expensive operation, because it implies changes in more than one subsystem.

An interface does not have to be declared as an interface from OOP. I believe that honesty, the fullness of the interface and your clear understanding of this interface are enough. Moreover, an interface such as I mean within the framework of this article is something broader than an interface from OOP. What matters is not form, but essence.

It will be appropriate to mention the architecture of microservices. The boundaries between each of the services is nothing but the interface, which I will discuss in this article.
As an example, I want to bring the file usage counter in inode to * nix (file reference count): there is an interface - if you use a file, increase its counter by 1. When you have finished using it, decrease its counter by 1. When the counter is zero , then no one is using this file, and you need to delete it. Such an interface is indescribably flexible, because he does not impose absolutely no restrictions on the internal structure of the actor, who can use it. The use of files within the file system and the descriptor file from executable programs organically fit into this interface.

Solve the problem on an abstract rather than a specific level

Obviously, the ability to choose the right interface is a very important skill. My experience tells me that very often a successful interface comes to mind when you try to solve a problem at an abstract (general) level, rather than the current (specific) manifestation of it. Albert Einstein once said that the correct formulation of a problem is more important than its solution. In this light, I completely agree with him.

Which solution to the “open the front door” task seems more correct to you?

Go to the door;
Take a bunch of keys from your pocket;
Select the desired key;
Open the door for them.

Or:

Go to the door;
Call the subsystem “keystore” and get the available keychain from it;
Call the subsystem “search for the correct key” and ask her for the most suitable key to the current door from the bunch of available keys;
Open the door with the key suggested by the search subsystem for the correct key.

The abstractness of the second algorithm is many times higher than the first, and as a result, its completeness is also higher. The second algorithm is corny far more likely to remain relevant even 50 years later, when the concept of “keys” and “doors” will differ from today's :)

Looking at the problem from an abstract point of view, we can naturally come up with full interfaces. After all, when solving a particular manifestation of a problem, the maximum that we can think of in terms of an interface is just its particular projection onto our particular problem. Looking at an abstract problem, we are more likely to see the full interface, rather than its manifestation for some specifics.

At some point, you begin to see these abstract operations behind their specific manifestations (implementations). This is already great! But do not forget that you need to minimize the number of connections - this means that there is a risk of getting too far into the wilds of abstraction. It’s absolutely not necessary to include in your architecture all the abstractions that you see in the analysis. Include only those that justify their presence due to additional introduced flexibility or by splitting an excessively complex system into subsystems.

Physics

There is such a science, and I love it along with programming. In physics, many phenomena can be considered at different levels of abstraction. The collision of two objects can be considered as Newtonian dynamics, but can be considered as quantum mechanics. The air pressure in a balloon can be considered as micro- and macro-thermodynamics. Physicists must have come to such a model for good reason.

The fact is that using different levels of detail in the code architecture is also very beneficial. Any subsystem can be recursively split further into sub-subsystems. The subsystem will act as a system, and we will look for subsystems in it. This is a divide and conquer approach. Thus, a program of any complexity can be explained at a convenient level to the interlocutor, the level of detail in 5 minutes for a beer to a friend is a programmer or a non-tech boss at a corporate meeting.

As an example, what happens in our laptop when we turn on a movie? Everything can be considered at the media player level (we read the contents of the movie, decode it in the video, show it on the monitor). It can be viewed at the operating system level (read from a block device, copy to the desired memory pages, “wake up” the player’s process and run it on one of the cores), or it can also be done at the disk driver level (optimize the i / o queue on the device, scroll to the desired sector, read the data). By the way, in the case of an SSD, the last list of steps would be different - and that’s all the charm, because in operating systems there is an interface of a block storage device, we can stick out a magnetic disk, stick in a USB flash drive and we won’t notice much difference. Moreover, the interface of the block device was invented long before the advent of CDs, flash drives and many other modern storage media - what is this if not an example of a successful abstract interface that has lived and remains relevant for a single generation of devices? Of course, someone may argue that the process was the opposite - new devices were forced to adapt to an existing interface. But if the interface of the block device were frankly bad and inconvenient, it would not have survived the market and would have been absorbed in some other alternative.

The human brain cannot hold many concepts / objects in one’s head at the same time, regardless of whether we are talking about physics or about programming or something else. Accordingly, try to organize the vertical hierarchy of your architecture so that you have no more than a dozen actors at any level of abstraction. Compare two descriptions of the same system:

Here we process incoming orders. First comes the validation process - we check the availability of the ordered goods in the warehouse, check the correctness of the delivery address, the success of the payment. Then the notification process starts - an SMS with information about the new order comes to the operator. The department head receives an email with summary information.

Or:

Here we process incoming orders. First, the validation system works out - we check the accuracy and correctness of all data. Well, in principle, if you're interested, we have there internal validation (availability in stock, etc.) and external (correct information specified in the order). Upon successful completion of validation, a notification system is launched - here you can find complete information about notifications via this link.

Feel the vertical orientation of the second description compared to the first? Additionally, the second description highlights the abstractions “validation” and “notification” brighter than the first.

How do people usually fly to the moon?

Right! They first design a rocket (a rocket as a whole, and separately each of its components). Then they build a plant for the production of each component and a plant for the final assembly of the rocket from the produced components. And then they fly to the moon on an assembled rocket. Is there any parallel?

The output is a huge number of components that can be reused for other related purposes. And there are factories that mass produce these components. And the success of the entire enterprise depends to a large extent on successful design (when they forgot to add an oxygen recovery module to the project, and the rocket is already on the launch pad, things are bad), a little less on the quality of the plants built (the plants can still be calibrated and tested somehow ) and least of all from a specific instance of the rocket, which stands on the launcher - if something happens to it, it will be easy to recreate it on the basis of the existing infrastructure. Soon we will learn to clone people, and then even with unsuccessful launches, there will be no talk of human losses :)

In programming, everything is exactly the same. The role of factories falls on the shoulders of iron - to execute our code. But here the role of design (creation of architecture) and specific implementations (construction of factories) rests with the programmer. Very rarely, these 2 stages somehow clearly stand out from the general tangle. And to think about these 2 stages separately is very useful, moreover, in other areas it even looks illogical. After all, who will immediately build the plant without first deciding what this plant will produce?

Architecture benefits

I am only summarizing the concepts that I tried to describe above. With the successful use of architecture, we have:

Simplicity of isolated testing of each system. Since each system communicates with the outside world through a strict interface, it is very easy to test it separately
Simplification of code support: by splitting into subsystems, making changes to existing code is simplified
System extensibility increases as thanks to interfaces, in many places we can easily connect some new functionality (or replace an existing one with an alternative implementation)
Code reuse is increasing: interfaces introduce weak binding into code. This means that any system will simply be applied in some other task. Here again, the completeness of the interface plays an important role . If the interface was really complete , it will be enough for a new task. Recall the Unix paradigm “Do one thing, but do it well” - reusing a well-written program with a full interface is a pleasure!

Signs of Successful Architecture

The success of an architecture cannot be unambiguously assessed as “yes” or “no.” Moreover, the same architecture can be successful within the framework of one project (specification) and fail within another project, even if both projects nominally operate in the same subject area. At the time of design, you are required to have the deepest and most comprehensive understanding of the process that you automate / model with code.

Nevertheless, I dare to offer you some common features of successful architectures:

The separation between architecture code and implementation code. Someone solves the problem at an abstract level (it’s like a boss who says “you need to increase sales in the next quarter”), and someone implements the specific steps necessary to achieve one of the components of the overall result (the PR department employee begins to advertise in the newspaper )
No matter where in the program we stop, there should always be an abstract explanation of what we are doing here. At the level of the boss, this may be “we increase sales, because last quarter was unprofitable ”, at the level of a particular employee it could be“ I give advertisements to the newspaper, because this is part of my job responsibilities (interface), and I just got an order from above to do this. " Such an explanation should be logical and consistent with the level of knowledge / horizons of the analyzed subject / actor.
Most of the code looks like an interaction between a service provider and a consumer. The user notification system provides the service “notify the user of event X” and, in turn, as part of the implementation of this service, consumes the service “send SMS message” and “send email”.
All critical components can be easily replaced with alternative implementations. We corny disconnect the old component and connect the component with an alternative implementation to the same interface. By the way, your non-tech boss will be terribly glad of such an opportunity at some critical moment!
It is easy to explain architectures with words (additional documentation), and it is relatively difficult to “understand the meaning” when looking into the code. In words, it is easier to explain the semantic load of the interfaces that make up the architecture, because with a high level of interface abstraction, this very semantic load is not so obvious from the code. In addition, some interfaces that are officially undeclared in the framework of the programming language used can trivially slip past the eyes of the programmer when he views an unfamiliar code.
When using architecture, most of the functionality becomes available closer to the end of the development cycle. At the initial stages, the programmer writes the architecture code and implements individual subsystems. Only at the very end does he connect them together in the correct sequence to achieve the final (business) result. And when there is no architecture or it is not enough, the functionality is delivered more or less linearly - it is commonplace for a person to take 10 days to write a code, and every day he writes 10% of the total code canvas. Here is a graphical explanation of this item - a graph of the distribution of task completeness versus development time:

Tips for Building a Successful Architecture

Try to ask 3 questions when analyzing an architecture problem: What are we doing? Why are we doing this? How do we do this? For what?" an interface responds, for example, “we notify the user of an event”. For “why?” the consumer answers - the code that calls the subsystem, and the specific implementation of the interface (service provider) answers the question “how”.

Try to arrange any self-sufficient operation in the form of a subprogram (function, method, or something else, depending on the tools available to you). Even if this is just one line of code, and it is used once in your program. So you separate the architectural code (list of abstract actions) from the implementation. In this context, this function acts as an interface, and then we get the consumer (calls the function) and the supplier (implementation of the function). Example:

function process_object($object) {
  $object->data[‘special’] = TRUE;
  $object->save();
  send_notifications($object);
}

function process_object($object) {
  $object->markAsSpecial();
  $object->save();
  send_notifications($object);
}

Use more levels of detail in your architecture. With intensive “vertical” crushing, you will have a wide selection of components of different caliber. When you begin to solve the next task within the framework of such a project, you will have the choice to either use some kind of high-level system (quickly, possibly to the detriment of the flexibility of the solution), or to “add” a solution from low-level components that more precisely fits the business need. Naturally, if possible, you will prefer higher-level components, but you will always have the freedom to assemble some critical section from lower-level components. For example, you might have a high-level “notify user of an event” component. Based on the settings in the user’s profile, she selects the long or short version of the notification and sends it either by SMS or by mail. Such a high-level component uses 2 lower-level ones: “send SMS to number X with content Y” and “send email to address X with content Y”. The next time you need to notify the user of an event, most likely you will use a high-level component. But you still have the option to send sms and letters bypassing the high-level component using the low-level layer directly - let's say this can be useful for you with a critical notification - it would be better to send it directly to the sms phone bypassing user settings due to the critical situation. The more levels of detail you select, the more freedom you will have. It’s like an atomic bomb and a point airstrike - sometimes it’s more convenient to bomb the hell out of the battlefield, and sometimes it’s more convenient to deliver 10 point strikes against strategic targets.

Your imagination works many times faster than your fingers - validate and “try on” the architecture on a piece of paper before you start implementing it in code. It will be a shame to understand after 5 hours of coding that the interfaces you invented do not cover the needs of the subject area, and you could have foreseen this problem by spending 20 minutes analyzing the architecture and “checking” the architecture on paper. At some points I spend a full day sitting and looking at the sky - inventing and rolling architecture on paper.

Do not overload your interfaces. In pursuit of fullnessinterface, we can include redundant elements in it, but here you can inadvertently spoil the porridge with oil. The more elements an interface includes, the less freedom it leaves to the one who will implement it. Also, do not forget that it is possible, at some point you will need to change this interface in the light of some new business tasks. The simpler the interface, the easier it is to change it and the actors on both sides of this interface.

It may sound paradoxical, but an overloaded interface will be less complete than a perfectly load balanced interface. Excessive details narrow the interface, but do not expand it, because some details lose their physical meaning in some other context. For example, we could “overdo it” and introduce the concept of time zone into our interface of the system for notifying a user about an event: “notify the user about an event taking into account (or without) his time zone”. In some context this will be the right interface, but in some wrong. Suppose users of our system begin to live on the moon and there is no concept of a “time zone” in the sense in which earthlings are used to it. Then this additional load in the interface will be redundant and will act to the detriment of the entire architecture.

Don't forget about performance and scalability issues when designing your architecture. Ideally, the interfaces should be as simple as possible - let's say a couple of functions that allow you to modify and delete some entity from the repository. Packing only 2 functions in the interface, we get a high level of abstraction - we can use a relational database and NoSQL for physical data storage. But if there are thousands of such entities, then it becomes obvious that they need to be manipulated at the DBMS level, and not the application. Then you need to consciously include the database structure in the interface where these entities are stored. Otherwise, the interface will be beautiful, but incompletebecause, taking into account performance requirements, a complete interface should provide fast and efficient tools for mass interaction with entities.

Architecture creation

The ability to correctly understand the subject area, to identify successful interfaces, I relate to art. In my personal case, I studied this craft through the practice and contemplation of the architectures of other authors, always passing the architecture under study through the prism of my own critical thinking.

The next time you need to solve a relatively large task, move away from the computer and sit for an hour with a piece of paper. In the beginning, perhaps no thoughts will go into your head, but you honestly continue to reflect on the problem and the abstractions / interfaces that can be hidden inside this problem. Do not be distracted - the depth of immersion and concentration are very important so that you can think through and compose all the actors and their connections in your imagination as detailed as possible.

When you see someone else's architecture (or your own, but for some time previously implemented) architecture, and you need to make changes to the code, try to analyze whether it is convenient to make these changes with the current architecture, is it flexible enough. What can be improved in it?

Upd .: I wrote this article when I was preparing to speak at one conference. The video can be viewed here - meduza.carnet.hr/index.php/media/watch/12326

Tags: