O-notation in software design

From the sandbox

Understanding SOLID, I often stumbled upon the fact that not following these principles can lead to problems. Problems are known, but poorly formalized. This article is written with the aim of formalizing the typical situations that arise in the process of writing code of possible solutions and the resulting consequences. We will talk about how bad code threatens us and how problems grow with the growth of the program. Unfortunately, in most cases, the assessment comes down to the “many” and “one” variants, which hints at the failure of the O-notation, but even such an analysis will allow us to better understand which code is really dangerous for the further development of the system, and which code is tolerable.

Definition

We will say that a change in a program requires “O” from f (n) actions if the programmer needs to make no more than f (n) logically separate changes in the program to implement the change up to a constant factor, where n is the size of the program.

Metrics

Consider some design properties according to Robert Martin and evaluate them in terms of O-notation.

A design is hard if a single change causes a whole cascade of other changes in dependent modules. The more modules you have to change, the harder the design.

The difference between the O (1) and O (n) changes is significant. Those. our design allows a constant number of changes, or as the program grows, the number of changes will increase. Further, the changes themselves should be considered - they can also be tough with some asymptotics. Thus, stiffness can be complex down to O (nm). The parameter m will be called the depth of rigidity. Even an approximate estimate of the depth of rigidity in a design that at least allows rigidity O (n) is a matter too complicated for a person, since each of the changes must be checked for even deeper changes.

Fragility is a property of a program to be damaged in many places when a single change is made. Often new problems arise in parts that have no conceptual connection with the one that has been changed.

We will not consider the question of the logical connection of the modules in which changes occur. Therefore, from the point of view of the Notation of the Difference between Fragility and Rigidity, there is also no reasoning valid for rigidity, applicable to fragility.

A design is stagnant if it contains parts that might be useful in other systems, but the efforts and risks involved in trying to separate these parts from the original system are too great.

Risks and efforts in this definition can be interpreted as non-constant as the module grows in the number of changes that occur in the module when trying to abstract it from the original system. However, as practice shows, abstraction is still worth it, since it brings order to the module itself and allows it to be transferred to other projects. Very often, after the first need to transfer the module to another project, other similar ones appear.

Viscosity.

Faced with the need to make a change, the developer usually finds several ways to do this. Some retain the design, others do not (that is, they are essentially a hack). If design-preserving approaches are harder to implement than a hack, then the design viscosity is high. Solving a problem is wrong, easy, and right is difficult. We want to design our programs so that it is easy to make changes that preserve the design.

Following viscosity can be called short-sighted in terms of O notation. Yes, at first the developer really has the opportunity to make a change in O (1) instead of O (n) (due to rigidity or fragility), but often such changes lead to even greater rigidity and fragility, i.e. increase the depth of rigidity. If you ignore this "bell", the following changes may be no longer possible to solve the "hack" and it is necessary to amend the conditions of stringency (previously used to proc eed than before the "bell") or lead system in good condition. That is, when viscosity is detected, it is better to rewrite the system right away.

It happens like this: Ivan needs to write a certain code, which kurdyachit bokrenka. Climbing through the different parts of the program, where, as he suspects, has been kurdyachili more than once, he finds a suitable fragment. It copies it, inserts it into its module and makes the necessary changes.

But Ivan does not know that the code he extracted with the mouse was put there by Peter, who took it from the module written by Sveta. Sveta first had kurdachit bokrenka, but she knew that this process is very similar to kurdächenie barmaglotov. She somewhere found the code that pulls the barmaglot, copied it into its module and modified it. When the same code in slightly different forms appears again and again, developers lose the idea of abstraction.

In this situation, it becomes obvious that when it becomes necessary to change the basis of the excavated action, this change should be made in n places. Given the possibility of unique modifications in each copy-paste, this may also result in logically unrelated changes. In this case, it is likely to just forget about the change elsewhere, since there is no protection at the compilation stage. This means that it can turn into O (n) test iterations.

Application About Notation to SOLID

SRP (Single Responsibility Principle). A software entity should have only one reason for change (responsibility). In other words, for example, a class should not follow business logic and display, since when changing one responsibility, we must make sure that the other liability is not damaged. That is, the inconsistency with the SRP principle translates into rigidity and fragility. Also, following this principle helps to get rid of inactivity and transfer modules from one program to another with potentially fewer dependencies.

The asymptotics of changes remains in general the same as without following the principle, but the constant factor decreases significantly. In both cases, we should forward the entire contents of the class and, in the case of changes in the interface of the entity, the place where they interact with this entity. Only following the SRP helps to reduce the interface and the likelihood of its change, as well as the amount of internal implementation, which after the change may be faulty. Similarly, the reasoning, trimmed before the argument about interfaces, is valid for the ISP (Interface Segregation Principle).

OCP (Open Close Principle). Software entities (classes, modules, functions, etc.) should be open for expansion and closed for modification. This should be understood in such a way that we should be able to modify the behavior of a module without affecting its source code. Typically, this is achieved through inheritance and polymorphism. Since the violation of the LSP principle is a violation of OCP, and while DIP is a means for maintaining OCP, the following can be applied to both the LSP and DIP. Inconsistency with the principle of openness-closeness forces to make changes in all entities unclosed relative to this change.

A rather trivial situation is, for example, the presence of a chain of Iphs that determine the type of variable among the list of child classes. Such structures may appear in the program more than once. And with every addition of a new child class, appropriate changes should be made in each such chain. Such situations can appear not only with child classes, but also with any consideration of not all possible particular cases [This refers to cases not at the time of writing, but in general. Cases may appear later].

Now consider the situation when we make m changes of the same type, which, because of inconsistency with the principle of openness-closeness, require n operations from us. Then, if we leave everything as it is, supporting the architecture of the consideration of special cases, and we do not generalize, then we obtain the total complexity for all changes O (mn). If we close all m places relative to this change, then subsequent changes will occupy O (1) instead of O (m). Thus, the overall complexity is reduced to O (m + n). This means that it is never too late to start following OCP.

Martin doesn’t have to guess about such a situation (if you don’t know exactly, of course) how to close from the first change, but after the first change you should close, because the first change was a marker that the system would not necessarily remain in its current state. This is logical, since we are doing O (1 * n) actions because of the non-closing, and then O (m) actions, to close on subsequent changes. Total we get the total complexity of O (n + m), but at the same time we do all the actions exactly when they become necessary and do not do anything in advance, not knowing if it will be needed.

Patterns and O-Notation

You can draw another analogy between O-notation in the theory of computation and O-notation in design. It lies in the fact that we reduce the number of calculations using algorithms and data structures, such as search trees and heaps, which solve typical tasks faster than “head-on solutions”, and the number of programmer’s operations is well-designed, which can use equally good typical solutions. which are called design patterns. You can evaluate the effect of patterns in the context of the SOLID principles and, as a result, in the context of O-notation.

For example, the Mediator pattern eliminates the possibility of breaking something in the program when the logic of the interaction of two objects changes, since it completely encapsulates it and guarantees the constant complexity of such a change.

The Adapter pattern allows us to use (read add) entities with different interfaces that we will use with a common purpose. Using this template, you can embed a new object with an incompatible interface into the system for a number of operations, independent of the size of the system.

As data structures can support some operations with good asymptotics, and some with poor, and patterns manifest themselves flexibly with respect to some changes and rigidly with respect to others.

Reasonable Limits

When dealing with O-notation, working on an optimization problem, we must remember that it is not always the algorithm with the best asymptotics that is best suited for solving the problem. It should be understood that the sorting of a bubble for an array of 3 elements will work faster than the pyramidal, despite the fact that pyramidal sorting has the best asymptotics. For small values of n, the constant factor plays an important role, which O-notation hides. O-notation in design also works in the same way. For small projects, it makes no sense to fence a lot of templates, since the costs of their implementation exceed the number of changes that should be made from “poor design”.

Tags: