How to explain to non-IT managers the principles of building a fault-tolerant IT infrastructure

    About a year ago, a rather serious task was set before me: to put in a 2-hour lecture for managers the story of both Agile and DevOps.

    Thus began my return from the softskills plane of Agile training to IT. And according to the organizers, over 1000 product managers passed through this lecture, of which approximately 48/50 people heard the word “Load Balancer” for the first time in my class.

    I even got a comic deity "a great balancer, master of updates without downtime, cheap to implement A / B tests without programming, and generally a good night's sleep of a manager."

    Of course, colleagues from IT can laugh at this simplification, and even be outraged by the fact that the world did not agree on the word "balancer" and how much attention can be paid to it.

    But when in my room 48 out of 50 people did not hear about the phenomenon of load balancing, this is a little sad. Yes, and the developers of the backends of some mobile applications, even large banks can sin by the absence of such schemes.

    My favorite yellow bank, for example, updates the backend server of a mobile application at 5 a.m. Moscow time about 2 times a week. Why do I know that? Because in Novosibirsk, where I was returning to live for a year in 2016, it was already 9am at that time, and the error 000 popped up for me. It is terrible to imagine that this is already lunch for the Far East.

    Perhaps we have a chance to make this world a little better if managers will think about fault tolerance at the time of budgeting server capacities, and there will be not 1 server for everything, but a truly commensurate degree of risk and system load.

    What for?


    The very first question that arises when setting any task, of course: why?

    There is such a framework:

    Why do we need it? | Why do they need it?

    Why do we need it?


    If we imagine that “we” are a lot of people from IT, not only developers and related specialists, but also technology consultants, HR and Agile coaches, who are in daily contact with managers who do not have an IT background.

    For myself, I answered the first question quite simply: improving the technical literacy of managers greatly reduces the likelihood of inadequate tasks and increases the happiness of developers.

    Why do they need it?


    Why do managers who are really far from IT know about this?

    We are all people, and we all want to sleep peacefully. Managers often take responsibility for what they are not able to really influence. The stress level in this case is comparable with the passengers of the aircraft who have aerophobia.

    And this is probably the only argument that will not be like snobbery “how can you not know such obvious things” or “any person should blindfold an indefinite integral at night blindfolded”. In my experience, if a person is "to the elbow in the console", then even unconsciously, but he can often operate with such stamps.

    How can I explain the complex simple pictures


    The illustrations below do not claim to be absolute truth and do not have independent value, especially since these simplifications should not be used as a guide to action when building fault-tolerant architectures, since I did not intend to draw various subtle points there, such as caching. This is just a simplified model.

    In adult learning, and assimilation of new information is part of learning, it is important to understand that any information must be repeated at least three times in order to increase the likelihood that it is actually acquired.

    For example, such a scheme will most likely be associated with the meme “do not try to leave Omsk” and only confirm the person in the thought that “everything is complicated, but they also want a lot of servers”.



    But this scheme, shown at first, can create a person’s association of the word “balancer” with the phenomenon of balancing the load on the server. Without any guarantees of a correct understanding of this process, but with confident knowledge that it exists and why it is needed.



    Let's spoil a few points of the Agile manifest in this place and say “that is, without diminishing the value of what's on the right, we value more on what's on the left.”

    For example, because this scheme allows you to understand how to configure the A / B testing system without writing tons of source code, and how to update the server without drinking for courage (to the manager, not to the admin) before that.

    What's next?


    And this very understanding opens the way for the manager to the wonderful world of CI / CD, because if we already know the minimum labor required to make the infrastructure partially fault-tolerant, we are less afraid of frequent releases. And this fundamentally changes the approach to update policies in general.

    Well, it’s not for me to tell you that smaller edits laid out at 1/10 of the capacity (even if it is 1 server out of 3, but only 10% of the traffic is given to it), this is a strong decrease in passions during the upgrade. Even if the servers completely stop processing every 10th request.

    We once had a 20% drop at RPS 600, and it was quickly eliminated, it seems even without the participation of people. It was then that I, as a technical PM who was responsible for all the backends of the direction, practically began to aspirate to repeat the word “balancer” to other managers.

    As my experience shows, this knowledge is extremely useful precisely so that managers can understand how to minimize the risks from the release and become interested in CI / CD and various technological experiments.

    About 4 years ago, approximately the same story in my practice was to tell developers about GitFlow-like “brunching” systems to stabilize releases and moratoriums on commits in the release branch, supported at the hook level, but lately it has become less and less and less required.

    In my opinion, it is now really important to increase the technical literacy of non-technical managers. Absolutely not necessarily in this way, of course.

    Only registered users can participate in the survey. Please come in.

    Have you heard the word “balancer” before this article?

    • 87.5% I'm an IT professional, of course 70
    • 5% I am an IT professional, no 4
    • 6.2% I am a non-IT manager, yes 5
    • 1.2% I am a non-IT manager, no 1

    Also popular now: