FiresShadow December 22, 2015 at 16:27

Castanedian warrior in the field of risk management

Want to know what risk management is and how to deal with ninja dexterity?
Then welcome to kat!

(This article outlines thoughts inspired by the work of Tom DeMarco, and most likely the article will be uninteresting to those who are already familiar with his works)

In 1988, the city of Denver decided to build a new airport instead of the old one. The airport existing at that time did not meet the needs of a growing city, and was unsuitable for expansion. A budget was allocated, construction began. But, unfortunately, the project could not be completed on time. The reason for the delay was software that was not written on time. The airport was equipped with low narrow tunnels designed to transport luggage. The software was supposed to scan the barcode of the baggage and direct it in the right direction through the network of tunnels. Since baggage delivery was not possible due to incomplete software, the airport was idle until the software was ready. Since the construction of the airport is associated with huge capital investments, all this capital was frozen, while programmers in a hurry tried to play catch-up. And time is money. Lost profits + losses amounted to half a billion dollars ... The media destroyed the reputation of the organization as a software provider, putting all the blame on it.

And now let's imagine that the managers of this project spent half a day of their precious time and engaged in a risk assessment by brainstorming. Imagine that they gathered and analyzed what losses they would incur if the runway or software were not ready on time, and also thought how to reduce losses in case of force majeure. By golly, a network of tiny tunnels is not the only way to transport luggage! If a couple of million dollars was spent on building slightly more spacious tunnels, luggage could be transported with the help of cheap labor or small trucks. Two million is far less than half a billion. In the end, you could not even invest in expanding the tunnels, you could use sled dogs! Hire a canine specialist arrange with the owner of the nearest kennel for stray dogs, and start training. If the software was ready on time, then in the shawarma airport the price of hot dogs would remain unchanged, but in case of difficulties the sled dogs could save the situation. Anyway, thisat least some kind of safety net. But there is one thing but. You need to think about such things in advance. If on the appointed day for the delivery of the project, the contractor comes running shouting “boss, everything is lost!”, Then it’s too late to build new tunnels or train the dogs, because while you finish this, the work of the contractor will probably come to its logical conclusion.

Terminology

When you put in the project budget potential risk losses multiplied by the probability of risk, this is called risk containment . The risk will either come or not. If it does not, then your money will remain with you. If it does, then perhaps the extra money that you have put into the budget will not be enough. But if you evaluate all the risks, and some of them come, and some do not, then you will probably meet the budget.

When you spend money in advance to minimize the loss of risk, this is called risk mitigation . For example, when you hire a dog handler to train dogs. It does not matter if the risk comes or not, the dog handler will eat and have fun at your expense.

Let us depict the ratio of mitigation and containment of risks for the construction of an airport.
It is known that a similar program was made by another team of programmers twice as long. In addition, programmers constantly complain that they are not able to catch up with the schedule, so we will assume that the probability of keeping up to date is 50%. In fact, if you are familiar with risk charts, you will agree that the probability of the most optimal scenario occurring of all the possible ones is equal to a nanopercent, but we, like construction managers, will slightly exceed expectations and will assume that the probability of success is 50%. As in one bearded joke: either succeeds or not, so 50%. In IT in general, in most cases, temporary and probabilistic estimates are given intuitively and “from the ceiling”. So, the probability of risk is 1 \ 2, the price is half a billion, we will lay half of half a billion to contain the risk. To mitigate the risk, you will need several monthly salaries of a dog handler, a ton of dog food, + the salary of a janitor cleaning up for dogs living in an unfinished airport. Well, or the price of more spacious tunnels is to your taste. It will look something like this:

As you can see, risk mitigation in this particular case looks much more profitable. Of course, if the risk does not come, then you will spend resources that you might not have spent. But, on the other hand, if it comes, and you have not been safe, then the consequences will be disastrous. It's like sparing money on a safety rope when you are going to walk a tightrope. Risk mitigation is much cheaper than containing risk, and its benefits are obvious. But let's look at the opposite situation — what if risk containment is much cheaper than mitigating it? Does this mean that in this case it is necessary to contain the risk? In fact, not always. Here the philosophy of a warrior will help us.

Meet this Carefree Farmer. He knows that the harder he will water and weed his field, the more rich he will harvest. The picture of the Carefree Farmer’s world looks something like this:

The more you tried, the more you got. The dependence is exponential: over time, for the same efforts you get a greater reward (qualifications and position in society increase, passive income increases, etc.), until a fixed ceiling is reached. Everything is predictable and simple.

And this is the Fearless Warrior. A warrior may receive a thousand minor wounds, but continue to stand firmly on his feet; until it receives a fatal blow. During the confrontation between the two sides, each side tries to maximize the fatal risks of the enemy. Tiring the enemy to death is much less effective than delivering a calculated blow to the heart. For a warrior, the formula “more effort - more reward” does not work. He may lose everything due to one single mistake. Both philosophies — the landowner and the warrior — are true for the concept of risks, but the concept of a warrior is nevertheless closer. The risk will either come or not. This will either lead to the failure of the project or not. With a shield or on a shield.On the other hand, we do not live in ancient Japan, and it is not customary for us to chop offenders' heads left and right in any incomprehensible situation. Not every failed project means the breakdown of the organization and the end of a manager’s career. From any, it would seem the most hopeless situation, you can try to extract at least some benefit. You can complete the project with reduced functionality, change the target audience, use the achievements made in the next project, draw conclusions about the reasons for the failure of the project and try to prevent this from happening in the future. From this failure will not cease to be failure, but at least some of your efforts will pay off.

Examples of risks in the confrontation of two forces

White moves the knight to e5 the next move. If black defeats the queen, then the mat will follow:
7 ... C: d1 The black elephant cuts the white queen
8. C: f7 + Kp: e7 The white elephant, under the cover of the knight, puts the checker black. Black has the only possible move by the king on e7
9. K: d5 # Second knight on d5. Checkmate

There is an opportunity to cut down the opponent’s queen and allow himself to check, but if the opponent can realize the emerging risks, then you will lose the game. This is a fatal (or critical) risk for your party.

Consider an example from a team intellectual game - DotA, where each team consists of 5 people. The more local skirmishes this or that player wins, the stronger his character becomes, and the more chances his team has to win the game. However, even if one team wins over and over, and it would seem that there is no hope, a single mistake by a player from the dominant team can mean a loss for the whole team. If in chess the goal is to cut down the king, then in DotA the goal is to cut down the source of strength of the opponent’s team (the world tree of the “whites” or the magic ice throne of the “blacks”). It happens that a player takes risks by attacking a weaker player from the opposing team, or by attacking one together. The risk is that in fact it can be a trap, and the player is insured from an ambush. If two players are caught in a well-organized ambush, then the nearest player will most likely be “killed” and will be out of the game until the “resurrection”, and the second player may try to save him (with the risk of being out of the game for some time too) or Run as fast as the opposing team deals with the first player. If the second player takes a chance and also "dies", the team will remain in the minority. After that, the enemy team can launch a full-scale attack, and, taking advantage of the numerical advantage, break down the defense and destroy the "source of strength" of the enemy. If the second player takes a chance and also "dies", the team will remain in the minority. After that, the enemy team can launch a full-scale attack, and, taking advantage of the numerical advantage, break down the defense and destroy the "source of strength" of the enemy. If the second player takes a chance and also "dies", the team will remain in the minority. After that, the enemy team can launch a full-scale attack, and, taking advantage of the numerical advantage, break down the defense and destroy the "source of strength" of the enemy.

This is similar to the final battle from some action movie, when one character receives a lot of lyuli throughout the fight and generally looks like a complete wimp against the favorite of the fight, but at the end by some miracle sends the opponent to the knockout. Yes, the philosophy of a warrior, she is. It is not enough just to throw further, jump higher, beat harder and have more experience. Wisdom and courage are needed to anticipate possible risks .

Let's see what the losing team did wrong. Maybe they shouldn't have attacked and risked an ambush? Actually, it was worth it. Risk and profit go hand in hand. Paying off risks bring victory closer. Who does not risk, he does not drink champagne. Another thing is that you need to take risks wisely. In this case, the second player, when he realized that the thing smells like fried, was worth saving his team, not his partner. On the other hand, if you are forced to abandon a friend in need, this is a clear sign that you are doing something wrong. In this case, it was necessary to mitigate the risk , after conducting reconnaissance. Yes, this would require additional resources and time, but it significantly reduces the risks. Moreover, the team dominated and could afford to spend some resources on intelligence. Apparently, the feeling of an imminent victory stupefied them, and they completely forgot about the possible risks. Disrespecting an adversary is a fatal mistake. Remember, any relationship between people is built on mutual respect.

Now back to the question, and if risk containment is much cheaper than mitigating it, is it worth restraining the risk in this case? Let me remind you that risk containment = probability of risk * losses from the onset of risk. If there is a risk, you will need to pay the full cost, which can be several times greater than the amount inherent in containing risk. If you don’t have the opportunity to pay this amount, and this will mark the failure of the project, then, in my opinion, it’s better to deal with risk mitigation. If mitigation of the risk is not possible, then it is worthwhile to think carefully about whether it is worth undertaking this project at all. In my opinion, it’s wiser to get half of the potential profit than risk losing everything. Of course, you can’t control all the risks. Risk and profit go hand in hand, so sometimes it makes sense to just take a chance. You cannot guarantee that a meteorite will not fall on your house tomorrow, but this is not a reason not to acquire your own housing. Maybe it will fall, but maybe not. But it's worth it! But, on the other hand, this is not a reason not to be safe in case it does fall. A small amount of money in the bank, sufficient for rental housing for several months, may well contain the risk of freezing from the cold on the street.

In the case of innovative technologies, risk management becomes a pretty subtle game. On the one hand, working with new technology bodes a lot of risks, but on the other hand, not taking risks to take advantage of it means risking allowing competitors to get around you. In this case, not taking risks is a much greater risk than taking risks (it sounds pretty complicated, right?). As I said, in my opinion, it is wiser to get half of the potential profit than to risk losing everything. Of course, we need an assessment of a specific situation. Reducing profits can guaranteed to lead to squeezing you out of the market by competitors if you initially did not have any trump cards up your sleeve. In this case, it will not work to create some kind of universal algorithm of actions. As Field Marshal Helmut von Moltke said, not a single plan can withstand a combat clash with the enemy.

Learning from your own mistakes in the risk management process

In the situation with the airport in Denver, all the blame was assigned to specific performers, although in fact the responsibility and losses fell on the city municipality. How could managers understand that they did something wrong? I noticed the following pattern: if a small failure leads to larger and more critical consequences, then somewhere in the risk management a mistake is made. For example, if, due to unavailability of software, the entire airport is idle. Or, for example, if signal lights do not work due to a power outage (for such cases, you can stock up on backup generators). If, due to an erroneous pressing of the wrong button, half of the globe flies into space in the form of fragments. But you don’t have to place the button from the nuclear suitcase next to the button for pouring coffee in a coffee machine in the lobby, yeah. For a manager, it looks like this: some artist made a mistake, pressed the wrong button, and now there is no Earth, all the fault lies with the artist. Well, to the performer it seems that he was mistaken quite a bit, just awake hit the wrong button. If there is a sense of insignificance of the error, and it sounds believable, then this is a signal that the risk management was carried out very badly. This, of course, does not justify the performer, destroying humanity. But you don’t have to work in offices in which corporate policy requires the presence of a button from a nuclear briefcase in every coffee machine, yeah. Well, or at least, you do not need to approach these machines closer than 3 meters. You can also try to achieve a change in corporate policy or enclose these machines with fences. Unfortunately, in most companies “the initiative is punishable,” so you have to install fences in your own time, for your money, and also be responsible for all the risks associated with these fences, even if you are not a manager and do not have the authority and capacity to managing these risks. in which corporate policy requires the presence of a button from a nuclear briefcase in every coffee machine, yeah. Well, or at least, you do not need to approach these machines closer than 3 meters. You can also try to achieve a change in corporate policy or enclose these machines with fences. Unfortunately, in most companies “the initiative is punishable,” so you have to install fences in your own time, for your money, and also be responsible for all the risks associated with these fences, even if you are not a manager and do not have the authority and capacity to managing these risks. in which corporate policy requires the presence of a button from a nuclear briefcase in every coffee machine, yeah. Well, or at least, you do not need to approach these machines closer than 3 meters. You can also try to achieve a change in corporate policy or enclose these machines with fences. Unfortunately, in most companies “the initiative is punishable,” so you have to install fences in your own time, for your money, and also be responsible for all the risks associated with these fences, even if you are not a manager and do not have the authority and capacity to managing these risks.
Another case is also possible - if the risk lies on the surface, periodically brings a bunch of problems, but everyone ignores it, then the performer will think that mistakes are inevitable, like the problems that they bring. Since the addiction will be developed to the problems, there will be no sense of the insignificance of errors and the significance of their consequences, but there will be a sense of the hellishness of what is happening.

Consider a few stories from IT.

History 1. Suspicious leader.

Once upon a time there were two teams working in different cities for the benefit of one startup. The first team consisted of a product partner (deputy director) and his assistant. Both of them combined the functions of managers, analysts and testers. The second team consisted of several programmers, a tester and a team leader. They were engaged in the automated collection and processing of information taken from the Internet, and its publication on their website. Well, since the sources from which the information was collected had a different format, and could change at the whim of the developers of these sources, it was not always possible to correctly collect information from them. For this purpose, the tester round the clock compared information from the source with information on the project website. Having found a mistake, the tester starts a corresponding card with a bug on the board, programmers fix it, and in the next version of the program, such information is obtained from the source correctly. And in another city, another tester - an assistant assistant - controls the quality. He will find a long-standing, already fixed bug, but does not start the card, but reports to the product-ovner. Then the programmers cheerfully report what they say, this bug has already been fixed, we just redistributed the information for this particular case with our latest algorithm, and everything went without errors. Owner did not trust the programmers, and believed that they cover themselves and the tester, who does not find bugs. Unfortunately, due to the specifics of the project, it was not possible to transfer all the information from the source all the time with the release of each new version: firstly, the volumes of information were large, and secondly, the information eventually lost its relevance and value. In general, after some time,
Could this situation have been predicted in advance? It would not be easy, but possible. It was enough to introduce ourselves in turn at the place of each of the participants in the process, take into account his job responsibilities and imagine what he would do and what difficulties he would encounter. An ovner from another city must ensure that the remote team does not cool and idle, and also sets the overall strategic direction of the project. The assistant should help him with this. The team should work. Naturally, in the absence of complete transparency of the process, suspicions may sneak into the head of the ram. Moreover, complete transparency could not be achieved, because neither the partner nor his assistant had ever been involved in programming, and accordingly could not objectively evaluate the productivity of the remote team. Respectively, the only possible way to prevent a conflict is to make sure that the “overseers” cannot find fault with anything. It was necessary to predict this risk and take action. But even when the risk came, the team leader and the partner did not look for ways to solve the problem. They began to look for the guilty. The programmer partially solved the problem by proposing to delete and re-fill the data for the previous month every week on the test site, so that the assistant assistant would look for errors in the data received by the latest algorithm. The main problem here is the lack of trust and respect between teams in different cities, and this problem has not been resolved. But now the partner has at least fewer reasons for discontent. But even when the risk came, the team leader and the partner did not look for ways to solve the problem. They began to look for the guilty. The programmer partially solved the problem by proposing to delete and re-fill the data for the previous month every week on the test site, so that the assistant assistant would look for errors in the data received by the latest algorithm. The main problem here is the lack of trust and respect between teams in different cities, and this problem has not been resolved. But now the partner has at least fewer reasons for discontent. But even when the risk came, the team leader and the partner did not look for ways to solve the problem. They began to look for the guilty. The programmer partially solved the problem by proposing to delete and replenish the data for the previous month every week on the test site, so that the assistant assistant would look for errors in the data received by the latest algorithm. The main problem here is the lack of trust and respect between teams in different cities, and this problem has not been resolved. But now the partner has at least fewer reasons for discontent. The main problem here is the lack of trust and respect between teams in different cities, and this problem has not been resolved. But now the partner has at least fewer reasons for discontent. The main problem here is the lack of trust and respect between teams in different cities, and this problem has not been resolved. But now the partner has at least fewer reasons for discontent.
Moral: solve problems, not shift responsibility. (Responsibility for the entire project as a whole lies with the project manager in any case, more about this in the manager’s black book .) Solving a problem means analyzing the situation, searching for the fundamental cause of the problem, assessing the appropriateness of mitigating risk to prevent a recurrence of the problem in the future, assignment of measures to mitigate risk to subordinates, monitoring the implementation of these measures. Of course, before this you need to give instructions on how to eliminate the consequences of the risk.

History 2. A sharp transition to a new program

There was one program for accounting, we will arbitrarily call it program A. It was decided to write a new program B that transfers data to program A through a database. More detailed data on the production process were introduced into the new program B than into the old program A, then the necessary data were calculated in B and transferred to A. Users had to enter the data in B, and then use the old program A. B was used not only for transmission data in A, but also gave users some new features. The developer of the new program B did not know the business logic of program A, and the Delphi language in which it was written, so he clarified with author A what data and to which table to transfer.
Unfortunately, this organization did not have a testing department, and the analyst, who knew business logic, was part-time boss, and preferred to transfer testing and partly analytics to her subordinates - programmers. If the programmer misunderstood the rather abstract technical tasks of the analyst, then the unverified and incorrectly implemented program was sent to the user.
So that users would not be tempted to continue to enter data in the old program, it was decided to forcibly remove such an opportunity from them, abruptly transferring them to a new, insufficiently tested, program. The manager gave the task to the author of the old program to make changes to it that prohibit entering data, and he promptly completed the task. I don’t know, maybe the author of the old program naively believed that this time the testing was carried out. Maybe the above-described problem of getting used to the inevitability of problems played a role. Be that as it may, he did not say about the risks. Well, maybe he subconsciously understood that he, as the only person who is aware of the business logic of the old program (except for the manager-analyst, of course), will have to conduct testing and he will fail the terms of his current task. Be that as it may, the under-tested program was forcibly delivered to users. It was expected that it was found that some particular cases were not taken into account. Because of this, the user’s work was stopped several times for several hours, and changes were hastily made to the program. The responsibility for this was assigned by the boss to the developer of the new program.
Let's see what mistakes were made in this case from a management point of view. If in a previous story the team leader could even theoretically foresee the risks and mitigate them, then this leader, who had never been involved in programming, had no such opportunity. Since the leader had to manage IT processes, which she did not understand, she tried to shift more responsibility to those who understood them - to the final executors. But she did it implicitly. That is, after a couple of incorrectly calculated risks, the programmers should have become aware of the completely clear and not hidden position of the boss: you are a programmer, do all sorts of magic things in which I do not understand anything, you are responsible for everything that happens on the project. But no one was given the direct task of studying and starting risk management. It happens that the leader is incompetent in some matters, but this does not matter, because he has subordinates who can be entrusted with certain tasks. If you don’t know how to draw - entrust it to the designer, do not know how to write sites - entrust it to the programmer, do not know how to manage risks - entrust it to the senior programmer, do not know how to check the effectiveness of risk management - outsource it. The main thing is to have a clear understanding of who is responsible for what. It is considered good form when labor responsibilities are negotiated at the time of employment. In this case, virtually no one was involved in risk management. Well, do not forget that most managers have a superior leader. If the project manager is responsible for the project as a whole, then the higher manager is responsible for the projects of the entire unit as a whole. Accordingly, the fact that the analyst, who is unable to lead, is engaged in the management of the project, and even the analyst does his job poorly, because his own boss - then this is a mistake of a higher leader.Peter principle in action. Alas, there are even fewer good managers than good programmers.

History 3. “Continuous Integration” through “Spoiled Phone”

In an organization without a testing department from past history, a mechanism for urgently uploading patches in case of detecting bugs during combat testing by users was provided. The user finds a bug and sends a request to the first support line. The first line of support, if it cannot solve the problem on its own, sends a request to the project manager. The project manager makes sure that this is a bug, not the unplanned wishes of the users who are trying to sell under the guise of a bug, and sends the application to the programmer. As a rule, to support a specific project, a single specific programmer is assigned. The programmer puts the executable file in a predetermined folder and writes about it to the first line of support. The first line of support is written to system administrators. Administrators post the patch and report to the first line of support. The first line of support is reported to the programmer. The programmer reports to the user. The user confirms that the bug has been fixed. How does this “emergency” laying out a patch work in practice? Often information about the bug comes to the programmer with a delay of half a day. A person from the first line of support assigned to this application can leave for about half an hour. If the programmer does not periodically ask the first line of support whether the patch was posted, then he most likely will not know if he was posted at all on this day. How does this “emergency” laying out a patch work in practice? Often information about the bug comes to the programmer with a delay of half a day. A person from the first line of support assigned to this application can leave for about half an hour. If the programmer does not periodically ask the first line of support whether the patch was posted, then he most likely will not know if he was posted at all on this day. How does this “emergency” laying out a patch work in practice? Often information about the bug comes to the programmer with a delay of half a day. A person from the first line of support assigned to this application can leave for about half an hour. If the programmer does not periodically ask the first line of support whether the patch was posted, then he most likely will not know if he was posted at all on this day.
Once the patch didn’t work out for a long time. Then they reported that the patch was posted. But the user’s bug did not disappear. They connected remotely to the user, and found out that the patch was still not laid out. It turned out that the person from the first line of support provided incorrect information to the admins. Relocated the patch, it worked. And to whom can responsibility be transferred for the fact that the user has not worked for half a day? Right, to the programmer. Well, not to the testing department, which is not there, in fact. Well, no one will blame themselves, so the analyst-manager, who cannot check before sending the new version of the program that the programmer did exactly what the analyst wanted, has nothing to do with it either. And admins and the first line of support are generally under foreign jurisdiction, so you cannot make them extreme. All that remains is the programmer. What’s the most interesting For an “emergency” uploading of the patch, even its own program was implemented, which was able to automatically send letters to the programmer when a bug fix was assigned to him. It is unclear what prevents even the programmer and user from automatically sending notifications about patch uploads when administrators mark the patch in this program. In general, it is considered good practice when patches are laid out with the click of a button. For example, using when patches are laid out with the click of a button. For example, using when patches are laid out with the click of a button. For example, usingtimcity . Changing the process of laying out a patch solves the problem and mitigates the risks, but finding the guilty does not.

Consider several simplified algorithms for managers' behavior.

Algorithm 1.

Despite the absurdity and immorality of such tactics of behavior, in some cases it turns out to be effective. If the leader cannot influence the situation (the senior management has not allocated funds, there is no understanding of how the processes are to be managed, etc.), then by arranging troubles for those who theoretically can influence the situation, the leader increases the chances of eliminating the problem. In one store, all shortages for goods were removed from the salaries of movers and cashiers. After that, the store employees voluntarily threw themselves on video cameras and began to watch, watching videos. After several thief shoppers regularly stealing this store were caught by the hand, shortages were significantly reduced. Despite the fact that a positive result has been achieved, it is worth noting the risks of such tactics of behavior. Sellers could chip in cameras and sacrifice personal time, or they could get a new job in a store across the road. This feature makes this tactic extremely risky in IT. I do not recommend doing this.

Algorithm 2.

I recommend you this algorithm.

For a deeper understanding of risk management, I recommend that you read the book “Waltzing with the Bears” by Tom DeMarco and Timothy Lister.

Maybe someone will come in handy, here's another selection of books for managers .

All comments about typos, spelling errors and the like, please send a personal message. But I do not promise to fix them, because when I write an article, I do not receive a reward (except for the experience of writing essays) and do not risk anything (except for the rating of my account). Here, probably, it would be worth moving smoothly to the topic of motivation and self-identification, but, alas, this article ends here. And by the way, the fact that I did not undertake any obligations to correct spelling errors does not mean that I will definitely not correct them - write, we will figure it out.

Thanks for attention! :)

Tags: