WASD42 April 26, 2019 at 14:46

Crash Boeing 737 Max through the eyes of a software developer

Transfer

I present to you the translation of the article “How the Boeing 737 Max Disaster Looks to a Software Developer” by Greg Travis. It will be about how the Boeing’s desire to save money and “cut corners” for commercial gain, as well as the culture of “incompetence and unethicality” in the development community led to the death of 346 people. I don’t share the author’s position in everything (in particular, I believe that the human factor is much more evil than software), but it’s hard to disagree with the main arguments.

Below are a lot of letters. If you read laziness, but you want to get acquainted with the topic, then on Habré there is the first, shorter version of this article in the translation of Vyacheslav Golovanov , it can be found here .

The views expressed in this article are solely those of the author and are not the position of IEEE Spectrum or IEEE ( approx. Transl. - and the translator too :) ).

I am a pilot with 30 years of experience and a software developer with 40 years of experience. I wrote a lot about aviation and software development. The time has come to write about both things at the same time.

Now all the news is full of headlines about the accidents of the new Boeing 737 Max aircraft model ^(Eng.) , Which occurred literally one after the other with the newly released aircraft. For an industry whose existence is wholly reliant on customers' sense of complete control and safety, these two crashes pose a great existential danger. And despite the fact that over the past decades the number of deaths in plane crashes has decreased, this achievement is not at all a reason for excessive self-confidence.

WestJet Boeing 737 MAX 8, acefitt , Creative Commons Attribution 2.0 Generic

The first Boeing 737 model first appeared in 1967, when I was 3 years old. It was a small plane with small engines and relatively simple control systems.

Airlines (especially the American Southwest) loved them for their simplicity, reliability and flexibility. In addition, for piloting them in the cockpit instead of the usual three or four people at that time, only two crew members were required, which allowed airlines to start saving significantly. Along with the development of the air transportation market and the advent of new technologies, the 737 grew rapidly in size, and the complexity of electronics and mechanics increased. Of course, not only 737 grew. Airliners require huge investments both from the aircraft industry and from the airlines buying them, therefore both of them were also constantly enlarged.

Most of these market and technological forces, however, acted on the basis of the economic interests of companies, and not in the interests of passenger safety. Engineers worked tirelessly to reduce what the industry calls the “cost per passenger-kilometer” —that is, the cost of delivering a passenger from point A to point B.

A lot of this optimization story has to do with engines. Carnot’s Theorem of Efficiency (Efficiency) of heat engines says that the more and hotter you make an engine, the more efficient it becomes. This principle is equally true for the chainsaw engine, and for the jet engine.

Elementary. The easiest and fastest way to make an engine more efficient in terms of fuel consumption per unit of power is to make it larger. That's why the Lycoming O-360 engine in my small Cessna has pistons the size of a big plate. Therefore, diesel engines in marine vessels are made the size of a three-story house. And for exactly the same reason, Boeing wanted to install the huge CFM International LEAP engines in the new 737 model.

There is only one small problem: the original 737 was equipped with very small engines by today's standards, which made it easy to place them under the wings. However, as the 737 increased in size, its engines also grew, and the clearance between them and the ground became smaller and smaller.

To get around this problem, many tricks were invented (or “hacks,” as software developers would call them). For example, the most noticeable and obvious for the public is to change the shape of the air intakes from round to oval in order to provide more space under the engine.

In the case of the 737 Max, the situation became critical. The diameter of the blades of the engines installed on the original 737 was 100 cm (40 inches), in the new engines for the 737 Max the diameter increased to 176 cm. With a difference in the center line of more than 30 cm, you can no longer make the air intake so oval that the engine did not begin to scrape the ground.

Then it was decided to build up the engine from above and shift it forward and upward relative to the wing. This, in turn, led to the displacement of the axial line of the engine thrust. Now, with an increase in engine power, the aircraft has a tendency toward cabling, that is, to the nose.

For reference: the angle of attack of an aircraft is the angle between the direction of the velocity vector of the incoming air flow and the wing plane. Imagine that you put your hand out of the open window of a car moving along the highway. If you hold your palm almost parallel to the ground, this will be a small angle of attack; turning your palm about the plane of the earth, you will increase the angle of attack. When the angle of attack becomes too large (supercritical), aerodynamic stall occurs as a result of disruption of the air flow. You can verify this yourself by putting your hand out of the window of a moving car: slowly turning your hand you will feel an increasing lifting force pushing your hand up until your hand suddenly falls down - this is a stall of the stream with subsequent stall.

Thus, it turns out that the tendency to convert with increasing engine power in practice means a risk of further development of aircraft stall if the pilots squeeze the gas (as my son likes to say). Such a development of events becomes especially likely at a low flight speed.

Worse, due to the fact that the engine nacelles are so far advanced and so large, they themselves create lift, especially at large angles of attack. That is, the gondolas made the already bad situation even worse.

I emphasize: in 737 Max, the engine nacelles themselves at high angles of attack work like wings and create lift. Moreover, the center of application of this force is shifted far ahead relative to the center of application of the lifting force of the wing, which, in turn, leads to the fact that the 737 Max tends to increase the angle of attack even more with an increase in the angle of attack. And this is the worst example of incompetence in aerodynamics.

In itself, a change in pitch with a change in engine power is a fairly common occurrence in aircraft control. Even my little Cessna lifts her nose a little while applying gas. During training, pilots are told about such difficulties and taught to overcome them. However, there are certain safe limits set by regulators with which the pilots themselves are ready to put up.

It is quite another thing when the pitch changes with increasing angle of attack. An airplane, already approaching the point of aerodynamic stall, should under no circumstances have a tendency to further develop this effect. This property is called "dynamic instability", and the only class of aircraft where it is permissible - fighters - is equipped with catapults for pilots.

Everyone in the aviation community dreams of an airplane, which would be as natural and easy to fly as possible. For example, when changing engine power, lowering the flaps or extending the landing gear, the flight conditions should not noticeably change, rolls should not occur or the pitch should change - the behavior should remain predictable.

The hull of the aircraft (the iron itself) should initially work as predictably as possible, and not require additional “bells and whistles”. This aviation canon was laid down during the first flights of the Wright brothers to Kitty Hawk.

Obviously, the new Boeing 737 Max model tucks up its nose too much with increased traction, especially at already large angles of attack. They violated the oldest aviation law, and possibly FAA (Federal Aviation Administration) certification criteria in the United States. But instead of returning back to the drawing board and fixing the plane’s body, Boeing decided to rely on some kind of “Maneuvering Characteristics Augmentation System” ⁽ MCAS).

Boeing solved the problem with iron using software.

I will leave a discussion on the emergence of a corporate language ( approx. Transl .: apparently, the author means that the name “System for improving maneuverability” does not mean HOW it does absolutely nothing that is unusual for aviation terms ) in aviation lexicon for another article, but let's note that this system could be called differently, for example, “A cheap way to prevent a stall when the pilots punch it,” CWTPASWTPPI). However, it is probably worth stopping at MCAS.

Of course, MCAS is a much cheaper alternative to deep processing of the airframe, given the need to accommodate new large engines. Such processing could require, for example, lengthening the front landing gear (which could then not fit into the fuselage when pulled into the body), folding the wings up, or some other similar changes. That would be monstrously expensive.

All development and manufacturing of the Max 737 took place under the auspices of the myth "this is the same good old 737." Recognize that this is not an old model, and then re-certification would take years and require millions of dollars.

"In fact, pilots licensed to fly the Boeing 737 in 1967 can control all subsequent versions of the 737."
From a review of an earlier version of an article from one of the 737 pilots at one of the largest airlines.

Worse, such major changes would require not just re-certification by the FAA, but also the development of a completely new Boeing glider. Now we are talking about really big money, both for aircraft manufacturers and for airlines.

And all because the Boeing’s main argument in selling the 737 Max was that it was the same 737, and any pilot who flew on previous models would be able to control Max as well - without expensive retraining, obtaining a new certificate and a new rating. Airlines - and Southwest are a prime example - tend to prefer a fleet of one "standard" type of aircraft. They prefer to have a single airplane model that can be controlled by any of their pilots, since then both pilots and airplanes become interchangeable, maximizing flexibility and minimizing costs.

One way or another, it all comes down to money, and MCAS has become another opportunity for Boeing and its customers to make money flow in the right direction. The need to insist that the flight characteristics of the 737 Max did not differ from previous 737 models was the key to the interchangeability of the 737 Max fleet. It was probably the reason that the documentation about the very existence of MCAS was hidden.

If this change suddenly became too noticeable, for example, it was reflected in the manual for the aircraft or it would be paid special attention when the pilots undergo training, then someone - probably someone from the pilots - would get up and say: Hey, something that doesn't look like 737. " And the money would flow in the wrong direction.

As I already explained, you can conduct an experiment with an angle of attack on your own by simply putting your hand out the window of a moving car and twisting your palm. So, such a complex machine as an airplane also has the mechanical equivalent of a hand exposed from a window - an angle of attack sensor.

You may notice it when boarding a plane. As a rule, there are two of them, one on each side of the aircraft, usually right under the windows of the cockpit. Do not confuse them with pitot tubes (read about them below). The angle of attack sensor looks like a weather vane, and the pitot tube looks like ... hmm, tube. The angle of attack sensor looks like a weather vane precisely because it is a weather vane. Its mechanical wing moves in response to changes in the angle of attack.

Pitot tubes measure the force with which the air stream “presses” on the plane, and the angle of attack sensor determines which direction this stream is coming from. Since Pitot tubes, in fact, measure pressure, they are used to determine the speed of the aircraft relative to air. The angle of attack sensor determines the direction of the aircraft relative to the flow.

There are two sets of angle sensors and two sets of pitot tubes, one on each side of the fuselage. Typically, instruments installed on the side of the main pilot take their readings from sensors on the same side of the hull; similarly, the instruments of the second pilot show the values from the sensors of his side of the vessel. This approach creates a natural redundancy in the equipment, which allows quick and easy cross-validation by any of the pilots. If the co-pilot considers his airspeed indicator to be odd, he can check it against the similar device on the side of the main pilot. If the testimony diverges, then the pilots find out which of the devices shows the truth, and which is lying.

Once upon a time there was a joke that when in the future airplanes can fly by themselves, a pilot and a dog will still have to sit in the cockpit. A pilot is needed so that passengers are calmer from the realization that there is someone in front. The dog must bite the pilot if he tries to even touch something.

In the 737th, the Boeing not only made backup aircraft devices and sensors, but also a backup on-board computer, installing one computer each from the main and second pilots. A flight computer does a lot of different useful things, but its main task is to autopilot the plane when it was told to do it, and verify that the pilot made no mistakes in the manual pilot mode. The last paragraph is called “protection of the range of flight modes”.

But let's call a spade a spade - this is the very “biting dog” from the joke.

What does MCAS do? This system should lower the nose of the aircraft if it considers that the vessel is beyond the limits of acceptable angles of attack in order to avoid aerodynamic stall. Boeing installed MCAS in the 737 Max due to the fact that larger engines and their new location made stalling more likely than in previous generations of the model.

At that moment, when the MCAS notices that the angle of attack has become too large, he commands the trimmers of the aircraft (the system that makes the aircraft move up or down) to point the bow of the vessel down. She also does something else: indirectly, using what Boeing calls the “Elevator Feel Computer” (elevator sensing computer, EFC), she pushes the pilot control wheels (helms that pilots push or pull to raise or lower their nose aircraft) down.

In 737 Max, like most other modern airliners and even cars, a computer monitors all processes, or even directly controls them. In many cases, there is no longer a direct mechanical connection (i.e., cables, hydraulic lines and tubes) between the pilot’s control tools and the aircraft’s real aerodynamic tail, keel and other devices that make the plane fly. And if there is such a mechanical connection, then the computer decides what it is permissible for the pilot to do with them (and again that same biting dog).

However, it is important that the pilots receive a physical response about everything that happens. In the good old days, when the cables connected the control elements of the pilots with the plumage, they had to pull the helm with great effort if the plane was down. They had to force push him if the plane was gaining altitude. Under the supervision of a computer, the natural sensations of control disappeared. The 737 Max no longer has a “natural feeling."

Yes, in 737 there are redundant hydraulic systems linking the controls that the pilot interacts with and directly working ailerons and other parts of the aircraft. However, these hydraulic systems are so powerful that they do not transmit direct feedback from the aerodynamic forces acting on the ailerons. Pilots will only feel what the computer will let them feel. And sometimes the sensations are not so pleasant.

When the flight computer directs the plane to decline due to the fact that the MCAS system decided that it was about to enter the stall, a chain of motors and compensators makes the helms in the cockpit move forward. And it turned out that the computer can put so much effort into the helms that the pilots, trying to pull them to themselves and show the computer that he is doing something completely, completely wrong, are quickly exhausted.

In fact, the fact that the system does not allow the pilot to control the aircraft by pulling the helm over himself was a deliberate decision of the 737 Max designers. Because if pilots can pull the control column and re-direct the nose of the aircraft up, when the MCAS system says that it should be pointing down, then what is the point in such a system?

Despite the fact that MCAS is integrated into the flight computer, it intervenes even when the autopilot is turned off and the pilots are confident that they control the aircraft independently. In the struggle between pilots and the on-board computer for who is in charge in the cockpit, the latter exhausted people to death (literally).

Finally, it was necessary to hide the very existence of the MCAS system so that no one would say: “Hey, this is no longer the old 737”, and the necessary bank accounts were not affected.

A flight computer is just a computer. This means that inside it there are no aluminum parts, no cables, no fuel lines, or other attributes of aviation. They are filled with lines of code. This is where everything becomes dangerous.

These lines of code are undoubtedly written by people under the control of superiors. Neither programmers nor their superiors are as familiar with the special culture and customs of the aviation world as much as people working in factories, riveting wings, developing steering brackets and installing landing gears in the fuselage. These people have a common “industry” memory of what worked in the past in aviation and what went wrong. Software developers - no.

In 737 Max, only one of the flight computers is active at the same time - either on the side of the main or on the side of the co-pilot. An active computer only receives data from sensors that are installed on its side of the aircraft.

If a person during piloting notices that the computer data diverge, he inspects the control panel, evaluates the readings of other devices and understands what is wrong. In a system installed on a Boeing, the on-board computer does not "inspect other devices." He trusts only devices on his side. He does not do the old-fashioned way. He is ultramodern. He is software.

This means that even if a specific sensor of the angle of attack fails - which happens to devices that are subject to transitions from one extreme environment to another, to constant vibrations and shaking - the control computer will simply believe it.

Worse than that. There are several other devices that directly and indirectly determine the angle of attack, for example, pitot tubes, artificial horizon, and so on. The pilot would check all of these instruments in order to quickly diagnose a faulty angle of attack sensor.

In extreme cases, the pilot can always look out the window and visually make sure that no, the nose of the aircraft is not pulled dangerously up. This is the final test, and it must remain the exclusive and absolute privilege of the pilot. Unfortunately, the current version of MCAS deprives him of this right. She takes away the ability of pilots to react to what they see with their own eyes.

As a person with narcissistic personality disorder, MCAS overshadows the decisions of pilots. And this in the end turned out to be bad for everyone. 

- HAL, lift your nose.
- Sorry, Dave, I'm afraid I can't do this for you.

The MCAS-based on-board computer remains blind to any evidence that it is wrong, including what the pilot sees with his own eyes, and when he desperately tries to level the plane and pulls the robotic control wheel towards himself, the computer “bites” the pilot and his passengers to death .

In the old days, the FAA had an army of aviation engineers. They worked shoulder to shoulder with aircraft manufacturers to make sure the aircraft was safe and ready for certification.

As aircraft became more complex, the gap between how much the FAA and aircraft manufacturers could pay their employees was constantly widening. More and more engineers were moving from the public sector to the private. Soon, the FAA did not have the opportunity to figure out how much a particular aircraft model is safe, and whether it can be produced.

Then the FAA suggested to the aircraft designers: “What if your people themselves tell us how safe the project is?” The manufacturers replied: “Sounds good.” And the FAA: “And say hello to Joe, we miss.”

Thus was born the concept of “Designated Engineering Representative” (DER). These representatives are mercenaries of aircraft manufacturers, engine manufacturers, and software developers who certify to the FAA that everything is safe and good.

This looks like a clear conflict of interest, but this is not entirely true, yet no one is interested in the planes crashing. The airline industry relies entirely on public confidence, and every plane crash poses an existential threat to the industry. None of the manufacturers will not hire DER just to sign any papers. On the other hand, after a long day's work, someone might take a word to the guys from the software development department that “yes everything is in order there”.

It is striking that none of the MCAS software developers in 737 Max raised the question of using not one, but several data sources when determining the developing stall, including the angle of attack sensor located on the other side. As a lifelong member of the software development fraternity, I don’t understand what an explosive mixture of incompetence, arrogance and lack of understanding of aviation culture could lead to such a mistake.

From a translator: In the comments to the article, one of the users pointed out that MCAS, among other things, did not just reduce the angle of attack by 0.8 degrees after the first time determined the developing stall process, as expected, but did it in a loop to the stop with every new dimension.

But I know for sure that this is an indicator of a much deeper problem in the industry. The people who wrote the code for the original MCAS system were obviously infinitely far from the level of professional maturity that was required of them, and were not even aware of it. How can one now entrust them with the correction of this software, and indeed believe in the reliability and security of the rest of the flight management software?

So, Boeing has created an aerodynamically unstable body of the aircraft - 737 Max. The first big mistake. Boeing then attempted to mask the emerging dynamic instability problem of the new 737 with software. The second mistake. Finally, the software relied on the readings of systems known for their tendency to fail (angle of attack sensors), and did not even have primitive cross-checking procedures, not only with other types of instruments, but even with the readings of the second set of sensors. Big mistake number 3.

Any of these problems would not allow quality testing. Any of them would be enough not to get “OK” not only from DER, but also from the youngest engineer.

This is not just a big problem. This is a political, social, economic and technical sin.

It so happened that in the interval between the first and second catastrophes of 737 Max, I had to install a new digital autopilot for my own aircraft. This is the 1979 Cessna 172, the most popular aircraft in history in terms of the number of manufactured models. He received the first flight certificate almost a decade before the first Boeing 737 (1955 vs. 1967).

My new autopilot consists of several ultra-modern components, including a backup on-board computer (with two Garmin G5s) and an intricate communication bus (CAN, Controller Area Network), which allows different components of the system to communicate with each other, regardless of their location in the aircraft body. The CAN bus was developed in the automotive industry to implement Drive-by-Wire technology (electronic digital car control system), but in terms of objectives and implementation, it is similar to ARINC buses connecting components in the 737 Max.

My autopilot also includes electronic trimmers. Consequently, he can make the same corrections to the flight configuration of my 172 as flight computers with MCAS in the 737 Max. I remember that after the first crash of the 737 Max, during the installation of the autopilot, when talking with a friend, I noted that I was probably adding a potential source of danger similar to the one that led to the death of the Lion Air flight.

Finally, my new autopilot also has a “protective shell” (the very protection of the range of flight modes), where the “shell” is a graph of the aircraft's ultimate operational properties. At a time when the autopilot does not control my Cessna, the system nevertheless continues to monitor the state of the aircraft to make sure that I will not dump it in a corkscrew, will not fly up the chassis, or will do a lot of other things. Yes, he also has a “biting dog” mode.

As you can see, the similarities between my $ 20,000 autopilot and the multi-million dollar autopilot in each 737 are direct, tangible, and relevant. What are the differences?

For starters, installing a new autopilot required a new certificate (“Supplemental Type Certificate,” STC). That is, both the autopilot manufacturer and the FAA agree that my 1979 Cessna 172 with a built-in autopilot from Garmin is so significantly different from the plane that it once came off the assembly line that it is no longer the same Cessna 172. This is a completely different plane .

In addition to the fact that my aircraft now has a new (additional) certificate such as an aircraft (and a new certification process), we needed to obtain, review and supplement a bunch of documentation on it, including the aircraft operating manual. As you understand, for the most part these add-ons contain information about autopilot.

It should be especially noted that this documentation, which anyone who wants to fly on this plane should be familiar with, explains in detail the operation of the autopilot and how it controls the trimmers, and also describes the features of the protection of the range of flight modes.

It also explains in detail how to determine that the system is malfunctioning and how to quickly turn it off. The lines that you need to pull out the fuse from the autopilot system to turn off are repeated again and again on almost every page of the new documentation. To any pilot who wants to fly my 172, it immediately becomes clear that it is different from any other 172.

There is a huge difference between what they say to pilots who are going to board their Cessna for the first time and those who got into 737 Max .

From a translator: One of the readers of the original article pointed out that despite the fact that the Boeing 737 Max has two on-board computers, you can’t switch from one to the other in flight. Moreover, it is possible to turn off the MCAS only by pulling the fuse out of the hydraulic motor, which drives the plumage trimmers, but the pilots still lack the strength to change the position of the trimmers without the aid of hydraulics. And MCAS has already set them in the lowest possible position ...

Another difference between my autopilot and the system with the MCAS in 737 Max is that the devices connected to the CAN bus constantly communicate and carry out cross-checks, which, apparently, MCAS does not. For example, an autopilot constantly polls both G5 on-board computers to determine location. If the data diverges, the system notifies the pilot and shuts down, switching to manual control mode. She does not direct the plane into the ground, if she suddenly begins to believe that he is about to begin to fall down.

And probably the biggest difference is the force that the pilot must exert to crush the autopilot commands on my plane and on the 737 Max. In my 172 there are still cables that directly connect the controls to aerodynamic surfaces. The computer is forced to press the same levers as me, and it is much weaker than me. If the computer incorrectly decides that the plane begins to fall off, I can easily overcome its resistance.

In my Cessna, man still always emerges victorious from the battle with autopilot. Exactly the same philosophy has always been professed by Boeing in the development of its aircraft, moreover, it was used against its sworn rival Airbus, which acted directly opposite. However, with the release of 737 Max, the Boeing, it seems, without telling anyone, decided to change the strategy of the relationship between man and machine and just as quietly change the operating instructions.

All this saga with 737 Max should teach us not only that complication leads to additional risks, and that technology has its own limit, but also what we should have real priorities. Today, the main thing is money, not security. They think about it only insofar as it is necessary to continue the movement of money in the right direction. The problem is becoming more acute every day, since devices are increasingly dependent on what is too easy to change - software.

Defects in the device and hardware, whether poorly located engines or o-rings scattering in the cold, are obviously difficult to fix. When I say “difficult,” I mean “expensive.” Software defects, on the other hand, can be fixed quickly and cheaply. All that is required is to simply post the update and release the patch. Moreover, we have ensured that buyers now consider all this to be the norm - both updates for the operating system on the computer, and patches automatically installed on my Tesla while I sleep.

In the 1990s, I once wrote an article comparing the relative complexity of Intel Pentium processors, expressed in the number of transistors per chip, with the complexity of the new Microsoft Windows operating system, expressed in lines of source code. It turned out that they were relatively equally complex.

Around the same time, it turned out that earlier versions of Pentium processors were subject to a bug called the FDIV error . Only a small number of Pentium users (approx. Transl .: scientists and mathematicians) could have experienced any problems with it. Similar defects were found in Windows, which also affected only a small part of OS users.

However, the implications for Intel and Microsoft were fundamentally different. Small software patches were systematically released for Windows, while Intel had to withdraw all defective processors in 1994. This cost the company $ 475 million - more than 800 million at today's prices.

In my opinion, the relative simplicity and lack of significant material costs when updating software has led to the development of a culture of laziness in the developer community. Moreover, due to the fact that software more and more controls "hardware", this culture of laziness begins to penetrate into the development of technology - for example, in aircraft construction. Less and less, we pay due attention to the development of a correct and simple design of the equipment, because it is so simple to fix the defect with the help of software.

Every time a new update comes out for my Tesla, the Garmin flight computer in my Cessna, the Nest thermostat, or the TV in my home, I once again realize that none of these things have been released from the factory really ready. Because their creators have realized that this is not necessary. The work can be finished sometime later by releasing the next update.

»I am a network engineer, a former programmer who wrote software for aircraft avionics. It was always curious that we had to dodge struggling to put a new motherboard in a certified computer, and the software did not require any certification (except for general restrictions such as “cannot work under Windows” or “must be written in C ++”). True, it was about 10 years ago, I hope that things are different now. "
- Anonymous, from personal correspondence

Boeing is currently installing a new update for the on-board computer and MCAS 737 Max. I don’t know for sure, but I believe that this update will mainly be aimed at two things:

The first is to teach the software how to cross-check devices, as pilots do. That is, if one sensor of the angle of attack begins to report that the plane is about to begin to fall off, and the other sensor does not, that is, the hope is that the system will no longer immediately direct the plane with its nose into the ground, but still first notify the pilots about the conflict in the readings of the sensors.

The second is to abandon the “shoot first, you will ask questions later” strategy, that is, start looking at different sources instead of one.

For the life of me, I don’t understand how it turned out that these two basic principles of the aviation industry, the foundations of thinking that have served the industry faithfully until now, could be forgotten when developing MCAS. I don’t know and don’t understand which process in DER’s work has broken so much that it made such a fundamental defect in the project.

I suspect that the reason lies at about the same place as the reason for Boeing’s desire to supply larger engines and avoid the large costs associated with it - the desire to eat free cheese , which, as everyone knows, happens only in a mousetrap.

The emphasis on the need to develop as simple systems as possible is well demonstrated by Charles Parrow, a sociologist at Yale University, the author of the book “Normal Accidents: Living With High-Risk Technologies” (Normal Accidents: Living With High-Risk Technologies ^(English) ) 1984. The whole essence of the book is contained in the title. Parrow argues that system failure is a normal result of the operation of any complex system with closely related components, when the behavior of one component directly affects the behavior of another. Despite the fact that individually such errors may seem to be caused by a technical malfunction or a broken process, in fact they should be considered as integral features of the system itself. These are the “expected” accidents.

This problem is nowhere felt more acute than in systems designed to enhance security. Each new change made, each complication becomes less and less effective, and ultimately leads to completely negative results. Imposing one fix on top of another in an attempt to increase security ultimately leads to its reduction.

This is exactly what the old engineering principle of design tells us - “The simpler the better” ( “Keep it simple, stupid” , KISS), and its aviation version: “Simplify, then add lightness” (“Simplify, then add lightness” )

One of the main principles of the FAA in the certification of aircraft during the Eisenhower era was a peculiar covenant of simplicity: aircraft should not show significant changes in pitch with a change in engine power. This requirement appeared during the existence of a direct relationship between the controls of a pilot in the cockpit and the plumage of an aircraft. This requirement - when it was written - rightly imposed a requirement of simplicity on the design of the airframe itself. Now, between the man and the machine, a layer of software has appeared, and no one knows for sure what is really happening there. Things have become too complicated to understand.

I can’t get out of my head the parallels between the disasters of 737 Max and the space shuttle Challenger. The Challenger accident happened because people followed the instructions, and not vice versa - another illustration of “normal” disasters. The rules said that before launching the shuttle, a conference was required to ensure complete readiness for flight. No one said that when making a decision, one should not give too much weight to the possible political consequences that could arise due to the postponement of the launch. All input data were carefully weighed according to the established process, most agreed to launch. And seven people perished.

In the case of the 737 Max, everything was also done according to all the rules. The rules say that the pitch of the aircraft should not change too much when the engine power changes, and that the designated engineer (DER) has the right to sign any changes aimed at solving this problem. There is nothing in the rules that DER should not be guided by business considerations when making a decision. And now 346 people are dead.

It is highly likely that MCAS, designed to improve flight safety, has killed more people than it could ever have saved. No need to try to fix it with a further increase in complexity, additional software. It just needs to be removed.

about the author

Greg Travis - writer, software development manager, pilot, aircraft owner. In 1977, at the age of 13, he created Note, one of the first social media platforms; its flight time is more than 2000 hours, it controlled everything from gliders to Boeing 757 (in a simulator with full simulation of movement).

Tags:

Crash Boeing 737 Max through the eyes of a software developer

about the author

Also popular now: