Woe from Wit, or Why Excellent Students Write Inexplicable Code

    Most of us had excellent grades in mathematics at school and at the university. Remember how we solved the examples? Let's say we need to take the derivative of the function:

    f ( x ) = ln xx 2


    We thought for a few seconds and wrote down the finished result:

    f ( x ) = 1 - 2 l n xx 3


    Pupils weaker wrote down the decision in steps and spent significantly more time:

    f ( x ) = ( l n xx 2 )=(lnx)x2-lnx(x2)( x 2 ) 2 =onexx2-lnx2xx 4 =x-lnx2xx 4 =x ( 1 - l n x2 )x 4 =1-2lnxx 3


    We, excellent students, all this is useless. Why write so many unnecessary intermediate steps when you can immediately answer? But we want to quickly get rid of this example in order to move on to the next!

    Why can we do this, while others cannot?

    Mathematics and Short-Term Memory


    No, it is obvious that we can perform all the necessary operations in the mind, and we do not need to write the intermediate results on paper. But why is this possible?

    In cognitive psychology, that part of the memory in which the calculations “in the mind” are directly performed is called short-term and working. Everything is complicated and ambiguous there, but whatever you call it, the amount of this memory is very limited. Various researchers call the "magic numbers" seven and five. These are average values. The amount of “working” memory depends on the circumstances and correlates with intellectual abilities.

    It turns out that the ability not to record intermediate results is due to the capabilities of our working (or short-term) memory. At us, techies, the working memory for storing “technical” elements is larger than that of the humanities. The more memory, the more intermediate steps we can accumulate in it without writing.

    Programming and Short-Term Memory


    Now let's try to imagine how our superpowers work when we program.

    To solve the resulting problem, you need to write a certain amount of code. But we are in a hurry, and we are able to keep a lot of details in our heads. Feel where I'm driving? We do not record intermediate steps. Instead, we write algorithms that receive the source data and immediately produce the finished result. They are long, these algorithms, and they do a lot of work. As much as we could fit in our own working memory when they were written.
    Hypothesis. The smarter the programmer (the more voluminous working memory he has), the longer the methods and functions he writes.

    Let's try to calculate how much large working memory is required when programming. Since we don’t know exactly how thinking works, and what kind of “objects” it operates with, we will simply count independent objects that are found in the program code.

    As an experimental, I took a 150-line Ruby function from the BigBlueButton project . I admit right away that this code touched me for being alive. I had to spend a couple of days and almost completely rewrite several methods to make cosmetic changes to the functionality of a small part of the project. Not a single long sequence of lines could be reused.

    This code perfectly illustrates the hypothesis. It was written just as we used to solve examples. The mass of intermediate steps was held directly in the head, and only the final decision hit the paper. 150 lines that solve the whole damn task in one fell swoop. This code was clearly written by a very talented guy!

    We are not doing this maliciously. The basis of the programmer’s work is to keep in mind a huge number of objects and the connections between them. What is wrong with the fact that we use immediately more of these objects in one method in order to get rid of them as soon as possible and move on to the next large set of objects? A consistent solution to the problem "little by little", "step by step" - this is for troechniki, right?

    So, I roughly calculated the number of independent objects that appear in the code. The number of objects that had to be held in my head more or less at the same time in order to understand what was happening in all this footcloth from top to bottom. Here's what I got:

    • 4 function arguments that are mentioned a total of 15 times;
    • 42 internal variables used 131 more times;
    • 52 calls to 24 different hash elements passed to the function as arguments (in order to work with the code, you need to keep the internal “device” of all these hashes in your head);
    • 9 calls to 8 different external entities (constants and class methods).

    Total, counting at a minimum, we got 4 + 42 + 24 + 8 = 78 independent objects. And I haven’t considered operations that are performed on objects yet. But operations also "occupy" some share of the working memory.

    78 objects against the "magic" seven - isn’t it too much for one function?

    Of course, here you can endlessly argue that since the code is written and works, then 78 objects are not a problem at all. This is not the longest method, is it? So there can be even more objects? In addition, who said that all 78 must be held strictly at the same time?

    The problem with the long method is not that it is impossible to understand. The problem is that it's hard to understand, because it takes effort to “warm up our cache.” It takes time and effort to load into the working memory the whole set of objects that make up the studied code fragment and keep them there for a long time. The more objects - the more you give working memory, but it is different for everyone, and for the same developer it depends on the degree of fatigue and mood ...

    Metrics and Short-Term Memory


    Various metrics are used to evaluate code quality . For example, Maurice Halstead studied numerical code metrics back in the 70s (!). It seems to be clear that measuring and evaluating the quality of the code is a good thing. And no doubt, the more complex the code, the more mental effort it requires. But there is one question with metrics. In the words of EvgeniyRyzhkov , "the metrics did not come up with the main thing - what to do with them."

    Programming is a complex intellectual process. Its course is influenced by many factors. The most significant factor, in my opinion, is the properties and limitations of the main working tool for creating code - the intelligence of a programmer. The study of the "internal structure" of intelligence is engaged in cognitive psychology. Thanks to her, it is reliably known that the possibilities of intelligence are limited, and the magnitude of these restrictions has even been measured. And since the capabilities of the tool are limited, then the “product” will and should have a certain specificity. “Will” - because the code is a product of the work of the brain-tool. “Must” - because the code is also the source of raw materials.

    The raw code must meet certain criteria so that the brain-tool does not break during its processing.

    Since a science that studies the properties of intelligence is called cognitive, then a metric that correlates with the limitations of this intelligence can also be called cognitive. I will call it, say, cognitive weight . And then cognitive complexity is already taken . By the way, Halstead in his Beginnings of the Science of Programs describes a metric to which cognitive weight is very similar. Only Halstead does not appeal to the concept of “cognitive”. (By the way, Cognitive Psycology R. Solso was first published in 1979, and Elements of Software Science M. Halstead in 1977).

    So, I answer the question about metrics.
    Practically useful metrics of code quality should be built so that they show with which code the brain-tool will work easily, and on which it will “break”. And not “intuitively”, but based on data obtained from cognitive science.

    Summary


    Recording the solution “step by step” is required not only so that the three-player can solve the example. It is needed so that reading and verifying the solution requires so little mental effort that even a three-man could do it. The excellent student will also understand the solution without intermediate steps, but if the examples are complex and you need to read thousands of them ... In general, you understand.

    Next time I’ll talk about the techniques that I use to reduce the complexity of my code.

    PS One of my shaitan methods, to which the hands cannot refactor, has a length of 120 lines. I do not even want to take cognitive weight. I am ashamed.



    UPD

    Judging by the kamenty, I did not clearly express the meaning that I put into the terms “excellent pupil” and “trochik”. This pair of terms is just a literary device: metaphor and hyperbole. Some short names were needed for the concepts of “a person with abilities in the exact sciences above average” and “a person with less developed abilities in the exact sciences”. Nowhere in the text, as it seems to me, do I praise the former and do not discredit the latter. On the contrary, one of the ideas “wired” in the text (not new and not mine): if “excellent students” thoughtlessly use their excellent abilities in the coding process, nothing good will come of it.

    School grades for me have never been an indicator of something significant. I myself had extremely average grades in all subjects except mathematics. I always tried to impress my daughter with the idea that grades are not the main thing in life. It is important that something is in the head: knowledge, understanding, interest in something. Although, she did not listen to me and graduated from school with a gold medal.



    UPD2

    No, I had in mind precisely those excellent students who think and understand, and not cram. In academic subjects, our abilities allow us to get the right decisions very quickly, skipping the irrelevant intermediate steps. And in real technical problems, a high speed of solution, as a rule, is a positive factor.

    And in programming, the intermediate steps suddenly turn out to be the most significant. And we begin to be genuinely perplexed about the raids on our beautiful 300-line functions. But we tried, we did the best we can! Just no one explained that in programming you need exactly the opposite: write stupid code , and fast and smart solutions always have harmful consequences.

    They told me at the university only about Dijkstra's structural programming, but I learned about the principle of sole responsibility much later. What is surprising is that I talked about the length of the method, relying on someone's advice about the screen size, not understanding what the true reason why methods should be short.

    Also popular now: