ph_piter July 19, 2019 at 23:04

Self-documenting code is (usually) nonsense

Transfer

Hello!

In advance of today's translated publication, we’ll immediately point out that this text is intended as a follow-up to the recent discussion material, “ Stop Zealous with Comments in the Code .” We were so impressed by the discussion that unfolded there and 189 comments as of July 19, 2019, that we decided to give the floor to another author from the Medium portal (Christopher Lane), who polemizes on the points of principle with theses of Brian Norlander, the author of the first article. Note that in the original, this article was published a month later than the previous one (May 16 and June 16), but it collected almost half the applause (706 against 1.5 K at the time of publication of the translation). Let's see what will happen on Habr ...

The picture is taken from the site rawpixels.com from the authorPexels

I carefully read the excellent article by Cindy Cheung on technical documentation and why developers should better explain their own code - and I must say that I completely agree with it.

I’ve been doing a damn long job in this IT of yours, and my experience tells me that there is such self-deception in our business that developers and engineers simply cannot resist.

My code is self-documenting - Misconception

In theory, the code of a good engineer should be so clear and readable that it just doesn't need any documentation.

You know, this is nonsense ... as a rule.

Why is "self-documenting code" nonsense?

Let's say you write code as cool as Hemingway wrote prose . Perhaps your super-duper code is clean and clear (to another developer). In the end, this code was written by a techie for a techie, and no matter how clean and concise your code may seem, it is still not intended to be read by non-programmers who could grumble: “what the hell is all this mean ?! ”

Why do I think self-documenting code is all nonsense? Let me explain in detail.

Reason 1: Programming is full of all kinds of tricks that are not self-documenting.

Just because most people, including developers, are not cars. Yes, most likely I’ll go through your code, I’ll understand the names of your methods and classes, I’ll even understand what exactly you do in each method.

But the code is written FOR cars. They understand much better than us what to do with it, and in order to describe it to them, we have programming languages. You need to communicate with people in a more human language so that a person can understand what your software does.

There is a very big difference between “reading the code and seeing what is happening in it” and the documentation. It is possible to write down in the code with all the details what is done in it, but can it be called “self-documenting” in this case? I think everyone understands that it’s impossible.

Consider the following simple blob in C #. I read the file, get its contents, and then get the file encoding using StreamReader.

var fileContents = “”;
Encoding fileEncoding; using (var reader = new StreamReader(filePath, Encoding.Default,  true))
 {
   reader.Peek(); 
   fileEncoding = reader.CurrentEncoding;
   fileContents = reader.ReadToEnd();
 }

If we ignore possible ambiguities with StreamReader, otherwise this code is quite simple, right? Then ... have mercy, but what is being done on this line?

reader.Peek();

It turns out that the reader must perform this action to get the encoding of the file. Tell me, where is self-documentation? But it’s enough to spend some 10 seconds to make the code much more understandable.

reader.Peek(); //Вот так нужно заглянуть в файл, чтобы получить его кодировку.

This is just one example, and damn simple. As your code gets more complex, such small details begin to pop up everywhere and gradually clutter up the once-clean code. It becomes increasingly difficult for the person who will read it to catch what is happening in the code.

Reason 2: The complexity is not essentially self-documenting.

If you have ever written BASH or BAT files, then you know that the actions described in such a file are performed sequentially. One task after another. The file resembles a short story that is read from the first to the last line.

However, if you are writing a program, and in particular a web application, such a consistent history will not be there, except for the code for the initial loading and configuration of all web services.

The classes themselves that make up a modern web application are not executed sequentially. In essence, they are a collection of web or API controllers that are called specifically during the interaction of a client with a web application. Each web or API controller can provide threads for branching new processes, sending messages to other services, and waiting for responses to trigger listeners web hooks based on their results. Nothing of this is nearly impossible to set forth in a “plot” format. From all of your "self-documenting code", a novice or non-programmer will only get out "it seems, I understand what is happening here." Again, hardly anyone would dare to trust such “documentation”.

The more complex your application, the higher the likelihood that its classes, methods and the entire framework will not work in sequential mode. Assuming that anyone who encounters such an application can easily understand from the code what is happening in it, you are entering an increasingly slippery track.

Reason 3: The syntax of programming languages, in principle, cannot be called readable.

Just take a look at this jquery function that calls the API endpoint.

var myURL="https://some.url/path?access_token=my_token";
$.ajax({
    url: myURL+"&callback=?",
    data: "message="+someOtherData,
    type: 'POST',
    success: function (resp) {
        alert(resp);
    },
    error: function(e) {
        alert('Error: '+e);
    }  
});

Phew ...

No, I do not want to say that there is something wrong with the syntax. Everything is perfectly acceptable for jquery call. But I emphasize that, if you look at it through the eyes of a novice programmer or non-programmer, then this listing may well be no more clear to him than byte code. There will be no more sense in it.

Syntactically, programming languages are designed so that they can actively use the restrictions set by the language itself, as well as useful shortcuts that help keep the code compact and easily change it as needed. The programming language is not conceived as a uniquely reliable means of communication, in which everyone understands. It is designed for professionals who know the language itself, its syntax and shortcuts.

For everyone else, the programming language is incomprehensible.

What to do?

There are some tricks, using which, you will help non-specialists understand your code.

Stage 1: Try writing documentation

Sounds blasphemous, right? Write documentation ?! You couldn’t come up with anything funnier!

Seriously, no one even requires you to write War and Peace. But try to describe in the technical documentation the main actions, validation and error handling in a simple consistent style.

Client calls API endpoint /someurl/object/{id}
The API controller uses {id}(of type int) to find the object in the database.
If the object returns null, then the API controller gives the client an HTTP response 404 (file not found). The API controller logs this as a warning.
If the returned object is - NOT null, then the API controller converts this object to JSON format and returns it to the caller with an HTTP response of 200 (OK).

It is hardly difficult to do, but by writing such documentation you will make life easier for someone. If you are closer to a more selfish motivation, think about it: if you are constantly being asked for help and explanations, then you can just point them to the documentation, and not explain the same thing over and over again.

Stage 2: Draw Outlines

If it’s still difficult to write simple documentation for you, then at least try to draw the most necessary schemes, since they often serve as the “glue” that helps a person from the outside to correlate your code with what is happening in it.
Check out websequencediagrams.com , where you can write great sequence diagrams in plain text format and then create them.

Text

title Service one to service two
Service two requester -> Service two http client: Get Contract
Service two http client -> Service one REST API: GET /api/contracts/{123}
Service one REST API -> Service one app logic: get Contract with id 123
Service one app logic -> Service one REST API: Contract
Service one REST API -> Service one REST API: Serialise to JSON / XML / etc.
Service one REST API -> Service two http client: Serialised data
Service two http client -> Service two http client : Deserialise to Contract
Service two http client -> Service two requester: Contract

The diagram that is obtained from it is

Beautiful!

The hackneyed phrase that one picture is worth a thousand words is nonetheless true. Such diagrams and flowcharts will help a non-programmer to understand the behavior of your application, and for this he will only need to carefully study the picture. Colleagues will appreciate it, and on your part it will be a small investment in a common cause.

Stage 3: Name your classes and activities in a Single Language (Ubiquitous Language)

As you know, the Unified Language is a concept from DDD (subject-oriented design), where the team and users need to develop a language that describes all classes and their interactions. Such a language is understandable to a layman, so customers, testers, instructors and business representatives will be able to read and understand on it what our program does and how it solves user problems in this subject area.

Once the Unified Language is agreed, you must ensure that all your classes, their methods, events, and everything else are named as close to the Unified Language as possible.

/// 
/// Клиент сам отвечает за собственный вход в систему/выход из нее, а также за извлечение информации из своего профиля и управление ею 
/// settings
/// 
public interface ICustomer
{
     Task Login(string username, EncryptedString password);
     Task Logout();
     Task GetMyProfile();
     Task SaveMyProfile(CustomerProfileInformation);
     Task GetMySettings();
     Task SaveMySettings(CustomerSettings);

Although, this is just a piece of code, above it is written in a simple and understandable language what is happening here.

Stage 4: just write comments

If all of the above seems to you too, too burdensome - just provide your code with informative comments. You may not need them right now (you are now immersed in the code with your head, everything is clear to you already), but in the future they can be very useful to you.

Just a few correctly spaced lines with comments, a comment on a class or method will be enough to make the code much more understandable. I do not urge you to comment on each line (this will only complicate the code). Just comment out the most difficult sections of the code so that whoever gets through it understands where this code will lead him.

I hope you find these tips useful.

Tags: