10 principles of self-documenting code
Hello! Today I want to share tips on writing perfect, understandable code, taken from Peter Goodliff’s book "Programmer's Craft // The Practice of Writing Good Code."
Of course, it would be nice to read this entertaining book to anyone who writes the code, but for those who are especially lazy, but who want to stop tormenting less and mislead their colleagues ( have a conscience ), I present 10 principles of self-documenting code under cat :
The presentation format has a huge impact on the ease of understanding the code. A reasonable representation conveys the structure of the code: functions, loops, and conditional statements become clearer.
The names of all variables, types, files and functions should be meaningful and not misleading. The name must correctly describe what it is. If you are unable to find a meaningful name, then there is a doubt about whether you understand the operation of your code.
The naming system should be consistent and not cause unpleasant surprises. Make sure that the variable is always used only for the purpose that its name suggests.
A good choice of names is probably the best way to avoid unnecessary comments. Names are the best way to bring the code closer to the expressiveness of natural languages.
How you break the code into functions and what names you give them can make the code understandable or completely incomprehensible.
Minimize any unexpected side effects, no matter how useful they may seem. They will require additional documentation.
Write short functions. They are easier to understand. You can navigate in a complex algorithm if it is divided into small fragments with meaningful names, but this cannot be done in the shapeless mass of code.
To the extent possible, describe the limitations or behavior using the available language features. For example:
Code like if (counter == 76) is perplexing. What is the magic meaning of 76? What is the meaning of this check? The practice of magic numbers is vicious. They obscure the meaning of the code. It is much better to write like this:
If the constant 76 code is often found in the code (excuse me, bananas_per_cake ), an additional advantage is achieved: when it is necessary to change the content of bananas in the pie, it is enough to modify the code in one place rather than perform a global search / replace of the number 76, which is fraught with errors.
This applies not only to numbers, but also to constant strings. Take a look at any literals in your code, especially if they occur repeatedly. Wouldn't it be better to use named constants instead?
Try to highlight important code against the background of ordinary material. In the right place you should get the reader’s attention. There are a number of tricks for this. For example:
All related information should be in one place. Otherwise, you will make the reader not only jump through hoops, but also look for these hoops with ESP. The API for each component must be represented by a single file. If there is too much interconnected information to be presented in one place, it is worth revising the code architecture.
Whenever possible, combine objects using language constructs. In C ++ and C #, you can combine elements within the same namespace . In Java, the bundling mechanism is the package engine. Related constants can be defined in an enumeration.
Place a comment block at the beginning of the file with a description of the contents of the file and the project to which it relates. This does not require much work, but is of great benefit. Anyone who has to accompany this file will get a good idea of what it is dealing with. This title may have special meaning: most software companies, for legal reasons, require that each source file has a copyright statement. Typically, file headers look something like this:
Put handling of all errors in the most suitable context. If there is a problem of reading / writing to the disk, it must be processed in the code that deals with access to the disk. To handle this error, you may need to generate another error (such as an exception “I can’t load the file”), passing it to a higher level. This means that at each level of the program, the error must be an accurate description of the problem in its context . It makes no sense to handle the error associated with a disk failure in the user interface code.
Self-documenting code helps the reader understand where the error occurred, what it means and what its consequences are for the program at the moment.
So, we tried to avoid writing comments using other indirect methods of documenting code. But after you have made every effort to write understandable code, everything else needs to be provided with comments. To make the code easy to understand, it needs to be supplemented with an appropriate amount of comments. Which one?
Try other tricks first. For example, check if you can make the code more understandable by changing the name or creating a helper function, and thus avoid commenting.
I’m sure that after the introduction of several of these principles into the habit, you will make one programmer happier. And you will be this happy programmer. When? At the time of returning to work on my code six months ago.
Of course, it would be nice to read this entertaining book to anyone who writes the code, but for those who are especially lazy, but who want to stop tormenting less and mislead their colleagues ( have a conscience ), I present 10 principles of self-documenting code under cat :
1. Write simple code with good formatting
The presentation format has a huge impact on the ease of understanding the code. A reasonable representation conveys the structure of the code: functions, loops, and conditional statements become clearer.
int fibonacci(int position)
{
if (position < 2)
{
return 1;
}
int previousButOne = 1;
int previous = 1;
int answer = 2;
for (int n = 2; n < position; ++n)
{
previousButOne = previous;
previous = answer;
answer = previous + previousButOne;
}
return answer;
}
- Make sure that the normal execution of your code is obvious. Error handling should not detract from the normal execution sequence. The conditional if-then-else constructs must have a uniform branch order (for example, always place the “regular” code branch before the “error handling” branch, or vice versa).
- Avoid a lot of levels of nested statements. Otherwise, the code becomes complex and requires extensive explanation. It is generally accepted that each function should have only one exit point; this is known as Single Entry, Single Exit (SESE, one input, one output) code . But usually this restriction makes it difficult to read the code and increases the number of levels of nesting. I like the fibonacci function above option more than the following SESE style option :
int fibonacci(int position) { int answer = 1; if (position >= 2) { int previousButOne = 1; int previous = 1; for (int n = 2; n < position; ++n) { previousButOne = previous; previous = answer; answer = previous + previousButOne; } } return answer; }
I would refuse such excessive nesting in favor of the additional return statement - it became much more difficult to read the function. The appropriateness of hiding return somewhere deep in the function is doubtful , but the simple abridged calculations at its beginning make reading very easy. - Beware of code optimizations that cause clarity in the underlying algorithm. Optimize your code only when it becomes clear that it interferes with the program’s acceptable performance. When optimizing, make clear comments regarding the functioning of this section of code.
2. Choose meaningful names
The names of all variables, types, files and functions should be meaningful and not misleading. The name must correctly describe what it is. If you are unable to find a meaningful name, then there is a doubt about whether you understand the operation of your code.
The naming system should be consistent and not cause unpleasant surprises. Make sure that the variable is always used only for the purpose that its name suggests.
A good choice of names is probably the best way to avoid unnecessary comments. Names are the best way to bring the code closer to the expressiveness of natural languages.
3. Break the code into independent functions
How you break the code into functions and what names you give them can make the code understandable or completely incomprehensible.
- One function, one action
Minimize any unexpected side effects, no matter how useful they may seem. They will require additional documentation.
Write short functions. They are easier to understand. You can navigate in a complex algorithm if it is divided into small fragments with meaningful names, but this cannot be done in the shapeless mass of code.
4. Choose meaningful type names
To the extent possible, describe the limitations or behavior using the available language features. For example:
- When determining a value that will not change, assign a constant type for it (use const in C).
- If the variable should not take negative values, use an unsigned type (if it exists in the language).
- Use enumerations to describe the associated dataset.
- Choose the type of variables correctly. In C / C ++, write size to variables of type size_t , and the results of arithmetic operations with pointers to variables of type ptrdiff_t .
5. Use named constants
Code like if (counter == 76) is perplexing. What is the magic meaning of 76? What is the meaning of this check? The practice of magic numbers is vicious. They obscure the meaning of the code. It is much better to write like this:
const size_t bananas_per_cake = 76;
...
if (count == bananas_per_cake)
{
// испечь банановый пирог
}
If the constant 76 code is often found in the code (excuse me, bananas_per_cake ), an additional advantage is achieved: when it is necessary to change the content of bananas in the pie, it is enough to modify the code in one place rather than perform a global search / replace of the number 76, which is fraught with errors.
This applies not only to numbers, but also to constant strings. Take a look at any literals in your code, especially if they occur repeatedly. Wouldn't it be better to use named constants instead?
6. Highlight important pieces of code
Try to highlight important code against the background of ordinary material. In the right place you should get the reader’s attention. There are a number of tricks for this. For example:
- Place your ads wisely in the classroom. First, information about open objects should go, because it is the user of the class that needs it. Closed implementation details should be placed at the end, as they are less interesting to most readers.
- If possible, hide all non-essential information. Do not leave unnecessary garbage in the global namespace. C ++ has the idiom pimpl, which allows you to hide the details of the implementation of the class. (Meyers 97).
- Do not hide important code. Do not write more than one statement in a line and make this statement simple. The language allows you to write very ingenious for-loop statements in which all the logic fits on one line with many commas, but such statements are difficult to read. Avoid them.
- Limit the nesting depth of conditional statements. Otherwise, it is difficult to notice the handling of really important cases behind a heap of ifs and parentheses.
7. Combine related data
All related information should be in one place. Otherwise, you will make the reader not only jump through hoops, but also look for these hoops with ESP. The API for each component must be represented by a single file. If there is too much interconnected information to be presented in one place, it is worth revising the code architecture.
Whenever possible, combine objects using language constructs. In C ++ and C #, you can combine elements within the same namespace . In Java, the bundling mechanism is the package engine. Related constants can be defined in an enumeration.
8. Label files
Place a comment block at the beginning of the file with a description of the contents of the file and the project to which it relates. This does not require much work, but is of great benefit. Anyone who has to accompany this file will get a good idea of what it is dealing with. This title may have special meaning: most software companies, for legal reasons, require that each source file has a copyright statement. Typically, file headers look something like this:
/*********************************************************
* File: Foo.java
* Purpose: Foo class implementation
* Notice: (c) 1066 Foo industries. All rights reserved.
********************************************************/
9. Handle errors correctly
Put handling of all errors in the most suitable context. If there is a problem of reading / writing to the disk, it must be processed in the code that deals with access to the disk. To handle this error, you may need to generate another error (such as an exception “I can’t load the file”), passing it to a higher level. This means that at each level of the program, the error must be an accurate description of the problem in its context . It makes no sense to handle the error associated with a disk failure in the user interface code.
Self-documenting code helps the reader understand where the error occurred, what it means and what its consequences are for the program at the moment.
10. Write meaningful comments
So, we tried to avoid writing comments using other indirect methods of documenting code. But after you have made every effort to write understandable code, everything else needs to be provided with comments. To make the code easy to understand, it needs to be supplemented with an appropriate amount of comments. Which one?
Try other tricks first. For example, check if you can make the code more understandable by changing the name or creating a helper function, and thus avoid commenting.
I’m sure that after the introduction of several of these principles into the habit, you will make one programmer happier. And you will be this happy programmer. When? At the time of returning to work on my code six months ago.