Subversion: checklist for correct commits
We assume that the reader: a) works in a team; and b) realized the need to work correctly with version control systems, or at least was confronted with the need to use one.
Subversion will be used in the examples, although all recommendations are fully applicable to any other version control system.
Roughly divide the phases of project development into three - debut, middlegame and endgame.
In debut, new code is written in huge chunks, often whole chunks of the system are transferred from place to place. The release is far away, there are no special requirements for the general state of the system. It is permissible even to break the system to certain limits.
In the middlegame, the system as a whole has stabilized, the product is nearing release. Refactoring is clearer, although sometimes quite extensive. At this stage, it is already expected that the system as a whole is working - at least the broken repository is condemned.
Finally, the system goes into the endgame immediately before the release and immediately after. For web applications, new relatively small features are constantly being added, and large changes are first tested on branches. For more traditional applications, on the contrary, a branch is created for maintenance releases, and the development of the next large version continues on the trunk.
Commits are relatively rare in debut. For example, creating a new class, a new relationship, or writing a preliminary version of a sufficiently large piece of code can be a good commit point. For some programmers, especially high-performance ones, it may even be convenient to commit time-bound two to three times a day. The repository can be in a fairly “parsed” state, up to the point that the extracted code is simply not compiled.
In the middlegame, the commits stream is much more structured. Commits are more frequent, and are tied only to structural changes. Commits are basically no larger than average. Good points for a commit would be: closing a task or bug in the tracker, refactoring, adding or fixing another feature. The repository is almost always in a more or less operational state, and broken repositories are condemned (and the fact of breaking causes a karmic blow).
Finally, in the endgame, commits are somehow tightly controlled. In general, the fact of working on a project in this phase should have some justification - this is usually a bug or an important feature transferred for release. The commits in this phase quickly go to production, so they: are clearly tied to the bug tracker; relatively small; well documented; well tested and thought out - introducing additional bugs at this stage inflicts a serious karmic blow).
Based on many years of experience in the development, support and review of fairly large software systems, we have compiled some set of important recommendations. They are called, firstly, to increase the depth of awareness when working on a project; secondly, to facilitate the work of the code reviewer - and therefore the probability of detecting a bug or a flaw; thirdly, to facilitate the fate of the person who will be forced to understand the history of the project sometime in the future (often the author of the current commits will also be such a person).
In general, in our opinion, the depth of understanding by the programmer of the project on which he works is strongly correlated with the quality of the commits that he makes in the process of work.
Commits should be logical. A commit must correspond to one structural unit - a new file, a new block, a new includer, a new class, a new relationship, a new feature, a closed bug, a single refactoring, a chapter of documentation, a correction of documentation, etc.
The meaning of this requirement is to play for the good of the future . Observing it, you can be sure that one or another commit in the stream is either safe (for example, fix documentation), or it can be freely pumped out with the confidence that exactly one logical change was rolled back. This is a priceless feature in binary error searching. Moreover, it is obvious that logical changes are much easier to review in the code review process.
Commits must be tested (except for the debut phase). Ideal if the change is generally tested by unit tests. In extreme cases, the commit needs to be checked at least once - even the most harmless corrections can lead to disastrous consequences for karma. Of course, the commit should be syntactically valid - the fix to the fix strikes the image.
Commits should be well documented. We are silent about empty journal messages - this is generally the worst that could be. This is only permissible for employees who are completely far from programming (such as designers), who have hardly agreed to this sub-version of yours at all. And then, it makes sense to work on the cultural level of even such people! A commit with a description of “Fix” or “Fixed” is valid only if there are no more than two or three lines in diff. If more - you need to write in more detail.
Separately, it is necessary to consider the issue of a journal message, consisting of several unrelated items. It is clear that such a message is a symptom of a true problem - that your commit consists of several parts that are not connected to each other. The best solution in this case would be to cancel the attempt to commit and try to split it into several parts.
In the endgame phase, poorly documented commits should be strictly condemned. Also, if the commit is a bug closure, then the bug number must be mentioned in the journal message. The same Trac integrates with Subversion, determining ticket numbers in journal messages and automatically turns them into corresponding links.
Before committing, you need to view diffs, as well as keep track of new files not added to the repository and add them. If you use TortoiseSVN, then it has a special menu item called "Check for modifications". In the command-line client, svn st (for loose files) and svn diff | more (to review changes before committing).
This rule must be observed in order not to accidentally commit something wrong. It is known that, for example, in the process of correcting errors, subjectively, it seems that "there was a lot of work." However, when viewing the diff, it turns out that one or two lines were actually corrected. Viewing also allows you to eliminate garbage from commits - for example, temporary debugging statements and empty changes (adding and removing spaces). It often happens that only white space changes remain in a particular file - they must be rolled back to their original state (svn revert). In the endgame phase, reviewing changes before committing is strictly required.
Separately, mention should be made of fixing alignments and spaces in files. Such an operation (if it is very necessary) must be committed independently. In the endgame phase, in no case should you allow committing whitespace changes along with significant changes.
Also, smaller rules:
- when renaming a file and making changes to it, you should separately commit the rename, and the change separately;
- when adding a third-party component (Rails plugin, prototype.js file) to the project code, they should be committed separately;
- you should not leave commented lines of code - for storing old versions, naturally, there is a version control system (in some cases, this rule may be violated with one-two-line corrections).
Questions, wishes, suggestions?
PS We have a whole bunch of vacancies: undev.ru
Subversion will be used in the examples, although all recommendations are fully applicable to any other version control system.
Roughly divide the phases of project development into three - debut, middlegame and endgame.
In debut, new code is written in huge chunks, often whole chunks of the system are transferred from place to place. The release is far away, there are no special requirements for the general state of the system. It is permissible even to break the system to certain limits.
In the middlegame, the system as a whole has stabilized, the product is nearing release. Refactoring is clearer, although sometimes quite extensive. At this stage, it is already expected that the system as a whole is working - at least the broken repository is condemned.
Finally, the system goes into the endgame immediately before the release and immediately after. For web applications, new relatively small features are constantly being added, and large changes are first tested on branches. For more traditional applications, on the contrary, a branch is created for maintenance releases, and the development of the next large version continues on the trunk.
Commits are relatively rare in debut. For example, creating a new class, a new relationship, or writing a preliminary version of a sufficiently large piece of code can be a good commit point. For some programmers, especially high-performance ones, it may even be convenient to commit time-bound two to three times a day. The repository can be in a fairly “parsed” state, up to the point that the extracted code is simply not compiled.
In the middlegame, the commits stream is much more structured. Commits are more frequent, and are tied only to structural changes. Commits are basically no larger than average. Good points for a commit would be: closing a task or bug in the tracker, refactoring, adding or fixing another feature. The repository is almost always in a more or less operational state, and broken repositories are condemned (and the fact of breaking causes a karmic blow).
Finally, in the endgame, commits are somehow tightly controlled. In general, the fact of working on a project in this phase should have some justification - this is usually a bug or an important feature transferred for release. The commits in this phase quickly go to production, so they: are clearly tied to the bug tracker; relatively small; well documented; well tested and thought out - introducing additional bugs at this stage inflicts a serious karmic blow).
Based on many years of experience in the development, support and review of fairly large software systems, we have compiled some set of important recommendations. They are called, firstly, to increase the depth of awareness when working on a project; secondly, to facilitate the work of the code reviewer - and therefore the probability of detecting a bug or a flaw; thirdly, to facilitate the fate of the person who will be forced to understand the history of the project sometime in the future (often the author of the current commits will also be such a person).
In general, in our opinion, the depth of understanding by the programmer of the project on which he works is strongly correlated with the quality of the commits that he makes in the process of work.
Commits should be logical. A commit must correspond to one structural unit - a new file, a new block, a new includer, a new class, a new relationship, a new feature, a closed bug, a single refactoring, a chapter of documentation, a correction of documentation, etc.
The meaning of this requirement is to play for the good of the future . Observing it, you can be sure that one or another commit in the stream is either safe (for example, fix documentation), or it can be freely pumped out with the confidence that exactly one logical change was rolled back. This is a priceless feature in binary error searching. Moreover, it is obvious that logical changes are much easier to review in the code review process.
Commits must be tested (except for the debut phase). Ideal if the change is generally tested by unit tests. In extreme cases, the commit needs to be checked at least once - even the most harmless corrections can lead to disastrous consequences for karma. Of course, the commit should be syntactically valid - the fix to the fix strikes the image.
Commits should be well documented. We are silent about empty journal messages - this is generally the worst that could be. This is only permissible for employees who are completely far from programming (such as designers), who have hardly agreed to this sub-version of yours at all. And then, it makes sense to work on the cultural level of even such people! A commit with a description of “Fix” or “Fixed” is valid only if there are no more than two or three lines in diff. If more - you need to write in more detail.
Separately, it is necessary to consider the issue of a journal message, consisting of several unrelated items. It is clear that such a message is a symptom of a true problem - that your commit consists of several parts that are not connected to each other. The best solution in this case would be to cancel the attempt to commit and try to split it into several parts.
In the endgame phase, poorly documented commits should be strictly condemned. Also, if the commit is a bug closure, then the bug number must be mentioned in the journal message. The same Trac integrates with Subversion, determining ticket numbers in journal messages and automatically turns them into corresponding links.
Before committing, you need to view diffs, as well as keep track of new files not added to the repository and add them. If you use TortoiseSVN, then it has a special menu item called "Check for modifications". In the command-line client, svn st (for loose files) and svn diff | more (to review changes before committing).
This rule must be observed in order not to accidentally commit something wrong. It is known that, for example, in the process of correcting errors, subjectively, it seems that "there was a lot of work." However, when viewing the diff, it turns out that one or two lines were actually corrected. Viewing also allows you to eliminate garbage from commits - for example, temporary debugging statements and empty changes (adding and removing spaces). It often happens that only white space changes remain in a particular file - they must be rolled back to their original state (svn revert). In the endgame phase, reviewing changes before committing is strictly required.
Separately, mention should be made of fixing alignments and spaces in files. Such an operation (if it is very necessary) must be committed independently. In the endgame phase, in no case should you allow committing whitespace changes along with significant changes.
Also, smaller rules:
- when renaming a file and making changes to it, you should separately commit the rename, and the change separately;
- when adding a third-party component (Rails plugin, prototype.js file) to the project code, they should be committed separately;
- you should not leave commented lines of code - for storing old versions, naturally, there is a version control system (in some cases, this rule may be violated with one-two-line corrections).
Questions, wishes, suggestions?
PS We have a whole bunch of vacancies: undev.ru