How to embed static analysis into a project with more than 10 megabytes of source code?
So, you are a developer in a project in which there is a lot (or even a lot) of source code. For example, more than 10 megabytes. You read articles about checking open source projects and wanted to test your own project with some code analyzer. You checked the project and received more than a thousand messages from the analyzer. A thousand is an optimistic option. Maybe more than a dozen thousand. But you are not a lazy developer? You started to view them. And, horror, the fifth message from the analyzer turned out to be a real mistake! And also the seventh, ninth, twelfth and fifteenth. You wrote yourself a dozen more real errors that the analyzer pointed out and headed for the boss with the words:
“Chef, look. I downloaded a cool analyzer. He had already found ten real mistakes for us in just half an hour. And in all, he issued a thousand (two, three, four) messages. Let's buy this analyzer, the guys and I will deal with and correct all messages in two to three weeks. And then, when we fix everything, he will give us 0 messages. So we are cool programmers and we are making high-quality code! ”
And although you are already mentally substituting the chest for a new medal (after all, you are happy for the quality of the project!), Your boss will very likely answer something like this:
“Do you have nothing to do? You want to distract the whole team for three weeks to correct errors, and we have a release in a month! So what, are the mistakes real? We lived somehow with them and nothing. Yes, I understand that it would be great not to make mistakes in the new code. Well, you do not do it! I won’t let you touch the old code! It has already been tested and customers have paid for it. And who will pay for the three weeks of the team? So we will not buy the analyzer, and we will not touch the old code. And apparently your current tasks are over. So we'll throw you now. And to be ready tomorrow! ”
The real scenario? Real. Worst of all, the good idea of introducing static analysis has stalled from having to work through all the messages that the code analyzer gives out. After all, if each run of the analyzer will produce a thousand messages, then it will be impossible to understand where new messages and where old ones. Or is there still a way to solve this problem?
How does this issue resolve in PVS-Studio?
A couple of releases back in PVS-Studio, a mechanism appeared to suppress old or “uninteresting” messages from the code analyzer. By version 5.22, we finally debugged it, worked it out, and now it has turned out to be so convenient that we recommend it to anyone who is thinking about introducing static analysis in their project. And now I'll tell you how to use it.
So, here is a project for 5 million lines of code (just for example), in which 150 megabytes of source code. You checked it with an analyzer and received several thousand messages of general purpose (General Analysis). You would be happy to fix them, but the project manager does not allocate time.
OK, no question. After the analysis of the entire solution is completed, go to the menu PVS-Studio -> Suppress Messages ... The dialogue is simple, click "Suppress Current Messages", then Close. As a result, you will have 0 messages left in the PVS-Studio window. And even if you run the analysis of the whole code again, after the analysis is complete, there will be 0 messages. But if you start to write new code or modify the old one, then the analyzer will swear at this code. Unless of course you write it right away perfect.
How it works?
This mechanism works quite simply. Rather, now it’s simple, but how many experiments have been done to come up with this option ... Well, okay, what happens from the point of view of the user when he clicks the “Suppress Current Messages” button?
A database is created in the project folder from .suppress files. For each project (.vcxproj file), its own .suppress file is created nearby. It stores information about all messages that were issued by the analyzer when checking this project. With these reports, we compare the results of each new analysis.
Messages are naturally not compared directly. We take into account the message code, the text of the current line of code, as well as the text of the previous and next lines of code. But we do not take into account the line number: on the contrary, if the message was displayed at the end of the file (and so it is stored in the database), then inserting lines at the beginning of the file will lead to a change in the message line. We will track this and will not once again give out an “uninteresting” message. But if one of the lines has changed (previous, current or next), then we already swear, because the message context was affected, and we can consider this message already issued to a new (modified) code.
What to do with these. suppress files? If PVS-Studio runs on the same machine, say a build server, then you can store these files directly in the project folders. If this is not convenient (for example, a clean assembly is done), then before starting the analyzer, you can copy files to the project folder using robocopy from another place - it saves the folder structure.
If several developers work with the analyzer, then .suppress files can be put into the repository. The question arises - how to synchronize these .suppress files between developers? On the one hand, the answer is simple - it's XML, so there is no problem. But on the other hand, it turns out that you do not need to synchronize these files (and indeed somehow modify them). It is enough to create them once, when introducing the analyzer, and why not modify them anymore. Well, try to maintain 0 messages from the analyzer on the project in the future.
Note. But how to maintain 0 messages from the analyzer in the future, if it is not, no, and it gives false warnings? There are several ways to suppress individual warnings in this case. About this one will be said below.
Well, what is this mechanism for?
If suddenly someone still does not understand, then I will give an absolutely concrete example. You can add all messages to such a database (we get 0 messages when checking). Then, on the build server, the analyzer starts every night ( how to configure the launch of the analyzer on the build server ), which displays only new messages, that is, messages on the new code written by the team in a day. These messages are saved not only in .plog (analyzer report in xml-format), but in a text file nearby. And this text file can already be sent by mail to project participants using any suitable program. For example, using SendEmail .
In the morning, people see messages from the analyzer in their mail and can correct errors even without installing PVS-Studio on their machines. Well, if you still want to open a report (.plog), then it is available on the build server. With this trick, you can save a lot of money on licenses for PVS-Studio.
By the way, you can configure which messages (more precisely, which levels) should be included in the text report. This is done using the OutputLogFilter option, which is located on the Specific Analyzer Settings tab of PVS-Studio settings. We recommend that you include General Analysis Level1 and Level 2 messages in the text file.
A little caveat about incremental analysis.
Nevertheless, we recommend that developers primarily use PVS-Studio, including on local machines, in incremental analysis mode . This is a checkmark “Analysis after Build (Modified files only)” in the PVS-Studio menu. When the checkbox is on, the analyzer monitors your work and automatically starts for those files that have been compiled (it tracks the modification of .obj files). If the analyzer does not find anything, then you will not even notice that it started. And if it finds, then there will be a pop-up error message.
Incremental analysis maintains a database of "uninteresting" messages. If you have such a database, then messages from incremental analysis will only be about the new code. And if not, then completely to the file that is being analyzed.
The incremental analysis mode allows you to correct errors immediately when they appear, even before the error appears in the version control system. And as is known from McConnell, error correction at this stage is the cheapest. Therefore, if possible, we recommend using PVS-Studio both for daily checks on the server, and in incremental analysis mode on the machines of programmers.
How to fix messages in the database?
Well, you implemented static analysis on the project, the analyzer gives you no more than a few messages a day that you and the team immediately edit. Excellent. But here stood a week of free time, which can be spent on editing old errors. How to get to them? There are two options.
- You can call the dialog box through the “Suppress Messages ...” command, in which the “Display Suppressed Messages” checkmark appears. Turn it on and a message will appear.
- Or, if you have daily launch configured on the build server, then you just go to the folder with the results of the last analysis and see the following files there:
- SolutionName.plog - a log with only new messages;
- SolutionName.plog.txt - a text log, the same as SolutionName.plog;
- SolutionName_WithSuppressedMessages.plog - all messages, including "uninteresting" ones.
It is important to understand this. If you have the time and opportunity, you can always go back to the old "uninteresting" messages and correct them. We recommend that you do this, because after all, there are errors and errors must be corrected.
And this does not contradict the Mark as False Alarm function?
PVS-Studio has the Mark As False Alarm command - mark it as a false positive. This is when a comment of the form // - V501 is added to the line with the message. Upon encountering such a comment, the analyzer will not display a V501 message on this line.
The mechanism described in this article and Mark As False Alarm do not contradict each other. But they serve for slightly different purposes. Mass suppression of uninteresting messages - for mass markup during the implementation of the analyzer in the project. And Mark As False Alarm - so that the analyzer does not swear at individual fragments.
In principle, it would be possible to mark all the code as False Alarm earlier, but usually it’s scary for people to make so many changes in the code. In addition, it is not clear how to work with old errors - remove everything that is marked as False Alarm? And if there really is a false alarm.
But using the mass suppression mechanism to suppress individual messages is also wrong.
In general, these are two mechanisms that solve different problems. Do not confuse them.
So, the approach to implementing static analysis in a living project looks like this:
- We mark all messages as “uninteresting” using the “Suppress Messages ...” command.
- Now, at the next start, the analyzer will only output messages to the new code.
- If necessary, you can always fix errors that were hidden during the implementation.