EvgeniyRyzhkov September 8, 2010 at 11:45

Regular use of static code analysis in team development

In anticipation of the release of a static analyzer from Intel under the name Advisor, which will be included in Intel Parallel Studio 2011, around September, it will be useful to generally talk about the technology of static code analysis and its application. The fact is that according to experience in Russia, static analysis is not often used, apparently due to the fact that we do not have so many complex software projects. Therefore, a short text on what it is and to whom it may be useful, I hope it will be useful. Well, and who else but the authors of PVS-Studio analyzer should do this text? :-)

annotation

Static code analysis technologies are used in companies with mature software development processes. However, the level of application and implementation of code analysis tools in the development process may vary. Starting from the manual launch of the analyzer “from time to time” or when searching for subtle errors, and ending with the daily automatic launch or launch when adding new source code to the version control system.

The article discusses the various levels of using technologies of static code analysis in team development, shows how to "transfer" the process from one level to another. As an example, the article uses the PVS-Studio code analyzer developed by the authors.

Introduction

Static code analyzer is a tool for finding software errors in the source code. The use of such a tool helps to avoid the detection of software errors at the development stage, and not at the stages of testing or use.

However, companies are not always able to profit from such tools. The reasons for this are very different. Some projects are simply not economically suitable for the implementation of a code analyzer, some projects are not large enough for the effect to be noticeable. Therefore, before introducing static code analysis into the development process, you need to understand when it can be useful, and when not.

Based on the experience of the authors (involved in the development, promotion and sale of their own static code analyzer), the article formulates the main considerations that should be guided by the introduction of such tools in the development process.

What is static code analysis?

Static code analysis is a technology for finding errors in programs by parsing the source code and searching for patterns (templates) of known errors in it. This technology is implemented by special tools called static code analyzers.

The word "static" means that the code is parsed without running the program for execution. Tools that analyze a program while it is running are called dynamic code analyzers.

The most famous static analyzers are produced by Coverity, Klocwork, Gimpel Software. Popular dynamic analyzers are made by Intel (Intel Parallel Inspector) and Micro Focus (DevPartner Bounds Checker). It is also necessary to mention the specialized static code analyzer PVS-Studio, the development and promotion of which the authors of the article are engaged in.

The result of the work of the static analyzer is a list of potential problems detected in the code with a file name and a specific line. In other words, this is a list of errors, very similar to the one that the compiler produces. The term "potential problems" is not used here by chance. Unfortunately, a static analyzer cannot absolutely say whether this potential error in the code is a real problem. Only a programmer can know this. Therefore, alas (and this is inevitable), code analyzers give false positives.

Tools for static code analysis are divided by the type of supported programming languages (Java, C #, C, C ++), by diagnosed problems (general-purpose analyzers or specialized ones, for example, for developing 64-bit or parallel programs).

For which projects is static code analysis relevant

Static code analysis is advisable not to be used in all projects, but only in medium and large ones. The discussion on what is considered a small / medium / large project is clearly beyond the scope of this article, but from our own experience we recommend that you think about using static analysis in projects that are larger than 30 person-months. If the software project is smaller than the specified size, then instead of using static analysis, it is enough to have several qualified developers in the project. A team of two or four qualified employees will fully pull such a project and will be able to make it qualitatively from a program point of view. But if either more people are working on the project, or the project lasts more than six months, then hoping that "you just need to write without errors" is naive enough.

Variants (scenarios) of using static code analyzers

Consider situations in which the development team may need to use static code analysis. Here, the case where the static analysis only appears during the development process is deliberately considered, because if the static analysis has long been implemented and used, then it makes no sense to discuss implementation issues.

So, suppose a team of 5 people is involved in the transfer of code for a software project to work on 64-bit computers. Assume also that the project code is written in C / C ++. We say in advance that such prerequisites are made in order to use our PVS-Studio code analyzer in the example. The developers fixed the main compilation errors, assembled the application, the distribution kit. They started testing and found out that the program has extremely mysterious errors that appear only in the 64-bit version of the program. The developers go to Google, introduce the “64-bit platform with ++ issues” and among the 8.5 million results on the first page they find a link to our article “20 issues of porting C ++ code on the 64-bit platform” (in the Russian version “20 traps porting C ++ code to a 64-bit platform ”), from which they learn what appears in C / C ++ applications when developing 64-bit versions of programs, various previously invisible problems are manifested. There they learn that there is a PVS-Studio tool that will allow you to find and fix these problems. Next, the developers download the tool, look at the evaluation version, if it suits them, then buy a license, use the tool to find some errors in their code, fix them, and the program appears without errors. After that, the developers consider the task of creating the 64-bit version of the program complete and then refuse to use the analyzer, since they believe that they do not need it anymore. Next, the developers download the tool, look at the evaluation version, if it suits them, then buy a license, use the tool to find some errors in their code, fix them, and the program appears without errors. After that, the developers consider the task of creating the 64-bit version of the program complete and then refuse to use the analyzer, since they believe that they do not need it anymore. Next, the developers download the tool, look at the evaluation version, if it suits them, then buy a license, use the tool to find some errors in their code, fix them, and the program appears without errors. After that, the developers consider the task of creating the 64-bit version of the program complete and then refuse to use the analyzer, since they believe that they do not need it anymore.

Another scenario close to this. When developing a Java application, a team of 5 developers encountered an error in one of the third-party modules. Unfortunately, it was not possible to find the error in the code with the “eyes”; the developers downloaded a trial version of some code analyzer for Java, with it they found an error in this third-party module, fixed it, but they didn’t buy a license for the tool - the project budget limits. The error is fixed, the application is released, the license for the tool is not violated. Everything seems to be normal, but this option of using a static analyzer cannot be called correct.

Third use case. The developers switched to using Visual Studio Team Foundation Server, in which it is possible to run code analysis for files added to the version control system. A few weeks later, the developers turned off the code check, as adding new code turned into the game “convince the analyzer to allow the file to be added”.

All three of these use cases are not good cases of using static analysis. And this is despite the fact that in the first two cases, the analyzer helped find real errors in the code, and in the third case the code of the programmers was apparently frankly bad. What are the reasons for these failures?

What prevents full use of a static code analyzer

We show the reasons that the above three options for using static analysis are not successful cases.

If the team uses a specialized code analyzer (as in the described case to search for problems of 64-bit code), then it is very tempting to abandon the tool after the problems seem to be found and fixed. Indeed, if a 64-bit version of a software product is released, it might seem that there is no point in using a special tool further. However, it is not. If you refuse to use such an analyzer, then over time (after several months) already in the new code, those errors that could be detected using the code analyzer will appear. That is, although the 64-bit version of the application exists and (once) was debugged, the new code may contain errors specific to 64-bit applications. The conclusion of the first use case is the rejection of a specialized code analyzer after

In the second case described, the team decided to use a specialized tool only when it was already obvious the presence of hard-to-detect errors in the project. And after fixing these errors, the team abandoned the tool. The problem with this approach is that hard-to-detect errors will appear in the project sooner or later. But, perhaps, at first they will now be seen by users, not developers or testers. The conclusion of the second use case coincides with the first conclusion - the rejection of the tool will certainly lead to the appearance of difficult to detect errors.

In the third use case, when due to the difficulties of adding new code to the version control system, it was decided to refuse the static analysis when adding the code, the problem is not with the static analyzer, but with the insufficient level of the command. Firstly, the team was unable to configure the tool so that its messages were useful. And, secondly, apparently the code was really not very good, since the analyzer issued a lot of diagnostic messages.

So, we formulate the main problems that prevent us from constantly using static code analysis tools in our work:

The high price of code analysis tools does not allow the use of these tools in small (primarily on a budget) projects. You just need to understand that there are projects in which static analysis is not suitable not because of technological, but because of economic reasons.
A code analysis tool gives many false positives. Alas, any code analyzer gives false positives and often gives a lot of them. The reason here lies in the philosophy of such tools. Better to give out ten to one hundred false messages than to miss one real. It is not worth hoping that some analyzers generate few false positives. It is better to choose a tool that somehow supports the ability to work with false positives. For example, our PVS-Studio analyzer contains the “Mark as False Alarm” function. With its help, you can mark the false positives of the analyzer directly in the code. That is, indicate that the analyzer should not produce such and such type of messages in such and such a line.
Poor integration into the development environment. If the code analysis tool does not have smooth "seamless" integration into the development environment, then it is unlikely to be used regularly.
The lack of automated launch using the command line. This does not allow regular code analysis of the entire project, for example, during daily builds.
Lack of integration with version control system. Although in the example considered earlier, checking the new code when adding it to the version control system served as a refusal to use such tools, nevertheless the very possibility of such integration is useful.
Too complex, or vice versa too simple settings of the code analyzer.

The solution here is the interaction of a company that wants to use static code analysis technology with a company that provides these technologies. That is, relations from the category "buy an instrument and use it" pass into the category "buy a solution, implement it, and only then use it." Like it or not, but in most cases, just buying a “program analyzer” and using it to profit will fail. It is necessary to “tighten up” the development process in the company and, together with the static analysis solution provider, introduce the tool they offer into the ongoing regular team development process.

Static analysis market leaders like Coverity or Klocwork work in this way. This incidentally has, perhaps, a not entirely clear external manifestation. These companies are not so easy to get at least some kind of trial version from the site. And to reach the answer to the question “how much does it cost” is not at all possible until the sales managers find out maximum information about the client.

Conclusion

If your company plans to use static code analysis, then consider the following:

The implementation of static code analysis affects the entire development process.
A static analyzer is not a small utility or a regular copy of Windows that you can buy and use without any interaction with the supplier. Always count on the fact that you need to communicate closely with the developers of the analyzer, and the procedure for implementing the tool takes time and effort.
A static analyzer enhances the overall culture of software development in a team, but only if the team itself is ready for this increase. That is, this is a mutual process.
Improving the development culture through the use of static code analyzers is an expensive process. One must be prepared for this and understand that this will require substantial investments.

Tags: