Static analysis of the HPX library with PVS-Studio
- Transfer
This article is a translation of Hartmut Kaiser's blog post about the experience of using PVS-Studio on HPX - C ++ library for distributed / parallel computing of any scale.
Once we already used the trial version of the PVS-Studio analyzer for HPX, and then it was remembered to us by its verbosity. Recently, there have been many articles about this utility, and, because A lot of time has passed since its use, we decided to contact the developers in order to find out if they are ready to support our open-source product. We were very happy when we received a license for 1 year in exchange for an article about the problems found.
After downloading PVS-Studio V5.26 , I had no problems installing it as an extension for Visual Studio 2013 Professional, although I would prefer to test it with the 2015RC1 version - my main development environment. Unfortunately, at the moment VS 2015 is not supported, but I hope that this will happen soon after the new release from Microsoft.
The integration of PVS-Studio into Visual Studio made a very good impression. One additional menu button provides access to all commands and options. All diagnostic messages are displayed in a special window from which you can directly go to the source code. You can also open a web-based context-sensitive help, explaining each of the diagnostics separately. All as I would like.
Messages have 3 different levels of importance (severity levels): high, medium and low, and, in turn, are grouped into 3 categories of analysis: general, optimization and analysis of 64-bit compatibility. The user interface allows you to limit the displayed diagnostics and filter the result. For the main HPX module, the analyzer generated about 70 messages from approximately 1000 .h and .cpp files (~ 140,000 lines of code), which was, in my opinion, not bad (high severity: 5, medium: 44, low: 21). The initial analysis on my laptop took about 10 minutes.
I wanted to see hidden bugs in HPX with impatience. Our team is very sensitive to the quality of the code, and each pool of requests is reviewed by at least one developer before the merge in the main branch. So I was sure that the analyzer could not find anything.
First, let's look at high-severity errors. Four of them were very similar and had a common nature:
This code deserializes a polymorphic object through a pointer to the base class. We know that raw_ptr (act) dynamically creates a new object, assigning its address to the passed pointer. We also know that in case of an error, the function throws an exception. All this could be visible to the static analyzer, since The entire serialization module in HPX is located in headers. The analyzer, apparently, did not see this and generated errors. Fortunately, you can explicitly tell PVS-Studio to ignore this error with just one click, which will add a magic comment like // - V522 - very convenient. Moreover, PVS-Studio gives many options for suppressing messages based on files or directories on patterns in file names — all options are easily accessible and well explained.
The second mistake alarmed me:
To understand what the analyzer wanted to say, it took me some time, because operator% from Boost. Format for me looked completely inconspicuous. However, even if operator% was not overloaded, and the code was indeed problematic, the error message would still be incomprehensible. I “solved” this problem in the same way - by suppressing the message.
The last “high severity” message was the result of the optimization:
And the analyzer was right, the hostname variable was not used at all in this context. None of the compilers, which we use for regular testing on more than 20 platforms, have not discovered this before - great remark!
Messages of medium and low importance were mainly related to benign problems about 32bit / 64bit compatibility, such as the implicit conversion of sign integers into large unsigned integers (for example, int32_t -> uint64_t).
But two errors turned out to be real bugs:
The message itself indicates a problem: once we changed the return type of the function from bool to int, but forgot to change one of the return expressions. In the future, this could create complex problems for reproduction.
Another mistake turned out to be more serious:
This is a really good point, especially because we added the constructor declaration as a workaround for older versions of gcc that incorrectly instantiated default implementations.
In the end, I was glad to spend time launching the analyzer for the entire HPX codebase. I was glad to see that no serious problems were found, as confirmation of our policy review, which we introduced a long time ago. I also decided to add PVS-Studio to our contiguous integration process that starts at every commit.
Static analysis definitely matters. Running utilities like PVS-Studio takes some time and effort, but, in my opinion, is the right investment of time and money. Compilers are not suitable for the deep analysis that PVS-Studio does, since otherwise, it would significantly increase the compile time.
One of the most useful opportunities for us was the easy integration into the CI process. Since we run our daily tests on various platforms (including Windows), the analysis results are available for developers working under other operating systems (Linux, Mac OS).
Another pleased option of PVS-Studio is to launch an analysis every time after a successful project build. Surprisingly, this does not add much overhead to the normal build process, and also gives feedback to subtle code problems in the very early stages of implementation. I left the option enabled to evaluate the utility of the tool in the process.
Analyzing the generated messages, I was surprised that, even though the utility has time to analyze as deeply as possible, PVS-Studio could not analyze the overloads of certain operators to determine the non-default semantics. An example with operator% from Boost.Format demonstrated this. I agree - this is indeed a very delicate moment, and I am not sure that it is possible to ensure correct diagnosis in this case. On the other hand, it is here, in an in-depth analysis of the semantics of the code, the tool must have all the power.
If you are interested in HPX, forknite the library with github .
Once we already used the trial version of the PVS-Studio analyzer for HPX, and then it was remembered to us by its verbosity. Recently, there have been many articles about this utility, and, because A lot of time has passed since its use, we decided to contact the developers in order to find out if they are ready to support our open-source product. We were very happy when we received a license for 1 year in exchange for an article about the problems found.
General impressions
After downloading PVS-Studio V5.26 , I had no problems installing it as an extension for Visual Studio 2013 Professional, although I would prefer to test it with the 2015RC1 version - my main development environment. Unfortunately, at the moment VS 2015 is not supported, but I hope that this will happen soon after the new release from Microsoft.
The integration of PVS-Studio into Visual Studio made a very good impression. One additional menu button provides access to all commands and options. All diagnostic messages are displayed in a special window from which you can directly go to the source code. You can also open a web-based context-sensitive help, explaining each of the diagnostics separately. All as I would like.
Messages have 3 different levels of importance (severity levels): high, medium and low, and, in turn, are grouped into 3 categories of analysis: general, optimization and analysis of 64-bit compatibility. The user interface allows you to limit the displayed diagnostics and filter the result. For the main HPX module, the analyzer generated about 70 messages from approximately 1000 .h and .cpp files (~ 140,000 lines of code), which was, in my opinion, not bad (high severity: 5, medium: 44, low: 21). The initial analysis on my laptop took about 10 minutes.
Error examples
I wanted to see hidden bugs in HPX with impatience. Our team is very sensitive to the quality of the code, and each pool of requests is reviewed by at least one developer before the merge in the main branch. So I was sure that the analyzer could not find anything.
First, let's look at high-severity errors. Four of them were very similar and had a common nature:
template <typename Archive>
voidload(Archive& ar){
actions::manage_object_action_base* act = 0;
ar >> hpx::serialization::detail::raw_ptr(act);
// V522: Dereferencing of the null pointer 'act' might take place.
HPX_ASSERT(act->is_valid());
// ...
}
This code deserializes a polymorphic object through a pointer to the base class. We know that raw_ptr (act) dynamically creates a new object, assigning its address to the passed pointer. We also know that in case of an error, the function throws an exception. All this could be visible to the static analyzer, since The entire serialization module in HPX is located in headers. The analyzer, apparently, did not see this and generated errors. Fortunately, you can explicitly tell PVS-Studio to ignore this error with just one click, which will add a magic comment like // - V522 - very convenient. Moreover, PVS-Studio gives many options for suppressing messages based on files or directories on patterns in file names — all options are easily accessible and well explained.
The second mistake alarmed me:
#define HPX_VERSION_MAJOR 0#define HPX_VERSION_MINOR 9#define HPX_VERSION_SUBMINOR 11std::stringfull_version_as_string(){
// V609 Mod by zero. Denominator ‘0’ == 0.return boost::str(
boost::format("%d.%d.%d") %
HPX_VERSION_MAJOR % HPX_VERSION_MINOR %
HPX_VERSION_SUBMINOR);
}
To understand what the analyzer wanted to say, it took me some time, because operator% from Boost. Format for me looked completely inconspicuous. However, even if operator% was not overloaded, and the code was indeed problematic, the error message would still be incomprehensible. I “solved” this problem in the same way - by suppressing the message.
The last “high severity” message was the result of the optimization:
// V808 'hostname' object of 'basic_string' type was created // but was not utilized.std::string hostname = boost::asio::ip::host_name();
And the analyzer was right, the hostname variable was not used at all in this context. None of the compilers, which we use for regular testing on more than 20 platforms, have not discovered this before - great remark!
Messages of medium and low importance were mainly related to benign problems about 32bit / 64bit compatibility, such as the implicit conversion of sign integers into large unsigned integers (for example, int32_t -> uint64_t).
But two errors turned out to be real bugs:
int runtime_support::load_components(util::section& ini)
{
// load all components as described in the configuration informationif (!ini.has_section("hpx.components")) {
// V601 The 'true' value is implicitly cast to the integer type.returntrue; // no components to load
}
// ...
}
The message itself indicates a problem: once we changed the return type of the function from bool to int, but forgot to change one of the return expressions. In the future, this could create complex problems for reproduction.
Another mistake turned out to be more serious:
structwhen_each_frame
{// ...private:
// V690 Copy constructor is declared as private in the // 'when_each_frame' class, but the default '=' operator // will still be generated by compiler. It is dangerous // to use such a class.
when_each_frame();
when_each_frame(when_each_frame const&);
public:
// ...
};
This is a really good point, especially because we added the constructor declaration as a workaround for older versions of gcc that incorrectly instantiated default implementations.
In the end, I was glad to spend time launching the analyzer for the entire HPX codebase. I was glad to see that no serious problems were found, as confirmation of our policy review, which we introduced a long time ago. I also decided to add PVS-Studio to our contiguous integration process that starts at every commit.
Conclusion
Static analysis definitely matters. Running utilities like PVS-Studio takes some time and effort, but, in my opinion, is the right investment of time and money. Compilers are not suitable for the deep analysis that PVS-Studio does, since otherwise, it would significantly increase the compile time.
One of the most useful opportunities for us was the easy integration into the CI process. Since we run our daily tests on various platforms (including Windows), the analysis results are available for developers working under other operating systems (Linux, Mac OS).
Another pleased option of PVS-Studio is to launch an analysis every time after a successful project build. Surprisingly, this does not add much overhead to the normal build process, and also gives feedback to subtle code problems in the very early stages of implementation. I left the option enabled to evaluate the utility of the tool in the process.
Analyzing the generated messages, I was surprised that, even though the utility has time to analyze as deeply as possible, PVS-Studio could not analyze the overloads of certain operators to determine the non-default semantics. An example with operator% from Boost.Format demonstrated this. I agree - this is indeed a very delicate moment, and I am not sure that it is possible to ensure correct diagnosis in this case. On the other hand, it is here, in an in-depth analysis of the semantics of the code, the tool must have all the power.
If you are interested in HPX, forknite the library with github .