
We go mushrooming after cppcheck

After heated discussions about the Big Calculator , I wanted to check something else out of the projects related to the research. The first thing that was found was an open project, OpenMS, associated with protein mass spectrometry. This project turned out to be written with a serious approach. In development, at least Cppcheck is used. So nothing sensational had to wait. However, there was an interest in what errors PVS-Studio could find after Cppcheck. Interested persons I invite you to continue reading the article.
There is an OpenMS project. Why is it needed, I do not presume to retell in my own words. Still blurt out stupidity. Just to quote a fragment of the description from Wikipedia:
OpenMS is an open-source project for data analysis and processing in protein mass spectrometryand is released under the 2-clause BSD license. OpenMS has tools for many common data analysis pipelines used in proteomics, providing algorithms for signal processing, feature finding (including de-isotoping), visualization in 1D (spectra or chromatogram level), 2D and 3D, map mapping and peptide identification. It supports label-free and isotopic-label based quantification (such as iTRAQ and TMT and SILAC). Furthermore, it also supports metabolomics workflows and DIA / SWATH targeted analysis.
Source: Wikipedia. OpenMS
The project is medium in size, but very complex. The size of the source code is more than 20 megabytes. Plus a large number of third-party libraries (Boost, Qt, Zlib, and so on). The project uses templates very actively. The source code of the project can beDownload from SourceForge .
When developing an OpenMS project, static code analysis is used. The presence of “cppcheck.cmake” and the following comments:
if (i != peptide.size()) // added for cppcheck
indicates that at least Cppcheck is used. Cpplint is also mentioned and there is a file called “cpplint.py”. Professional approach. Well done.
Now let's see what we can find using the PVS-Studio analyzer .
Note. For some reason, C ++ files carry the extension * .c. So do not be confused when you see an example of C ++ code located in the '* .c' file.
1. Shortcomings with OpenMP
I rarely come across projects that use OpenMP technology . In general, sometimes thoughts visit me to remove these diagnostics from the analyzer. Therefore, I was genuinely surprised when I saw warnings related to OpenMP. Over the past year, having checked dozens of projects, I have never seen a single warning on that topic. Wow, someone still uses this technology.
Among the warnings issued there are false positives, but there were warnings in the case.
DoubleReal ILPDCWrapper::compute(....) const
{
....
DoubleReal score = 0;
....
#pragma omp parallel for schedule(dynamic, 1)
for (SignedSize i = 0; i < (SignedSize)bins.size(); ++i)
{
score += computeSlice_(fm, pairs, bins[i].first,
bins[i].second, verbose_level);
}
return score;
}
Warning PVS-Studio: V1205 Data race risk. Unprotected concurrent operation with the 'score' variable. ilpdcwrapper.c 213 The
amount is considered incorrect. The variable 'score' is not protected from simultaneous use in different threads.
Other comments are not so critical, but still worth paying attention to. Inside parallel sections, all exceptions must be caught. An exception going beyond the parallel section is likely to lead to an abnormal termination of the program. You can read more about this topic in the notes: " OpenMP and exceptions ", " Handling exceptions inside parallel sections ".
An exception can be thrown explicitly using the throw statement. It can also occur when calling the new operator (std :: bad_alloc).
First option. The getTheoreticalmaxPosition () function may throw an exception.
Size getTheoreticalmaxPosition() const
{
if (!this->size())
{
throw Exception::Precondition(__FILE__, __LINE__,
__PRETTY_FUNCTION__,
"There must be at least one trace to ......");
}
....
}
virtual void run()
{
....
#pragma omp parallel for
for (SignedSize i = 0; i < (SignedSize)seeds.size(); ++i)
{
....
f.setMZ(
traces[traces.getTheoreticalmaxPosition()].getAvgMZ());
....
}
....
}
PVS-Studio Warning: V1301 The 'throw' keyword cannot be used outside of a try..catch block in a parallel section. featurefinderalgorithmpickedhelperstructs.h 199
Second option. Calling the 'new' operator may raise an exception.
TraceFitter* chooseTraceFitter_(double& tau)
{
// choose fitter
if (param_.getValue("feature:rt_shape") == "asymmetric")
{
LOG_DEBUG << "use asymmetric rt peak shape" << std::endl;
tau = -1.0;
return new EGHTraceFitter();
}
....
}
virtual void run()
{
....
#pragma omp parallel for
for (SignedSize i = 0; i < (SignedSize)seeds.size(); ++i)
{
....
TraceFitter* fitter = chooseTraceFitter_(egh_tau);
....
}
....
}
PVS-Studio Warning: V1302 The 'new' operator cannot be used outside of a try..catch block in a parallel section. featurefinderalgorithmpicked.h 1926
Other similar warnings:
- V1301 featurefinderalgorithmpicked.h 1261
- V1301 mzmlfile.h 114
- V1301 rawmssignalsimulation.c 598
- V1301 rawmssignalsimulation.c 1152
- V1301 chromatogramextractor.h 103
- V1301 chromatogramextractor.h 118
- V1302 featurefinderalgorithmpicked.h 1931
- V1302 rawmssignalsimulation.c 592
- V1302 rawmssignalsimulation.c 601
- V1302 openswathanalyzer.c 246
2. Typos
std::vector< std::pair > spectra_offsets;
std::vector< std::pair > chromatograms_offsets;
template
void MzMLHandler::writeFooter_(std::ostream& os)
{
....
int indexlists;
if (spectra_offsets.empty() && spectra_offsets.empty() )
{
indexlists = 0;
}
else if (!spectra_offsets.empty() && !spectra_offsets.empty() )
{
indexlists = 2;
}
else
{
indexlists = 1;
}
....
}
PVS-Studio
Warnings : V501 There are identical sub-expressions 'spectra_offsets.empty ()' to the left and to the right of the '&&' operator. mzmlhandler.h 5288
V501 There are identical sub-expressions '! spectra_offsets.empty ()' to the left and to the right of the '&&' operator. mzmlhandler.h 5292
Very strange checks. The container 'spectra_offsets' is checked twice. Most likely, this is a typo and two different containers should be checked: spectra_offsets and chromatograms_offsets.
template
void MzMLHandler::characters(
const XMLCh* const chars, const XMLSize_t)
{
....
if (optionalAttributeAsString_(data_processing_ref,
attributes,
s_data_processing_ref))
{
data_.back().meta.setDataProcessing(
processing_[data_processing_ref]);
}
else
{
data_.back().meta.setDataProcessing(
processing_[data_processing_ref]);
}
....
}
PVS-Studio Warning: V523 The 'then' statement is equivalent to the 'else' statement. mzmlhandler.h 534
If you look at similar code fragments, you can conclude that you had to use:
- processing_ [data_processing_ref]
- processing_ [default_processing_]
Many typos turned out to be related to the generation of exceptions. Mistakes are commonplace. The 'throw' keyword is forgotten. A temporary object is simply created and destroyed immediately. An example of such an error:
inline UInt asUInt_(const String & in)
{
UInt res = 0;
try
{
Int tmp = in.toInt();
if (tmp < 0)
{
Exception::ConversionError(
__FILE__, __LINE__, __PRETTY_FUNCTION__, "");
}
res = UInt(tmp);
}
catch (Exception::ConversionError)
{
error(LOAD,
String("UInt conversion error of \"") + in + "\"");
}
return res;
}
PVS-Studio Warning: V596 The object was created but it is not being used. The 'throw' keyword could be missing: throw ConversionError (FOO); xmlhandler.h 247
Identical typos here:
- inclusionexclusionlist.c 281
- inclusionexclusionlist.c 285
- precursorionselectionpreprocessing.c 257
- modificationsdb.c 419
- modificationsdb.c 442
- svmtheoreticalspectrumgeneratorset.c 103
- logconfighandler.c 285
- logconfighandler.c 315
- suffixarraytrypticcompressed.c 488
- tooldescription.c 147
- tofcalibration.c 147
And the last typo I noticed:
inline typename Value::Type const & operator*() {
tmp.i1 = *in.in1;
tmp.i2 = *in.in2;
tmp.i3 = *in.in2;
return tmp;
}
Warning PVS-Studio: V525 The code containing the collection of similar blocks. Check items 'in1', 'in2', 'in2' in lines 112, 113, 114. pipe_joiner.h 112
It should be written:
tmp.i1 = *in.in1;
tmp.i2 = *in.in2;
tmp.i3 = *in.in3;
3. A strange condition
CompressedInputSource::CompressedInputSource(
const String & file_path, const char * header,
MemoryManager * const manager)
: xercesc::InputSource(manager)
{
if (sizeof(header) / sizeof(char) > 1)
{
head_[0] = header[0];
head_[1] = header[1];
}
else
{
head_[0] = '\0';
head_[1] = '\0';
}
....
}
PVS-Studio warning: V514 Dividing sizeof a pointer 'sizeof (header)' by another value. There is a probability of logical error presence. compressedinputsource.c 52
If we divide the size of the pointer by the size of the byte, we always get a value greater than one. At least I don’t know fancy architecture where this is not so. So here is some kind of mistake.
A similar weird check is available here: compressedinputsource.c 104
4. Returning a link to a local object
template
inline Iter > const &
operator++(Iter > & me, int)
{
Iter > before = me;
goNext(me);
return before;
}
PVS-Studio Warning: V558 Function returns the reference to temporary local object: before. iter_concat_virtual.h 277
The function returns a reference to the temporary variable 'before'. When you exit the function, this variable will be destroyed. Using a reference to a destroyed object will lead to unpredictable consequences.
It seems to me that it was not necessary to create a new iterator, but to save the link:
Iter > &before = me;
[PS Wrong. See comments.]
A similar trouble exists with the '-' operator: iter_concat_virtual.h 310
5. Sloppy computing
typedef size_t Size;
typedef double DoubleReal;
void updateMeanEstimate(const DoubleReal & x_t,
DoubleReal & mean_t, Size t)
{
DoubleReal tmp(mean_t);
tmp = mean_t + (1 / (t + 1)) * (x_t - mean_t);
mean_t = tmp;
}
PVS-Studio warning: V636 The '1 / (t + 1)' expression was implicitly casted from 'int' type to 'double' type. Consider utilizing an explicit type cast to avoid the loss of a fractional part. An example: double A = (double) (X) / Y ;. masstracedetection.c 129
The expression "(1 / (t + 1))" is always zero or one. This is due to the fact that this expression is integer. Perhaps it was thought to get a completely different result different. I do not know the logic of the program, but it seems to me that I had in mind the following:
tmp = mean_t + (1.0 / (t + 1)) * (x_t - mean_t);
I also did not like that instead of the M_PI constant, explicit values are used, and not very accurate ones. This is certainly not a mistake, but not very good. Example:
bool PosteriorErrorProbabilityModel::fit(
std::vector & search_engine_scores)
{
....
incorrectly_assigned_fit_param_.A =
1 / sqrt(2 * 3.14159 *
pow(incorrectly_assigned_fit_param_.sigma, 2));
....
}
PVS-Studio Warning: V624 The constant 3.14159 is being utilized. The resulting value could be inaccurate. Consider using the M_PI constant from
Similarly:
- posteriorerrorprobabilitymodel.c 101
- posteriorerrorprobabilitymodel.c 110
- posteriorerrorprobabilitymodel.c 155
- posteriorerrorprobabilitymodel.c 162
6. Going abroad the array
static const Int CHANNELS_FOURPLEX[4][1];
static const Int CHANNELS_EIGHTPLEX[8][1];
ExitCodes main_(int, const char **)
{
....
if (itraq_type == ItraqQuantifier::FOURPLEX)
{
for (Size i = 0; i < 4; ++i)
{
std::vector > one_label;
one_label.push_back(std::make_pair(
String("Channel ") +
String(ItraqConstants::CHANNELS_FOURPLEX[i][0]),
DoubleReal(ItraqConstants::CHANNELS_FOURPLEX[i][0])));
labels.push_back(one_label);
}
}
else //ItraqQuantifier::EIGHTPLEX
{
for (Size i = 0; i < 8; ++i)
{
std::vector > one_label;
one_label.push_back(std::make_pair(
String("Channel ") +
String(ItraqConstants::CHANNELS_FOURPLEX[i][0]),
DoubleReal(ItraqConstants::CHANNELS_FOURPLEX[i][0])));
labels.push_back(one_label);
}
}
....
}
PVS-Studio Warning: V557 Array overrun is possible. The value of 'i' index could reach 7. itraqanalyzer.c 232
In fact, this error could be attributed to Copy-Paste errors. But let there be “going out of the array border”. That sounds more intimidating. Anyway, this division is very arbitrary. The same error can be classified in different ways.
Here, in the 'else' branch, you had to work with the arrays of 'CHANNELS_EIGHTPLEX'. This is evidenced by the comment:
else //ItraqQuantifier::EIGHTPLEX
However, the copied code fragment was not completely changed. As a result, the array CHANNELS_FOURPLEX, which has a smaller size, is used there.
A similar error here (another Copy-Paste): tmtanalyzer.c 225
Consider another example.
DoubleReal masse_[255]; ///< mass table
EdwardsLippertIterator::EdwardsLippertIterator(const
EdwardsLippertIterator & source) :
PepIterator(source),
f_file_(source.f_file_),
actual_pep_(source.actual_pep_),
spec_(source.spec_),
tol_(source.tol_),
is_at_end_(source.is_at_end_),
f_iterator_(source.f_iterator_),
f_entry_(source.f_entry_),
b_(source.b_),
e_(source.e_),
m_(source.m_),
massMax_(source.massMax_)
{
for (Size i = 0; i < 256; i++)
{
masse_[i] = source.masse_[i];
}
}
PVS-Studio Warning: V557 Array overrun is possible. The value of 'i' index could reach 255. edwardslippertiterator.c 134
The copy constructor does not work properly with the masse_ array. It consists of 255 elements. And 256 elements are copied.
The correct cycle:
for (Size i = 0; i < 255; i++)
{
masse_[i] = source.masse_[i];
}
Better yet, do not use magic constants at all.
7. Deprecated operator call 'new'
svm_problem * LibSVMEncoder::encodeLibSVMProblem(....)
{
....
node_vectors = new svm_node *[problem->l];
if (node_vectors == NULL)
{
delete[] problem->y;
delete problem;
return NULL;
}
....
}
PVS-Studio Warning: V668 There is no sense in testing the 'node_vectors' pointer against null, as the memory was allocated using the 'new' operator. The exception will be generated in the case of memory allocation error. libsvmencoder.c 177
Checking “if (node_vectors == NULL)” does not make sense. If memory allocation fails, an exception will be thrown. As a result, the program will not behave as the programmer planned. For example, a memory leak will occur.
Similar obsolete checks:
- file_page.h 728
- libsvmencoder.c 160
Conclusion
I think it will be useful to use not only Cppcheck, Cpplint, but PVS-Studio. Especially if you do it regularly. I invite the developers of the project to write to us at support@viva64.com . For a while, we will allocate a key for a more detailed OpenMS test.