GGribkov March 20, 2019 at 17:13

Top 10 errors in C ++ projects for 2018

It's already been three months since 2018 is over. For many, it flew almost imperceptibly, but for us, the developers of PVS-Studio, it turned out to be very saturated. We worked hard, fearlessly fought for the advancement of static analysis to the masses and looked for new errors in open source projects written in C, C ++, C # and Java. The ten most interesting of them we have collected for you in this article!

We searched for interesting places using the PVS-Studio static code analyzer . It can detect errors and potential vulnerabilities in code written in the languages mentioned above.

If you are interested in looking for errors yourself, you can always download and try our analyzer. We provide a free version of the analyzer for students and enthusiastic programmers, a free license for developers of open-source projects, as well as a trial version for all-all-all. Who knows, maybe by next year you can make your top 10? :)

Note:I suggest that the reader check himself and, before looking at the analyzer warning, try to identify the anomalies himself. How many mistakes can you find?

Tenth place

Source: And again into space: how the Stellarium unicorn visited.

This error was discovered while checking the Stellarium virtual planetarium.

The above code snippet, although small, is fraught with a rather tricky error:

Plane::Plane(Vec3f &v1, Vec3f &v2, Vec3f &v3)
  : distance(0.0f), sDistance(0.0f)
{
  Plane(v1, v2, v3, SPolygon::CCW);
}

Found it?

PVS-Studio Warning: V603 The object was created but it is not being used. If you wish to call constructor, 'this-> Plane :: Plane (....)' should be used. Plane.cpp 29

The author of the code wanted to initialize part of the fields of the object using another constructor embedded in the main one. True, instead, he only managed to create a temporary object that would be destroyed when leaving his area of visibility. Thus, several fields of the object will remain uninitialized.

Instead of a nested constructor call, you should use the delegating constructor introduced in C ++ 11. For example, you could do this:

Plane::Plane(Vec3f& v1, Vec3f& v2, Vec3f& v3)
  : Plane(v1, v2, v3, SPolygon::CCW)
{
  distance = 0.0f;
  sDistance = 0.0f;
}

Then all the required fields would be correctly initialized. Isn't it wonderful?

Ninth place

Source: Perl 5: how errors were hidden in macros.

In ninth place, there is a remarkable macro from the Perl 5 source code.

Collecting errors for writing an article, my colleague Svyatoslav came across a warning issued by the analyzer on using a macro. Here it is:

PP(pp_match)
{
  ....
  MgBYTEPOS_set(mg, TARG, truebase, RXp_OFFS(prog)[0].end);
  ....
}

To find out what was the matter, Svyatoslav dug deeper. He opened the definition of a macro, and saw that it contained several more nested macros, some of which also had nested macros. It was so hard to figure it out that I had to use a preprocessed file. But, alas, this did not help. In place of the previous line of code, Svyatoslav discovered this:

(((targ)->sv_flags & 0x00000400) && (!((targ)->sv_flags & 0x00200000) ||
S_sv_only_taint_gmagic(targ)) ? (mg)->mg_len = ((prog->offs)[0].end),
(mg)->mg_flags |= 0x40 : ((mg)->mg_len = (((targ)->sv_flags & 0x20000000)
&& !__builtin_expect(((((PL_curcop)->cop_hints + 0) & 0x00000008) ?
(_Bool)1 :(_Bool)0),(0))) ? (ssize_t)Perl_utf8_length( (U8 *)(truebase),
(U8 *)(truebase)+((prog->offs)[0].end)) : (ssize_t)((prog->offs)[0].end),
(mg)->mg_flags &= ~0x40));

Warning PVS-Studio: V502 Perhaps the '?:' Operator works in a different way than it was expected. The '?:' Operator has a lower priority than the '&&' operator. pp_hot.c 3036

I think it will be difficult to find such an error with my eyes. Honestly, we meditated on this code for a long time and came to the conclusion that in fact there is no mistake here. But in any case, this is a rather entertaining example of unreadable code.

Macros are said to be evil. Of course, there are times when they turn out to be indispensable, but if you can replace the macro with a function, you should definitely do it.

Nested macros are especially fraught. Not only because they are difficult to understand, but also because they can give unpredictable results. If the author of a macro accidentally makes a mistake in such a macro, it will be much more difficult to find it than in a function.

Eighth place

Source: Chromium: Other Errors

The following example was taken from a series of articles about analyzing the Chromium project. She covered herself in the WebRTC library.

std::vector
StereoDecoderFactory::GetSupportedFormats() const
{
  std::vector formats = ....;
  for (const auto& format : formats) {
    if (cricket::CodecNamesEq(....)) {
      ....
      formats.push_back(stereo_format);
    }
  }
  return formats;
}

PVS-Studio Warning: V789 CWE-672 Iterators for the 'formats' container, used in the range-based for loop, become invalid upon the call of the 'push_back' function. stereocodecfactory.cc 89 The

error is that the size of the formats vector changes inside the range-based-for loop. Range-based loops are based on iterators, so resizing the container inside such loops can lead to invalidation of these iterators.

This error will persist if you rewrite the loop using explicit iterators. Therefore, for clarity, you can bring this code:

for (auto format = begin(formats), __end = end(formats); 
     format != __end; ++format) {
  if (cricket::CodecNamesEq(....)) {
    ....
    formats.push_back(stereo_format);
  }
}

For example, when using the push_back method, a vector may be released - and then iterators will point to an invalid memory area.

To avoid such errors, you should adhere to the rule: never resize the container inside the loop, the conditions of which are tied to this container. This applies to range-based loops and loops using iterators. You can read about what operations can lead to invalidation of iterators in discussions on StackOverflow.

Seventh place

Source: Godot : Regarding the Use of Static Code Analyzers Regularly

The first example from the video game industry will be a snippet of code that we found in the Godot game engine. You may have to sweat to discover the error with your eyes, but I'm sure our sophisticated readers can handle it:

void AnimationNodeBlendSpace1D::add_blend_point(
  const Ref &p_node, float p_position, int p_at_index)
{
  ERR_FAIL_COND(blend_points_used >= MAX_BLEND_POINTS);
  ERR_FAIL_COND(p_node.is_null());
  ERR_FAIL_COND(p_at_index < -1 || p_at_index > blend_points_used);
  if (p_at_index == -1 || p_at_index == blend_points_used) {
    p_at_index = blend_points_used;
  } else {
    for (int i = blend_points_used - 1; i > p_at_index; i++) {
      blend_points[i] = blend_points[i - 1];
    }
  }
  ....
}

PVS-Studio Warning: V621 CWE-835 Consider inspecting the 'for' operator. It's possible that the loop will be executed incorrectly or won't be executed at all. animation_blend_space_1d.cpp 113

Consider the loop condition in more detail. The counter variable is initialized with the value blend_points_used - 1 . At the same time, based on the previous two checks (in ERR_FAIL_COND and in if ), it becomes clear that at the time the loop is executed, blend_points_used will always be greater than p_at_index . Thus, either the loop condition will always be true, or the loop will not be executed at all.

If blend_points_used - 1 == p_at_indexthen the loop is not executed.

In all other cases, the check i> p_at_index will always be true, since the counter i increases at each iteration of the loop.

It may seem that the cycle will run forever, but it is not.

First, there will be an integer overflow of the variable i , which is undefined behavior. Therefore, rely on this is not worth it.

If i were of type unsigned int , then after the counter reaches the maximum possible value, the i ++ operator would turn it into 0. This behavior is defined by the standard and is called "Unsigned wrapping." However, you should be aware that using such a mechanism is also not a good idea .

It was in the first place, but after all there is still a second! The fact is that it will not even reach the integer overflow. Where before the array goes abroad. This means that an attempt will be made to access the memory area outside the block allocated for the array. And this, too, is vague behavior. A classic example :)

To make it easier to avoid such errors, I can give only a couple of recommendations:

Write simpler, more intuitive code
Do a more thorough Code Review and write more tests for freshly written code
Use static analyzers;)

Sixth place

Source: Amazon Lumberyard: the cry of the soul

Another example from the gamedev industry, namely, from the source code for the Amazon Lumberyard AAA engine.

void TranslateVariableNameByOperandType(....)
{
  //  Igor: yet another Qualcomm's special case
  //  GLSL compiler thinks that -2147483648 is
  //  an integer overflow which is not
  if (*((int*)(&psOperand->afImmediates[0])) == 2147483648)
  {
    bformata(glsl, "-2147483647-1");
  }
  else
  {
    //  Igor: this is expected to fix
    //  paranoid compiler checks such as Qualcomm's
    if (*((unsigned int*)(&psOperand->afImmediates[0])) >= 2147483648)
    {
      bformata(glsl, "%d",
          *((int*)(&psOperand->afImmediates[0])));
    }
    else
    {
      bformata(glsl, "%d",
          *((int*)(&psOperand->afImmediates[0])));
    }
  }
  bcatcstr(glsl, ")");
  ....
}

PVS-Studio Warning: V523 The 'then' statement is equivalent to the 'else' statement. toglsloperand.c 700

Amazon Lumberyard is being developed as a cross-platform engine. Therefore, developers are trying to support as many compilers as possible. The programmer Igor ran into the Qualcomm compiler, as we are told by the comments.

It is not known whether Igor was able to complete his task and cope with the “paranoid” checks of the compiler, but he left behind a very strange code. It is strange that both then - and else- branches of the if statement contain absolutely identical code. Most likely, such an error was made as a result of a sloppy Copy-Paste.

I don’t even know what can be advised here. Therefore, I just wish the Amazon Lumberyard developers success in fixing bugs, and good luck to programmer Igor!

Fifth place

Source: Once again, the PVS-Studio analyzer turned out to be more attentive to humans

. An interesting story happened with the following example. My colleague Andrei Karpov was preparing an article about the next test of the Qt framework. In the course of writing noteworthy errors, he encountered a warning from the analyzer, which he considered false. Here is the relevant code snippet and warning:

QWindowsCursor::CursorState QWindowsCursor::cursorState()
{
  enum { cursorShowing = 0x1, cursorSuppressed = 0x2 };
  CURSORINFO cursorInfo;
  cursorInfo.cbSize = sizeof(CURSORINFO);
  if (GetCursorInfo(&cursorInfo)) {
    if (cursorInfo.flags & CursorShowing)   // <= V616
  ....
}

PVS-Studio Warning: V616 CWE-480 The 'CursorShowing' named constant with the value of 0 is used in the bitwise operation. qwindowscursor.cpp 669

That is, PVS-Studio swore at a place where, obviously, there was no error! It can not be constant to CursorShowing equal to 0 , because just above the pair of rows is initialized to 1 .

Since an unstable version of the analyzer was used for verification, Andrei doubted the correctness of the warning. He carefully examined this section of code several times, and still did not find an error. As a result, he wrote this false positive to the bugtracker so that other colleagues could correct the situation.

And only with a detailed analysis it became clear that PVS-Studio was again more attentive than a person. The value 0x1 is assigned to the named constant cursorShowing , and the bit constant operation “and” involves the named constant CursorShowing . These are completely different constants, because the first begins with a lowercase letter, and the second with a capital letter.

The code compiles successfully, because the QWindowsCursor class really contains a constant with that name. Here is her definition:

class QWindowsCursor : public QPlatformCursor
{
public:
  enum CursorState {
    CursorShowing,
    CursorHidden,
    CursorSuppressed
  };
  ....
}

If you do not explicitly assign a named enum constant, it will be initialized by default. Since CursorShowing is the first element of an enumeration, it will be set to 0 .

In order to prevent such mistakes, you should not give entities too similar names. You should especially carefully follow this rule if these entities are of the same type or can be implicitly cast to each other. Indeed, in such cases it will be practically impossible to detect an error by eye, and the incorrect code will be successfully compiled and live happily inside your project.

Fourth place

Source: We shoot in the foot, processing the input data.

We are approaching the top three finalists, and the error from the FreeSWITCH project is next in turn.

static const char *basic_gets(int *cnt)
{
  ....
  int c = getchar();
  if (c < 0) {
    if (fgets(command_buf, sizeof(command_buf) - 1, stdin) 
          != command_buf) {
      break;
    }
    command_buf[strlen(command_buf)-1] = '\0'; /* remove endline */
    break;
  }
  ....
}

PVS-Studio Warning: V1010 CWE-20 Unchecked tainted data is used in index: 'strlen (command_buf)'.

The analyzer warns that the expression strlen (command_buf) - 1 uses unverified data. And really: if command_buf turns out to be empty from the point of view of the C language string (containing a single character - '\ 0') then strlen (command_buf) will return 0 . In this case, command_buf [-1] will be called , which represents undefined behavior. Trouble!

The very juice of this error is not even why it happens, but howshe is happening. This error is one of those pleasant examples that you can "touch" on your own, reproduce. You can start FreeSwitch, perform some actions that will lead to the execution of the above code section, and pass the program an empty line for input.

As a result, with a flick of the wrist, a working program turns (no, not elegant shorts ) into a non-working one! Details on how to reproduce this error can be found in the source article at the link above, but for now I will give a clear result:

Remember that the input can be anything, and you should always check it. Then the analyzer will not swear, and the program will be more reliable.

Now it's time to deal with our winners: we are moving to the finals!

Third place

Source: NCBI Genome Workbench: Research at Risk

The top three winners are opened by a piece of code from the NCBI Genome Workbench project - a set of tools for studying and analyzing genetic data. Although it is not necessary to be a genetically modified superman to find a mistake here, quite a few are aware of this possibility.

/**
 * Crypt a given password using schema required for NTLMv1 authentication
 * @param passwd clear text domain password
 * @param challenge challenge data given by server
 * @param flags NTLM flags from server side
 * @param answer buffer where to store crypted password
 */
void
tds_answer_challenge(....)
{
  ....
  if (ntlm_v == 1) {
    ....
    /* with security is best be pedantic */
    memset(hash, 0, sizeof(hash));
    memset(passwd_buf, 0, sizeof(passwd_buf));
    ...
  } else {
    ....
  }
}

PVS-Studio warnings:

V597 The compiler could delete the 'memset' function call, which is used to flush 'hash' buffer. The memset_s () function should be used to erase the private data. challenge.c 365
V597 The compiler could delete the 'memset' function call, which is used to flush 'passwd_buf' buffer. The memset_s () function should be used to erase the private data. challenge.c 366

Managed to find a mistake? If so, then you - well done! .. well, or still a genetically modified superman.

The fact is that modern optimizing compilers can do a lot to make the assembled program work faster. In particular, compilers are able to track that the buffer passed to memset is not used anywhere else.

In this case, they can delete the “unnecessary” memset call , and have every right to do so. Then a buffer that stores important data can remain in memory to the delight of attackers.

Against this background, the literate commentary “with security is good to be pedantic” looks even funnier. Judging by the very small number of warnings issued for this project, the developers tried very hard to be careful and write safe code. However, as we can see, skipping this security flaw is very simple. According to Common Weakness Enumeration, a defect is classified as CWE-14 : Compiler Removal of Code to Clear Buffers.

To clear the memory safely, use the memset_s () function . It is not only safer than memset () , but also cannot be "ignored" by the compiler.

Second place

Source: How PVS-Studio turned out to be more attentive than three and a half programmers of the

Silver medalist of this top were sent to us by one of our clients. He was sure that the analyzer gave false warnings.

Eugene received the letter, scanned it briefly, and sent it to Svyatoslav. Svyatoslav thoughtfully looked at the section of code sent by the client, and thought, “Can the analyzer be so blatantly wrong?” Therefore, he went for a consultation with Andrei. He also checked the site and decided: indeed, the analyzer gives false positives.

What can you do, you need to fix it. And only when Svyatoslav began to make synthetic examples to formalize the task as a bugtracker, he realized what was happening.

Errors were indeed present in the code, but not one of the programmers could detect them. Honestly, the author of this article did not succeed either.

And this despite the fact that the analyzer clearly issued warnings for the wrong places!

Can you find such a crafty mistake? Check yourself for vigilance and attentiveness.

PVS-Studio warning:

V560 A part of conditional expression is always false: (ch> = 0x0FF21). decodew.cpp 525
V560 A part of conditional expression is always true: (ch <= 0x0FF3A). decodew.cpp 525
V560 A part of conditional expression is always false: (ch> = 0x0FF41). decodew.cpp 525
V560 A part of conditional expression is always true: (ch <= 0x0FF5A). decodew.cpp 525

If you succeed - you won’t hold my respect!

The error lies in the fact that the logical negation operator (!) Does not apply to the whole condition, but only to its first subexpression:

!((ch >= 0x0FF10) && (ch <= 0x0FF19))

If this condition is met, then the value of the variable ch lies in the interval [0x0FF10 ... 0x0FF19]. Thus, the four further comparisons no longer make sense: they will always be either true or false.

To avoid such errors, it is worth following a few rules. First, it’s very convenient and clear to align the code with a table. Secondly, do not overload expressions with parentheses. For example, this code can be rewritten like this:

const bool isLetterOrDigit =    (ch >= 0x0FF10 && ch <= 0x0FF19)  // 0..9
                             || (ch >= 0x0FF21 && ch <= 0x0FF3A)  // A..Z
                             || (ch >= 0x0FF41 && ch <= 0x0FF5A); // a..z 
if (!isLetterOrDigit)

Then, firstly, the number of brackets becomes much smaller, and secondly, the probability of “catching" a mistake made by the eyes increases.

And now - cherry: we move to the first place!

First place

Source: System in shock: interesting errors in the source codes of the legendary System Shock

So, the error from the legendary System Shock becomes the finalist of our today's top! This game, released in 1994, became the progenitor and inspirer of such iconic games as Dead Space, BioShock and Deus Ex.

But first, I have to admit something. What I will show you now does not contain any error. By and large, it is not even a piece of code, but I just could not resist not to share this with you!

The fact is that in the process of analyzing the source code of the game, my colleague Victoria found many interesting comments. Here and there, suddenly there were playful and ironic remarks, and even verses:


// I'll give you fish, I'll give you candy, 
// I'll give you, everything I have in my hand
// that kid from the wrong side came over my house again,
// decapitated all my dolls
// and if you bore me, you lose your soul to me 
// - "Gepetto", Belly, _Star_
//  And here, ladies and gentlemen, 
// is a celebration of C and C++ and their untamed passion...
//  ==================
TerrainData  terrain_info;
//  Now the actual stuff...
//  =======================
// this is all outrageously horrible, as we dont know what
// we really need to deal with here
// And if you thought the hack for papers was bad,
// wait until you see the one for datas... - X
// Returns whether or not in the humble opinion of the
// sound system, the sample should be politely obliterated 
// out of existence
// it's a wonderful world, with a lot of strange men
// who are standing around, and they all wearing towels

For our Russian-speaking readers, I made an approximate free translation:


// Я дам тебе рыбку, я дам тебе конфетку, 
// Я дам тебе все, что есть у меня в руках 
// этот мальчик с противоположной стороны 
// снова пришел ко мне домой
// обезглавил всех моих кукол
// и если ты мне надоешь, ты потеряешь душу для меня 
// - "Gepetto", Belly, _Star_
//  И вот, дамы и господа,
// перед нами триумф C и C++ и их неукрощенной страсти
//  ==================
TerrainData  terrain_info;
//  А теперь к делу...
//  =======================
// это всё невыразимо ужасающе, так как мы не знаем,
// с чем нам действительно нужно иметь здесь дело
// И если вы думали, что что хак для работы с бумагами был плох
// это вы еще не видели тот, что используется для работы с данными... - X
// Возвращает скромное мнение звуковой системы о том, 
// должен ли семпл быть любезно вычеркнут из существования
// это чудный мир, с кучей странных мужчин
// что повсюду стоят, и они в полотенцах

These comments were left by the developers of the game in the early nineties ... By the way, Doug Church - the chief designer of System Shock - was also writing code. Who knows, maybe any of these comments were written by him personally? I hope that about men in towels - this is not his work :)

Conclusion

In conclusion, I want to thank my colleagues for looking for new errors and writing articles about them. Thanks guys! Without you, this article would not have turned out so interesting.

I also want to talk a little about our achievements, because for a whole year we have been engaged in more than just searching for mistakes. We also developed and improved the analyzer, as a result of which it underwent significant changes.

For example, we added support for several new compilers and expanded the list of diagnostic rules. We also provided initial support for the MISRA C and MISRA C ++ standards . The most important and time-consuming innovation was the support of a new language. Yes, now we can analyze Java code ! And we also updated the icon:)

Also I want to thank our readers. Thank you for reading our articles and writing to us! Your feedback is very pleasant and important to us.

On this, our top 10 C ++ errors for the year 2018 came to an end. Which places did you like the most and why? Did you come across interesting examples in 2018? Tell us about it in the comments!

Until next time!

If you want to share this article with an English-speaking audience, then please use the link to the translation: George Gribkov. Top 10 bugs of C ++ projects found in 2018

Tags: