Chromium: sixth project check and 250 bugs

    This text begins a series of articles devoted to the next test of the Chromium project using the PVS-Studio static code analyzer. The articles will examine various error patterns and make recommendations to reduce the likelihood of their occurrence in the code. However, before you begin, you need to write a kind of introduction that will answer a number of questions in advance, and also provide the Chromium developers with all the bugs that they may begin to correct without waiting for the end of the series of articles.

    Andrey Karpov, PVS-Studio

    Background


    My name is Andrey Karpov, and I am an evangelist of static analysis in general and PVS-Studio static analysis tool in particular. However, the term “technical evangelist” is already becoming obsolete, and “developer advocate” is replacing it.

    I spend a lot of time writing material on improving the quality of code and improving the reliability of programs. Now I have a new reason to write several articles on this topic - checking the open Chromium project using the PVS-Studio analyzer. This is a large project, and bugs of various kinds live in any large project. This diversity allows you to immediately consider several interesting topics related to the causes of these bugs and how to prevent them.

    It is worth noting that this is not the first article on the Chromium project. My previous posts:


    As you can see, with interesting article titles I had a bad time, and my strength ran out. Therefore, further my colleagues picked up the baton:


    By the way, while I was studying a fresh report, I could not resist and wrote a short note about one mistake I liked very much. Since the article has already been published, I will provide a link here to it:


    Each time we checked the project, it contained a large number of errors. A new check was no exception. Moreover, as the PVS-Studio analyzer increasingly detects errors, I simply at first did not know what to do with all of them. Quickly reviewing the report, I wrote out about 250 errors and thought about it. Describe all 250 errors in one article? It will be some kind of horror: long, boring and uninteresting. Break up the description of several articles? It won’t get any better either, instead of one boring article several will appear.

    Then I decided to break the errors found by type and consider them separately. Moreover, I decided not just to describe the errors, but to try to suggest some methods of dealing with them in addition to static code analysis. After all, it is much better not to make a mistake at all than to find it later using static / dynamic code analysis / or somehow else. Even worse if the user finds errors. Therefore, if it is possible to improve the coding style so that the appearance of an error becomes less likely, then it is worth speculating on this topic. This is what I will do in a series of articles.

    Before I start looking at error patterns, I need the introduction you are reading. For example, I need to explain why I did not find the strength to carefully study the report, why I can not say about the percentage of false positives and where you can immediately see all the errors I noticed.

    Project Verification


    At the very end of 2017, my colleague Svyatoslav Razmyslov downloaded the source codes of the Chromium project, somehow conjured them and gave me the generated project for Visual Studio and the PVS-Studio report. Unfortunately, working with the solution in Visual Studio was not possible. Wednesday did not survive the solution, which includes 5021 projects.

    5021 projects in solution

    Everything is incredibly slow, and in general the environment after a while falls. Therefore, I studied the report using PVS-Studio Standalone. This, of course, is not as convenient as using the usual Visual Studio environment, but it is quite acceptable.

    You need to understand that the Chromium project is a big project. Not even said that. This is a BIG PROJECT.

    The Chromium project and the libraries used in it consist of 114 201 C and C ++ files. The number of lines of code is 30,263,757. Of these, comments make up 16%.

    Just that PVS-Studio can verify a project of this size is already an achievement :).

    What i found


    While the New Year holidays were going on, for three nights I looked through the report and wrote out about 250 code fragments, which, in my opinion, require attention and editing. I admit that I did not find the time and energy to review the report carefully. I looked through many warnings very quickly, and ignored some at all when I was bored with some kind of error. I will tell you more about this in the next chapter.

    It is important that I found a lot of mistakes that are enough to write several articles. By the time I finish publishing the last line, the information about the errors found in the project may be a little outdated. But it does not matter. My goal is to demonstrate the capabilities of the static code analysis methodology and share with readers some recommendations on the coding style.

    So that the developers of Chromium and libraries could correct the errors I noticed, without waiting for the end of the series of articles, I wrote them out in a separate file. This should still be done for the reason that perhaps not all error messages will be included in the articles.

    Link to the file with the description of the noticed defects: chromium.txt .

    Why didn’t I manage to look through the report carefully?


    I did not tune the analyzer to reduce the number of false positives. Therefore, false warnings prevented me from looking at the report, and often I missed messages of the same type without peering at them.

    Moreover, I skipped code fragments where it is not immediately clear whether there is an error or not. There are a lot of messages, but I'm alone. If I began to carefully peer into the code, I would only write articles in a few months.

    Let me show you with examples why some warnings are hard to understand, especially if this is unfamiliar code. And in Chromium I DON'T know the WHOLE code.

    So, the PVS-Studio analyzer issued a warning to one of the files of the V8 project:

    V547 CWE-570 Expression 'truncated' is always false. objects.cc 2867

    Error found, or is it a false positive? Try to figure out what is the matter here. I added the comment "// <=" to where the analyzer points.

    void String::StringShortPrint(StringStream* accumulator,
                                  bool show_details) {
      int len = length();
      if (len > kMaxShortPrintLength) {
        accumulator->Add("", len);
        return;
      }
      if (!LooksValid()) {
        accumulator->Add("");
        return;
      }
      StringCharacterStream stream(this);
      bool truncated = false;
      if (len > kMaxShortPrintLength) {
        len = kMaxShortPrintLength;
        truncated = true;
      }
      bool one_byte = true;
      for (int i = 0; i < len; i++) {
        uint16_t c = stream.GetNext();
        if (c < 32 || c >= 127) {
          one_byte = false;
        }
      }
      stream.Reset(this);
      if (one_byte) {
        if (show_details)
          accumulator->Add("Put(static_cast(stream.GetNext()));
        }
        if (show_details) accumulator->Put('>');
      } else {
        // Backslash indicates that the string contains control
        // characters and that backslashes are therefore escaped.
        if (show_details)
          accumulator->Add("Add("\\n");
          } else if (c == '\r') {
            accumulator->Add("\\r");
          } else if (c == '\\') {
            accumulator->Add("\\\\");
          } else if (c < 32 || c > 126) {
            accumulator->Add("\\x%02x", c);
          } else {
            accumulator->Put(static_cast(c));
          }
        }
        if (truncated) {                      // <=
          accumulator->Put('.');
          accumulator->Put('.');
          accumulator->Put('.');
        }
        if (show_details) accumulator->Put('>');
      }
      return;
    }

    Figured out? Difficult?

    Difficult! And that is precisely why I alone cannot study all the analyzer warnings.

    For those who are too lazy to understand, I will explain what the point is.

    So, the analyzer claims that the condition if (truncated) is always false. Let's shorten the function, leaving the gist:

    void F() {
      int len = length();
      if (len > kMaxShortPrintLength)
        return;
      bool truncated = false;
      if (len > kMaxShortPrintLength)
        truncated = true;
      if (truncated) {                      // <=
        accumulator->Put('.');
        accumulator->Put('.');
        accumulator->Put('.');
      }
    }

    The truncated flag must be true if the text is too long, that is, if the if condition is satisfied (len> kMaxShortPrintLength) . However, if the text is too long, then the function exits above.

    That is why truncated is always false and three points at the end will not be added.

    And even now, having found out the reason why the analyzer gives a warning, I don’t know how the code should be written. Or, really, you need to immediately exit the function, then the code that adds the points is just superfluous. Or points are needed, and you should remove the first check, which terminates the function ahead of schedule. It is very, very difficult to study errors in third-party code.

    The PVS-Studio analyzer generated a lot of warnings by V547. I only looked somewhere in their 10th part. Therefore, if you undertake to study carefully, much more errors will be found than I wrote out.

    Here is another example of why I was bored with these warnings.

    void ResourcePrefetcher::OnReadCompleted(net::URLRequest* request,
                                             int bytes_read) {
      DCHECK_NE(net::ERR_IO_PENDING, bytes_read);
      if (bytes_read <= 0) {
        FinishRequest(request);
        return;
      }
      if (bytes_read > 0)
        ReadFullResponse(request);
    }

    PVS-Studio warning: V547 CWE-571 Expression 'bytes_read> 0' is always true. resource_prefetcher.cc 308

    Unlike the previous case, everything is simple here. The analyzer is exactly right in stating that the second condition is always true.

    However, this is not an error, but redundant code. Should I edit such a code? Complex issue. This, by the way, is the reason why it is much better to immediately write code under the supervision of the analyzer, and not to heroically wade through warnings with one-time launches.

    If the analyzer were used regularly, then such redundant code would most likely not even get into the version control system. A programmer would see a warning and write more elegantly. For example, like this:

    void ResourcePrefetcher::OnReadCompleted(net::URLRequest* request,
                                             int bytes_read) {
      DCHECK_NE(net::ERR_IO_PENDING, bytes_read);
      if (bytes_read <= 0)
        FinishRequest(request);
      else
        ReadFullResponse(request);
    }

    The analyzer is silent here. At the same time, the code has become shorter, simpler and more understandable.

    In addition to the V547 , the analyzer generated a whole mountain of V560 messages . This warning indicates that it is always true or false that not all the condition, but some part of it.

    I was bored to study these messages too. This does not mean that the warnings of the V560 are bad. But real serious mistakes are rare. Basically, these warnings indicate poor quality redundant code.

    An example of a boring redundant check:

    template 
    std::unique_ptr>
    DeclarativeRule::Create(....) {
      ....
      bool bad_message = false;                                 // <=
      std::unique_ptr actions = ActionSet::Create(
          browser_context, extension, rule->actions, error,
          &bad_message);                                        // <=
      if (bad_message) {                                        // <=
        *error = "An action of a rule set had an invalid "
                 "structure that should have been caught "
                 "by the JSON validator.";
        return std::move(error_result);
      }
      if (!error->empty() || bad_message)                       // <=
        return std::move(error_result);
      ....
    }

    PVS-Studio Warning: V560 CWE-570 A part of conditional expression is always false: bad_message. declarative_rule.h 472

    Condition:

    if (!error->empty() || bad_message)

    can be simplified to:

    if (!error->empty())

    There is also an option to rewrite the code like this:

      if (bad_message) {
        *error = "An action of a rule set had an invalid "
                 "structure that should have been caught "
                 "by the JSON validator.";
      }
      if (!error->empty() || bad_message)
        return std::move(error_result);

    I hope I could explain why I did not study the report carefully. This is a lot of time consuming work.

    False positive rate


    I can’t say what is the percentage of false positives. Firstly, I couldn’t even view the log to the end and I don’t know the exact number of errors detected by PVS-Studio. Secondly, it makes no sense to talk about the percentage of false positives without first setting up the analyzer.

    If you configure the PVS-Studio analyzer, you can expect 10-15% of false positives. An example of such a configuration is described in the article “ PVS-Studio analyzer specifications using the EFL Core Libraries example, 10-15% of false positives ”.

    You can, of course, make such a setup for Chromium, but it is irrational to do this only to demonstrate some numbers in the article. This is a lot of work that we are ready to do, but for a fee. Google may well attract our team to configure the analyzer and, at the same time, to correct all errors found. Yes, consider this a hint .

    Customization is definitely possible and will give a good result. For example, about HALF of all false positives is associated with the use of the DCHECK macro in the code.

    Here's what this macro looks like:

    #define LAZY_STREAM(stream, condition)                            \
    !(condition) ? (void) 0 : ::logging::LogMessageVoidify() & (stream)
    #define DCHECK(condition)                                         \
     LAZY_STREAM(LOG_STREAM(DCHECK), !ANALYZER_ASSUME_TRUE(condition))\
       << "Check failed: " #condition ". "

    From the analyzer’s point of view, PVS-Studio is just some kind of conditional check and a set of actions, after which the rest of the code continues to execute.

    As a result, the analyzer generates false warnings, for example, for such a code:

    bool Value::Equals(const Value* other) const {
      DCHECK(other);
      return *this == *other;
    }

    PVS-Studio reports: V1004 CWE-476 The 'other' pointer was used unsafely after it was verified against nullptr. Check lines: 621, 622. values.cc 622

    From the point of view of the analyzer, the other pointer is checked for equality nullptr . But regardless of whether other is a null pointer or not, then dereferencing will occur. The analyzer considers such actions to be suspicious.

    The DCHECK macro is a type of assert macro. But if the analyzer knows about assert , then what DCHECK is - it does not understand. To better explain what is happening, I will write a pseudocode:

    bool Equals(T* ptr) const
    {
      if (!ptr)
        LogMessage();
      return *this == *ptr;
    }

    This is how PVS-Studio analyzer sees code. First, the pointer is checked for nullptr equality . If the pointer is null, the LogMessage function is called . However, the function is not marked as not returning control. This means that regardless of whether ptr is a null pointer or not, the function will continue to execute.

    Next, the pointer is dereferenced. But there was a check, where he was checked for equality to zero! Therefore, the pointer may be null and the analyzer signals a problem in the code. This is how the analyzer generates a lot of completely correct, but useless messages.

    By the way, such a macro implementation is confusing not only to PVS-Studio. Therefore, a special “backup” was made for the analyzer built into Visual Studio:

    #if defined(_PREFAST_) && defined(OS_WIN)
    // See comments on the previous use of __analysis_assume.
    #define DCHECK(condition)                    \
      __analysis_assume(!!(condition)),          \
          LAZY_STREAM(LOG_STREAM(DCHECK), false) \
              << "Check failed: " #condition ". "
    #define DPCHECK(condition)                    \
      __analysis_assume(!!(condition)),           \
          LAZY_STREAM(PLOG_STREAM(DCHECK), false) \
              << "Check failed: " #condition ". "
    #else  // !(defined(_PREFAST_) && defined(OS_WIN))

    If you make a similar backup for the PVS-Studio analyzer, the picture with false positives will change dramatically. According to my estimates, half of the false positives will immediately disappear. Yes, exactly half. The thing is that the DCHECK macro is used a huge number of times.

    Other publications


    This is where the introductory article ends, and here I will gradually lay out links to other articles. Thanks for attention.

    1. Beautiful Chromium and clumsy memset .
    2. The break and fallthrough statement .
    3. Chromium: memory leaks .
    4. Chromium: typos .
    5. Chromium: Use of Inaccurate Data .
    6. Why is it important to check what the malloc function returned .
    7. Chromium: other errors .



    If you want to share this article with an English-speaking audience, then please use the link to the translation: Andrey Karpov. Chromium: About the Sixth Check of the Project .

    Also popular now: