Analysis of utilities for static analysis of C ++ code

    Analysis of the following utilities:Everything you need can be found by following the links, and we will immediately get down to business.

    Test 1:

    int main()
    {
    	vector v;
    	v.reserve(2);
    	assert(v.capacity() == 2);
    	v[0];
    	v[0] = 1;
    	v[1] = 2;
    	cout << v[0] << endl;
    	v.reserve(100);
    	cout << v[0] << endl;
    	return 0;
    }
    

    Carefully study this code, how many problems can you identify?
    Of course all the necessary headers inkludirovany ( iostream, vector, cassert ).
    Before we discuss professional versions of this code, we will analyze well-known utilities for static source analysis. The following is the result of the above utilities.

    RATS : nothing
    Cppcheck : nothing
    Graudit : nothing
    g ++ (with the -Wall flag) : nothing

    Maybe at first glance everything is not so bad or is there nothing bad at all? Do not say this with Sutter (among other things, an example is taken from his book "New Complex Problems in C ++. 40 New Puzzle Examples with Solutions" ).
    On the first line in the body of the main () function, we declare a vector v consisting of integers. Then we reserve a place for 2 elements. Before moving on to the next line, take a look at the C ++ standard. What does she tell us about the vector :: reserve method ? (warning: free translation)
    "
    void reserve(size_type n);
    It informs about the planned change in size, possibly corresponding control of the memory allocation. After reserve (), capacity () is greater than or equal to the argument passed to reserve () in case of memory reallocation. Otherwise, it is equal to the previous value of capacity () . Redistribution of memory occurs if and only if the current value of capacity () is less than the value of the reserve () argument . It does not resize the container and in the worst case requires linear runtime. If the argument value is greater than max_size ()throws a long_error exception. In the case of redistribution, invalidates all links, pointers and iterators that refer to the elements of the container. "

    For now, let's move on. The assert line (v.capacity () == 2); bad at best, reserve () guarantees increase in the first place capacities and in this case assert is just an extra line of code (which “programmers” like to blindly believe the assert after each operation and consider it a professional tone). Secondly, as we learned from the standard itself, reserve () can increase the capacity more container than the specified ar ument, so Sutter recommends assert (v.capacity ()> = 2) . I recommend to remove it at all.
    Further worse. " Vector operator:: operator []may, but is not required to perform a range check. "The standard does not say anything about this, so the developer of the standard library has every right to add such a check and do without it.” - Sutter . The reason for everything is efficiency, the test takes some time, numerous such checks will require a significant part of the program’s time. But this does not mean that your effective program is better than Vasya's reliable program. Replace line Pobrobuy v [0]; on v.at (0); The program will fly with an exception. Which, in most cases, you need reliable code. vector:: atPerforms a range check on the index value. A strict recommendation to use this method in such, and not only, situations.

    Lines v [0] = 1; v [1] = 2; work quite well, you can be convinced by compiling an example. But you need to remember the difference between size / resize and reserve / capacity . resize changes the number of elements in the container to the specified argument value by adding or removing them from the end of the container. We learned about the reserve method from the standard, we add that it increases the size of the internal buffer so that it can accommodate the specified number of elements (at least). " We can safely use operator [](or the at method) only to change the elements that are actually contained in the container, i.e. really counted in size. "Is it clear? And static analysis utilities are also not clear. But the problem is significant. Sutter and many other pros write in their books about the subtle points in coding in your favorite C ++. At first glance, it might seem that the ability to solve problems and knowledge of syntax the language is more than enough for development, but you and I both know that we are fooling ourselves, how many of you inserted couts on each line of code to identify a problem? How many of you could not endure the debugging that the third day already performed? So, there are people who are not inherent in such things, including people with professional programming skills who are not too lazy to go further and study books by such gurus as Sutter and Myers (and many others).

    And yet, in example 1, we are waiting for the last logical error, what do you think cout << v [0] outputs ; after reserve (100) ? That's right, at best zero! Note that prior to the reservation, we gave v [0] a non-zero value. The “problem” is again in the backup function.
    So what's the matter in the general case? The fact that I need a utility that tells you what ifv.reserve (2); v [0] = 1; it is better to use v.resize (2); v [0] = 1; or v.reserve (2); v.push_back (1) ! And the utilities we examined were silently silent. I also need a utility that checks illegal boundaries, and in the case of v [0]; advise using v.at (0). And our reviewed utilities were again silently nice. Hence the moral, in delicate situations do not rely on utilities such as RATS, Cppcheck, and even more so on Graudit. Even a compiler with the highest level of warnings turned on should sometimes be looked at through the eyes of a paranoid encoder.
    Well, it’s not in vain that the utilities occupy a place on your screws, and it is not for nothing that the developers worked on it.

    Test 2:

    Let's go further, increase the dose of errors in the code. Consider the same code with the new prettyFormat function (example again from Satter). She gets the number and buffer, then inserts that number into the resulting buffer using sprintf.
    void prettyFormat(int i, char* buf)
    {
        sprintf(buf, "%4d", i);
    }
    int main()
    {
        vector v;
        v.reserve(2);
        //....
        char buf[5];
        prettyFormat(10, buf);
        return 0;
    }
    

    As you can see, prettyFormat was added , and the string char buf [10] . Although we will no longer need the code from the previous example, but let it stay, maybe one of the utilities will see something? Here is the result:

    RATS: on line char buf [5]; " High: fixed size local buffer. Extra care should be taken to ensure that character arrays that are allocated on the stack are used safely. They are prime targets for buffer overflow attacks. " It is advised to be careful when using character arrays, as they are helpless when buffer overflow attacks.
    Cppcheck : nothing
    Graudit : nothing
    g ++ (with -Wall option) : nothing

    A lot has been said about buffer overflows, so we’ll be brief and just criticize. I have the impression that the tools for static analysis are intended only to find possible cases of buffer overflows. However, several of them are silent even in such cases (as Cppcheck or Graudit). I don’t blame the compiler, he didn’t stop me from finding problems, but the utilities specially designed for such things in most cases are cute silent. In such situations, we need to get on the prevention tools and recommendations to use alternative to the sprintf (Oh, and for other possible functions). Alternative situations should be read in Satter (we are talking about std :: stringstream, std :: strstream, boost :: lexical_cast, however the latter is based on std :: stringstreamAnd finally of snprintf , pozvolyachschaya set the buffer size, thus providing protection against buffer overflow).

    Test 3:

    Add an uninitialized pointer to the prettyFormat function .
    int* prettyFormat(int i, char* buf)
    {
        sprintf(buf, "%4d", i);
        int* a;
        return a;
    }
    int main()
    { 
        ...
    

    In the core code, it has not changed compared to previous tests. This is not something that may look like a serious test, but it will give a lot of interesting information about the utilities used. In particular, it was said about Cppcheck that it does not duplicate compiler warnings, but as we will soon see, it is a bluff. And other utilities simply do not see the problem or really do not duplicate compiler warnings. Below is the result.

    RATS : same as in the previous test (we are talking about the string char buf [5];)
    Cppcheck : on the line int * a; throws (error) Uninitialized variable: a.
    Fine, but ...
    g ++ (with the -Wall option) : In function 'int * prettyFormat (int, char *)':
    warning: 'a' is used uninitialized in this function.
    Something familiar is not it? The same warning that Cppcheck finally gave.
    Graudit is generally silent.

    Test 4:

    Add humor, change the prettyFormat function like this:
    int* prettyFormat(int i, char* buf)
    {
        sprintf(buf, "%4d", i);
        int* a;
        fopen("filename", "r");
        char buf2[5];
        strcpy(buf2, buf);
        return a;
    }
    

    Fopen was added, which opens I do not know what, a local buf2 buffer was added , into which buf, which is received as an argument, is copied using strcpy. Let's see how much will give.

    RATS: For strings char buf2 [5]; char buf [10] the same warning about the care of using fixed-size character buffers.
    And finally: “High: strcpy
    Check to be sure that argument 2 passed to this function call will not copy
    more data than can be handled, resulting in a buffer overflow.”
    Check if the second argument will copy more data than buf2 can contain.
    Not bad, in many situations it would help. Let's go further.
    Cppcheck : (error) Unitialized variable: a.
    Something familiar, the same as in the previous example, but a pity.
    g ++ (with the -Wall option) : Same as in the previous test.

    RATS more or less coped with the task, the rest - to shoot! Why is everything so bad? Firstly, the guys who wrote these utilities themselves are not really C ++ gurus, most of all they like to just publish the program and that's it. Secondly, many subtle errors are difficult to detect, and they are mainly handled by commercial utilities, although I have not tried it myself. Thirdly, read books, magazines, in particular Habr and you will not need utilities like these.

    Thanks for attention!

    PS I wanted to publish the Code Revision blog, there is not enough karma. No, not that I want to increase karma in this way, just in case the crowd does not yell - "why not here and there ...".

    PPS Thank you so much to the Habrasociety - keep it up!

    Also popular now: