Andrey2008 March 20, 2019 at 22:17

Macro harm for C ++ code

The C ++ language offers vast possibilities for doing without macros. So let's try to use macros as little as possible!

Immediately make a reservation that I am not a fanatic and do not urge to abandon macros for idealistic reasons. For example, when it comes to manually generating the same type of code, I can recognize the benefits of macros and come to terms with them. For example, I calmly relate to macros in old programs written using MFC. It makes no sense to fight with something like this:

BEGIN_MESSAGE_MAP(efcDialog, EFCDIALOG_PARENT )
  //{{AFX_MSG_MAP(efcDialog)
  ON_WM_CREATE()
  ON_WM_DESTROY()
  //}}AFX_MSG_MAP
END_MESSAGE_MAP()

There are such macros, and okay. They really were created to simplify programming.

I am talking about other macros with which they try to avoid the implementation of a full-fledged function or try to reduce the size of a function. Consider several motives to avoid such macros.

Note. This text was written as a guest post for the Simplify C ++ blog. I decided to publish the Russian version of the article here. Actually, I am writing this note in order to avoid a question from inattentive readers why the article is not marked as “translation” :). And here, actually, a guest post in English: " Macro Evil in C ++ Code ".

First: macro code attracts bugs

I do not know how to explain the reasons for this phenomenon from a philosophical point of view, but it is. Moreover, macro-related bugs are often very difficult to spot when conducting a code review.

I have repeatedly described such cases in my articles. For example, replacing the isspace function with this macro:

#define isspace(c) ((c)==' ' || (c) == '\t')

The programmer who used isspace believed that he was using a real function that considers not only spaces and tabs as whitespace, but also LF, CR, etc. The result is that one of the conditions is always true and the code does not work as intended. This error from Midnight Commander is described here .

Or how do you like this shorthand for writing the function std :: printf ?

#define sprintf std::printf

I think the reader guesses that it was a very unsuccessful macro. It was found, by the way, in the StarEngine project. Read more about this here .

One could argue that programmers are to blame for these errors, not macros. This is true. Naturally, programmers are always to blame for errors :).

It is important that macros cause errors. It turns out that macros must be used with increased accuracy or not at all.

I can give examples of defects associated with the use of macros for a long time, and this nice note will turn into a weighty multi-page document. Of course, I will not do this, but I will show a couple of other cases for convincing.

ATL library providesmacros such as A2W, T2W and so on to convert strings. However, few people know that these macros are very dangerous to use inside loops. Inside the macro, the alloca function is called , which will allocate memory on the stack again and again at each iteration of the loop. A program may pretend to work correctly. As soon as the program starts processing long lines or the number of iterations in the loop increases, the stack can take and end at the most unexpected moment. You can read more about this in this mini-book (see the chapter “Do not call the alloca () function inside loops”).

Macros such as A2W hide evil. They look like functions, but, in fact, have side effects that are difficult to notice.

I can’t get past similar attempts to reduce code using macros:

void initialize_sanitizer_builtins (void)
{
  ....
  #define DEF_SANITIZER_BUILTIN(ENUM, NAME, TYPE, ATTRS) \
  decl = add_builtin_function ("__builtin_" NAME, TYPE, ENUM, \
             BUILT_IN_NORMAL, NAME, NULL_TREE);  \
  set_call_expr_flags (decl, ATTRS);          \
  set_builtin_decl (ENUM, decl, true);
  #include "sanitizer.def"
  if ((flag_sanitize & SANITIZE_OBJECT_SIZE)
      && !builtin_decl_implicit_p (BUILT_IN_OBJECT_SIZE))
    DEF_SANITIZER_BUILTIN (BUILT_IN_OBJECT_SIZE, "object_size",
         BT_FN_SIZE_CONST_PTR_INT,
         ATTR_PURE_NOTHROW_LEAF_LIST)
  ....
}

Only the first line of the macro refers to the if statement . The remaining lines will be executed regardless of the condition. We can say that this error is from the C world, since it was found by me using the V640 diagnostics inside the GCC compiler. GCC code is written primarily in C, and in this language macros are hard to do. However, you must admit that this is not the case. Here it was quite possible to make a real function.

Second: code reading becomes more complicated

If you come across a project that is full of macros consisting of other macros, then you understand what hell it is to understand such a project. If you have not encountered, then take a word, this is sad. As an example of code that is hard to read, I can cite the GCC compiler mentioned earlier.

According to legend, Apple has invested in the development of the LLVM project as an alternative to GCC due to the complexity of the GCC code due to these very macros. Where I read about it, I don’t remember, so there will be no proofs.

Third: writing macros is hard

It's easy to write a bad macro. I meet them everywhere with corresponding consequences. But writing a good and reliable macro is often more difficult than writing a similar function.

Writing a good macro is difficult for the reason that, unlike a function, it cannot be considered as an independent entity. It is required to immediately consider the macro in the context of all possible options for its use, otherwise it is very easy to rake a problem of the form:

#define MIN(X, Y) (((X) < (Y)) ? (X) : (Y))
m = MIN(ArrayA[i++], ArrayB[j++]);

Of course, for such cases, workarounds have long been invented and the macro can be implemented safely:

#define MAX(a,b) \
   ({ __typeof__ (a) _a = (a); \
       __typeof__ (b) _b = (b); \
     _a > _b ? _a : _b; })

The only question is, do we need all this in C ++? No, in C ++ there are templates and other ways to build efficient code. So why do I continue to encounter similar macros in C ++ programs?

Fourth: debugging is complicated

There is an opinion that debugging is for wimps :). This, of course, is interesting to discuss, but from a practical point of view, debugging is useful and helps to find errors. Macros complicate this process and definitely slow down the search for errors.

Fifth: false positives of static analyzers

Many macros, due to the specifics of their device, generate multiple false positives from static code analyzers. I can safely say that most of the false positives when checking C and C ++ code are associated with macros.

The trouble with macros is that the analyzers simply cannot distinguish the correct tricky code from the erroneous code. The article about checking Chromium describes one of these macros.

What to do?

Let's not use macros in C ++ programs unless absolutely necessary!

C ++ provides rich tools such as template functions, automatic type inference (auto, decltype), constexpr functions.

Almost always, instead of a macro, you can write an ordinary function. Often this is not done because of ordinary laziness. This laziness is harmful, and we must fight it. A small additional time spent writing a full-fledged function will pay off with interest. The code will be easier to read and maintain. The probability of shooting your own leg will decrease, and compilers and static analyzers will produce fewer false positives.

Some might argue that code with a function is less efficient. This is also just a "excuse."

Compilers now perfectly inline the code, even if you haven't written the inline keyword .

If we are talking about calculating expressions at the compilation stage, then here macros are not needed and even harmful. For the same purpose, it is much better and safer to use constexpr .

I will explain with an example. Here is a classic macro error that I borrowed from FreeBSD Kernel code.

#define ICB2400_VPOPT_WRITE_SIZE 20
#define  ICB2400_VPINFO_PORT_OFF(chan) \
  (ICB2400_VPINFO_OFF +                \
   sizeof (isp_icb_2400_vpinfo_t) +    \
  (chan * ICB2400_VPOPT_WRITE_SIZE))          // <=
static void
isp_fibre_init_2400(ispsoftc_t *isp)
{
  ....
  if (ISP_CAP_VP0(isp))
    off += ICB2400_VPINFO_PORT_OFF(chan);
  else
    off += ICB2400_VPINFO_PORT_OFF(chan - 1); // <=
  ....
}

The chan argument is used in a macro without wrapping in parentheses. As a result, the expression ICB2400_VPOPT_WRITE_SIZE does not multiply the expression (chan - 1) , but only one.

The error would not appear if an ordinary function was written instead of a macro.

size_t ICB2400_VPINFO_PORT_OFF(size_t chan)
{
  return   ICB2400_VPINFO_OFF
         + sizeof(isp_icb_2400_vpinfo_t)
         + chan * ICB2400_VPOPT_WRITE_SIZE;
}

It is very likely that the modern C and C ++ compiler will perform inlining functions on its own , and the code will be as efficient as in the case of a macro.

At the same time, the code has become more readable, as well as free from errors.

If it is known that the input value is always a constant, then you can add constexpr and be sure that all calculations will occur at the compilation stage. Imagine that it is C ++ and that chan is always a constant. Then it is useful to declare the function ICB2400_VPINFO_PORT_OFF like this:

constexpr size_t ICB2400_VPINFO_PORT_OFF(size_t chan)
{
  return   ICB2400_VPINFO_OFF
         + sizeof(isp_icb_2400_vpinfo_t)
         + chan * ICB2400_VPOPT_WRITE_SIZE;
}

Profit!

I hope I managed to convince you. Good luck and fewer macros in the code!

Tags: