Compile-time validation in C / C ++
C / C ++ allows you to check constant expressions at the stage of compilation of the program. This is a cheap way to avoid problems when modifying code in the future.
I will consider working with:
There are many ways to break the compiler at compile time. Of these, I like the performance most of all:
But if you use boost in your program, then you don’t need to invent anything: BOOST_STATIC_ASSERT . Support also promises to be in C ++ 11 (static_assert).
With the tool sorted out, now about use.
Enumerations are a set of constants related by meaning, which are usually used at the branch point of program logic. There are usually several branch points, and you can easily skip something.
Example:
The last element is not an algorithm, but an auxiliary constant with a number one greater than the maximum semantic element.
Now wherever the constants from this set are used, you just need to add a check:
If new constants are added in the future, the code at this point will stop compiling. So, the author of the changes will have to look at this section of the code and, if necessary, take into account the new constant.
As a bonus from the introduction of EM_ItemsCount, it becomes possible to insert runtime-checks of function parameters:
Compare with the option without such a constant:
(add EM_AES512 and get the wrong check)
A special case of verification from the previous section.
Suppose we have an array with parameters for the same encryption algorithms (the example is a little sucked from the finger, but in life there are similar cases):
It is required to keep this structure synchronous with TEncryptMode.
(I don’t need to explain why the last element of the array is needed.)
We need an auxiliary macro to calculate the length of the array:
Now, you can write a check (better if right after the params definition):
upd: In the comments , the skor habraiser suggested a safer version of the lengthof macro, for which thanks to him.
Everything is obvious here (after the examples above). Before switch (mode) add:
A slightly less obvious runtime check:
An additional bastion for defense against mistakes. If actions are processed the same way, it is better to list several case-conditions for one action, leaving default unoccupied:
Let's distract from enums and look at a class like this:
It may well be that sometime in the future, someone will want to add an int c variable to it. The class by this time became large and complex. How to find the points at which to write the variable c?
Such a semi-automatic solution method is proposed - we set up a data version constant constant in the class:
Now, in all methods in which it is important to track the integrity of all data, you can write:
Adding new data to the class, you will have to manually increase the DataVersion constant (here discipline is required, alas). But the compiler will immediately pay attention to those places that need to be checked. These verification points should include:
The rest of the verification places depend on the internal logic (output to the log, for example).
The same constant (DataVersion) is convenient to use when saving data to disk (if interested, I can write about it separately).
What is the result?
Pros:
Minuses:
For me, the advantages outweigh, but for you?
upd Added code highlighting.
I will consider working with:
- by transfers (enum),
- arrays (their synchronization with enum),
- switch constructs
- as well as work with classes containing heterogeneous data.
BOOST_STATIC_ASSERT and all-all-all
There are many ways to break the compiler at compile time. Of these, I like the performance most of all:
#define ASSERT(cond) typedef int foo[(cond) ? 1 : -1]
But if you use boost in your program, then you don’t need to invent anything: BOOST_STATIC_ASSERT . Support also promises to be in C ++ 11 (static_assert).
With the tool sorted out, now about use.
Control the number of elements in enum
Enumerations are a set of constants related by meaning, which are usually used at the branch point of program logic. There are usually several branch points, and you can easily skip something.
Example:
enum TEncryptMode {
EM_None = 0,
EM_AES128,
EM_AES256,
EM_ItemsCount
};
The last element is not an algorithm, but an auxiliary constant with a number one greater than the maximum semantic element.
Now wherever the constants from this set are used, you just need to add a check:
ASSERT(EM_ItemsCount == 3);
If new constants are added in the future, the code at this point will stop compiling. So, the author of the changes will have to look at this section of the code and, if necessary, take into account the new constant.
As a bonus from the introduction of EM_ItemsCount, it becomes possible to insert runtime-checks of function parameters:
assert( 0 <= mode && mode < EM_ItemsCount );
Compare with the option without such a constant:
assert( 0 <= mode && mode <= EM_AES256 );
(add EM_AES512 and get the wrong check)
Arrays and enum
A special case of verification from the previous section.
Suppose we have an array with parameters for the same encryption algorithms (the example is a little sucked from the finger, but in life there are similar cases):
static const ParamStruct params[] = {
{ EM_None, 0, ... },
{ EM_AES128, 128, ... },
{ EM_AES256, 256, ... },
{ -1, 0, ... }
};
It is required to keep this structure synchronous with TEncryptMode.
(I don’t need to explain why the last element of the array is needed.)
We need an auxiliary macro to calculate the length of the array:
#define lengthof(x) (sizeof(x) / sizeof((x)[0]))
Now, you can write a check (better if right after the params definition):
ASSERT( lengthof(params) == EM_ItemsCount + 1 );
upd: In the comments , the skor habraiser suggested a safer version of the lengthof macro, for which thanks to him.
switch
Everything is obvious here (after the examples above). Before switch (mode) add:
ASSERT(EM_ItemsCount == 3);
A slightly less obvious runtime check:
ASSERT(EM_ItemsCount == 3);
switch( mode ) {
case ...: ... break;
...
default:
assert( false );
}
An additional bastion for defense against mistakes. If actions are processed the same way, it is better to list several case-conditions for one action, leaving default unoccupied:
...
case ET_AES128:
case ET_AES256:
...
break;
...
Classes with heterogeneous data
Let's distract from enums and look at a class like this:
class MyData {
...
private:
int a;
double b;
...
};
It may well be that sometime in the future, someone will want to add an int c variable to it. The class by this time became large and complex. How to find the points at which to write the variable c?
Such a semi-automatic solution method is proposed - we set up a data version constant constant in the class:
class MyData {
static const int DataVersion = 0;
...
};
Now, in all methods in which it is important to track the integrity of all data, you can write:
ASSERT(DataVersion == 0);
Adding new data to the class, you will have to manually increase the DataVersion constant (here discipline is required, alas). But the compiler will immediately pay attention to those places that need to be checked. These verification points should include:
- designers
- assignment operator (operator =)
- comparison operators (==, <, etc),
- reading / writing data (including <<, >>),
- destructor (if it is not trivial).
The rest of the verification places depend on the internal logic (output to the log, for example).
The same constant (DataVersion) is convenient to use when saving data to disk (if interested, I can write about it separately).
Benefitit
What is the result?
Pros:
- Automatic integrity check at the compilation stage (sometimes, it saves hours and even days of debugging).
- Zero overhead at run time.
Minuses:
- Additional code (albeit relatively small).
- The load on self-discipline (you just need to look at the triggered falls, and not just fix the constant).
For me, the advantages outweigh, but for you?
upd Added code highlighting.