sand14 March 9, 2015 at 19:57

C #: how not to "shoot yourself in the foot"

Today we will examine in more detail how it became possible to “ shoot yourself in the foot ” in C #, as well as in general in .NET, when working with logical values , in what practical cases this can happen, and how to prevent this.

What lines are displayed by this console application ?
Having launched the application, having previously assembled it in the Visual Studio Community 2013 environment, we get the following result:

Unsafe Mode
01: b1        : True
02: b2        : True
03:  b1 ==  b2: False
04: !b1 == !b2: True
05: b1 && b2  : True
06: b1 &  b2  : True
07: b1 ^  b2  : True
08: b1 && b3  : True
09: b1 &  b3  : False
10: b1 ^  b3  : True
Safe Mode
11: b1        : True
12: b2        : True
13:  b1 ==  b2: True
14: !b1 == !b2: True
15: b1 && b2  : True
16: b1 &  b2  : True
17: b1 ^  b2  : False
18: b1 && b3  : True
19: b1 &  b3  : True
20: b1 ^  b3  : False

Based on the assumption that in each of the logical variables b1, b2, b3 there is either a “true” value or a value other than a “false” value (and, therefore, also a “true” value - are these Boolean variables?), Several questions:

Why are Unsafe and Safe Mode blocks different results in positions 03 and 13, 07 and 17, 09 and 19, 10 and 20, respectively?
(and why, then, the values in other positions corresponding to each other in Unsafe and Safe blocks coincide?)
Why are the results at positions 05 and 06 the same inside the Unsafe block, but different at 08 and 09?
And why are the results in 08 and 09 different?

Let's try to figure it out:

Probably everyone knows that initially in programming languages there were no special logical (Boolean) data types.
As boolean types, integer data types were used.

Zero was treated as a false value (False), a value other than zero was treated as true (True).
Thus, the if branching operator could be applied to the integer operand.

The mistake that is easily made in C / C ++ by confusing the assignment (=) and equality (==) operators is widely known.
The following code will always display the string “i == 1”:

int i = 0;
if (i = 1)
  printf("i == 1");
else
  printf("i == 0");

This is due to the fact that the assignment operator (=) instead of the equality operator (==) was mistakenly used in the if branch operator in the operand “i = 1”.
As a result, the value “1” is written into the variable “i”, respectively, the “=” operator returns the value “1”, and the integer value “1” is used as the operand of the if operator, interpreted as a logical (Boolean) value, and will always be executed code from the first branch (printf ("i == 1")).

Therefore, in C / C ++, it is customary to use the comparison operator as follows:

int i = 0;
if (1 == i)
  printf("i == 1");
else
  printf("i == 0");

instead of “intuitive”:

int i = 0;
if (i == 1)
  printf("i == 1");
else
  printf("i == 0");

The reason is that in the operator “1 == i” we cannot make a mistake and write it as “1 = i” - the compiler will not allow us to assign constant (1) a new value (i).

Apparently, at some point, the developers of the programming languages decided to add support for the "full-fledged" logical types in the languages:
So, in Turbo / Borland Pascal and Delphi, the Boolean type appeared . Variables of this type could take the values False and True. Moreover, it was documented that the type size is 1 byte, and the ordinal (integer) values returned by the Ord function are 0 and 1 for False and True, respectively.

And what about other possible non-zero internal values? The behavior in this case could be vague, and the documentation / books clarified that Boolean values should be tested this way:

var
  b: Boolean;
begin
  b := True;
  if b then
    WriteLn('b = True')
  else
    WriteLn('b = False');
end

but not like this:

var
  b: Boolean;
begin
  b := True;
  if b = True then
    WriteLn('b = True')
  else
    WriteLn('b = False');
end

The variable “b” could have a non-zero value other than unity, and then the result of the comparison “b = True” would be undefined - the result could be false (if the comparison was performed as a comparison of two integers, bypassing the stage of “normalizing” the values - obviously, for performance reasons).

On the other hand, it indirectly recognized that a case is possible when a logical variable can contain an internal code other than zero and one, and that a nonzero value is considered “true”, although it cannot always be processed correctly:

a boolean variable is implemented as an integer, and it is possible to cast an integer to a Boolean (not to mention the possibilities of address arithmetic);
this is also confirmed by this : “Casting the variable to a Boolean type is unreliable” - that is, we can cast an integer to a Boolean , but the result is “unreliable” - in practice this means that the result of testing this value is undefined.

Later, Boolean types ByteBool , WordBool , LongBool of sizes 1, 2 and 4 bytes were added to Delphi for compatibility with Boolean types when working with code written in C / C ++, COM objects, and other third-party code.
For them , it is determined that, unlike the Boolean type , any non-zero value is considered to be “true”.

In C ++, the "native" type bool was added in the same way (the variables of this can take the values false and true ), and its size is non-deterministic(probably depends on the platform bit - for performance reasons or some other; data type dimensions for specific versions of Microsoft compilers are given here and here ).
And also there is no explicit definition of the internal codes false and true , although it follows indirectly from the code examples accompanying the definitions false and true that false has an internal numeric code 0 and true is an internal numeric code 1.

We conducted a historical tour of the genesis of Boolean types to see pitfalls when working with such a seemingly simple data type - a logical (Boolean) type, and with an understanding of the issue, approach the consideration of the internal structure of the logical data type in C #, discuss why in the test The program turned out the results like they did, and how to work correctly in C # with boolean values when interacting with unmanaged code.
The retreat turned out to be quite voluminous, so we will consider these issues next time.

Tags:

C #: how not to "shoot yourself in the foot"

Also popular now: