What is Strict Aliasing and why should we care? Part 2
- Transfer
(OR typing pun, indefinite behavior and alignment, oh my God!)
Friends, there is very little time leftbefore starting a new thread on the course “C ++ Developer” . It's time to publish a translation of the second part of the material, which tells about what a pun is typing.
What is a pun typification?
We have reached the point where we may wonder why we may need pseudonyms at all? Usually for the implementation of puns typing, because frequently used methods violate strict aliasing rules.
Sometimes we want to get around the type system and interpret the object as another type. Reinterpreting a memory segment as another type is called a type punning. Typing puns are useful for tasks that require access to the base representation of an object to view, transport, or manipulate the provided data. Typical areas where we can come across the use of typing puns: compilers, serialization, network code, etc.
Traditionally, this was achieved by taking the address of the object, casting it to a pointer to the type to which we want to interpret, and then accessing the value, or in other words, using aliases. For instance:
As we saw earlier, this is unacceptable aliasing, this will cause undefined behavior. But traditionally, compilers did not use strict aliasing rules, and this type of code usually just worked, and developers, unfortunately, are used to allowing such things. A common alternative pun-typing method is through union, which is valid in C, but will cause undefined behavior in C ++ ( see example ):
This is unacceptable in C ++, and some believe that unions are intended solely for implementing variant types, and consider that using unions for typing puns is an abuse.
How to implement a pun?
The standard blessed method for typing puns in C and C ++ is memcpy. This may seem a bit complicated, but the optimizer needs to recognize the use of memcpy for the pun, optimize it and generate a register to register the move. For example, if we know that int64_t is the same size as double:
We can use
With a sufficient level of optimization, any decent modern compiler generates code identical to the previously mentioned reinterpret_cast method or the join method to get a pun. Studying the generated code, we see that it uses only the mov register ( example ).
Typing pun and arrays
But what if we want to implement a pun pun of an unsigned char array into a series of unsigned int and then perform an operation on each unsigned int value? We can use memcpy to turn an unsigned char array into a temporary unsinged int type. The optimizer will still be able to see everything through memcpy and optimize both the temporary object and the copy, and work directly with the underlying data, ( example ):
In this example, we take
The assembly for the loop body shows that the optimizer turns the body into direct access to the underlying array
The same code, but using
C ++ 20 and bit_cast
In C ++ 20, we have
The following is an example of how to use
In the case when the types To and From do not have the same size, this requires us to use an intermediate structure. We will use a structure containing a multiple character array
Unfortunately, we need this intermediate type - this is the current limitation
Alignment
In the previous examples, we saw that violation of strict aliasing rules could lead to storage exclusion during optimization. Violation of strict aliasing can also lead to violation of alignment requirements. Both the C standards and C ++ say that objects are subject to alignment requirements that limit the place where objects can be placed (in memory) and therefore accessible. C11 section 6.2.8 Alignment of objects states :
Complete types of objects have alignment requirements that impose restrictions on the addresses at which objects of this type can be placed. Alignment is an implementation-defined integer value that represents the number of bytes between consecutive addresses at which this object can be placed. The type of the object imposes an alignment requirement on each object of this type: more stringent alignment can be requested using the keyword
The C ++ 17 project standard in section 1 [basic.align] :
Object types have alignment requirements (6.7.1, 6.7.2) that impose restrictions on the addresses at which an object of this type can be placed. Alignment is an implementation-defined integer value that represents the number of bytes between consecutive addresses at which a given object can be placed. An object type imposes an alignment requirement on every object of this type; More stringent alignment can be requested using the alignment specifier (10.6.2).
Both C99 and C11 explicitly indicate that a conversion that results in an unaligned pointer is undefined behavior, section 6.3.2.3. Pointers says:
So let's assume:
Thus, interpreting a char array of size 4 as
Which may result in reduced performance or bus error in some situations. Whereas using alignas to force the same alignment for an array in int will prevent alignment requirements from breaking:
Atomicity
Another unexpected punishment for unbalanced access is that it violates the atomicity of some architectures. Atomic stores may not appear atomic for other threads in x86 if they are not aligned.
Catching strict aliasing violations
We do not have many good tools for tracking strict aliasing in C ++. The tools that we have will catch some cases of violations and some cases of improper loading and storage.
gcc using flags
although he will not catch this additional case ( example ):
Although it
Another tool we have is ASan, which can catch misaligned recording and storage. Although they are not direct violations of strict aliasing, this is a fairly common result. For example, the following cases will generate runtime errors during assembly using clang using
The last tool that I recommend is specific to C ++ and, in fact, not only a tool, but also a coding practice that does not allow casting in the C style. And
It is also easier to search the code base for
For C, we have all the tools that are already described, and we also have
The TIS interpreter can intercept all three, the following example calls the TIS kernel as a TIS interpreter (the output is edited for brevity):
And finally, TySan , which is under development. This sanitizer adds type checking information to the shadow memory segment and checks accesses to determine if they violate the aliasing rules. The tool should potentially be able to track all aliasing violations, but may have a large overhead at runtime.
Conclusion
We learned about the aliasing rules in C and C ++, which means that the compiler expects us to strictly follow these rules and accept the consequences of not fulfilling them. We’ve learned about some tools that can help us identify some pseudonym abuse. We have seen that the usual use of aliasing is a pun of typification. We also learned how to implement it correctly.
Optimizers are gradually improving type-based alias analysis and already breaking some code that is based on strict aliasing violations. We can expect optimizations to get better and break even more code that just worked before.
We have standard ready-made compatible methods for interpreting types. Sometimes for debug builds these methods should be free abstractions. We have several tools for detecting severe aliasing violations, but for C ++ they will catch only a small part of cases, and for C using the tis-interpreter we can track most violations.
Thanks to those who commented on this article: JF Bastien, Christopher Di Bella, Pascal Quoc, Matt P. Dziubinski, Patrice Roy and Olafur Vaage
Of course, in the end, all errors belong to the author.
So the translation of a rather large material has come to an end, the first part of which can be read here . And we traditionally invite you to the open door day , which will be held March 14 by the head of the technology development department at Rambler & Co - Dmitry Shebordaev.
Friends, there is very little time leftbefore starting a new thread on the course “C ++ Developer” . It's time to publish a translation of the second part of the material, which tells about what a pun is typing.
What is a pun typification?
We have reached the point where we may wonder why we may need pseudonyms at all? Usually for the implementation of puns typing, because frequently used methods violate strict aliasing rules.
Sometimes we want to get around the type system and interpret the object as another type. Reinterpreting a memory segment as another type is called a type punning. Typing puns are useful for tasks that require access to the base representation of an object to view, transport, or manipulate the provided data. Typical areas where we can come across the use of typing puns: compilers, serialization, network code, etc.
Traditionally, this was achieved by taking the address of the object, casting it to a pointer to the type to which we want to interpret, and then accessing the value, or in other words, using aliases. For instance:
int x = 1 ;
// В языке C
float *fp = (float*)&x ; // Недопустимый алиасинг
//В языке C++
float *fp = reinterpret_cast(&x) ; // Недопустимый алиасинг
printf( “%f\n”, *fp ) ;
As we saw earlier, this is unacceptable aliasing, this will cause undefined behavior. But traditionally, compilers did not use strict aliasing rules, and this type of code usually just worked, and developers, unfortunately, are used to allowing such things. A common alternative pun-typing method is through union, which is valid in C, but will cause undefined behavior in C ++ ( see example ):
union u1
{
int n;
float f;
} ;
union u1 u;
u.f = 1.0f;
printf( "%d\n”, u.n ); // UB(undefined behaviour) в C++ “n is not the active member”
This is unacceptable in C ++, and some believe that unions are intended solely for implementing variant types, and consider that using unions for typing puns is an abuse.
How to implement a pun?
The standard blessed method for typing puns in C and C ++ is memcpy. This may seem a bit complicated, but the optimizer needs to recognize the use of memcpy for the pun, optimize it and generate a register to register the move. For example, if we know that int64_t is the same size as double:
static_assert( sizeof( double ) == sizeof( int64_t ) ); // C++17 не требует сообщения
We can use
memcpy
:void func1( double d ) {
std::int64_t n;
std::memcpy(&n, &d, sizeof d);
//…
With a sufficient level of optimization, any decent modern compiler generates code identical to the previously mentioned reinterpret_cast method or the join method to get a pun. Studying the generated code, we see that it uses only the mov register ( example ).
Typing pun and arrays
But what if we want to implement a pun pun of an unsigned char array into a series of unsigned int and then perform an operation on each unsigned int value? We can use memcpy to turn an unsigned char array into a temporary unsinged int type. The optimizer will still be able to see everything through memcpy and optimize both the temporary object and the copy, and work directly with the underlying data, ( example ):
// Простая операция, возвращающая значение обратно
int foo( unsigned int x ) { return x ; }
// Предположим, что len кратно sizeof(unsigned int)
int bar( unsigned char *p, size_t len ) {
int result = 0;
for( size_t index = 0; index < len; index += sizeof(unsigned int) ) {
unsigned int ui = 0;
std::memcpy( &ui, &p[index], sizeof(unsigned int) );
result += foo( ui ) ;
}
return result;
}
In this example, we take
char*p
, assume that it points to several pieces of sizeof(unsigned int)
data, interpret each piece of data as unsigned int
, calculate foo()
for each piece of pun intended, summarize this in result and return the final value. The assembly for the loop body shows that the optimizer turns the body into direct access to the underlying array
unsigned char
as unsigned int
by adding it directly to eax
:add eax, dword ptr [rdi + rcx]
The same code, but using
reinterpret_cast
a pun to implement (violates strict aliasing):// Предположим, что len кратно sizeof(unsigned int)
int bar( unsigned char *p, size_t len ) {
int result = 0;
for( size_t index = 0; index < len; index += sizeof(unsigned int) ) {
unsigned int ui = *reinterpret_cast(&p[index]);
result += foo( ui );
}
return result;
}
C ++ 20 and bit_cast
In C ++ 20, we have
bit_cast
one that provides a simple and safe way to interpret, and can also be used in context constexpr
. The following is an example of how to use
bit_cast
to interpret an unsigned integer in float
( example ):std::cout << bit_cast(0x447a0000) << "\n" ; //предполагая, что sizeof(float) == sizeof(unsigned int)
In the case when the types To and From do not have the same size, this requires us to use an intermediate structure. We will use a structure containing a multiple character array
sizeof(unsigned int)
(4-byte unsigned int is assumed) as the From type, and unsigned int
as the To. Type:struct uint_chars {
unsigned char arr[sizeof( unsigned int )] = {} ; // Полагая sizeof( unsigned int ) == 4
};
// Полагая len кратное 4
int bar( unsigned char *p, size_t len ) {
int result = 0;
for( size_t index = 0; index < len; index += sizeof(unsigned int) ) {
uint_chars f;
std::memcpy( f.arr, &p[index], sizeof(unsigned int));
unsigned int result = bit_cast(f);
result += foo( result );
}
return result ;
}
Unfortunately, we need this intermediate type - this is the current limitation
bit_cast
. Alignment
In the previous examples, we saw that violation of strict aliasing rules could lead to storage exclusion during optimization. Violation of strict aliasing can also lead to violation of alignment requirements. Both the C standards and C ++ say that objects are subject to alignment requirements that limit the place where objects can be placed (in memory) and therefore accessible. C11 section 6.2.8 Alignment of objects states :
Complete types of objects have alignment requirements that impose restrictions on the addresses at which objects of this type can be placed. Alignment is an implementation-defined integer value that represents the number of bytes between consecutive addresses at which this object can be placed. The type of the object imposes an alignment requirement on each object of this type: more stringent alignment can be requested using the keyword
_Alignas
. The C ++ 17 project standard in section 1 [basic.align] :
Object types have alignment requirements (6.7.1, 6.7.2) that impose restrictions on the addresses at which an object of this type can be placed. Alignment is an implementation-defined integer value that represents the number of bytes between consecutive addresses at which a given object can be placed. An object type imposes an alignment requirement on every object of this type; More stringent alignment can be requested using the alignment specifier (10.6.2).
Both C99 and C11 explicitly indicate that a conversion that results in an unaligned pointer is undefined behavior, section 6.3.2.3. Pointers says:
A pointer to an object or partial type can be converted to a pointer to another object or partial type. If the resulting pointer is not correctly aligned for the pointer type, the behavior is undefined. ...Although C ++ is not so obvious, I believe that this sentence from paragraph 1 is
[basic.align]
enough:... The type of object imposes an alignment requirement on each object of this type; ...Example
So let's assume:
- alignof (char) and alignof (int) are 1 and 4 respectively
- sizeof (int) is 4
Thus, interpreting a char array of size 4 as
int
violates strict aliasing, and may also violate alignment requirements if the array has alignment of 1 or 2 bytes.char arr[4] = { 0x0F, 0x0, 0x0, 0x00 }; // Может быть размещен на с интервалом в 1 или 2 байта
int x = *reinterpret_cast(arr); // Undefined behavior невыровненный указатель
Which may result in reduced performance or bus error in some situations. Whereas using alignas to force the same alignment for an array in int will prevent alignment requirements from breaking:
alignas(alignof(int)) char arr[4] = { 0x0F, 0x0, 0x0, 0x00 };
int x = *reinterpret_cast(arr);
Atomicity
Another unexpected punishment for unbalanced access is that it violates the atomicity of some architectures. Atomic stores may not appear atomic for other threads in x86 if they are not aligned.
Catching strict aliasing violations
We do not have many good tools for tracking strict aliasing in C ++. The tools that we have will catch some cases of violations and some cases of improper loading and storage.
gcc using flags
-fstrict-aliasing
and-Wstrict-aliasing
can catch some cases, although not without false positives / troubles. For example, the following cases will generate a warning in gcc ( example ):int a = 1;
short j;
float f = 1.f; // Первоначально не инициализирован, но ядро TIS обнаружило, что к нему обращаются с неопределенным значением ниже
printf("%i\n", j = *(reinterpret_cast(&a)));
printf("%i\n", j = *(reinterpret_cast(&f)));
although he will not catch this additional case ( example ):
int *p;
p=&a;
printf("%i\n", j = *(reinterpret_cast(p)));
Although it
clang
allows these flags, it does not seem to actually implement the warning. Another tool we have is ASan, which can catch misaligned recording and storage. Although they are not direct violations of strict aliasing, this is a fairly common result. For example, the following cases will generate runtime errors during assembly using clang using
-fsanitize=address
int *x = new int[2]; // 8 байт: [0,7].
int *u = (int*)((char*)x + 6); // вне зависимости от выравнивания xэтоне будет выровненным адресом
*u = 1; // Доступ к диапазону [6-9]
printf( "%d\n", *u ); // Доступ к диапазону [6-9]
The last tool that I recommend is specific to C ++ and, in fact, not only a tool, but also a coding practice that does not allow casting in the C style. And
gcc
, they clang
will perform diagnostics for casting in the C style using -Wold-style-cast
. This will force any undefined typing puns to use reinterpret_cast. In general, it reinterpret_cast
should be a beacon for a more thorough analysis of the code. It is also easier to search the code base for
reinterpret_cast
to audit. For C, we have all the tools that are already described, and we also have
tis-interpreter
, a static analyzer that exhaustively analyzes the program for a large subset of the C language. Given the C versions of the previous example, where using -fstrict-aliasing skips one case ( example )int a = 1;
short j;
float f = 1.0 ;
printf("%i\n", j = *((short*)&a));
printf("%i\n", j = *((int*)&f));
int *p;
p=&a;
printf("%i\n", j = *((short*)p));
The TIS interpreter can intercept all three, the following example calls the TIS kernel as a TIS interpreter (the output is edited for brevity):
./bin/tis-kernel -sa example1.c
...
example1.c:9:[sa] warning: The pointer (short *)(& a) has type short *. It violates strict aliasing
rules by accessing a cell with effective type int.
...
example1.c:10:[sa] warning: The pointer (int *)(& f) has type int *. It violates strict aliasing rules by
accessing a cell with effective type float.
Callstack: main
...
example1.c:15:[sa] warning: The pointer (short *)p has type short *. It violates strict aliasing rules by
accessing a cell with effective type int.
And finally, TySan , which is under development. This sanitizer adds type checking information to the shadow memory segment and checks accesses to determine if they violate the aliasing rules. The tool should potentially be able to track all aliasing violations, but may have a large overhead at runtime.
Conclusion
We learned about the aliasing rules in C and C ++, which means that the compiler expects us to strictly follow these rules and accept the consequences of not fulfilling them. We’ve learned about some tools that can help us identify some pseudonym abuse. We have seen that the usual use of aliasing is a pun of typification. We also learned how to implement it correctly.
Optimizers are gradually improving type-based alias analysis and already breaking some code that is based on strict aliasing violations. We can expect optimizations to get better and break even more code that just worked before.
We have standard ready-made compatible methods for interpreting types. Sometimes for debug builds these methods should be free abstractions. We have several tools for detecting severe aliasing violations, but for C ++ they will catch only a small part of cases, and for C using the tis-interpreter we can track most violations.
Thanks to those who commented on this article: JF Bastien, Christopher Di Bella, Pascal Quoc, Matt P. Dziubinski, Patrice Roy and Olafur Vaage
Of course, in the end, all errors belong to the author.
So the translation of a rather large material has come to an end, the first part of which can be read here . And we traditionally invite you to the open door day , which will be held March 14 by the head of the technology development department at Rambler & Co - Dmitry Shebordaev.