Secure crypto programming. Part 2 final
We continue to translate the set of rules for secure cryptographic programming from Jean-Phillip Omasson ...
Some compilers optimize operations that they consider useless.
For example, the compiler MS Visual C ++ found the operator | memset | in the following snippet of Tor anonymous network implementation code:
However, the role of this operator | memset | is to clear the | digest | buffer from confidential data, so that with any subsequent readings of data from the uninitialized stack, it was impossible to obtain confidential information.
Some compilers believe that they can remove conditional checks, considering the code to be wrong somewhere in the program. For example, finding the following code snippet
some compilers will decide that the condition | ptr == NULL | should always be FALSE, since otherwise it would be incorrect to dereference it in the function | call_fn () |.
Analyze the compiled code and make sure that all instructions are present in it. (This is not possible for standard-size applications, but it should be done for a piece of code that is critical from a security point of view).
Understand what optimizations your compiler can do and carefully evaluate the effect of each of them in terms of safe programming principles. In particular, be careful with optimizations that remove code fragments or branches, as well as code fragments that prevent errors that “cannot appear” if the rest of the program is correct.
Whenever possible, consider disabling that optimization during compilation that removes or weakens the checking of conditions that affect security.
To prevent instruction removal through optimization, the function can be overridden using the volatile keyword. This is for example used in libottery when overriding | memset |:
A call to memset_s is introduced in C11, for which deletion during optimization is prohibited.
Many programming environments provide different implementations of the same software interfaces - their functionality is the same outwardly, but security features are fundamentally different.
This problem is typical for random number sensors: OpenSSL has | RAND_bytes () | and | RAND_pseudo_bytes () |, the BSD C libraries have | RAND_bytes () | and | RAND_pseudo_bytes () |, in Java - | SecureRandom | and | Random |
Another example may be the fact that in systems that provide time-independent byte word comparison functions, at the same time, there are options that may cause a time leak.
Sometimes a function is safe on some platforms and dangerous on others. In these cases, programmers use this function, believing that their code will be executed on platforms where it is safe. This is a bad approach, because the code can be ported to other platforms and become unsafe - no one will notice.
On systems that allow platform-specific function overrides, some programmers redefine unsafe functions as safe and write programs using the program interface, which is generally unsafe. This is a rather controversial approach, because it leads to the fact that the programmer writes code that looks unsafe. Moreover, if an overridden method does not work ever, the program will become unsafe and this cannot be determined. And finally, this will lead to the fact that the code fragments of such programs will be unsafe if they are copied to other projects.
Whenever possible, do not use unsafe options for safe functions. For example, a MIF based on a robust stream cipher with random initial padding is fast enough for most applications. The data type-independent memcmp replacement is also fast enough to be used for all memory region comparison operations.
If you cannot remove unsafe functions, redefine them so that an error is generated at the compilation stage, or use the tools for static code analysis to identify and warn about the use of unsafe functions. If you can override an unsafe function with a safe option, then for greater security, never call an unsafe API and make sure that you can detect the fact of its use.
If it is necessary to leave both options (safe and unsafe), make sure that the function names are so different that it will be difficult to accidentally use an unsafe option. For example, if you have a safe and insecure MAP, do not name the unsafe option “Random”, “FastRandom”, “MersenneTwister” or “LCGRand” - instead name it, for example, “InsecureRandom”. Design your programming interfaces in such a way that using unsafe functions is always a little scary.
If your platform provides an unsafe version of a function without a name that indicates its insecurity and you cannot remove this function, use the system call wrapper with a safe name, then by static analysis of the code, identify all uses of the unsafe name.
If a function is safe on some platforms and unsafe on others, do not use the function directly: instead, define and use a secure wrapper.
When it is not clear what kind of analysis the various parts of the software interface require, the programmer can easily enough make a mistake about what functionality they can safely use.
Consider the following example (invented, but similar to those found in real life) of the RSA software interface:
Suppose that the “key” parameter contains the components of the details, then the function can be called in 16 ways, many of which are meaningless, and some are unsafe.
Note that only 4 of the 16 possible ways to call this function are safe, another 6 are unsafe, and the remaining 6 in some cases can cause problems during use. Such an API is only suitable for those developers who understand the implications of using various addition methods in the RSA system.
Now imagine that we will add programming interfaces for block encryption in various modes, key generation, various message authentication codes and signatures. Any programmer who tries to develop the correct function that implements authentication and data encryption using such program interfaces will have a huge number of choices, while the number of safe options will obviously be reduced.
In some C-like languages, signed and unsigned integer types are different. In particular, in C, the question is whether type | char | iconic depends on the implementation. This can lead to problematic code such as, for example, the following:
If | char | unsigned, then this code behaves as we expect from it. But if | char | significant, | buf [0] | can take negative values, leading to very large values of the arguments of the functions | malloc | and | memcpy | and the possibility of damage to the heap if we try to set the value of the last character to 0. The situation could be even worse if | buf [0] | is 255, then name_len will be -1. Thus, we will allocate in memory a buffer of size 0 bytes, and then copy | | size_t) -1 memcpy | to this buffer, which will clog the heap.
In languages that distinguish between signed and unsigned byte types, implementations must use unsigned types to represent byte strings in their APIs.
On most operating systems, the memory used by one process can be used by another process without first clearing it because the first process stopped or returned memory to the system. If the memory contains secret keys, they will be available to another process, which increases the chance of their compromise. In multi-user systems, this makes it possible to determine the keys of other users of the system. Even within the same system, such a situation can lead to the fact that previously relatively “safe” vulnerabilities could lead to leakage of sensitive data.
Clear all variables that contain sensitive data until you forget about it and stop using it. Using the | mmap () | function remember that running | munmap () | instantly frees memory and you lose control over it.
To clear memory or destroy objects that fall outside your field of vision, use platform-dependent memory cleaning functions where possible - such as | SecureZeroMemory () | for win32 or | OPENSSL_cleanse () | for openssl.
A more or less universal solution for C could be:
Many cryptographic systems require sources of randomness, and such systems can become unsafe even in the case of small deviations from randomness in such sources. For example, leakage of even one random number in the DSA will lead to an extremely fast determination of the secret key. Insufficient randomness can be quite difficult to determine: a Debian random number generator error in OpenSSL went unnoticed for two years, leading to the compromise of a large number of keys. The requirements for random numbers for cryptographic applications are very strict: many pseudo random number generators do not satisfy them.
For cryptographic applications
Minimize the use of randomness by choosing primitives and their design (for example, Ed25519 allows you to obtain curves for electronic signature in a deterministic way). To generate random numbers, use sources provided by operating systems and guaranteed to satisfy cryptographic requirements, such as | / dev / random |. On resource-limited platforms, consider using analog random noise sources and a good kneading procedure.
Be sure to check the values produced by your sensor to make sure that the bytes received are what they should be and that they were written properly.
Follow the recommendations of Nadi Heninger et al. In Section 7 of their article.
On Intel processors with Ivy Bridge architecture (and subsequent generations), the built-in generator guarantees high entropy and speed.
Unix systems typically use | / dev / random | or | / dev / urandom |. However, the first of them has the blocking property, i.e. it does not return values if it believes that there is not enough randomness accumulated. This property limits usability
, and therefore | / dev / urandom | used more often. Use | / dev / urandom | simple enough:
However, this simple program may not be enough to safely generate randomness: it will be safer to perform additional error checks as in the function | getentropy_urandom | Libressl
On Windows systems | CryptGenRandom | the Win32 API generates pseudo-random bits suitable for use in cryptography. Microsoft offers the following use case:
If you focus on use in Windows XP or later versions, the above CryptoAPI code can be replaced by | RtlGenRandom |
Prevent compiler tampering with parts of your security-critical code
Problem
Some compilers optimize operations that they consider useless.
For example, the compiler MS Visual C ++ found the operator | memset | in the following snippet of Tor anonymous network implementation code:
int
crypto_pk_private_sign_digest(...)
{
char digest[DIGEST_LEN];
(...)
memset(digest, 0, sizeof(digest));
return r;
}
However, the role of this operator | memset | is to clear the | digest | buffer from confidential data, so that with any subsequent readings of data from the uninitialized stack, it was impossible to obtain confidential information.
Some compilers believe that they can remove conditional checks, considering the code to be wrong somewhere in the program. For example, finding the following code snippet
call_fn(ptr); // всегда разыменовывает ptr.
// много много строк
if (ptr == NULL) { error("ptr must not be NULL"); }
some compilers will decide that the condition | ptr == NULL | should always be FALSE, since otherwise it would be incorrect to dereference it in the function | call_fn () |.
Decision
Analyze the compiled code and make sure that all instructions are present in it. (This is not possible for standard-size applications, but it should be done for a piece of code that is critical from a security point of view).
Understand what optimizations your compiler can do and carefully evaluate the effect of each of them in terms of safe programming principles. In particular, be careful with optimizations that remove code fragments or branches, as well as code fragments that prevent errors that “cannot appear” if the rest of the program is correct.
Whenever possible, consider disabling that optimization during compilation that removes or weakens the checking of conditions that affect security.
To prevent instruction removal through optimization, the function can be overridden using the volatile keyword. This is for example used in libottery when overriding | memset |:
void * (*volatile memset_volatile)(void *, int, size_t) = memset;
A call to memset_s is introduced in C11, for which deletion during optimization is prohibited.
#define __STDC_WANT_LIB_EXT1__ 1
#include
...
memset_s(secret, sizeof(secret), 0, sizeof(secret));
Avoid confusing secure and insecure programming interfaces
Problem
Many programming environments provide different implementations of the same software interfaces - their functionality is the same outwardly, but security features are fundamentally different.
This problem is typical for random number sensors: OpenSSL has | RAND_bytes () | and | RAND_pseudo_bytes () |, the BSD C libraries have | RAND_bytes () | and | RAND_pseudo_bytes () |, in Java - | SecureRandom | and | Random |
Another example may be the fact that in systems that provide time-independent byte word comparison functions, at the same time, there are options that may cause a time leak.
Bad decisions
Sometimes a function is safe on some platforms and dangerous on others. In these cases, programmers use this function, believing that their code will be executed on platforms where it is safe. This is a bad approach, because the code can be ported to other platforms and become unsafe - no one will notice.
On systems that allow platform-specific function overrides, some programmers redefine unsafe functions as safe and write programs using the program interface, which is generally unsafe. This is a rather controversial approach, because it leads to the fact that the programmer writes code that looks unsafe. Moreover, if an overridden method does not work ever, the program will become unsafe and this cannot be determined. And finally, this will lead to the fact that the code fragments of such programs will be unsafe if they are copied to other projects.
Decision
Whenever possible, do not use unsafe options for safe functions. For example, a MIF based on a robust stream cipher with random initial padding is fast enough for most applications. The data type-independent memcmp replacement is also fast enough to be used for all memory region comparison operations.
If you cannot remove unsafe functions, redefine them so that an error is generated at the compilation stage, or use the tools for static code analysis to identify and warn about the use of unsafe functions. If you can override an unsafe function with a safe option, then for greater security, never call an unsafe API and make sure that you can detect the fact of its use.
If it is necessary to leave both options (safe and unsafe), make sure that the function names are so different that it will be difficult to accidentally use an unsafe option. For example, if you have a safe and insecure MAP, do not name the unsafe option “Random”, “FastRandom”, “MersenneTwister” or “LCGRand” - instead name it, for example, “InsecureRandom”. Design your programming interfaces in such a way that using unsafe functions is always a little scary.
If your platform provides an unsafe version of a function without a name that indicates its insecurity and you cannot remove this function, use the system call wrapper with a safe name, then by static analysis of the code, identify all uses of the unsafe name.
If a function is safe on some platforms and unsafe on others, do not use the function directly: instead, define and use a secure wrapper.
Avoid mixing security layers and cryptographic primitives abstractions on the same API level
Problem
When it is not clear what kind of analysis the various parts of the software interface require, the programmer can easily enough make a mistake about what functionality they can safely use.
Consider the following example (invented, but similar to those found in real life) of the RSA software interface:
enum rsa_padding_t { no_padding, pkcs1v15_padding, oaep_sha1_padding, pss_padding };
int do_rsa(struct rsa_key *key, int encrypt, int public, enum rsa_padding_t padding_type, uint8_t *input, uint8_t *output);
Suppose that the “key” parameter contains the components of the details, then the function can be called in 16 ways, many of which are meaningless, and some are unsafe.
encryption / decryption | symmetric / asymmetric | type of padding | remarks |
---|---|---|---|
0 | 0 | none | Decryption without padding. The possibility of fake. |
0 | 0 | pkcs1v15 | Decryption PKCS1 v1.5. Perhaps subject to attack by Blainbacher. |
0 | 0 | oaep | Decryption OAEP. A good option. |
0 | 0 | pss | Decryption PSS. A strange enough option, possibly leading to unintentional errors |
0 | 1 | none | Signature without padding. The possibility of fake. |
0 | 1 | pkcs1v15 | Signature PKCS1 v1.5. Suitable for some applications, but it is better to use a PSS signature. |
0 | 1 | oaep | Signature OAEP. Suitable for some applications but it is better to use a PSS signature. |
0 | 1 | pss | Signature PSS. A very good option. |
... | ... | ... | remaining options (encryption and signature verification). |
Note that only 4 of the 16 possible ways to call this function are safe, another 6 are unsafe, and the remaining 6 in some cases can cause problems during use. Such an API is only suitable for those developers who understand the implications of using various addition methods in the RSA system.
Now imagine that we will add programming interfaces for block encryption in various modes, key generation, various message authentication codes and signatures. Any programmer who tries to develop the correct function that implements authentication and data encryption using such program interfaces will have a huge number of choices, while the number of safe options will obviously be reduced.
Decision
- Provide high-level programming interfaces. For example, provide features that implement encryption and authentication of data that use only strong algorithms and in a safe manner. When you write a function that provides various combinations of symmetric and asymmetric algorithms and their operating modes, make sure that this function does not allow the use of unsafe algorithms and their unsafe combinations.
- Whenever possible, avoid low-level APIs. Most users do not need to use RSA without padding, use a block cipher in ECB mode, or use a DSA signature with a user-selected random value. These functions can be used as building blocks in order to implement something durable - for example, to do OAEP padding before calling RSA without additions, use encryption in ECB mode for blocks 1,2,3, ... to implement counter mode or use a random or unpredictable byte sequence for a random DSA value, but practice shows that they will be used more often incorrectly than correctly.
Some other primitives are necessary for the implementation of certain protocols, but most likely will not be suitable for the implementation of new protocols. For example, you cannot implement browser-compatible TLS without CBC, PKCS1 v1.5, and RC4, but any of these primitives is not a good option.
If you provide a cryptographic module for use by inexperienced programmers, it would be better to avoid such functions completely and select (for the API) only functions that implement well-described high-level safe operations. - If you still need to provide an interface to both experienced and inexperienced users, clearly separate the high-level and low-level software interfaces. The “secure encryption” function should not be the same function as the “incorrect encryption” with slightly changed arguments. In languages that separate functions and types into packages and headers, safe and unsafe crypto functions should not be contained in the same packages and headers. In languages with subtypes, there must be separate types for secure cryptorealizations.
Use unsigned types to represent binary data.
Problem
In some C-like languages, signed and unsigned integer types are different. In particular, in C, the question is whether type | char | iconic depends on the implementation. This can lead to problematic code such as, for example, the following:
int decrypt_data(const char *key, char *bytes, size_t len);
void fn(...) {
//...
char *name;
char buf[257];
decrypt_data(key, buf, 257);
int name_len = buf[0];
name = malloc(name_len + 1);
memcpy(name, buf+1, name_len);
name[name_len] = 0;
//...
}
If | char | unsigned, then this code behaves as we expect from it. But if | char | significant, | buf [0] | can take negative values, leading to very large values of the arguments of the functions | malloc | and | memcpy | and the possibility of damage to the heap if we try to set the value of the last character to 0. The situation could be even worse if | buf [0] | is 255, then name_len will be -1. Thus, we will allocate in memory a buffer of size 0 bytes, and then copy | | size_t) -1 memcpy | to this buffer, which will clog the heap.
Decision
In languages that distinguish between signed and unsigned byte types, implementations must use unsigned types to represent byte strings in their APIs.
Clear memory of sensitive data
Problem
On most operating systems, the memory used by one process can be used by another process without first clearing it because the first process stopped or returned memory to the system. If the memory contains secret keys, they will be available to another process, which increases the chance of their compromise. In multi-user systems, this makes it possible to determine the keys of other users of the system. Even within the same system, such a situation can lead to the fact that previously relatively “safe” vulnerabilities could lead to leakage of sensitive data.
Decision
Clear all variables that contain sensitive data until you forget about it and stop using it. Using the | mmap () | function remember that running | munmap () | instantly frees memory and you lose control over it.
To clear memory or destroy objects that fall outside your field of vision, use platform-dependent memory cleaning functions where possible - such as | SecureZeroMemory () | for win32 or | OPENSSL_cleanse () | for openssl.
A more or less universal solution for C could be:
void burn( void *v, size_t n )
{
volatile unsigned char *p = ( volatile unsigned char * )v;
while( n-- ) *p++ = 0;
}
Use strong chance
Problem
Many cryptographic systems require sources of randomness, and such systems can become unsafe even in the case of small deviations from randomness in such sources. For example, leakage of even one random number in the DSA will lead to an extremely fast determination of the secret key. Insufficient randomness can be quite difficult to determine: a Debian random number generator error in OpenSSL went unnoticed for two years, leading to the compromise of a large number of keys. The requirements for random numbers for cryptographic applications are very strict: many pseudo random number generators do not satisfy them.
Bad decisions
For cryptographic applications
- Do not rely on predictable sources of randomness such as time stamps, identifiers, temperature sensors, etc.
- do not rely on common pseudo-random number generation functions such as | rand () |, | srand () |, | random () | libraries | stdlib | or | random | Python language
- Do not use the Mersenne Twister generator
- Do not use resources like www.random.org (random data may become known to third parties or may also be used by them).
- Do not use your own random number generator, even if it is based on a strong crypto primitive (unless you know exactly what you are doing).
- Do not use the same random bits in different places of the application, for their "economical" consumption.
- Do not conclude that the generator is stable only because it passes Diehard or NIST tests .
- Do not conclude that a cryptographically robust generator necessarily protects against reading forward and reading backward.
- Never use “randomness" in its pure form as random data (analog random sources often have deviations, so N bits received from such a source have less than N random bits).
Decision
Minimize the use of randomness by choosing primitives and their design (for example, Ed25519 allows you to obtain curves for electronic signature in a deterministic way). To generate random numbers, use sources provided by operating systems and guaranteed to satisfy cryptographic requirements, such as | / dev / random |. On resource-limited platforms, consider using analog random noise sources and a good kneading procedure.
Be sure to check the values produced by your sensor to make sure that the bytes received are what they should be and that they were written properly.
Follow the recommendations of Nadi Heninger et al. In Section 7 of their article.
On Intel processors with Ivy Bridge architecture (and subsequent generations), the built-in generator guarantees high entropy and speed.
Unix systems typically use | / dev / random | or | / dev / urandom |. However, the first of them has the blocking property, i.e. it does not return values if it believes that there is not enough randomness accumulated. This property limits usability
, and therefore | / dev / urandom | used more often. Use | / dev / urandom | simple enough:
#include
#include
#include
#include
#include
int main() {
int randint;
int bytes_read;
int fd = open("/dev/urandom", O_RDONLY);
if (fd != -1) {
bytes_read = read(fd, &randint, sizeof(randint));
if (bytes_read != sizeof(randint)) {
fprintf(stderr, "read() failed (%d bytes read)\n", bytes_read);
return -1;
}
}
else {
fprintf(stderr, "open() failed\n");
return -2;
}
printf("%08x\n", randint); /* assumes sizeof(int) <= 4 */
close(fd);
return 0;
}
However, this simple program may not be enough to safely generate randomness: it will be safer to perform additional error checks as in the function | getentropy_urandom | Libressl
static int
getentropy_urandom(void *buf, size_t len)
{
struct stat st;
size_t i;
int fd, cnt, flags;
int save_errno = errno;
start:
flags = O_RDONLY;
#ifdef O_NOFOLLOW
flags |= O_NOFOLLOW;
#endif
#ifdef O_CLOEXEC
flags |= O_CLOEXEC;
#endif
fd = open("/dev/urandom", flags, 0);
if (fd == -1) {
if (errno == EINTR)
goto start;
goto nodevrandom;
}
#ifndef O_CLOEXEC
fcntl(fd, F_SETFD, fcntl(fd, F_GETFD) | FD_CLOEXEC);
#endif
/* Lightly verify that the device node looks sane */
if (fstat(fd, &st) == -1 || !S_ISCHR(st.st_mode)) {
close(fd);
goto nodevrandom;
}
if (ioctl(fd, RNDGETENTCNT, &cnt) == -1) {
close(fd);
goto nodevrandom;
}
for (i = 0; i < len; ) {
size_t wanted = len - i;
ssize_t ret = read(fd, (char *)buf + i, wanted);
if (ret == -1) {
if (errno == EAGAIN || errno == EINTR)
continue;
close(fd);
goto nodevrandom;
}
i += ret;
}
close(fd);
if (gotdata(buf, len) == 0) {
errno = save_errno;
return 0; /* satisfied */
}
nodevrandom:
errno = EIO;
return -1;
}
On Windows systems | CryptGenRandom | the Win32 API generates pseudo-random bits suitable for use in cryptography. Microsoft offers the following use case:
#include
#include
#include
#pragma comment(lib, "advapi32.lib")
int randombytes(unsigned char *out, size_t outlen)
{
static HCRYPTPROV handle = 0; /* only freed when program ends */
if(!handle) {
if(!CryptAcquireContext(&handle, 0, 0, PROV_RSA_FULL,
CRYPT_VERIFYCONTEXT | CRYPT_SILENT)) {
return -1;
}
}
while(outlen > 0) {
const DWORD len = outlen > 1048576UL ? 1048576UL : outlen;
if(!CryptGenRandom(handle, len, out)) {
return -2;
}
out += len;
outlen -= len;
}
return 0;
}
If you focus on use in Windows XP or later versions, the above CryptoAPI code can be replaced by | RtlGenRandom |
#include
#include
#include
#define RtlGenRandom SystemFunction036
#if defined(__cplusplus)
extern "C"
#endif
BOOLEAN NTAPI RtlGenRandom(PVOID RandomBuffer, ULONG RandomBufferLength);
#pragma comment(lib, "advapi32.lib")
int main()
{
uint8_t buffer[32] = { 0 };
if (FALSE == RtlGenRandom(buffer, sizeof buffer))
return -1;
for (size_t i = 0; i < sizeof buffer; ++i)
printf("%02X ", buffer[i]);
printf("\n");
return 0;
}