Risks and problems of password hashing

Original author: Miguel Ibarra Romero
  • Transfer
Security has always been a controversial topic, provoking numerous heated debates. And all thanks to the abundance of a variety of points of view and "ideal solutions" that suit some and are completely unsuitable for others. I believe hacking an application’s security system is just a matter of time. Due to the rapid growth of computing power and increasing complexity, today's secure applications will cease to be tomorrow.

Note perev .: for a more complete picture here, you will also be waiting for the translation of Hashing Passwords with the PHP 5.5 Password Hashing API , which the author refers to in the article.

If you have not studied hashing algorithms, then you most likely perceive them as a one-way function that converts variable-length data into fixed-length data . Let us analyze this definition:
  • One-way function : it is impossible to restore the original data from the hash using any efficient algorithm.
  • Converting variable-length data to fixed-length data : the input value can be “infinite” length, but the output value can not. This implies that two or more input values ​​can have the same hashes. The shorter the hash length, the higher the chance of collision.

Algorithms MD5 and SHA-1 no longer provide a sufficiently high reliability in terms of the likelihood of collisions (see Birthdays paradox ). Therefore, it is recommended to use algorithms that generate longer hashes ( SHA-256, SHA-512 , whirlpool , etc.), which makes the probability of collision negligible. Such algorithms are also called " pseudo-random functions", that is, the results of their work are indistinguishable from the results of the operation of a full random number generator (TRNG).

The disadvantages of simple hashing


The fact that with the help of an effective algorithm it is impossible to perform an operation that is inverse to hashing and restore the original data does not mean that you cannot be hacked. If you search well, you can find databases with hashes of common words and short phrases. In addition, simple passwords can be quickly and easily brute-force or cracked by dictionary search .

Here is a small demonstration of how the sqlmap tool , through SQL injection, breaks passwords using the brute force hashes generated by the MD5 algorithm.



Attackers can do even easier - google specific hashes in online databases:

You also need to understand that if two or more identical passwords have the same hashes, then by breaking one hash, we get access to all accounts that use the same password. For example: suppose we have several thousand users, for sure several of them use the password 123456 (if the site settings do not make the password more complicated). The MD5 hash for this password is e10adc3949ba59abbe56e057f20f883e. So if you get this hash and look in the database for this value, you will find all users with this password.

Why salt hashes are not safe


To complicate the attacks of this type, the so-called salt is used . This is a standard tool, but in the conditions of modern computing power it is no longer enough, especially if the salt is small.

In general, a function using salt can be represented as follows:

f (password, salt) = hash (password + salt)

To complicate a brute force attack, salt must be at least 64 characters long. But the problem is that for further user authentication, the salt must be stored in the database in plain text.

if (hash ([entered password] + [salt]) == [hash]) then the user is authenticated

Due to the uniqueness of salt for each user, we can solve the problem of collisions of simple hashes. Now all hashes will be different. Also approaches with google hashes and brute force will not work. But if an attacker gains access to salt or a database through SQL-code implementation, he will be able to successfully attack using brute force or dictionary search, especially if users choose common passwords (a la 123456).

Nevertheless, hacking any of the passwords will no longer automatically calculate users who have the same password, because we have ALL the hashes different.

Moment of chance


To generate the right salt, we need a good random number generator. Just forget about the rand () function.

There is a wonderful article devoted to this issue. In short: the computer itself does not generate random data, it is a deterministic machine . That is, each executed algorithm, having received the same data several times at the input, will present the same result at the output.

When a random number is wanted from a computer, it usually takes data from several sources (for example, environment variables: date, time, number of bytes written / read, etc.), and then performs calculations on them to obtain "random" data. Therefore, such data is called pseudo-random. So, if we somehow recreate the set of initial states at the time of execution of the pseudo-random function, then we can generate the same number.

If the pseudo-random generator is also implemented incorrectly, then patterns can be detected in the data generated by it, and with their help it is possible to predict the result of generation. Take a look at this picture, which is the result of the PHP function rand ():

image

Now compare with the data generated by a full-fledged random number generator:



Unfortunately, neither rand () nor mt_rand () can be considered suitable tools to ensure a high level of security.

If you need to get random data, use the openssl_random_pseudo_bytes () function, which is available since version 5.3.0. She even has a crypto_strong flag, which will indicate a sufficient level of security.

Usage example:
<?phpfunctiongetRandomBytes($byteLength){
    /*
     * Проверка доступности openssl_random_pseudo_bytes
     */if (function_exists('openssl_random_pseudo_bytes')) {
        $randomBytes = openssl_random_pseudo_bytes($byteLength, $cryptoStrong);
        if ($cryptoStrong)
            return $randomBytes;
    } 
    /*
     * Если openssl_random_pseudo_bytes недоступен или результат его работы слишком
     * слабый, то задействуется менее безопасный генератор
     */
    $hash = '';
    $randomBytes = '';
    /*
     * На Linux/UNIX-системах /dev/urandom является прекрасным источником энтропии, 
     * используйте его для получения начального значения $hash
     */if (file_exists('/dev/urandom')) {
        $fp = fopen('/dev/urandom', 'rb');
        if ($fp) {
            if (function_exists('stream_set_read_buffer')) {
                stream_set_read_buffer($fp, 0);
            }
            $hash = fread($fp, $byteLength);
            fclose($fp);
        }
    }
    /*
     * Используйте менее безопасную функцию mt_rand(), но только не rand()!
     */for ($i = 0; $i < $byteLength; $i ++) {
        $hash = hash('sha256', $hash . mt_rand());
        $char = mt_rand(0, 62);
        $randomBytes .= chr(hexdec($hash[$char] . $hash[$char + 1]));
    }
    return $randomBytes;
}

Password extension


You can implement password stretching, this makes it even more difficult brute force attacks. Stretching is an iterative, or recursive, algorithm that computes the hash of itself over and over again, tens of thousands of times (or even more).



The number of iterations should be such that the total computation time takes at least one second. The longer hashing is obtained, the more time an attacker has to spend on hacking.

To crack a password with stretching you need:
  1. Know the exact number of iterations, since any deviation will give a different hash;
  2. wait at least a second between each attempt.

This makes the attack very unlikely ... but not impossible. To overcome the second delay, the attacker must use a more efficient computer than the one under which the hash algorithm was configured. Consequently, the hacking process may require additional costs.

To stretch the password, you can use standard algorithms, for example PBDKDF2, which is a key generation function :
<?php/*
 * Количество итераций можно увеличить, чтобы перекрыть дальнейший рост 
 * производительности CPU/GPU. Используйте разные соли для каждого пароля
 * (можно класть их вместе со сгенерированным паролем). Данная функция
 * работает медленно, это сделано преднамеренно! Больше информации здесь: -
 * http://ru.wikipedia.org/wiki/PBKDF2 - http://www.ietf.org/rfc/rfc2898.txt
 */functionpbkdf2($password, $salt, $rounds = 15000, $keyLength = 32, 
        $hashAlgorithm = 'sha256', $start = 0){
    // Key blocks to compute
    $keyBlocks = $start + $keyLength;
    // Derived key
    $derivedKey = '';
    // Create keyfor ($block = 1; $block <= $keyBlocks; $block ++) {
        // Initial hash for this block
        $iteratedBlock = $hash = hash_hmac($hashAlgorithm, 
                $salt . pack('N', $block), $password, true);
        // Perform block iterationsfor ($i = 1; $i < $rounds; $i ++) {
            // XOR each iteration
            $iteratedBlock ^= ($hash = hash_hmac($hashAlgorithm, $hash, 
                    $password, true));
        }
        // Append iterated block
        $derivedKey .= $iteratedBlock;
    }
    // Return derived key of correct lengthreturn base64_encode(substr($derivedKey, $start, $keyLength));
}

There are also more time-consuming and memory-intensive algorithms, for example, bcrypt (we will talk about it below) or scrypt:
<?php// bcrypt внедрён в функцию crypt()
$hash = crypt($pasword, '$2a$' . $cost . '$' . $salt);

  • $ cost - labor input rate;
  • $ salt is a random string. It can be generated, for example, using the secure_rand () function described above.

The labor rate depends entirely on the machine on which the hashing is performed. You can start with a value of 09 and gradually increase until the duration of the operation reaches one second. Starting with version 5.5, you can use the password_hash () function, we will talk about this later.

PHP does not currently support scrypt, but you can use the implementation from Domblack .

The use of encryption technology


Many are confused in terms of hashing and encryption. As mentioned above, the hash is the result of the pseudo-random function, while encryption is the implementation of the pseudo-random transformation : the input data is divided into parts and processed in such a way that the result becomes indistinguishable from the result of the work of a full-fledged random number generator. However, in this case, you can perform the inverse transformation and restore the original data. The conversion is carried out using a crypto key, without which it is impossible to carry out the inverse transformation.

There is another important difference between encryption and hashing: the size of the space of the output message is unlimited and depends on the size of the input data in a 1: 1 ratio. Therefore, there is no risk of collisions.

Great care must be taken to use encryption correctly. Do not think that to protect important data, it is enough to simply encrypt using some algorithm. There are many ways to steal data. The main rule - never engage in amateur performances and use ready-made, well-developed implementations.

Some time ago, Adobe had a powerful user database leak due to improperly implemented encryption. Let's see what happened to them.

image

Suppose that the following data is stored in plain text in a table:



Someone in Adobe decided to encrypt passwords, but made two big mistakes:
  1. used the same crypto key;
  2. left passwordHint fields unencrypted.

Suppose, after encryption, the table began to look like this:

image

We do not know which crypto key was used. But if you analyze the data, you will notice that the same password is used in lines 2 and 7, as well as in lines 3 and 6.

It's time to turn to the password hint. Line 6 is “I'm one!”, Which is completely uninformative. But thanks to line 3, we can assume that the password is queen. Lines 2 and 7 separately do not allow us to calculate the password, but if we analyze them together, we can assume that this is halloween.

To reduce the risk of data leakage, it is better to use different hashing methods. And if you need to encrypt passwords, then pay attention to custom encryption:



Suppose we have thousands of users and we want to encrypt all passwords. As shown above, it is better to avoid using a single crypto key. But we also cannot make a unique key for each user, since storing the keys in itself will turn into a problem. In this case, it is enough to apply a common cryptocurrency for all, but at the same time make a “setting” unique to each user. The combination of key and "settings" will be a unique key for each user.

The simplest “setup” option is the so-called primary key , unique for each entry in the table. It is not recommended to use it in life, here it is shown only as an example:

f (key, primaryKey) = key + primaryKey

Here, the key and primary key are simply interlocked together. But to ensure security, a hash algorithm or key derivation function should be applied to them. Also, instead of the primary key, you can use a one-time key (analogue of salt) for each record .

If we apply custom encryption to our table, it will look like this:

image

Of course, you will need to do something else with password hints, but nevertheless, something adequate has already turned out.

Please note that encryption is not an ideal solution for storing passwords. Due to code injection threats, it is best to avoid this protection method. The most reliable way to store passwords is to use the bcrypt algorithm. But we must not forget that even the best and most proven solutions have vulnerabilities.

PHP 5.5


Today, using bcrypt is considered the best way to hash passwords. But many developers still prefer older and weaker algorithms like MD5 and SHA-1. And some do not even use salt when hashing. PHP 5.5 introduced a new hash API, which not only encourages the use of bcrypt, but also makes it much easier to work with. Let's look at the basics of using this new API.

Four simple functions apply here:
  • password_hash () - password hashing;
  • password_verify () - comparison of password with hash;
  • password_needs_rehash () - password rehashing;
  • password_get_info () - returns the name of the hashing algorithm and options used during hashing.

password_hash ()


Despite the high level of security provided by the crypt () function, many consider it to be too complicated, which is why programmers often make mistakes. Instead, some developers use a combination of weak algorithms and weak salts to generate hashes:
<?php
$hash = md5($password . $salt); // работает, но уровень безопасности невысок

The password_hash () function makes life easier for the developer and improves code security. To hash a password, it is enough to feed its functions, and it will return a hash that can be placed in the database:
<?php
$hash = password_hash($passwod, PASSWORD_DEFAULT);

And that’s it! The first argument is the password as a string, the second argument sets the hash generation algorithm. By default, bcrypt is used, but if necessary, you can add a stronger algorithm that allows you to generate longer strings. If you use PASSWORD_DEFAULT in your project, then make sure that the column width for storing hashes is at least 60 characters. It’s best to immediately set 255 characters. You can use PASSWORD_BCRYPT as the second argument. In this case, the hash will always be 60 characters long.

Please note that you do not need to set a salt value or a cost parameter. The new API will do everything for you. Since salt is part of the hash, you do not have to store it separately. If you still need to set your own salt (or value) value, then this can be done using the third argument:
<?php
$options = [
    'salt' => custom_function_for_salt(), // Напишите собственный код генерирования соли'cost' => 12// По умолчанию стоимость равна 10
];
$hash = password_hash($password, PASSWORD_DEFAULT, $options);

All this will allow you to use the latest security features. If a stronger hashing algorithm appears in PHP in the future, then your code will use it automatically.

password_verify ()


Now consider the function of comparing a password with a hash. The first is entered by the user, and the second we take from the database. Password and hash are used as two arguments to password_verify (). If the hash matches the password, then the function returns true.
<?phpif (password_verify($password, $hash)) {
    // Успешно!
}
else {
    // Неверные данные
}

Remember that salt is part of the hash, so it is not set separately here.

password_needs_rehash ()


If you want to increase the level of security by adding a stronger salt or increasing the cost parameter, or if the default hashing algorithm changes, then you probably want to hash all available passwords. This function will help to check each hash for what algorithm and parameters were used when creating it:
<?phpif (password_needs_rehash($hash, PASSWORD_DEFAULT, ['cost' => 12])) {
    // Пароль надо перехэшировать, поскольку использовался не текущий// алгоритм по умолчанию либо стоимостный параметр не был равен 12
    $hash = password_hash($password, PASSWORD_DEFAULT, ['cost' => 12]);
    // Не забудьте сохранить новый хэш!
}

Do not forget that you will need to do this at the moment when the user tries to log in, as this is the only time you will have access to the password in plain text.

password_get_info ()


This function takes a hash and returns an associative array of three elements:
  • algo - a constant that allows you to identify the algorithm;
  • algoName - name of the algorithm used;
  • options - values ​​of different options used during hashing.

Earlier versions of PHP


As you can see, working with the new API is not an example easier than with the awkward crypt () function. If you use earlier versions of PHP, I recommend paying attention to the password_compact library . It emulates this API and is automatically disabled when you upgrade to version 5.5.

Conclusion


Unfortunately, there is still no ideal solution for data protection. In addition, there is always a risk of breaking into your security system. However, the struggle between the shell and the armor does not stop. For example, our arsenal of protective equipment has recently replenished with the so-called sponge functions .

Also popular now: