The idea of ​​generating and remembering strong passwords

Published on August 17, 2014

The idea of ​​generating and remembering strong passwords

    I think everyone knows about the importance of using complex, non-matching, different, periodically changed passwords, as well as about the problems with remembering them. In principle, there is a relatively good solution to this problem - programs that store the password database in encrypted form. I want to share an alternative solution that has some advantages over such password memos, in particular, it does not require access to a file with an encrypted password database. The main idea is to remember one very strong master password, and generate passwords for individual accounts from it using cryptographic functions. To whom details are interesting - I ask under a cat.

    Password Requirements

    First, you need to understand what requirements a good password should meet:
    1. Must be hard to pick up. It should be long enough and contain characters of different types (large / small letters, numbers, special characters), so that it would not be possible to select it by exhaustive search. Must be unselected by dictionary, dictionary with combinations and mutations.
    2. It must be unique for each account, so that compromising the password from one account will not lead to unauthorized access to other accounts.
    3. Must change periodically.

    At first, I thought what could be done like this: remember one random sequence of characters and for each site dock a string specific to that site for it. For example, take every third character of the site address. Password for Habr (input to id . Tm t m. R u) will look like this: “Q9y: y>”, where “Q9y: y> 1W” is the string used on all sites, and “.tr” is an additive specific to In fact, there are two secret components: the generation algorithm itself and a random string. To get a password, you need to know both. The password is quite resistant to busting, you can periodically change the random drain - requirements (1) and (3) are met. The problem is that both components can be relatively easy to guess, knowing the passwords from a pair of sites.

    General algorithm

    The next step is the use of irreversible cryptographic functions. The secret here will be only a random line - the master password.
    The initial data for generating the password will be:
    • The master password is the secret line of long-term use, it must be complex enough to withstand brute force on any cluster ;-)
    • Site - site address / computer name - where to log in
    • Password version number for this site - needed if you need to change the password for this particular site.
    • Login - it can come in handy if there are several accounts on the same site / computer (so that passwords from different accounts will not match). Also, the login works like a salt to protect against enumeration according to rainbow tables.
    About tables
    Имеется ввиду атака, когда нарушитель каким-либо образом (фишинг, соц. инженерия, уязвимость сайта,...) может получать пароли от какого-либо популярного сайта. В таком случае для получения мастер-паролей тех пользователей, которые пользуются таким генератором паролей, но не заполняют поле логин, он может сгенерировать радужные таблицы и перебирать по ним. Конечно, такая атака имеет смысл только при широком распространении использования этого генератора паролей.

    This data is fed to the input of the hash function, the output is considered as a long integer, encoded in a 95 number system, we get a pseudo-random string 25 characters long for a 160-bit hash function. It’s better to fold the leading character, because it can take far from all values ​​(more precisely, only 5 values). We get 24 pseudo-random simovol.
    More precisely about the distribution of characters
    На самом деле распределение вероятностей по возможным значениям символов там будет не совсем равномерным за счёт того, что 2^160 не является степенью 95. Получится, что часть символов будет появляться с бОльшей вероятностью, чем другие. Но эта неравномерность будет очень быстро (экспоненциально) убывать от старших разрядов к младшим. Так (в предположении что хэш-функция идеальная и все значения её выхода равновероятны) на 24ой (старшей) позиции символы будут появляться с вероятностями 1.158011% и 1.051510%. На 23-ей позиции — с вероятностями 1.053724%, 1.051753% и 1.051510%. На 22-ой позиции — с вероятностями 1.052652%, 1.052639% и 1.052629%. Дальше разница будет ещё меньше. Для паролей такие отклонения от равномерного распределения значения практически не имеют.

    Concrete implementation

    Here lies a simple console program in Python3, which thus generates passwords.
    You can specify one of the options:
    -m - generate many passwords from one master password (in order not to enter the master password several times);
    -r - generate a random password (the source of randomness is the system generators of random sequences and a line entered from the keyboard);
    without parameters - generate one password and exit.
    Immediately after entering the master password, the program displays a 2-byte hash from the master password. This is necessary to control the correctness of the input. If the hash seems unfamiliar, then the master password is entered incorrectly.
    This program uses 2 different hash functions (ripemd160 and sha512) and a slightly more confused order of feeding it with the original data than just string concatenation. Using two functions — a move for paranoid ones — allows (as far as I see) the algorithm’s cryptographic stability even when a very serious vulnerability is detected in one of them (it’s clear that this is almost unbelievable, but the overhead for such “paranoia” is also minimal).
    Exact algorithm
    SitePwd = ripemd160(ver2str(version) + sha512_hex(ver2str(version) + master_password + site + login) + site)
    Все текстовые строки кодируются UTF-8; ' + ' означает конкатенацию строк.
    Функция ver2str(n) возвращает строку, состоящую из записанных подряд n+1 последовательных чисел от n до 0, например:
    ver2str(0) = '0'
    ver2str(4) = '43210'
    ver2str(12) = '1211109876543210'

    SitePwd дальше преобразуется в длинное целое число, которое записывается в 95-ричной системе счисления. В качестве 95-ричных цифр используются символы из массива chars (всего 95 символов):
    chars = "`1234567890-=\~!@#$%^&*()_+|qwertyuiop[]QWERTYUIOP{}asdfghjkl;'ASDFGHJKL:"zxcvbnm,./ZXCVBNM<>? "
    (Без внешних кавычек, последний символ — пробел)
    24 младших разряда полученной записи, выведенные в порядке слева на право от младших к старшим, и являются сгенерированным паролем.

    Practically problems using such passwords

    Some services have restrictions on the character set that can be included in the password. If forbidden characters are present in the generated password, you have to skip them. Accordingly, with subsequent use of the service, you have to remember which characters were skipped.

    Comparison with password memos

    Advantages of such an algorithm compared to password memos:
    • No need to have a password database on hand. It took me to get a password on another computer - I just downloaded the program from the github, drove my master password - and the entire password database with me.
    • There is no file that an attacker could steal and try to decrypt. The truth is, in a good memo with a good master password, decrypting a password database is also almost unbelievable.
    • Even if you gain full access to your computer (with reading all files and installing keyloggers), an attacker will still have to guess your account data (especially those that are not accessed from this computer)
    • It guarantees the use of different and random passwords. And you do not need to invent a password every time.

    Pros of "memorization":
    • If the master password is compromised, the cracker still needs to obtain an encrypted file to gain access to the accounts. In the case of a password generator, he will only have to find out the details of accounts that use the generated passwords. True, IMHO, if an attacker was able to get a master password, then getting an encrypted password database for him, too, is likely to be easy.
    • You can use existing passwords.
    • They allow you to remember a password of any structure from any character set, there is no problem with password restrictions on specific sites.

    What's next

    I also want to add here the database of accounts. So that it was necessary to enter only a master password and, optionally, a password to decrypt the database; and the site name, login, version number, password length and list of skipped characters could be taken from the database.

    Similar programs

    In the comments they suggested that there is still:
    MasterPassword - an article made by the
    hawkite FanKiLL - the same in JavaScript from a team from Stanford University. There are plugins for browsers. After hashing, the password is supplemented in such a way as to pass the standard complexity test.

    Only registered users can participate in the survey. Please come in.

    How do you solve the problem of remembering passwords?