RSA, is it that simple?

    Prelude


    Good day, dear readers.
    Most likely, most of you know what the RSA asymmetric encryption algorithm is all about. In fact, there are so many articles devoted to this issue throughout Runet and on this resource in particular, that it is almost impossible to say something new about it.
    Well, there, by golly, you can still come up with it, and so everything is clear a long time ago. The recipe is simple:
    Two primes P and Q.
    Multiply to obtain the number N.
    Choose an arbitrary E.
    Find D = E -1 (mod ( P-1 ) ( Q-1 )).
    For encryption, the message M is raised to the power of E modulo N. To decrypt the cryptotext C to the power of D along the same modulo N. All crypto primitives are ready. We take and use, right? In fact, not so. The thing is that this is really nothing more than a crypto primitive and in the real world everything is a little more complicated.

    Bit of theory


    In order to understand why it is highly discouraged to use RSA in its simplest form, we first note the following requirement put forward to asymmetric cryptosystems.
    Requirement 1:
    A modern asymmetric cryptosystem can (but is not a fact) be considered resistant if an attacker, having two plaintexts M 1 and M 2 , as well as one ciphertext C b, cannot determine with a probability greater than 0.5 which of the two plaintexts matches ciphertext C b .
    Let's see if RSA meets this requirement. So, suppose an attacker Malory listens on the correspondence of Alice and Bob. On one wonderful day for himself, he sees that Bob openly asked Alice a very important question, the knowledge of the answer to which will enrich, or at least immensely amuse, Mallory's curiosity. Alice monotonously answers Bob this personal question. She encrypts her answer with Bob's public key and sends the ciphertext. Then, Mallory intercepts the ciphertext and suspects that either Yes or No is encrypted in it. All he now needs to do in order to find out Alice’s answer is to encrypt the word “Yes” with Bob’s public key and if the received crypto text matches the intercepted one, it means that Alice answered “Yes”, otherwise the attacker will understand that the answer was "no."

    As you can see from this, albeit a somewhat far-fetched, but still not so utopian example, RSA is not as reliable as is commonly believed. Yes, of course, we can say that Alice herself is a fool and no one asked her to answer such a serious question for her in monosyllables. So what now prohibit the use of monosyllabic answers in cryptography? Of course not. Everything is not so bad, it is enough for the algorithm to add some random information to the text that could not be predicted and insidious Mallory will be powerless. Indeed, in fact, he will not be able to predict that the answer “Yes” after processing by the algorithm will turn, for example, into “Da4FE6DA54”, and therefore he will not be able to encrypt this message and he will have nothing to compare with the intercepted cryptotext.

    Thus, we can already say that RSA in all its manifestations, whether PGP or SSL, does not encrypt only the data sent to the input of the encryption function. The algorithm first adds blocks containing a random set of bits to this data. And only after that the result is encrypted. Those. instead of the familiar
    C = M E (mod N ),
    we get closer to reality
    C = (M || Rand) E (mod N ),
    where Rand is a random number.
    This technique is called supplementation schemes. Currently, using RSA without add-on schemes is not so much a bad form as it is a direct violation of standards .

    But that is far from all. It is believed that even if the cryptosystem satisfies the requirement formulated above, this does not yet prove its suitability for practical purposes. We formulate one more requirement for the stability of an asymmetric algorithm.

    Requirement 2:
    Let the attacker have access to the decrypting black box. Those. Any cryptographic text at the request of an attacker can be decrypted. Next, the attacker creates two plaintexts M 1 and M 2 . One of these texts is encrypted and the resulting cryptotext C b is returned to the attacker. The task of the attacker to guess with a probability greater than 0.5 which of the messages M 1 and M 2 corresponds to the cryptotext C b. At the same time, he may ask to decrypt any message, except for C b (otherwise the game does not make sense). It is said that the cryptosystem is stable if the attacker, even in such excellent conditions for himself, cannot indicate to which source text C b corresponds with a probability of more than 0.5.

    In light of the above, let's see how things are with RSA.
    So, the attacker has two messages M 1 and M 2 . And also cryptotext
    C b = M 1 E (mod N ).
    He needs to indicate which of the two texts specifically corresponds to C b. To do this, he can do the following. Knowing the public key E, he can create the message
    C '= 2 E * C b (mod N ).
    He then asks the decrypting “black box” to decrypt the message C '. And then simple arithmetic to help him. We have:
    M '= C' D (mod N ) = 2 ED * M 1 ED (mod N ) = 2 * M 1 (mod N ).
    Those. computing M '/ 2, the attacker will see M 1 . And this means that he will understand that in our example, the message M 1 was encryptedand therefore we once again became convinced of the unacceptability of using RSA in its naive form in practice.

    Eliminate this trouble and help supplement circuit. Only now they are demanding not only that additional information be absolutely random and unpredictable. But it is also necessary for additional blocks to help determine whether the ciphertext was obtained as a result of the encryption function or if it was modeled by an attacker. Moreover, if it is discovered that the ciphertext is modeled instead of decrypted data, the attacker will be given a message about the data mismatching with the real cryptotext.

    It would seem that the implementation of such a supplement scheme is a difficult task, but cryptography already has a ready-made tool for monitoring data integrity. Of course, these are hash functions. All modern add-on schemes are implemented on the idea of ​​using a hash function as a verification of decrypted data for authenticity.

    RSA uses two different add-on schemes when signing and encrypting data. The scheme implemented for signing documents is called the RSA-PSS (probabilistic signature scheme) or probabilistic signature scheme. The scheme used for encryption is RSA-OAEP (Optimal asymmetric encryption padding) or optimized asymmetric encryption padding , using OAEP as an example and consider how message encryption in RSA actually occurs.

    RSA-OAEP


    So, to encrypt absolutely any message in RSA-OAEP, the following is done:
    1. Two hash functions G (x) and H (x) are selected so that the total length of the results of the hash functions does not exceed the RSA key length.
    2. A random string of bits l is generated. The length of the string should be equal to the length of the result of the hash function H (x).
    3. Message M is blocked into k-bits. Then, to each received block m, (nk) zeros are added. Where n is the length of the hash function G (x).
    4. The following set of bits is determined: {m || 0 (nk) ⊕G (l)} || {l⊕H (m || 0 (nk) ⊕G (l))}
    5. The resulting bits are represented as an integer M 1
    6. Cryptotext is obtained by the formula: C = M 1 E (mod N )

    The decryption process is as follows:
    1. Find M 1 by the formula M 1 = C D (mod N )
    2. In the resulting set of bits, the left part is cut off. In the sense: the left part is the n left bits of the number M 1 where n is the length of the hash function G (x). Let us denote these bits by T. And note that T = {m || 0 (nk) ⊕ G (l)}. All other bits are the right part.
    3. We find H (T) = H (m || 0 (nk) ⊕ G (l))
    4. Knowing H (T) we get l, because we know l⊕H (T) is the right side of the block.
    5. After calculating l, we find m from T⊕G (l) since T = {m || 0 (nk) ⊕G (l)}
    6. If m ends with (nk) zeros, then the message is encrypted correctly. If not, this means that the ciphertext is incorrect, and therefore it is most likely falsified by an attacker.

    Conclusion


    Those. RSA is not only exponentiation modulo a large number. It is also the addition of redundant data that allows you to implement additional protection for your information. You may ask: why is all this necessary? Can such a situation really happen when an attacker gains access to a decryption algorithm? On a completely different occasion, it was somehow said: if any trouble can occur, it will certainly happen. Only now, using add-on schemes, this will no longer be considered a nuisance.

    upd: transferred cryptography to the blog.
    upd2:
    Literature and references:
    1. N. Ferguson, B. Schneier “Practical cryptography”
    2. N. Smart “Cryptography”
    3. Specification RSA-OAEP (pdf)

    Also popular now: