An electronic digital signature for dummies: what it is with and how not to choke. Part 1

    So, more and more often in the circles working with documents the words “electronic document” and related to it are almost inextricably “electronic digital signature”, otherwise - digital signature.

    This series of articles is intended to reveal “secret knowledge” about what it is, when and how it can and should be used, what are the pros and cons.

    Naturally, the articles are written not for cryptography specialists, but for those who will use this very cryptography, or just starting to study it, wanting to become a specialist, so I tried to simplify the understanding of the whole process as much as possible, giving analogies and considering examples.


    Why do we need to sign anything? Naturally, to make sure that we have read the content, we agree (and sometimes, on the contrary, disagree) with it. And the electronic signature also protects our content from spoofing.

    So, of course, starting with what an electronic digital signature is.
    In the most primitive case, this is the result of a hash function.Wikipedia will explain what this is better for me, in our case the main thing is that with a high degree of probability its result does not repeat for different source data, and also that the result of this function is not only shorter than the original data, so you can’t restore the original information from it . The result of a function is called a hash, and applying this function to data is called a hash. Roughly, you can call a hash function archiving, as a result of which we get a very small sequence of bytes, but you cannot restore the original data from such an “archive”.

    So, we read the file in memory, hash the read. And what, are we already getting EDS? Nearly. With great exaggeration, our result can be called a signature, but, nevertheless, it is not a full-fledged signature, because:

    1. We do not know who made this signature

    2. We do not know when the signature was made

    3. The signature itself is not protected from substitution in any way.

    4. Well, yes, there are many hash functions, which one was used to create this particular hash?

    Therefore, applying the word “signature” to a hash is still not good, we will call it simply a hash in what follows.

    You send your file to another person, for example, by mail, being sure that he will definitely receive and read exactly what you sent. He, in turn, must also hash your data and compare your result with yours. If they coincided - all is well. Does this mean that the data is protected? Not.
    After hashinganyone and anytime can, and you will never prove that he hashed not what you sent. That is, if the data will be intercepted along the way by an attacker, or the one to whom you send the data is not a very good person, then the data can be easily replaced and cached. And your recipient (well, or you, if the recipient is the same bad person) will never know that he did not receive what you sent, or he replaced the information from you for further use for your own bad purposes.
    Therefore, the place to use a pure hash function is to transport data within a program or programs if they can communicate with each other. Actually, using hash functions checksums are calculated . And these mechanisms protect against accidentaldata substitution, but do not protect against special .

    But, let's move on. We want to protect our hash result from spoofing, so that every person we come across cannot claim to have the correct result. For this, the most obvious is that (in addition to administrative measures)? Right, encrypt. But with the help of encryption, you can also verify the identity of the one who hashed the data! And to do this is relatively simple, because there is asymmetric encryption . Yes, it is slow and heavy, but we just need to encrypt a small sequence of bytes. The advantages of this action are obvious - in order to verify our signature, you will need to have our public key, by which the identity of the encrypted (and, therefore, created the hash) can be easily established.
    The essence of this encryption is as follows: you have a private key that you keep at home. And there is a public key. You can show and distribute the public key to everyone, but the private key cannot. Encryption is performed using the private key, and decryption is performed using the public key.
    In an analogy, you have an excellent lock and two keys to it. One key lock opens (open), the second - closes (closed). You take a box, put some thing in it and close it with your lock. So, as you want the recipient to open the box closed by your lock, then you openly open the lock and give the key to him calmly. But you do not want someone to close the box anew with your lock, because this is your personal lock, and everyone knows that it is yours. Therefore, you always keep the closing key with you so that someone does not put vile filth in your box and does not say later that you put it and closed it with your lock.

    And everything would be fine, but then a problem immediately arises, and, in fact, not even one.

    1. It is necessary to somehow transfer our public key, while the receiving side must understand it.

    2. We need to somehow associate this public key with us so that it cannot be assigned.

    3. Not only should the key be associated with us, we must also understand which encrypted hash to decrypt with which key. And if there is more than one hash, but, say, a hundred? Keeping a separate registry is a very difficult task.

    All this leads us to the fact that both the private key and our hash must be stored in some formats that need to be standardized, distributed as widely as possible and then used so that the sender and recipient do not have “translation difficulties”.

    As usual with people, they could not come to something single, and two large camps formed - the OpenPGP format and the S / MIME + X.509 format. But more on that in the next article.

    Part 2

    Also popular now: