Viginer's cipher and its clue
I must say right away that this topic is interesting only from the point of view of the history of cryptography, the described cipher is of little use for protecting information in the modern world. But, nevertheless, the algorithms described in the topic can come in handy at specialized olympiads.
By Blaise Viginer, a rather interesting encryption method was proposed in the 17th century. The cipher key is a special phrase. This phrase, repeated many times, is written over the encrypted text. Each letter of the secret message is obtained by shifting each letter of the source text by a certain number, specified by the letter of the key phrase (Letter A does not give a shift, letter B means a shift by one position, C by two, etc.).
For example, let's try to encrypt the word "SECRET", using the key phrase "ABV". The letter C is not shifted, the first letter E is shifted by one position, turning into F, the letter K is shifted by two positions, turning into M. Continuing to encrypt the message, we end up with "SZHMRZHF."
For three centuries, this code was considered practically unbreakable. The first attempts to crack this cipher were made in the 19th century. All these attempts were based on determining the length of the key phrase. If we know its length, then we can split the entire encrypted text into fragments, each of which is encoded with the same shift. In our example, the letters C, P are encoded with a zero shift, E, E are encoded with a shift of one, K, T are encoded with a shift of 2. If the text is long enough, we can apply frequency analysis and thereby reveal the original message. It turns out that the key to this cipher is to search for the length of the key phrase.
Now we will consider two methods for finding this length. The first method was proposed by Friedrich Kasitsky. The basis of Kasitsky's method is the search for bigrams. In the case when in the encrypted message the same bigram is repeated at a distance multiple of the length of the key phrase, it will occur at the same positions and in the encrypted text. Having found this distance and having received all its divisors, we get a set of candidate numbers for the length of the key phrase.
Let’s try to decipher the following text using Kasitsky’s method: “OAITABNPKHYUPM'AEMAZCHAFRYUATSMATVUSHKGUNSHYDOAYVYTCHYHTYTENENHEAPNHDRESYUNYAZ”. The ciphertext contains three repeating bigrams MA (positions 16 and 26), YOU (positions 44 and 49) and HX (positions 57 and 62). The bigram MA is repeated at a distance of 10 positions, the bigrams YO and HX at a distance of 5 positions. Most likely, the position of the key sequence length is 5. The method under consideration requires some luck, because “random” bigrams may appear in the text. Their probability is much lower than that of “regular” ones, but in small texts they can significantly complicate the decoding.
Before finally decrypting the text, we will consider another method for determining the key length proposed by Friedman. The essence of the method is in the cyclic shift of the message. Messages received in this way are recorded under the original ciphertext and the number of matching letters in the upper and lower lines is calculated. Based on these numbers, the so-called match index equal to the ratio of the number of matches to the total length of the message. For Russian texts, the coincidence index is approximately 6%, but for random texts this index is 1/32, i.e. approximately 3%. The Friedman method is based on this fact. The text is written with a shift of 1,2,3, etc. positions and for each shift the index of matches is calculated. Cyclically shifting our message we get:
Shift Matches Index
2 0 0.000
3 5 0.068
4 2 0.027
5 8 0.110 (!)
6 1 0.014
7 1 0.014
8 2 0.027
When you shift 5, the index increases sharply, so the length of the keyword is likely to be 5. It is quite easy to understand why the index increases sharply. In the case when all characters are shifted by the same position, the match index is the same as that of the source text. In the case when we calculate the index for the Viginer cipher, in all cases (except where the shift length is equal to the length of the key) we are comparing virtually random text.
Having determined the key length, we can, using the table of the frequency of letters, find out that the famous children's poem was encrypted:
Our Tanya cries loudly:
Dropped the ball into the river.
Tanya, Tanya, do not cry,
Do not drown in the river ball!
The last name of the author of Barto was taken as a password.
I hope you found something new in this article. And, I hope, the knowledge you use is exclusively for the good.
By Blaise Viginer, a rather interesting encryption method was proposed in the 17th century. The cipher key is a special phrase. This phrase, repeated many times, is written over the encrypted text. Each letter of the secret message is obtained by shifting each letter of the source text by a certain number, specified by the letter of the key phrase (Letter A does not give a shift, letter B means a shift by one position, C by two, etc.).
For example, let's try to encrypt the word "SECRET", using the key phrase "ABV". The letter C is not shifted, the first letter E is shifted by one position, turning into F, the letter K is shifted by two positions, turning into M. Continuing to encrypt the message, we end up with "SZHMRZHF."
For three centuries, this code was considered practically unbreakable. The first attempts to crack this cipher were made in the 19th century. All these attempts were based on determining the length of the key phrase. If we know its length, then we can split the entire encrypted text into fragments, each of which is encoded with the same shift. In our example, the letters C, P are encoded with a zero shift, E, E are encoded with a shift of one, K, T are encoded with a shift of 2. If the text is long enough, we can apply frequency analysis and thereby reveal the original message. It turns out that the key to this cipher is to search for the length of the key phrase.
Now we will consider two methods for finding this length. The first method was proposed by Friedrich Kasitsky. The basis of Kasitsky's method is the search for bigrams. In the case when in the encrypted message the same bigram is repeated at a distance multiple of the length of the key phrase, it will occur at the same positions and in the encrypted text. Having found this distance and having received all its divisors, we get a set of candidate numbers for the length of the key phrase.
Let’s try to decipher the following text using Kasitsky’s method: “OAITABNPKHYUPM'AEMAZCHAFRYUATSMATVUSHKGUNSHYDOAYVYTCHYHTYTENENHEAPNHDRESYUNYAZ”. The ciphertext contains three repeating bigrams MA (positions 16 and 26), YOU (positions 44 and 49) and HX (positions 57 and 62). The bigram MA is repeated at a distance of 10 positions, the bigrams YO and HX at a distance of 5 positions. Most likely, the position of the key sequence length is 5. The method under consideration requires some luck, because “random” bigrams may appear in the text. Their probability is much lower than that of “regular” ones, but in small texts they can significantly complicate the decoding.
Before finally decrypting the text, we will consider another method for determining the key length proposed by Friedman. The essence of the method is in the cyclic shift of the message. Messages received in this way are recorded under the original ciphertext and the number of matching letters in the upper and lower lines is calculated. Based on these numbers, the so-called match index equal to the ratio of the number of matches to the total length of the message. For Russian texts, the coincidence index is approximately 6%, but for random texts this index is 1/32, i.e. approximately 3%. The Friedman method is based on this fact. The text is written with a shift of 1,2,3, etc. positions and for each shift the index of matches is calculated. Cyclically shifting our message we get:
Shift Matches Index
2 0 0.000
3 5 0.068
4 2 0.027
5 8 0.110 (!)
6 1 0.014
7 1 0.014
8 2 0.027
When you shift 5, the index increases sharply, so the length of the keyword is likely to be 5. It is quite easy to understand why the index increases sharply. In the case when all characters are shifted by the same position, the match index is the same as that of the source text. In the case when we calculate the index for the Viginer cipher, in all cases (except where the shift length is equal to the length of the key) we are comparing virtually random text.
Having determined the key length, we can, using the table of the frequency of letters, find out that the famous children's poem was encrypted:
Our Tanya cries loudly:
Dropped the ball into the river.
Tanya, Tanya, do not cry,
Do not drown in the river ball!
The last name of the author of Barto was taken as a password.
I hope you found something new in this article. And, I hope, the knowledge you use is exclusively for the good.