02 Decode Vigenère

In the 16th century, the French diplomat and cryptographer Blaise de Vigenère developed a polyalphabetic cipher that was unbreakable for a long time. In contrast to monoalphabetic, it does not make use of a single key alphabet, which can be quickly figured out with the help of frequency analyses, but he used a separate one for each letter of the text to be encrypted. To do this, he shifted the alphabet by a certain number each time (like Ceasar shifts by 3 and Rot13 shifts by 13 letters). The shift key used for this is then the keyword with which the cryptotext can subsequently be decrypted again.

The best way to illustrate this procedure is with the Vigenère square, which represents all 25 possible shifts:

The first letter of the plaintext “Discworld” – (R.I.P. Sir Terry Pratchett!) – is an S (upper yellow row in the image) and is encrypted with the first letter of the key “Terry”, i.e. the T (yellow, left column in the image), thus shifting the alphabet by 19 letters (A=0) and ending up on the L. The second letter is a C and becomes a G with the key letter E, …

Frequency analyses now run into the void, since the individual letters are no longer replaced with the same one in each case. But Vigenère encryption is still not secure.

The shorter the key and the longer the plaintext, the easier it is to break the code. Since a normal text, no matter in which language, has certain repeating sequences of letters (bi- and trigrams), the probability increases with the length of the plaintext that such sequences of letters have been encrypted with the same key letters and that the cryptotext is similar. Once a repetition of bi- or trigrams has been discovered, the key length can be determined (a divisor of the distance between the same letter sequences). And now, with the knowledge of the length of the keyword, move the cryptotext piece by piece. This approach was christened Kasiski test after one of its discoverers (KasiskiOnline tool).

Even more theoretical is the Friedman test, whose algorithm uses probabilities that two random letters are the same to try to calculate magnitudes of the key length.

Also working with probability is the correlation function. If one counts the letters in the crypto-text and compares them with the letter frequency of normal speech, a sufficiently long text can be used to show the shift. The correlation function has a maximum at the shift where the distributions to be compared coincide best. Which also gives a good reading of the key length with sufficiently long texts.

Finished for all geocaching venereal puzzles? Well, actually not quite yet, there are more pretty approaches to the solution. If the keyword is a real word from the dictionary, the ngram analysis can be applied, which is also implemented at Cryptoool-online, among others. Here, one works with probable bi- or trigrams at the beginning of the word, with which one can draw conclusions about the key, in which one finds probable letter combinations in plain text.

Similarly but less theoretically, approaches work where you know parts of the key or the plaintext to believe. It doesn’t matter which of the two approaches you can use – in each case you place the expected word/part of a word (north, east, cache, search, fifty-two, the GC code or the owner’s name) over the cryptotext and shift forwards and backwards in the alphabet by the respective letters. If you want to and are able to do this, you can certainly program it quickly in Excel. But thanks to the internet, it is not necessary at all. f00l.de, nik kaanan and many others have already made this work easier for us with their scripts. If you don’t want to go to so much trouble, you can try smurfoncrack first. Or at crypt-online, where there are also frequency and n-gram analyses as well as an autocorrelation tool. A simple but sometimes working crack tool is offered by geocachingtoolbox .

And of course the GC Wizard should not be missing. There’s also a Vigenère cracker in the “General Code Solvers” section 😉

By the way, Mr Vigenère developed an improvement of this method, but it never gained its notoriety, although it is much more secure against analyses like the ones described here.

Autokey encryption also works with a keyword and thus like the Vigenère key described here, but the plaintext is appended to the end of the key for encryption and decryption. Thus, the key length is as long as the cryptotext and much more difficult to crack.