01 Manually decrypt ciphertexts

Simple, monoalphabetic ciphers, where each letter of the alphabet is simply assigned a different letter, character or string of numbers, can be deciphered in a relatively short time using a piece of paper, a table of letter frequencies for the language in question, and a little time.

As a general rule, the longer the ciphertext, the easier it is to figure it out using letter frequencies, word frequencies, word endings and linguistic skill.

The first thing to do is to count the characters of the ciphertext and sort them by frequency. If you have about 20-27 different characters (depending on the length of the text), you almost certainly have a single-cipher alphabet (26 characters) plus spaces and possibly one or two punctuation marks (full stop, comma and, for geocachers, perhaps a degree sign). If there are about 55 characters, upper and lower case may have been used, possibly also German umlauts. If there are about 10 characters on top, there could also be numbers in the text.

On the Cryptool page, as in the Wikipedia article, you will find the table with letter frequencies of the German and English language. There is also the “frequency mountain”, which is very helpful if you want to visually recognise rotation ciphers. That is, the form of letter ciphers in which the alphabet is only shifted by x digits. With Caesar and his code it was 3 (an “A” becomes a “C”, a “B” becomes a “D”, …), today the 13 is very often used (ROT13), which has the charm that with a further jump of 13 characters in the alphabet you arrive back at the original text. Decryption and encryption are thus possible in the same way. It should be emphasised, however, that a rotation cipher is by no means an encryption of any kind, i.e. it should never be used to conceal information that really needs to be kept secret. It is more of a gimmick where text is not immediately readable.

Does the character distribution give similar “swings” as the letter distribution in the tables? One or two characters should occur very frequently; the space (if it has been encoded at all) is usually the most frequent character. However, this can also be omitted, in which case the words are no longer as easy to read. Close on the heels of the space character is the letter “E”, which occupies about 17% of the letters in average German texts.

Even in my very small sample encoding, the relative frequency of the “E “s is correct.

The ciphertext  tuxjlaktlfckokotyfckojxkobokxlaktl has only 12 different letters, but that is because it is so short. The most common letter is the “K” with about 20%. If we assume that this is the E, we may already have deciphered one fifth of the ciphertext and, above all, a starting point for linguistic skill and typical letter combinations or word endings.

So, in addition to the tables for letter frequencies, there are also some with the most frequent letter endings. Here lead: “en, em, es, el and er, st, ing, sam, bar, lich, ung, heit, keit”.

Also interesting are the most frequent bigrams, i.e. pairs of letters occurring together: “en, er, ch, ck, (where c alone almost never occurs), te, de, nd, ei, ie, in, es”. And trigrams (the three most frequently occurring consecutive letters): “a, i, nde, the, and, the, che, end, gen, sch”.

Another look should be taken at the most frequently used words in German. This hit list is headed by “der, die, und, in, den, von, zu, das, mit, sich, des, auf, für, ist und im”. For geocachers, this hit list probably changes a little, with the words “north, east, degree, cache, coordinates, can, search, stash” and the written-out digits: “one, two, three, four, five/five, six, seven, eight, nine and zero” sliding further up the list.

By the way, there are actually no one-letter words in German – which makes German texts very different from English texts.

For manual decryption of simple ciphertexts, I use a text editor (the free Notepad++). Any word processor works just as well, but the font must be one whose letters have fixed widths (for example, Courier New or monospace). This way, the ciphertext and the decryption attempt can be placed directly below each other.

In the example just given tuxjlaktlfckokotyfckojxkobokxlaktl

Assuming that the text has been encoded without spaces (since there is only one very common letter) and the most common is in fact the “E”, I write these findings below the ciphertext.

Unfortunately, this is not yet readable. But perhaps what has broken the neck of many historical ciphertexts will work here: perhaps one can guess how the text begins, or which words are contained in it. Even in the Second World War, many ciphers that were actually almost certain to be cracked were cracked because typical greetings, the same phrases and easy-to-guess words were used.

In our case, a typical encoding for geocaching, let’s assume that this is a coordinate. This usually starts with North or N. The “T”, the first letter of the ciphertext makes up 12% of this, which fits well with the usual statistical frequency of about 10% from the “N”. Let’s try it out:

Well, it’s not really readable yet, so move on. But how? One could guess more letters now. The second most frequent in the ciphertext is an “O”, third is the “L”. If we take the letter frequency tables, the letters E N I S R and A are the most frequent. Thus, O and L should be one of them. Since “E” and “N” have presumably already been found, only I, S, R and A are missing.

You could also guess words. Does the ciphertext really begin with North? If so, the “U” in the ciphertext would be an “O” and even more helpful would be the “X”, which occurs three times in the ciphertext and would be equivalent to an “R”.

The text ends with en and another, still unknown letter. What could be a plausible ending here? Is it a german written out digit? Which one ends with en and another letter? Then fünf or fuenf would be a suitable candidate. If the encryptor chose the ue notation to avoid encrypting German umlauts, then the fifth letter from the back should correspond to the last letter, if it really is fuenf. Bingo! “laktl” is the last five letters. And it certainly means “fuenf”.

But we could also take the frequency mountain approach. If the ciphertext is only a RED shift (alphabet shift by x digits), conspicuousness should perhaps be visible here even with so few letters.

And indeed, the large bars seem to repeat at similar spacing at the top and bottom.

The E, I and N in the normal alphabet could correspond to the K, O and T bars in the ciphertext.

This “frequency mountain” would be much more meaningful if the ciphertext were longer. But even with the short snippet, it could be enough and we see a shift of 6 letters. This is also the suggestion cryptoolonline would make to us here if we clicked on the appropriate “RED check” button.

And actually we could have tried this a few steps earlier, since the letter “E” (in ciphertext “K”) and “N” (in ciphertext “T”) have already been guessed. Both are shifted by 6 letters, so you can at least hope that all letters have been shifted by 6 letters.

But no matter which way we choose, with a little practice it only takes minutes to get the original text nordfuenfzweieinszweidreivierfuenf from the ciphertext tuxjlaktlfckokotyfckojxkobokxlaktl

Even if the alphabet has been completely scrambled (instead of shifted by x digits), decoding will only take a little longer. The simple approach of using frequency analysis to identify the space and the “E” is always helpful. In longer texts, always look for full stops and commas, which are never at the beginning of a word but are always followed by a space. Next, try to decode the short words (the, the, the, and, in, in, …) and look for identical passages of text that mean the same words or the same parts of words.

Same ciphertext characters in a row are also a nice starting point, since in German only certain letters appear twice and these are often enclosed by the same or similar letters. Double consonants always have vowels, double vowels always have consonants around them. And of course, you should always look for typical greetings (“Dear Cacher, …”) and goodbyes (“Have a good search”).

And of course the GC Wizard must not be missing. In the “General Code Solvers” section, there is also a tool for cracking monoalphabetic substitution ciphers using frequency analysis. 😉

If the approach with the frequency analysis does not work, i.e. no character of the ciphertext stands out conspicuously, then it is not a monoalphabetic cipher but possibly a repeated letter shift in which the key alphabet has been changed every x characters. This can also be decrypted by hand with a little effort, but it is definitely much more strenuous. The important thing here is to use identical ciphertext passages to find out after how many letters the alphabet changes – and how often.

A more advanced link for decoding ciphertext using a spreadsheet and converting and comparing ascii values can be found on the Mathebord.