Wednesday, 14 October 2015

Quantum Cryptography


Prerequisite: physics

Modular functions seems impregnable enough but cryptographer just had to take it one step further, harnessing the random unpredictability of light polarization.

Photons can travel as transverse waves. What a polaroid does is filter out waves that oscillate in other directions. Some polaroids allow many oscillation directions while in this diagram below, the polaroid filters out all light waves except ones that oscillate vertically.. but not really.


Diagonal waves have a 50% chance of passing the filter and becoming vertical at the other side, according to Schrödinger's concept of superposition. Another key feature is that one cannot directly observe the polarization of photons due to Heisenberg's uncertainty principle, but one knows for sure the polarization of photons that come after a polaroid.

Let the madness begin.

As a setup for an example of quantum cryptography:

Message is translated into binary.
For the sake of simplicity, assume four possible wave polarizations.
Vertical ( | ) and horizontal ( - ) waves can pass through rectilinear polaroids (+).
Diagonal waves ( / and \ ) can pass through diagonal polaroids (x).
( | ) and ( / ) represents 1.
( - ) and ( \ ) represents 0.

Alice wants to send a key to Bob, and comes up with a pre-key 10011011.
She sets a scheme of randomly alternating polaroids +x++xx+x, then sends polarized photons ( | \ - | / \ | / ) through specific |, -, /, and \ polaroids to Bob, accordingly with her scheme.

A summary of Alice's transmission:

pre-key:         1 0 0 1 1 0 1 1
scheme:         + x + + x x + x
polarization:   | \  -  |   / \   |  /

(The only way to detect the polarization of a photon is by trial and error, where there is only one trial. Eve cannot possibly guess which polaroid to use for an upcoming photon, and using the wrong one will either block out or repolarize a photon, neither of which are desired. She cannot even deduce whether she used the correct polaroid or not, since a photon entering an incorrect polaroid has a 50% chance of getting through. Tampering with the photons with polaroids can also reveal Eve's act of eavesdropping).

Bob on the other end tries to receive the photons with a random polaroid scheme of his own.
Alice and Bob then identify where they used the same scheme, which is also where they both know the correct polarizations, and the resulting fragmented sequence can be used to generate their key (1110 in this example):

pre-key:                     1 0 0 1 1 0 1 1
Alice's scheme:         + x + + x x + x
polarization:               | \  -  |   / \   |  /
Bob's scheme:           + + x + x x x +
filtered scheme:        +        + x x
key:                           1        1 1 0

Eve can overhear what schemes they filtered out, but she cannot know what Bob correctly observed, which is essentially the key. The only way for Eve to know the key is to use the exact same scheme as Bob, which is highly improbable when the key is extremely long.

The incorporation of photons started with Charles Bennett's idea of quantum foolproof money. Last time I checked, a quantum key was successfully exchanged over one kilometer.

Oh dear mortals, what next?

Monday, 12 October 2015

Modular Functions in Encryption

Followup of Code Decryption

Prerequisite: algebra

The toughest of ciphers in Code Decipherment are not impregnable, because algorithms can be thought of as a function. No matter how complex the procedure, the systematic nature gives it away and like algebra, functions can be worked backwards.

Another problem is with key distribution. For the recipient to be able to decrypt a message, the recipient must have the key as well. This is particularly significant to digital communications where the key cannot simply be handed over in person (which defeats the purpose of digital communication!). To send the key digitally requires another key to secure the first key, and another for the second.. like an infinite Matryoshka doll. No way.

The thing to do then, is to 1) find an irreversible function and 2) find a way to decrypt a message without exchanging keys.

The breakthrough idea was formed separately first by James Ellis, Clifford Cocks, and Malcolm Williamson, then Whitfield Diffie, Martin Hellman, Ralph Merkle, Ronald Rivest, Adi Shamir, and Leonard Adleman. The concept can be described with the following analogy:

Alice is trying to send a message to Bob while Eve is eavesdropping.
Alice puts her message in a box.
Alice puts lock A on the box using key A, which she keeps secret.
Alice sends the box to Bob.
Eve cannot open the lock A on the box.
Bob puts another lock B on the box using another key B, which he keeps secret.
Bob sends the box back to Alice.
Eve cannot open lock A and B on the box.
Alice unlocks lock A using key A
Alice sends the box back to Bob.
Eve cannot open lock B on the box.
Bob unlocks lock B using key B.
Bob opens the box and reads Alice's message!

This solves the key distribution problem since no keys are exchanged, but it requires the function to be commutative as well as irreversible. So what kind of function fits the description?

The modular function~

There are two ways to understand it. One is to think of it as a clock. 12 (mod 7) = 5 would be 12 jumps around a 7 hour clock, landing at 5. Another way is to take 12 divided by 7, then state the remainder 5.


The following example applies modular functions not as an algorithm of the message itself, but as the algorithm to exchanging keys.. without exchanging keys:

Alice and Bob agree on a the function 3^x (mod 7) = y.
Eve overhears this function.
Alice chooses 6 as her key A, solves 3^(6) (mod 7) = 1, and reports the answer "1" to Bob.
Bob chooses 10 as his key B, solves 3^(10) (mod 7) = 4, and reports the answer "4" to Alice.
Eve overhears the exchange "1" and "4" but cannot reverse them with the function or do anything.
Alice solves B^A (mod 7), which (4)^(6) (mod 7) = 1.
Bob solves A^B (mod 7), which (1)^(10) (mod 7) = 1.
And so "1" is the agreed key for their message!
(I should have chosen better numbers where in reality the numbers are veeeeeeery large and do not coincide)

One may reverse the function through rigorous trial and error, but it is simply too exhausting when the key can reach astronomical lengths. The way to encrypt messages directly is as follows:

Alice choose two veeeeery laaaaarge prime numbers p and q, which she keeps secret.
Alice multiplies the two numbers to get N, and picks another number n. These two numbers she announces as a public key.
Bob wants to send Alice the letter "B", which he has to digitize first with ASCII binary digits or something.
Bob uses Alice's public keys to encrypt his letter "B" with the formula B^n (mod N) = C, solves for C, and sends it to Alice.
Eve overhears the message C but cannot do anything with it, not even with Alice's public keys since modular functions are not reversible.
Alice solves for a private decryption key d with the formula nd = 1 (mod (p-1)(q-1)) using Euclid's Algorithm (whatever that is).
Alice then deciphers Bob's message with the formula C^d (mod N) = B to get B.

It only gets crazier when you cross cryptography with physics. Just you wait. Be prepared to encounter some photons in a followup post..

Code Decryption

Prerequisite: algebra

Been reading The Code Book: The Science of Secrecy from Ancient Egypt to Quantum Cryptography by Simon Singh. It is a very fascinating read between history, cryptography, and linguistics. In this post I compile some deciphering techniques. Some are simple while others are pure genius. But first, some terminology.

plaintext: original message (notated in lowercase)
ciphertext: enciphered message (notated in capitals)
algorithm: the method of enciphering a message
key: the premise of an enciphering method


A person needs to know both the algorithm and the key in order to decipher a ciphertext. But in many cases, the algorithm is obvious and the key can be traced from it.

Alphabetic Substitution

This is the simplest of algorithms where the alphabet is scrambled up to make a cipheralphabet. A good knowledge of English (or whatever the plaintext is written in) is enough to crack the ciphertext.

If a lone alphabet appears commonly throughout a ciphertext, one can deduce that is either "a" or "i". Similarly, a recurrence of a three letter cipherword is probably "and" or "the". If there is no vowel in a four letter cluster, one of the letters is probably a "y". The letter after the "q" must be a "u". If the spaces are eliminated, one can still guess common suffixes for a start. Lingual rules provide many handholds to decipherment.

Know your spelling rules, substitute what you can, and play a little hangman until you get the whole plaintext. That was how I cracked the Gnommish Alphabet in the Artemis Fowl series back in seventh grade.

Caesar Shift

Actually I lied. The Caesar Shift is even simpler. It shifts the alphabet several places, then uses it as the cipheralphabet. The number of shifts is agreed with the recipient beforehand.


This encipherment was used for extremely short messages, such as one phrase. There are not enough clues to reason with, but this is still a weak cipher considering that one only needs to test twenty six cipheralphabets at most to reach the plaintext. If that sounds like a lot of work to you, read on. You will much rather confront a Caesar Shift.

Vigenère Cipher

"The Indecipherable Cipher" utilizes the Vigenère Square, which is essentially all possible Caesar Shifts lined systematically to make a square:


What happens is that the sender and receiver agree on a keyword, such as "BLUE". To encipher a message, the first letter would be enciphered with the Caesar Shift starting with "B", the second letter with the shift starting with "L", the third with the shift starting with "U", the fourth with "E", and the fifth with "B" again. So the message "pig is hungry" enciphered with the keyword "BLUE" will be "QTAMTSORHCS".

B L U E B L U E B L U
p  i  g  i  s  h u n g  r  y
Q T A MT S O RH C S

This enciphering technique is a polyalphabetic cipher, which alternates between more than one cipheralphabet. This makes it harder to pick out letters by frequency as opposed to a monoalphabetic cipher, where you can almost guess correctly that the most common cipherletter probably represents the plainletter "e".

Charles Babbage figured that frequency analysis plays a big role concerning the nature of Caesar Shifts in the Vigenère Cipher. Arabs first came up with frequency analysis, the association of cipherletters with plainletters by occurrence. What happens is you get an graph describing the frequency distribution of alphabets in a language..


..then compare it to the frequency distribution of cipherletters in your ciphertext (similarly can be done with Zipf's Law for whole words). Match corresponding frequencies of letters and cipherletters, substitute, do some tweaking, and you should have the plaintext. This technique is not particularly significant for general Alphabetic Substitution since logical reasoning is enough to crack the cipher, but it gives a handhold in Vigenère decipherment.

Homophonic Cipher

The previous ciphers were especially vulnerable to letter frequencies. To make up for that, the homophonic cipher uses numbers as the cipheralphabet, and adds more cipherletters to even out the frequencies. Each cipherletter should appear just as often as another.


It takes much more thought to crack this cipher, but it is still possible. One can consider spelling rules, estimate the amount of extra cipherletters for each plainletter, and take both into account. There is much more trial and error, but it is not such a horror compared to the next cipher..

The Enigma

This is where encryption escalates quickly. Why read my words when you can see for yourself? This video on the Enigma Machine tells what you need to know.

Hooooo, what is this monster? From left to right, the machine components are lamp letters, keyboard, plugboard, first scrambler, second scrambler, third scrambler, and reflector. If you trace this diagram carefully, hitting the "C" key gives the output "F".


The Germans with their Enigma Machines changed their agreed scrambler setting everyday in order to securely encipher the scrambler setting of their actual messages. So a person receiving a message would set their Enigma Machine to the agreed day setting, decipher the new scrambler setting, set to the new setting, and then proceed to decipher the actual message. The plugboard setting stays the same.

The Machine is the algorithm and the scrambler setting is the key. The key is six letters long, where the first three are the starting letters of the scramblers and the last three is a repetition. The key in plaintext "pigpig" may be enciphered as "GXWLDN".

To obtain the key, Marian Rejewski cleverly mapped out the chains of letter relations. He analyzed numerous keys of one day setting and paired up the first and fourth letters of each six letter key, since they are repetitions of each other. A relation for A, B, C, D, E, and F can be:

A B C D E F
D A F E B C

Then he organized this relation into chains. In this example there are two separate chains:

four links: A --> D --> E --> B --> A
two links: C --> F --> C

The significance of this organization is that the plugboard cannot interfere with the amount of chains or links, so that it is useful for cracking the scrambler setting. Instead of finding one key among ten thousand million million keys, he had only 105,456 possible chain-link characteristics to consider. And then there was the manual labour of recording the number of chains and links for each scrambler setting, but they did it. It took a year.

When the Germans found out about their flaw they stopped repeating their keys. To find the key, Alan Turing used Rejewski's idea on cribs. A crib is a ciphertext in which you know its plaintext as well, which in their case had to be guessed. A common crib that the Germans provided was "weather" (or "wetter" in German) in their weather reports.

Then he had to figure some plugboard settings as well. The trial and error went something like this:

He had some data based on a crib.
Given that ciphertext "A" is plaintext "b",
assume that "A" is connected to "S" on plugboard.

A --> plugboard --> S --> scramblers --> ? --> plugboard --> b
A --> plugboard --> S --> scramblers --> F --> plugboard --> b

So he joined "A" to "S" on the plugboard, then saw which letter lighted up. If "b" lighted up, it shows that "b" is not connected to any letter on the plugboard. If "F" lighted up, which he knew should be "b", he then assumed that "F" is connected to "b".

All that is good, but what happens when there is a contradiction? Say, three letters seem to be plugged to each other. Yes, those would be incorrect deductions from an incorrect assumption. To turn mistakes into an advantage, Turing realized that all these incorrect deductions are definitely incorrect, and do not need to be tested further. I am still trying to get my head around this one.

And he threw these testings into a bombe.


For more historical context alongside the logic, you ought to read The Code Book. It is pleasant for leisure as well as for study. If this article makes you insecure about your internet privacy, I have another post coming up that will calm your nerves. Wait for it~