Basic Code Breaking

by Joseph B. Zekany

In 28:2, b3ard wrote a good introduction to the RSA algorithm ("Simple RSA Encryption or Human-Calculable Encryption").

The algorithm is named after its creators Rivest, Shamir, Adleman, and is the gold standard for public- and private-key encryption.  It's used by companies like Verisign, who use it to generate certificates of authority for businesses like Amazon.com.  Verisign verifies that you, the customer, are in fact on the real Amazon.com website, then they make sure your purchase information is encrypted and sent from your computer to Amazon.com in the most secure way possible.

b3ard is correct in saying "that learning cryptography [is] tedious and time consuming."  However, I would also say it can be fun and rewarding.

His article did a good job of explaining the mechanics of the RSA algorithm, and how to generate the extremely small or weak key pairs used for his public- and private-key encryption.  What I will cover in this article are some of the weaknesses b3ard made reference to in his paper.  I hope to expand his work by giving the readers a better understanding of basic code breaking.

0x01: Discovery

The first thing I found when I started playing around with the numbers b3ard gave us was a weakness in the generated public and private key pairs.

He did say they were small and weak, however the weakness I thought he was talking about was the fact that the keys were small.  The problem I found was one born out of the fact that I didn't have access to a computer.  All I had was a calculator.  This meant I had to do the "mind-working elementary long division and multiplication."

Let's start by setting up the RSA key pair the way we were shown:
p = 5
q = 7

N = p * q = 35

r = (p - 1) * (q - 1) = (5 - 1) * (7 - 1) =  (4) * (6) = 24

k = (r + 1), ((r + 1) + r), (((r + 1) + r) +r), ..., etc.
This gave us a list of candidate numbers to factor out, thereby obtaining our public and private key pair.

The list of candidate numbers were: 25, 49, 73, 97, 121, 145

In the example we were given:
k = 145
e = 5
d = k/e = 145/5 = 29
So far, so good.

Now, in the next step, we had to substitute our letters for numbers, so we could encrypt our message. This gave us the following list to work with:
Alphabet substitution: a = 01, b = 02, c = 03, etc.
problem = 16, 18, 15, 02, 12, 05, 13

p = 16
r = 18
o = 15
b = 02
l = 12
e = 05
m = 13
He told us to remember that in practice you should always use your counterpart's public key to encrypt our message and not your own.

Here's an example to make things clear.

Bob generates his public-and private keys: e and d

If he wants Alice to send him a message, he must give Alice the public key he generated. In this case he gives Alice: e

Alice can now encrypt the message she want to send Bob, like so:
(message)^e mod N = (16)⁵ mod 35 = 11
She does this operation for every character she wants to send. Bob would get the following numbers: 11, 23, 15, 32, 17, 10, 13

To decrypt the message, Bob now would have to use his private key d, like so:
(cipher)²⁹ mod 35 = 16
This is where the fact that I didn't have a computer comes into play.  You see, my calculator couldn't handle the large numbers and popped an error.

I thought there had to be a better way to crunch these large numbers.  Then I remembered doing a problem from the MIT OpenCourseWare class 6.001: "Structure and Interpretation of Computer Programs."

And it hit me.

The problem required me to decrypt a string of characters, but I didn't have the key.  Back then, I remembered studying hashing algorithms, and that they were a one way operation.  Meaning you could never really decrypt, or reverse, a hash because the hashing algorithm only goes one way.

This has been covered before in these pages, and readers are encouraged to reference back issues.  Anyhow, I sent the encrypted string back through the hashing algorithm, and I had the plain text.

Something about visiting the NSA crypto museum, if I remember correctly.  So that's what I did with the cipher string.  (11, 23, 15, 32, 17, 10, 13).  And guess what?
(cipher)⁵ mod 35 = (11)⁵ mod 35 = 16
Is that right? Let's try that again:
(23)⁵ mod 35 = 18
e = 5 is the public key we generated. This is where I had to find someone with access to a computer. I had them punch in the equation:
(23)²⁹ mod 35 = 18
Now that's a bad thing.

Both e = 5 (our public key) and d = 29 (our private key) decrypt the cipher string!

This means anybody could decrypt our cipher message with our publicly available public key. This is when I decided to generate my own set of numbers to see if I could recreate the issue. I used:
p = 16
r = 18
o = 15
b = 02
l = 12
e = 05
m = 13
as my message string.

The next step was to pick my small prime numbers and generate my key pairs.
p = 5
q = 11

N = p * q = 55

r = (p - 1) * (q - 1) = 40
r = (5 - 1) * (11 - 1) = 4 * 10 = 40

k = ((r + 1) + r) = 81
k = ((40 + 1) + 40) = (41 + 40) =  81

d = k/e = 27
d = 81/e = 27
So my public key is e = 3 and my private key is d = 27. Okay, let's try this again. Alice encrypts her message with Bob's public key:
(message)³ mod 55 = 26
Once Bob has the cipher, he need to decrypt it with his private key:
(cipher)²⁷ mod 55 = (26)²⁷ mod 55 = 16
But what about the issue with the public key decrypting the cipher? Eve can now try using Bob's public key to decrypt Alice's message to Bob:
(26)³ mod 55 = 31
Okay, it looks like the issue is fixed, but there is another problem here.

0x02: Frequency Analysis

I just saw a mathematician demonstrate this same RSA technique.

In his presentation, he used a soldier on the battlefield needing to send a message to another soldier.

Let's say our soldier converts his message. The plain text number string would be: 11, 05, 05, 16, 20, 08, 05, 18, 09, 04, 07, 05

To keep this example as simple as possible, I'm going to use the key pair I just generated, so e = 3, d = 27, and N = 55.

Now let's use our formula to encrypt our plain text string:
(message)³ mod 55 = cipher
This gives us the following cipher string:
11 15 15 26 25 17 15 02 14 09 13 15
Now, at first glance, it looks like we can't decrypt this cipher without Bob's private key.

However, if you take a second look, you'll see there is a pattern to the cipher string.  You see the number 15 repeated four times.

This tells us we are dealing with a substitution cipher, and, whatever the number 15 represents, it's the same character throughout the string.  This is where the context of the cipher and a few simple rules can help us break this code.

In the English language, the most often used letters are: E, T, A, O, N, R, I

Common three letter groups are THE, AND, YOU, so if you saw a group of numbers repeated over and over, say like the group 25, 17, 15, you might take a guess that: T = 25, H = 17, E = 15

Now remember, trial-and-error is the order of the day.  One thing that can help break this code is the context of the cipher.

In this case, a soldier on a battlefield.  What would be important on the battlefield?  Maybe holding the high ground.  In Afghanistan, that would be a good guess.

So what does our cipher look like so far?
11 E E 26 T H E 02 14 09 13 E
Not bad.

We have six characters solved and six unsolved characters.

Now, looking at the cipher string, and taking the context of the cipher into consideration, you might guess: K = 11, P = 26

Okay, now we are getting somewhere:
K E E P T H E 02 14 09 13 E
What's a five letter word for hilltop?  It ends with E.  KEEPTHERIDGE.

With this method, we were able to break this cipher.  We didn't need math.  All we needed was a little reasoning and logic.

Codes have been broken like this for a long time.  b3ard did say this was a weak encryption.

This is fine for an inside joke at the water cooler but not for a soldier sending an important message.  So how can we break up this character pattern?  One way would be to combine the characters into groups.

For example, if we group k = 11, e = 05, we get 115.

Remember, our character groups must be smaller then our modulus.  Group (N).  I've generated a new key pair:
N = 703
r = 648
k = 1945
e = 5
d = 389
Our modulus is now greater than the largest group.  Our cipher is now: 210, 338, 341, 370, 18, 75

The pattern is now broken up and the cipher is much harder to break.

The commercial application of RSA algorithm works with large blocks of data, and uses large prime numbers to create the public and private key pairs.  The difficulty of factoring the products of two large prime numbers is the core mathematical fact underlying the RSA algorithm.

Not to be outdone, Rivest has devised a new problem: "the MIT puzzle."

This should keep college supercomputing centers busy for a while.  The problem is simple to state, and readers who are interested in breaking the code can do a search for it.

I hope this helps the soldiers in Afghanistan.  I would like to think I did something that matters.

Return to $2600 Index