IA Algorithms Enrichment 2017
Activities > Dec 9 > Project ideas

Project ideas

If you're having a hard time deciding what to make for your final project, no worries! Here are some ideas for programming projects that you can use, or you can adapt them into your own project.

Making and breaking a Caesar cipher

Relevant topics: arrays, functions, data types

The Caesar cipher

A Caesar cipher is one of the oldest (and also most insecure) methods to encrypt messages, so named because it was used by Caesar and the Romans. A cipher converts a plaintext message into ciphertext that is ideally unreadable to anyone who doesn't know some shared secret, or key.

The Caesar cipher is fairly straight forward. Suppose we have the plaintext ATTACKATDAWN that we want to encrypt. We then choose a letter of the alphabet, and shift all of the letters in the message by that letter's offset from A. For example, if we choose the letter 'C', we shift all of the letters forward by 2. Our encryption looks something like this:

Plaintext:  ATTACKATDAWN
Key:       +CCCCCCCCCCCC
            ------------
Ciphertext: CVVCEMCVFCYP

Every letter was shifted forward by 2, producing the nonsensical ciphertext CVVCEMCVFCYP. In the event where a key would cause the shift to move past Z, we simply wrap around to A and continue from there. For example:

Plaintext:  ZZYZXCALIFORNIA
Key:       +NNNNNNNNNNNNNNN
            ---------------
Ciphertext: MMLMKPNYVSBEAVN

To recover the plaintext, we simply do the reverse operation as above: using the key C, we subtract the key's offset from A from the ciphertext:

Ciphertext: CVVCEMCVFCYP
Key:       -CCCCCCCCCCCC
            ------------
Plaintext:  ATTACKATDAWN

As before, if a decryption would cause us to move a letter before A, we wrap around to Z and continue.

Ciphertext: MMLMKPNYVSBEAVN
Key:       -NNNNNNNNNNNNNNN
            ---------------
Plaintext:  ZZYZXCALIFORNIA
Conceptual questions
  • What might be an issue with using A as the key?
  • How many possible keys are there? Why might this be a problem?
  • Does the ciphertext reveal anything about the plaintext upon close inspection? Look at what plaintext letters map to in the ciphertext. How might this be exploited? We'll use this weakness in Task 2 to crack the Caesar cipher.

Task 1: Encrypt and decrypt text using a Caesar cipher

First, implement a Caesar cipher in C++. Your code should be capable of both encryption and decryption. To simplify things, assume that all plaintext and ciphertext input will contain capital letters (so no lowercase letters, spaces, punctuation, or numerals).

Hint 1: Doing math on chars

As has been mentioned previously, a char is really just another number to a computer. Due to this, we can do math on letters. For example, we can use C++ to compute the numeric value of the key in the example above:

//the (int) forces cout to output an int instead of a char.
//This would output a non-printing character without it.
cout << (int)('C' - 'A') << endl;

2

We can also add regular ints to chars and get a different letter, based on the offset:

cout << ('A' + 2) << endl;

C

Hint 2: Using a string like an array

Conceptually, a string is just a sequence of chars, and internally, this is exactly how string stores its data. As such, we can use some of the same concepts we saw with arrays and vectors on strings:

string foo = "bar";

//Index into a string
foo[2] = 'z';

//Loop over a string using a range-for loop
for (char c : foo)
	cout << c;

cout << endl;

//Loop over a string using a range-for loop, allowing changes to carry through
for (char &c : foo)
	c += 1; //increment each character by 1

cout << foo << endl;

baz
cb{

An important thing to note here is that A does not have a value of 0 (it actually has a value of 65). You will have to account for this when you write your program.

Hint 3: Modular arithmetic

There's another math operator in C++ that we haven't talked about: the modulo operator, or %. This is a binary operator, much like +, -, *, and /, and it computes the remainder of the division of the left side by the right side. It is often used to determine if a number is even:

int myNum = 3;

if (myNum % 2 == 1)
	cout << "myNum is odd." << endl;
else
	cout << "myNum is even." << endl;

myNum is odd.

However, it is also useful to make number sequences wrap around. For example, if we wanted the result of a computation to produce a number between 0 and 4, we can take the result of the computation modulo 5 to achieve this:

int a, b, c;

a = 4;
b = 3;
c = (a + b) % 5;

cout << c << endl << endl;

for (int i = 0; i < 12; ++i)
	cout << (i % 5); << endl;

2

0
1
2
3
4
0
1
2
3
4
0
1
2

Letter frequency analysis

As we saw in the introduction for this problem, this cipher has some flaws with it. It has a very small key space, and any letters that are the same in the plaintext will all map to the same letter in the ciphertext. We can use this fact to determine both the key and the plaintext given only the cipher text.

Letters in English are not used equally; it is a well-known fact that the letter 'E' is the most commonly used letter in the English language. Because all the Caesar cipher does is shift the letters in the text, we know that all of the Es will have been mapped to the same letter (say, 'J'). As a result, all of the Js in the ciphertext (if E does indeed map to J) should have an appearance frequency roughly equal to that of E in unencrypted English.

To perform the analysis, we can tally up each of the characters in the ciphertext and divide each tally by the total number of characters in the message (that way we have the relative frequencies of each letter). Then, we find the letter that is used most often, or has the closest frequency to E. We then compute the difference between the two letters, modulo 26. In this example, we would compute ('J' - 'E') % 26 = 'F' = 5. We then try to decrypt the message using this key. If the message is readable, we have found the correct key, if it is not, then we must try this process again, but using the second-most used letter in the ciphertexrt and matching it with the English letter that has the closest frequency.

Here is a table of English letter frequencies (source):

Letter Frequency Letter Frequency
A 0.0812 N 0.0695
B 0.0149 O 0.0768
C 0.0271 P 0.0182
D 0.0432 Q 0.0011
E 0.1202 R 0.0602
F 0.0230 S 0.0628
G 0.0203 T 0.0910
H 0.0592 U 0.0288
I 0.0731 V 0.0111
J 0.0010 W 0.0209
K 0.0069 X 0.0017
L 0.0398 Y 0.0211
M 0.0261 Z 0.0007

Task 2: Cracking the Caesar cipher

Disclaimer: It is a federal crime to snoop on or attack an encrypted communications channel without permission. While essentially nothing is encrypted with a Caesar cipher these days (due to it being incredibly easy to crack), you should always exercise caution when performing cryptanalysis on a cipher and be sure that you are acting ethically.

Expand your program to perform a letter frequency analysis on a given ciphertext, outputting the key and corresponding decryption. Since your first try may not produce the correct answer, make sure your program tries multiple different possible keys before exiting. You may find having a decryption function (from task 1) helpful for this task.

Use your program to find the key and plaintext for the following ciphertext:

IFWHZZLZHYLKLCPJLZAOHAHSSVDZVTLWLVWSLAVKHZOMYVTWVPUAHAVWVPUAICLYFMHZADOPSLVAOLYWLVWSLKHZOMYVTWVPUAIAVWVPUAHCLYFMHZAWLVWSLSPCPUNHAWVPUAJILPUNHWVPUAKPYLJASFPUILADLLUHYLVMALUNPCLUAVDVUKLYDOHAZZVNYLHAHIVBAWVPUAHAOHAZVTHUFWLVWSLMYVTWVPUAIHYLZVRLLUAVNLAAOLYLHUKDOHAZZVNYLHAHIVBAWVPUAIAOHAZVTHUFWLVWSLMYVTWVPUAHHYLZVRLLUAVNLAAOLYLAOLFVMALUDPZOAOHAWLVWSLDVBSKQBZAVUJLHUKMVYHSSDVYRVBADOLYLAOLOLJRAOLFDHUALKAVILKVBNSHZHKHTZAOLOPAJOOPRLYZNBPKLAVAOLNHSHEF

Hint: map

For this problem, you may find it useful to associate a char with a double (or vice versa); however, arrays and vectors only allow you to associate ints with another data type. Standard template library to the rescue!

C++ provides a class called map that allows you to create arbitrary mappings from one data type to another. You use it like this:

#include <iostream>
#include <map>
using namespace std;

int main () {
	map<char, double> freqs;
	
	freqs['C'] = 0.0271;
	
	cout << freqs['C'] << endl;
	return 0;
}

0.0271

First, note how the angle brackets have two data types -- char and double. In this context, this means that the map maps chars to doubles, so the index type will be a char and the value type will be double. You can use the indexing operator ([]) to index into a map just like you would an array.

Note how we didn't have to create an index for C above -- we just immediately indexed into our freqs map and assigned a value. Because map entries are not guaranteed to be adjacent values, map is implemented to immediately add an index if it does not already exist.

Here is a map of the English letter frequencies that you can use in your code. Do not worry about the syntax used here; just know that it will return the appropriate frequency if you index into the map with any capital letter.

map<char, double> engFreqs = {{'A', 0.0812}, {'B', 0.0149}, {'C', 0.0271},
								{'D', 0.0432}, {'E', 0.1202}, {'F', 0.023},
								{'G', 0.0203}, {'H', 0.0592}, {'I', 0.0731},
								{'J', 0.001}, {'K', 0.0069}, {'L', 0.0398},
								{'M', 0.0261}, {'N', 0.0696}, {'O', 0.0768},
								{'P', 0.0182}, {'Q', 0.0011}, {'R', 0.0602},
								{'S', 0.0628}, {'T', 0.091}, {'U', 0.0288},
								{'V', 0.0111}, {'W', 0.0209}, {'X', 0.0017},
								{'Y', 0.0211}, {'Z', 0.0007}};

The full documentation for map can be found here. (Member functions can be found towards the bottom of the page).

Follow-up questions

  • What could you change to make the cipher resist this kind of attack? Can you think of any ways in which your modified cipher might be susceptible to attack?
  • Think of another way we could have cracked this cipher. What about this approach might not work so well with modern cipher suites (which typically have key spaces of about 22048 to 24096)?

War (the card game)

Relevant topics: Abstract data types, functions

War is a simple card game played with two players and a standard poker card deck. At the beginning of the game, the deck is shuffled, and each player receives one half of the cards, which they place in a stack face-down in front of them.

Each player then takes the top card off of their deck and lays it face-up between the two players. The player that has the higher-ranked card wins the round, and places the two cards at the bottom of their deck. This process repeats until one player has all of the cards.

Cards ranked are ordered as follows (with the top being the most powerful card, the bottom being the least powerful card):

  • King
  • Queen
  • Jack
  • 10
  • 9
  • 8
  • 7
  • 6
  • 5
  • 4
  • 3
  • 2
  • A

Task

Implement a simulator for War in C++. You will want to make classes for the cards and the players. To get started, you will want to initialize the whole deck by placing cards into a vector. The deck has 4 suits (Hearts, Diamonds, Clubs, Spades), each with all of the different ranks listed above.

Once you have the full deck set up, shuffle it by doing the following:

#include <algorithm>
vector<Card> fullDeck;

//Initialization here

random_shuffle(fullDeck.begin(), fullDeck.end());

//fullDeck is now shuffled.

Once you have shuffled the deck, split it into two halves. I have provided two (fully-implemented) Deck classes that will ease the implementation of the game; you can use this for the two players' decks. You can add cards to a Deck like so:

Deck playerDeck;

playerDeck.push(fullDeck[0]);

Once you have built the players' decks, start playing the game. You can get the top card from a Deck like so:

Card &card = playerDeck.pop();

Make sure that you stop the loop between rounds so that the user can see the result of each rounds. You can make the program wait until the user presses Enter by using the following code:

cin.ignore();

You can view the starter code here.