In a simple substitution cipher, one character is substituted for another. Here is a simple example:
| A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z |
| R | Z | B | U | Q | K | F | C | P | Y | E | V | L | S | N | G | W | O | X | D | J | I | A | H | T | M |
To encode some text, simply find each character in the text in the first line, and replace it by the character below it. For example, using the example above, if you encode the word ``BIRDBRAIN'', you get ``ZPOUZORPS''. To decode, reverse the process--for the first character in ``ZPOUZORPS'', find ``Z'' in the lower line, look above it to get ``B''--the first letter of ``BIRDBRAIN'', et cetera.
If you have to decode a lot, it is easier if you invert the line above to get the table below. With this table it is much easier to decode since the letters in the encoded word are now in alphabetical order in the top line.
| A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z |
| W | C | H | T | K | G | P | X | V | U | F | M | Z | O | R | I | E | A | N | Y | D | L | Q | S | J | B |
These simple substitution ciphers are fairly easy to ``crack''--the problem is that in English (or any language), certain letters are far more likely to appear. In English, for example, the letter ``E'' is far more likely to appear than the letter ``Z''. In fact, here is a list of the letters used in English arranged approximately in order of usage (``E'' is the most used letter; ``Z'' is least). The approximate percentages for the first few letters in the list below are: E: 12.7%, T: 9.1%, A: 8.2%, O: 7.5%, and the percentages for the last few are: J: 0.2%, Q: 0.1%, Z: 0.1%.
| E | T | A | O | I | N | S | H | R | D | L | U | C | M | W | F | G | Y | P | B | V | K | X | J | Q | Z |
Following is a short passage encoded with a simple substitution cipher:
UJEJVZR QFEYGE, SV SO OFSU, JWIG FEESTGU FV VZG UJJE JC FW FQFEVLGWV SW PZSIZ F NASVVGESWN QFEVR PFO VFYSWN QAFIG. FV QEGISOGAR VZG OFLG LJLGWV, F XGFKVSCKA XKV TFIKJKO OZJPNSEA FEESTGU FV VZG UJJE. CJE F LJLGWV, VZGEG PFO ZGOSVFVSJW JW XJVZ OSUGO, FWU VZGW VZG OZJPNSEA OVGQQGU XFIY VJ LFYG PFR, OFRSWN, "FNG XGCJEG XGFKVR." "WJV FV FAA!" OFSU UJEJVZR QFEYGE, OFSASWN VZEJKNZ. "QGFEAO XGCJEG OPSWG!"
To try to crack this cipher, begin by counting the number of occurrences of each letter, and we obtain the following counts:
| A | B | C | D | E | F | G | H | I | J | K | L | M |
| 10 | 0 | 5 | 0 | 24 | 35 | 34 | 0 | 6 | 24 | 7 | 7 | 0 |
| N | O | P | Q | R | S | T | U | V | W | X | Y | Z |
| 9 | 18 | 7 | 9 | 7 | 22 | 3 | 11 | 31 | 16 | 7 | 5 | 16 |
Since the text sample was relatively small, we can't be certain that the most common letter (``F'' in the sample above) stands for the letter ``E'', but it's a pretty good bet that you'll find ``E'' among the letters ``F'', ``G'', and ``V''.
The structure of English gives plenty of other clues as well. For example, the word ``F'' appears twice in the text, so ``F'' must stand for ``I'' or ``A''. Since ``F'' is very common in the sample above, it is more likely to stand for ``A'', since ``A'' is much more common in English than ``I''. So the first guess you might make is that ``F'' stands for ``A''. Now the word ``FV'' appears in the text twice, and ``V'' is also very common. ``AT'' is a word in English, so perhaps ``V'' stands for ``T'' in the cipher. Now these are just guesses, but they are not bad guesses.
Making those substitutions gives us the following:
T A T A A AT T
UJEJVZR QFEYGE, SV SO OFSU, JWIG FEESTGU FV VZG
A A A T T A TT A T
UJJE JC FW FQFEVLGWV SW PZSIZ F NASVVGESWN QFEVR
A TA A AT T A T
PFO VFYSWN QAFIG. FV QEGISOGAR VZG OFLG LJLGWV,
A A T T A A AT T
F XGFKVSCKA XKV TFIKJKO OZJPNSEA FEESTGU FV VZG
UJJE.
A T T A TAT T
CJE F LJLGWV, VZGEG PFO ZGOSVFVSJW JW XJVZ OSUGO,
A T T T A T A A
FWU VZGW VZG OZJPNSEA OVGQQGU XFIY VJ LFYG PFR,
A A T
OFRSWN, "FNG XGCJEG XGFKVR."
T AT A A T A A
"WJV FV FAA!" OFSU UJEJVZR QFEYGE, OFSASWN
T A
VZEJKNZ. "QGFEAO XGCJEG OPSWG!"
Looking at the text above, there are a lot more clues. For one thing, in the first line is the word ``SV'', where the ``V'' may stand for ``T''. The only two words in English ending in ``T'' are ``AT'' and ``IT'', but we've already guessed that ``F'' stands for ``A'', so ``S'' is probably ``I''. Also, since we think we know what letters stand for ``T'' and ``A'', the other extremely common letter, ``G'', probably stands for ``E''. Finally, in the next-to-last line is the word ``FAA''--a three letter word beginning with ``A''. In English, ``A'' must be ``L'', ``D'', or ``S'', but ``at all'' makes much more sense than ``at add'' or ``at ass'', so ``A'' is probably ``L'':
T A E IT I AI E A I E AT T E
UJEJVZR QFEYGE, SV SO OFSU, JWIG FEESTGU FV VZG
A A A T E T I I A LITTE I A T
UJJE JC FW FQFEVLGWV SW PZSIZ F NASVVGESWN QFEVR
A TA I LA E AT E I EL T E A E E T
PFO VFYSWN QAFIG. FV QEGISOGAR VZG OFLG LJLGWV,
A EA TI L T A I L A I E AT T E
F XGFKVSCKA XKV TFIKJKO OZJPNSEA FEESTGU FV VZG
UJJE.
A E T T E E A E ITATI T I E
CJE F LJLGWV, VZGEG PFO ZGOSVFVSJW JW XJVZ OSUGO,
A T E T E I L TE E A T A E A
FWU VZGW VZG OZJPNSEA OVGQQGU XFIY VJ LFYG PFR,
A I A E E E EA T
OFRSWN, "FNG XGCJEG XGFKVR."
T AT ALL AI T A E AILI
"WJV FV FAA!" OFSU UJEJVZR QFEYGE, OFSASWN
T EA L E E I E
VZEJKNZ. "QGFEAO XGCJEG OPSWG!"
From here, it's easy to make progress. In the next-to-last line, ``WJV FV FAA'' is almost certainly ``NOT AT ALL'', so ``W'' is ``N'' and ``J'' is ``O''. Similarly, the word ``VZG'' is almost certainly ``THE'', so ``Z'' is ``H''. Thus we obtain:
O OTH A E IT I AI ON E A I E AT THE UJEJVZR QFEYGE, SV SO OFSU, JWIG FEESTGU FV VZG OO O AN A A T ENT IN HI H A LITTE IN A T UJJE JC FW FQFEVLGWV SW PZSIZ F NASVVGESWN QFEVR A TA IN LA E AT E I EL THE A E O ENT PFO VFYSWN QAFIG. FV QEGISOGAR VZG OFLG LJLGWV, A EA TI L T A O HO I L A I E AT THE F XGFKVSCKA XKV TFIKJKO OZJPNSEA FEESTGU FV VZG OO UJJE. O A O ENT THE E A HE ITATION ON OTH I E CJE F LJLGWV, VZGEG PFO ZGOSVFVSJW JW XJVZ OSUGO, AN THEN THE HO I L TE E A TO A E A FWU VZGW VZG OZJPNSEA OVGQQGU XFIY VJ LFYG PFR, A IN A E E O E EA T OFRSWN, "FNG XGCJEG XGFKVR." NOT AT ALL AI O OTH A E AILIN "WJV FV FAA!" OFSU UJEJVZR QFEYGE, OFSASWN TH O H EA L E O E INE VZEJKNZ. "QGFEAO XGCJEG OPSWG!"
From what we have above, ``FWU'' is clearly ``AND'', so ``U'' codes for ``D'', ``ZGOSVFVSJW'' is ``HESITATION'', so ``O'' codes for ``S'', ``VZGEG'' is either ``THESE'' or ``THERE'', but ``S'' is used, so ``E'' codes for ``R'':
DOROTH AR ER IT IS SAID ON E ARRI ED AT THE UJEJVZR QFEYGE, SV SO OFSU, JWIG FEESTGU FV VZG DOOR O AN A ART ENT IN HI H A LITTERIN ART UJJE JC FW FQFEVLGWV SW PZSIZ F NASVVGESWN QFEVR AS TA IN LA E AT RE ISEL THE SA E O ENT PFO VFYSWN QAFIG. FV QEGISOGAR VZG OFLG LJLGWV, A EA TI L T A O SHO IRL ARRI ED AT THE F XGFKVSCKA XKV TFIKJKO OZJPNSEA FEESTGU FV VZG DOOR UJJE. OR A O ENT THERE AS HESITATION ON OTH SIDES CJE F LJLGWV, VZGEG PFO ZGOSVFVSJW JW XJVZ OSUGO, AND THEN THE SHO IRL STE ED A TO A E A FWU VZGW VZG OZJPNSEA OVGQQGU XFIY VJ LFYG PFR, SA IN A E E ORE EA T OFRSWN, "FNG XGCJEG XGFKVR." NOT AT ALL SAID DOROTH AR ER SAILIN "WJV FV FAA!" OFSU UJEJVZR QFEYGE, OFSASWN THRO H EARLS E ORE S INE VZEJKNZ. "QGFEAO XGCJEG OPSWG!"
From here, it's easy. Fill in the obvious letters for a couple of passes to obtain the final decryption:
DOROTHY PARKER, IT IS SAID, ONCE ARRIVED AT THE UJEJVZR QFEYGE, SV SO OFSU, JWIG FEESTGU FV VZG DOOR OF AN APARTMENT IN WHICH A GLITTERING PARTY UJJE JC FW FQFEVLGWV SW PZSIZ F NASVVGESWN QFEVR WAS TAKING PLACE. AT PRECISELY THE SAME MOMENT, PFO VFYSWN QAFIG. FV QEGISOGAR VZG OFLG LJLGWV, A BEAUTIFUL BUT VACUOUS SHOWGIRL ARRIVED AT THE F XGFKVSCKA XKV TFIKJKO OZJPNSEA FEESTGU FV VZG DOOR. UJJE. FOR A MOMENT, THERE WAS HESITATION ON BOTH SIDES, CJE F LJLGWV, VZGEG PFO ZGOSVFVSJW JW XJVZ OSUGO, AND THEN THE SHOWGIRL STEPPED BACK TO MAKE WAY, FWU VZGW VZG OZJPNSEA OVGQQGU XFIY VJ LFYG PFR, SAYING, "AGE BEFORE BEAUTY." OFRSWN, "FNG XGCJEG XGFKVR." "NOT AT ALL!" SAID DOROTHY PARKER, SAILING "WJV FV FAA!" OFSU UJEJVZR QFEYGE, OFSASWN THROUGH. "PEARLS BEFORE SWINE!" VZEJKNZ. "QGFEAO XGCJEG OPSWG!"