MASTER SOFTWARE CORPORATION
59 DeGarmo Hills Road
Wappingers Falls, NY 12590




QUALITY SOFTWARE SINCE 1958
Last updated 11/14/07

A QUICK OVERVIEW OF CRYPTOGRAPHY



1. INTRODUCTION




     There have been many cryptographic methods invented since writing
itself was developed.  This overview is designed to give prospective users
of a cryptographic system a broad idea of where their system fits into the
whole spectrum of cryptographic methods and algorithms.  It also can give
them an idea of how strong a particular method might be in relation to
other methods, and a general concept of how the strength of a cryptographic
system can be judged.

     We have tried to write this overview using a minimum of math, so that
non-technical readers can understand most or all of the material.  None of
the subjects is covered to any depth, in the hope that the entire survey
can be read in less than 1 hour.

     This treatment deals only with the different kinds of algorithms or
methods used in cryptography.  There are many other considerations that go
into choosing a cryptographic algorithm.  Most older methods are intended
only for sending hand-written messages.  Among modern methods, some are
best-suited for protecting files on a computer, others are suited for
encrypting messages, while others are specialized for exchanging
cryptographic keys.

     These issues are considered elsewhere on this website.  See
www.mastersoftware.biz/crypt-ch.htm for example.



TABLE OF CONTENTS


1. INTRODUCTION

2. CRYPTOGRAPHIC METHODS
  2.1. Secret Method Cryptography
  2.2. Inherent Key Cryptography
  2.3. Hidden Key Cryptography
    2.3.1. No key required

3. SECRET KEY CRYPTOGRAPHY
  3.1. Substitution Ciphers
    3.1.1. Data Encryption Standard (DES)
    3.1.2. Advanced Encryption Standard (AES)
    3.1.3. GK-Crypt
    3.1.4. Linearity
  3.2. Transposition Ciphers
  3.3. Stream Ciphers
    3.3.1. Random numbers
    3.3.2. One-time pad
    3.3.3. OP-Crypt

4. PUBLIC KEY CRYPTOGRAPHY

5. PRIVATE KEY CRYPTOGRAPHY




2. CRYPTOGRAPHIC METHODS



     Most cryptographic methods, from ancient to modern, fall into a few
broad classes:

     Secret method
     Inherent key
     Hidden key
     Secret key
     Public key
     Private key

     In secret method cryptography the strength of the cipher comes from
keeping the method itself secret.  Most of the encryption methods of
ancient times, such as the atbash cipher mentioned in "The DaVinci Code,"
were secret method ciphers.

     Inherent key methods rely on a secret key that is built right into the
device, or right into the software.  The scytale or skytale used by the
pharaohs of ancient Egypt was an inherent key device; the diameter of the
cylinder was the key.  Some of the machine ciphers used during World War II
were of this type.  The machine would be constructed using a moving chain
with lugs, or a turning wheel with pins that drove the mechanism.  The
chain or wheel could have billions of possible settings.  When the
government bought the machine, technicians would set the lugs or pins to
some fixed arrangement before issuing the machines to the field.  All of
the machines in use would be fixed at the same setting.

     Hidden key methods depend on hiding the key for their strength.  The
most common method is to hide the key within the message itself.  For
example, it might be agreed that the fifth block of the message would
contain the key.  A more sophisticated variant is to combine the
information hidden in the message with some other information, such as the
date when the message was sent, to form the key.  In modern computer file
encryption programs, the key is sometimes placed in a key file which may be
encrypted with some inherent key method.  A slightly stronger version uses
a control file which contains the name of the key file, so the user can
give the key file any desired name, for example an innocent-sounding name
like chicken_recipe.

     Secret key methods get their strength from a secret key that is shared
by the sender and the receiver of the message.  Users may try to keep the
method secret, but they don't rely on that.  The strength rests in the key,
which must be carefully chosen.  Secret key methods are sometimes called
symmetric because the sender and the receiver use the same key.

     Public key methods rely on mathematical functions which are called
one-way functions because they are easy to compute, but very difficult to
reverse.  One example is multiplying two large prime numbers.  (Prime
numbers are numbers that cannot be formed by multiplying two other numbers
that are both greater than 1.  So 2, 3, 5, 7, 11 and 13 are prime, but
4=2x2, 6=2x3 and 9=3x3 are not prime.)  It is easy for a computer to
multiply two 100-digit primes to get a 200-digit product, but very hard to
find those 100-digit primes if you are given the 200-digit product.  Each
user of the public key system will have a public key, known to everyone,
that people can use to send messages, and a private key that is used to
read the messages.  Public key methods are sometimes called asymmetric
methods because the sender and receiver use different keys.

     Private key cryptography is the newest form.  In private key methods,
the sender and the receiver each have a separate private key, known only to
themselves, and not known to each other.  The sender encrypts the message
with the sender's private key.  Then certain mathematical functions are
used to change the message so that the sender's private key is removed and
replaced by the receiver's private key.  This allows the receiver to
decrypt and read the message.

     You can imagine the message as a jewel in a strongbox with two locks.
First the sender's lock is placed on the box, then the receiver's lock is
added.  When the sender's lock is removed the receiver can remove his own
lock to obtain the jewel.  The NK-Crypt and N2-Crypt encryption packages
use mathematical functions that can accomplish this securely.

     The following sections will discuss these classes of encryption
methods in more detail.



2.1. Secret Method Cryptography


     In Secret Method cryptography, the method, and often the fact that
cryptography is being used, are kept secret.  This is its only source of
security.  One example is a cipher designed for young children where A is
represented by a picture of an apple or an ape, B is represented by a bird
or a boat, C by a carrot or a cat, and so forth.  A message is a series of
such pictures.  The method is useful for teaching children the sounds of
the letters, but offers no secrecy.

     Secret method cryptography is primarily of historic interest.  In
ancient times, the message might be worked into the design on the
messenger's clothing, or written on the messenger's shaved scalp.  In the
weeks it took to deliver the message, the hair would grow back and conceal
the message.  A modern analogue would be the hidden patterns on the backs
of marked playing cards.

     Secret method cryptography, no matter how clever, is not considered
safe because anyone who learns the method or obtains a copy of the device,
can read all of the messages, both past and future.

     Unfortunately, some of the computer encryption methods in use today
are still of this type.  For example, one method is to exclusive-or
(combine) the contents of each computer file with its file name, scrambled
in a fixed manner, over and over.  For such weak secret key methods, simply
obtaining a single message and its translation would allow an opponent to
break the system.  Once broken, a secret method is broken forever, since
there are no keys the users can change to prevent reading future messages.

     These methods are still occasionally used either because the maker is
not aware of the danger, or because secret method encryption is both easy
to implement and easy to use.  If an encryption product says that no keys
are needed, and no secret files are used, then it is a good bet that it is
a secret method product.



2.2. Inherent Key Cryptography


     Inherent Key methods rely on a secret key that is built right into the
device, or right into the software.  The turning grille cipher, which was
used from the 17th through the early 20th centuries was an inherent key
device.  It used a thin square metal sheet with holes cut in it at various
spots.  The sender would place the grille over a sheet of paper, and write
one letter of the message through each hole.  When all of the holes had
been used, the grille would be turned and letters would be written through
the holes in their new positions.  The receiver would read the message the
same way.

     The positions of the holes constitute the secret key.  Since the holes
cannot be moved, the key is inherent in the device itself.

     Inherent key cryptography is nearly as weak as secret method
cryptography.  It is not considered safe because anyone who obtains a copy
of the same software, or the same device, can read all of your messages,
both past and future.  Thus the vendor, any government agency, and even any
other purchaser could read your files.

     Some inherent key software products are distributed with a different
key built into each copy of the program, just as combination locks are sold
with different settings.  This is better, but anyone who copies the
software from your computer, or from a backup disk, or from your online
backup service, can read all of your files or messages.

     Since inherent key cryptographic products are generally produced by
people with little knowledge of cryptography, the built-in keys and the
encryption method are also likely to be weak.

     Once again, if the product says that no keys or passwords are
required, be very wary.  It is not likely to be a strong product.



2.3. Hidden Key Cryptography


     Hidden Key methods depend on hiding the cryptographic key for their
strength.  It is like people hiding the keys to their home under the
doormat, on the lintel, or inside a fake stone in a flowerpot.

     Hidden key cryptographic software is now very popular with the
less-experienced suppliers of privacy products.  It offers the user the
convenience of not needing to enter a password or keyword each time the
software is used, but this convenience comes with the price of reduced
security.

     These products typically keep the key in a file that is itself
encrypted with an inherent key that is built into the program.  Anyone who
gets access to your computer, or to your backup disks, can copy that key
file, and then can read your files or your messages.

     There is nothing inherently wrong with the idea of storing file keys
or message keys in a key file, sometimes called a master file.  Many of the
best encryption products also do this.  But the master file must itself be
protected with a strong key, sometimes called a master key, otherwise the
file keys are vulnerable.  The master key must not be an inherent key (see
the preceding section) or a hidden key, but must be a secret key kept by
the user.

     Some products will take the hidden key idea several steps.  They will
have separate key files, which are encrypted with key file superkeys kept
in a superkey file, which is encrypted with a master key stored in a hidden
file whose name is stored in a control file which is encrypted with an
inherent key.  This is still not safe!!  Someone who gets access to your
computer, or to some other computer on your local network, can use the
encryption software on your computer to read your files.  They don't need
any keys or special knowledge to do this, because the encryption software
already has that ability built in.

     As long as the top-level key is kept on your computer, whether it is
in a key file or built into the encryption software, someone can find that
key and use it to read your files.


2.3.1. No key required

     The same blanket warning applies equally to all cryptographic packages
that can be run without entering a key or password.  These products may use
secret method cryptography, inherent key cryptography, or hidden key
cryptography.  Use caution.  If the product says that no keys or passwords
are required, then you are taking a serious risk.  If the product allows
only a short password, then you are taking a serious risk.  If the product
allows long passwords, but you choose a short simple password, then you are
taking a serious risk.

     The risk is present even when a very strong encryption algorithm is
being used.  For example, a disk encryption product may use 256-bit AES
encryption, and automatically produce the 256-bit key using the latest
random-number generating techniques, but if the user can access the
contents of the disk without a password, or by using at most a simple
password, then the product is not safe.



3. SECRET KEY CRYPTOGRAPHY



     Nearly all of the cryptographic methods developed over the past 400
years are secret key methods.  Nearly all of the cryptographic methods in
use today by govermments, and especially by the military, are secret key
methods.

     Secret key methods generally fall into two classes, block ciphers and
stream ciphers.  Block ciphers are further divided into substitution
ciphers and transposition ciphers.

     Substitution ciphers take blocks of one or more characters at at time
and replace them with other blocks according to some fixed rule.  The most
common example is the simple substitution cipher, often found on newspaper
puzzle pages, where each letter of the alphabet is consistently replaced by
a different letter of the alphabet.  A slightly more complex example might
use a table of 2-letter or 3-letter combinations, and use that table to
substitute for 2 or 3 letters of the message at a time.

     Transposition ciphers change the order of the characters in a message.
A simple example is where the letters of the message are written into a
rectangular grid going left to right across the rows.  Then the letters are
taken out of the grid going top to bottom down the columns.  As a general
rule, ciphers that use only transposition, no matter how complex, are not
considered secure.  The transposition must be combined with other
encryption steps to be secure.

     Stream ciphers generate a stream of keys, such as letters or numbers.
Each key in the stream is used to encrypt one character of the message.  In
classical methods the key stream was simply some key word or key phrase
repeated over and over for the length of the message.  The most common
modern method is to use a random number generator to produce the stream of
keys.  As a general rule, the random number generators used in most
statistical applications are not strong enough to produce a secure
encryption.



3.1. Substitution Ciphers


     Substitution ciphers generally take one character or block of
characters at a time and replace it with another.  The simplest form of
substitution cipher uses a table in which all of the blocks of a given size
appear in scrambled order.  You replace a given block with the
corresponding block from the table.  For example, using blocks of 3
uppercase letters (trigrams), you might have

     AAA  WCY
     AAB  GNR
     AAC  TDW
     AAD  ZOF
       etc.

Various methods are used to make this type of substitution depend upon the
key.  For example the key could be used to scramble the order of the blocks
in either or both of the lists.

     The table method becomes impractical for blocks of more than 3
characters, especially on computers, where there are 256 different
characters rather than 26.  A table of 4-character blocks would contain
256x256x256x256 entries, or about 4.3 billion entries, each containing 4
characters.

     For larger blocks of characters the substitution is usually
accomplished by doing a series of operations on single characters or pairs
of characters within the block.  A simple example using blocks of N
characters would use a table of pairs.  First the pair 1,2 is substituted,
then the pair 2,3, and so forth.  When the end of the block is reached, it
would wrap around using pair N,1.  This method starts to be secure when at
least 5 complete cycles are taken, particularly if a different pair
substitution table is used for each round.  Here is an example of one round

     COMPUTER      CO becomes XM
     XMMPUTER      MM becomes KD
     XKDPUTER      DP becomes RH
     XKRHUTER      HU becomes FL
     XKRFLTER      LT becomes BC
     XKRFBCER      CE becomes WT
     XKRFBWTR      TR becomes SV
     XKRFBWSV      VX becomes LD
     DKRFBWSL


3.1.1. Data Encryption Standard (DES)

     Some modern block substitution ciphers are the Data Encryption
Standard (DES), the Advanced Encryption Standard (AES), and the newer
GK-Crypt algorithm.  The Data Encryption Standard treats each block of 8
bytes, or 64 bits, as 16 units of 4-bits each.  (4-bit units are sometimes
called nibbles or nybbles.)  The left half of the block is used to generate
eight 4-bit quantities.  These 32 bits are permuted (shuffled) in a fixed
manner and then exclusive-ored (combined) with the right half.  Then the
left and right halves are swapped, and this operation is repeated for 16
rounds.  (There are also some bit permutations at the beginning and end,
but they add no strength.  They are there to make the encryption hard to
simulate in software.)

     The DES was state-of-the-art for the mid-1970's, but its small 56-bit
key size made it obsolete by about 1990.  The trend has been away from
bitwise hardware-oriented encryption.


3.1.2. Advanced Encryption Standard (AES)

     The Advanced Encryption Standard (AES) is also a block cipher, using a
block size of 16 bytes, or 128 bits, and a key size of 128 bits in current
implementations.  The key may be increased to 192 bits or 256 bits in the
future, when 128 becomes inadequate.  The first step in AES is to expand
the key to 128 bytes, to make 10 round keys of 16 bytes each.  This is done
by performing repeated substitution and exclusive-or operations on the key
bytes.

     The 10 round keys are then used to perform 10 rounds of encryption.
Each round consists of 3 steps.  First the message is combined with the key
using an exclusive-or operation.  In the second step the 16-byte message is
considered as 4 words of 4 bytes each.  These 4 words are shifted left by
1, 2, 3 and 4 byte positions, respectively.  In the third step each column
of the 4x4 block of bytes is mixed using a linear transformation, that is,
each column vector is multiplied by certain matrix of values.

     This last step is called a Hill cipher for Lester Hill who invented it
around 1920.  It was used by both the US and Russian military in the period
between the two World Wars.  Methods for breaking the Hill cipher were
developed by Jack Levine at UNC Raleigh.

     Some features of the AES are (1) it has a fixed block size, (2) it
uses the same key for every block, (3) it uses the same permutation in
every round, (4) the permutations are simple shifts applied within a single
4-byte word.


3.1.3. GK-Crypt

     The new GK-Crypt encryption algorithm is also a block cipher, using 9
rounds of substitution and permutation.  Version 1 of GK-Crypt uses
16-byte, or 128-bit blocks, and 80-byte, or 640-bit keys.  Version 2 of
GK-Crypt uses variable-length blocks of 16 to 32 bytes, or 128 to 256 bits,
and 640-bit keys.  The first step in GK-Crypt is to expand the key from 80
bytes to 360 bytes for version 1, or to 690 bytes for version 2.

     Like the DES and the AES, each round of GK-Crypt begins by combining
the message with the key for that round.  Then each byte of the message is
combined with another message byte.  In version 1 of GK-Crypt, this is done
in a fixed way, which is the same for every round.  In version 2 of
GK-Crypt, the choices of which bytes to combine depend on the key for each
round.  Finally, the message block is permuted (shuffled).  Unlike both DES
and AES, this is done in a key-dependent way, and is different for each
round.

     After each block has been encrypted, a new key is generated for the
next block.  This process also uses a sequence of substitutions and
key-dependent permutations.  A separate portion of the key is set aside for
generating the next key, and this portion is also changed for each new
block.  This means that not only is the key for the next block different,
but the transformation to generate the next key is also different.

     GK-Crypt is much stronger than AES, or even quadruple AES, because (1)
it has a variable block size, (2) it uses a different key for every block,
(3) it uses a different permutation in every round, (4) it permutes the
entire block in a complex way.


3.1.4. Linearity

     Linearity is an important concept in the design of all substitution
ciphers.  It can be illustrated by the familiar process of flipping a coin.
Suppose that you flip a fair coin 100 times, and you get 56 heads.  You
immediately know that you got 44 tails, because the number of heads plus
the number of tails always add up to the total number of flips.  Addition
is a linear function.

     The exclusive-or operation which is widely used in computer
cryptography is also a linear function.  When a message byte and a key byte
are exclusive-ored, the bits of each byte are lined up.  Everywhere that
there is a 1 in the key byte, the corresponding bit of the message is
inverted.  That is, if the message bit was a 0 it becomes a 1, and if it
was a 1 it becomes a 0.  Everywhere there is a 0 in the key, the
corresponding bit of the message stays unchanged.

     Message  0 0 0 0 1 1 1 1
     Key      0 1 1 0 1 0 0 1
              ---------------
     Result   0 1 1 0 0 1 1 0

     For cryptography, linear functions are the weakest kind.  Systems of
linear equations are easy to solve.  The methods have been known for
hundreds of years.  (Notice that the matrix multiplication step of the AES
is a purely linear operation.)  Therefore, if an encryption method can be
described by linear equations, and a few characters of the message are
known, it is easy to solve the linear equations and determine the key.
That let's you find the rest of the message.

     When one bit of the key in a linear system changes, a fixed set of
bits in the message will all change simultaneously.  This is a weakness
which is easy to exploit.  For example, if the same message has been
encrypted with 2 different keys, an attacker can look at which bits are the
same in each message, and which bits are different, and determine a good
deal about the keys.

     The ideal in cryptography is to have a perfectly non-linear system.
In such a system, when one bit of the key changes, or any combination of
key bits change, each bit of the cipher message will have exactly a 50-50
chance of changing.

     A perfect non-linear substitution is not possible, but every
well-designed encryption system tries to make its substitutions as
non-linear as possible.  The 8 S-boxes (substitution functions) in DES are
highly non-linear.  The substitution table in AES has been mathematically
designed to be highly non-linear.  However, this mathematical regularity
means that the substitution can be described by a mathematical function,
which is itself a weakness.

     GK-Crypt strikes a balance between strong non-linearity and
mathematical regularity.  GK-Crypt uses 3 substitution tables.  One is
highly structured, similar to the AES table, and the other two are somewhat
random.  All 3 tables were constructed to be strongly non-linear.  They
have been chosen so that they are also as independent as possible from one
another.  That is, there are no strong linear relationships between the 3
tables.  Such relationships might also weaken the encryption.



3.2. Transposition Ciphers


     The best-known transposition ciphers are the route ciphers.  In a
typical route cipher, the letters of the message are written into a
rectangular grid one at a time, going across the rows, something like this

     T H I S I S A
     R O U T E C I
     P H E R

Then the letters will be read out of the grid in a different order, for
example, reading down the columns

     TRPHOHIUESTRIESCAI

or diagonally

     TRHPOIHUSETIRESCAI

or reading alternately up and down the columns

     PRTHOHEUISTREISCIA.

     The Union used these types of route ciphers in the US Civil War, and
it is believed that they were never broken by the Confederacy.  Today,
these simple route transpositions make fairly elementary hobbyist ciphers,
easily solved by paper and pencil methods.

     In the late 19th century the first transposition cipher utilizing a
key was developed.  It is called a columnar transposition, or more
precisely, a complete columnar transposition if all of the columns are the
same length, and an incomplete columnar transposition otherwise.  The
letters of the message are written into a rectangular grid, as before, but
this time the columns are numbered in some order.  This numbering of the
columns is the key.

     3 7 1 6 4 8 2 5
     ---------------
     T H I S I S A C
     O L U M N A R T
     R A N S P O S I
     T I O N

The letters are read out of the grid by reading down the columns, but the
columns are taken out in the order specified by the key.  In this example,
the column numbered 1 is read out first, IUNO, then the column numbered 2,
ARS, and so forth.

     IUNOARSTORTINPCTISMSNHLAISAO

This is a major improvement over the route transposition, but this type of
cipher can still be solved by skilled hobbyists using paper and pencil
methods.

     Double columnar transposition was used by the US, Germany, and other
nations from about 1900 to 1945.  The message was scrambled by a columnar
transposition, and that message was scrambled again by a second columnar
transposition, usually using the same key for both steps.  The French broke
the German version, but it is believed that the Germans did not break the
American version.  Today advanced hobbyists routinely solve such messages
using paper and pencil.

     In general, a transposition cipher by itself, no matter how
sophisticated, is not secure.  However, in the 1890's a French
cryptographer named Etienne Bazeries showed that even a simple
transposition cipher combined with a simple substitution cipher could
provide a decent degree of security.  This fact is at the heart of most
modern block ciphers, including the AES and GK-Crypt.



3.3. Stream Ciphers


     The basic idea in a stream cipher is that each character in a message
is enciphered using one character of the key.  The first stream ciphers,
which were called polyalphabetic ciphers, were developed in the 16th to
18th centuries.  They used a short key, such as a single word, which was
repeated as many times as needed for the length of the message.  Typically,
the key word would be written across the top of the page during encryption,
and the message would be written in columns beneath it, leaving blank lines
for writing in the encrypted message.

     Most of these classical stream ciphers used a substitution table, or
tableau.  For English this is normally a 26x26 grid of letters.  Each of
the 26 rows contains the 26 letters of the alphabet in some order.  For the
simpler ciphers, the letters might be the normal alphabet shifted by one
position, left or right for each successive row, or written backwards and
shifted.  For the more sophisticated versions, each row might be scrambled.
To encrypt one letter of the message, the letter would be found on the top
row of the grid, and the key letter would be found along the left column of
the grid.  The letter lying in the same column as the message letter, on
the row corresponding to the key letter would be the enciphered letter.

     Decryption would be the reverse process.  The key letter would be
found along the left edge, and the encrypted letter would be found in the
corresponding row.  Then the letter at the top of that column would be the
letter from the original message.

     Slightly more complex variations of these classical ciphers would have
the message letter written on an index row above the top row of the grid,
and the key letters written in an index column to the left of the grid.
This provided for more keying possibilities because different index rows
and columns could be used with any grid.

     Most modern stream ciphers are simply elaborations on these
principles.  The message is combined character-by-character with the key
stream to produce the encrypted message.  In some of the oldest and weakest
systems, the key stream was produced by taking the keyword and cycling the
bits around.  That is, the leftmost bit of a character would be moved to
the righthand end, and the rest of the bits would be moved left one
position.  This would form the next character of the key stream.

     One of the simplest techniques was to use the file name, followed by
the file name with each character cycled 1 bit position, then 2 bit
positions, etc., until, after 8 shifts, it was back to the original file
name.  This produced a repeating key that was 8 times as long as the file
name.  This method is scarcely any stronger than the 16th century method of
repeating the keyword over and over.


3.3.1. Random numbers

     Far more common is to use a random number generator to produce the key
stream.  The two most common forms of random number generators, more
properly called pseudorandom number generators, are linear congruential and
linear feedback generators.  These will be described briefly, since a more
detailed treatment would require deeper math.

     There is one basic mathematical concept that needs to be introduced in
order to understand linear congruential generators.  This is the idea of
modular division.  If n and p are positive integers, then the residue of n
modulo p means the remainder when n is divided by p.  For example, if n is
42 and p is 10, then the residue of n modulo p is 2.  This is denoted 42 =
2 (mod 10).

     The basic idea in a linear congruential generator is that you start
with some number, called the seed, then you multiply this number by a fixed
multiplier, add a fixed constant, and then take the residue of this
quantity modulo a fixed divisor.  The result is the next random, or
pseudorandom, number.  This number becomes the next seed, and the process
can be repeated to get a sequence of pseudorandom numbers.

     The process can be denoted as

     mr + c (mod d).

Here m is the multiplier, r is the last random number, c is the constant,
and d is the divisor.  The operation mod indicates the modular division or
remainder operation.

     There are two types linear congruential generators that are most
commonly used in practice.  The most common type uses a prime number as the
modular divisor.  This prime is usually made as large as one computer word
can hold.  When the divisor is a prime the constant c is usually 0.

     As a simple example, let d be 7, let m be 2, and let c be 0.  Start
with a seed value of 1.  The next term will be 2x1(mod 7), or 2(mod 7),
which is just 2.  The third term will be 2x2(mod 7), which is 4.  The
fourth term is 2x4(mod 7), or 8(mod 7), which is 1.  Since the first and
fourth terms are the same, the sequence will repeat.  We say that the
period of the sequence is 3, since it contains just 3 numbers 1, 2, 4 that
repeat endlessly.  When 7 is the divisor 2 is not a good choice for the
multiplier, since the period is only 3.  If 3 were chosen as the
multiplier, then the sequence would be 1, 3, 2, 6, 4, 5 with a period of 6.
So 3 is a better choice.

     The second common type of linear congruential generator uses a power
of 2 as the divisor.  In this case c must be an odd number, and the
multiplier m must be one of the numbers 5, 9, 13, etc.  This type of
generator is faster than the prime generators because the modular division
can be done very fast with a power of 2.

     The values of m, c and d must be carefully chosen so that the
generator will have good random properties, and will have a long period,
that is, will generate many numbers before repeating.  There are many
statistical tests that can be used to test the random properties of such a
generator.  Random number generators of this type generally have good
distribution properties, but do poorly on what is called the runs-up test.
They tend to produce sequences of increasing numbers, such as 2, 19, 31,
177, 2161, 327648, 471255, etc., that are too long for real randomness.

     More importantly, linear congruential generators are not
cryptographically strong.  Each term in the sequence is determined from
just the one term before, and that term fits within a single computer word,
usually no more than 32 bits.  It is easy for an attacker to try all 232
values until the correct seed is guessed.

     The data which is used to calculate the terms in the sequence is
called the internal state of the generator.  With a 32-bit computer 1 bit
is used for the sign of the number and the other 31 bits

     In order to achieve today's standard of 128-bit security, it is
necessary to add together the outputs of at least 5 linear congruential
generators.  These 5 generators may all use the same value of p, but they
must use different values of m.  Naturally, using 5 generators takes 5
times as long to compute each random number.

     Another type of random number generator is the linear feedback random
number generator.  This type of generator uses a larger seed of k computer
words, x1, x2, x3, ..., xk.  Each new term in the sequence is generated by

     xn = xn-a $ xn-b $ xn-c $ ... $ xn-k

Here $ represents some combining operation, usually addition or
exclusive-or.  The terms xn-a, xn-b, etc., represent one or more of the
last k terms of the sequence.

     Sometimes a more general linear function is used, such as

     xn = c1xn-1 + c2xn-2 + ... + ckxn-k

where c1, c2, etc. are integer coefficients (multipliers).  Usually most of
these coefficients will be 0 for the sake of speed.

     The problem with all of the linear feedback generators is that the
random numbers can be expressed by linear equations, and linear equations
are easy to solve.  Given a small amount of known, or guessed text, an
attacker can solve the equations to determine the seeds.  Once the seeds
are known, the entire random key stream is known, so the entire message can
be read.

     The bottom line is that neither type of random number generator is
safe.  Linear congruential generators have small seeds which can be
exhaustively tried.  Linear feedback generators have large seeds, but lead
to simple equations that are easily solved.


3.3.2. One-time pad

     The original One-Time Pad was an encryption method where each
character of the message was combined with a separate character from the
keystream.  It is called "one-time pad" because secret agents were actually
sent into the field with pads of special water-soluble paper on which
hundreds of completely random characters were written.  These characters
were used as keys which could be used for encrypting a message by the
tableau method described above.  After each message was sent, the agent
would flush away that sheet.  The person receiving the message would use an
identical sheet, perhaps a carbon copy, to decipher the message.

     The key to the success of this method is that the keys have to be
completely random.  During World War II the British actually had a woman
draw lettered balls out of a Bingo cage, while the Germans used a system
with a swinging pendulum.

     With true random keys, any character of the message can become any
other character in the encrypted message with equal likelihood.  This can
be formally proven to be perfectly secure.  That is why so many encryption
products claim to be like the one-time pad, regardless of whether they are
similar or not.  To achieve this absolute security, each character of the
random key must be completely independent of any other character, and no
part of the random key stream can ever be used again.  This cannot be done
using pseudorandom numbers.

     Most products on the market today that claim to use the One-Time Pad
use a pseudorandom number generator as described in the preceding section.
Such systems are insecure because pseudorandom numbers eventually repeat,
because they can be determined by exhaustive trial, and/or they are
generated by simple linear relationships that can be solved easily.


3.3.3. OP-Crypt

     The OP-Crypt package makes several improvements over other products
which claim to use the one-time pad.  Data is not encrypted one character
at a time.  It is encrypted in blocks using a block cipher stronger than
AES (the Advanced Ecryption Standard).  If an opponent knew some of the
message, they could not determine the key that was used to produce the
encrypted message.  Therefore they could not determine any part of the
keystream.

     The key stream is not generated by a linear mathematical function,
which can be guessed or solved.  The key stream is generated from true
random data derived from random physical processes such as gamma rays and
solar flares.  Each user is furnished with more than 10,000,000 characters
of such true random data.  The key for each message is obtained by
encrypting randomly chosen portions of the already-random key data with a
different encryption key for each block.

     There are actually 4 keys for each block of the message, two
key-encrypting keys which encrypt two separate portions of the random data
to obtain two message-encrypting keys.  Each message block is encrypted
twice, making it impossible for an attacker to determine the two message
keys.  Of course the user need not be aware of any of this.  All of the
keys are generated internally by the software, and the user never sees
them.



4. PUBLIC KEY CRYPTOGRAPHY



     Public Key cryptography is a relatively new idea, dating back only to
the 1970's.  The basic idea is that each party has two functions, E to
encrypt a message, and D to decrypt the message.  The function E is known
to everyone.  It is publicly available, either by asking the person with
whom you wish to communicate, or by some kind of key-distributing network.

     The leading method for public key cryptography uses two large prime
numbers, say two 150-digit primes.  Each user of the system secretly
chooses two such primes, and forms their 300-digit product.  This product
is made public along with a large number E.  Whenever a person wants to
send a message to the receiver, they treat the message as a series of
300-digit numbers.  They then encrypt each block by raising the number to
the E power modulo that 300-digit product.  (See the discussion of modular
division in the section on random numbers.)

     The receiver, who knows the two primes, can easily calculate the
secret decrypting number D.  The receiver raises each encrypted block of
the message to the D power modulo the 300-digit product, and gets back the
original message.  A third party who wanted to read the encrypted message
would need to factor the 300-digit product to determine the two primes in
order to calculate D.  Factoring large numbers can be very difficult and
can take massive amounts of computing time.

     There are two drawbacks to this form of public key cryptography.
First, it is very slow.  Raising a large number to a large power modulo a
large number takes a lot of time, even with the most sophisticated methods.
This is why some public key software packages include data compression
functions.

     Second, improvements in factoring algorithms may make this form of
encryption insecure.  Some years ago mathematicians learned how to use
elliptic curves to factor large numbers.  Suddenly all of the public key
messages that had been sent using 50-digit or 100-digit primes were no
longer secure.  People who had collected and recorded such messages could
now read them, and learn the secrets they contained.  People who were using
public key cryptography had to switch to 150-digit or 200-digit primes if
they wanted security.



5. PRIVATE KEY CRYPTOGRAPHY



     The newest form of encryption is Private Key Cryptography.  In this
form, each party has a private key which nobody else knows.  The private
key is never shared with other parties, so there is no need for any form of
key exchange or key distribution.

     The great advantage of private key cryptography is that any two
parties anywhere who have the private key software can securely exchange
messages without having to make any prior arrangements.  No third party who
intercepts the messages can read them, even if the third party also has the
same private key software.

     The method is to treat each message as a series of numbers.  The
sender performs a mathematical encryption operation on each block of the
message using the sender's encryption key.  The receiver then performs a
second mathematical encryption operation on each block of the message using
the receiver's encryption key.  These mathematical operations are chosen so
that they commute, that is you would get the same result if you did the two
encryptions in the opposite order.

     This makes it possible for the sender to remove the first encryption
by decrypting the message using the sender's decryption key.  That leaves
the message encrypted only with the receiver's key.  So now the receiver
can decrypt the message using the receiver's own decryption key.

     The disadvantage, of course, is that the message must be transmitted 3
times so that the sender and receiver can perform their respective
encryption and decryption steps.  For many applications, however, the
advantage of not needing to exchange encryption keys outweighs this
disadvantage.

Back to the MASTER SOFTWARE CORPORATION homepage

© Copyright 2005-2007 Frank Rubin