Previous, Math 511 Home, Contents, Next

The Hamming Code

We introduced the Hamming code during the last lecture. Four data bits b1b2b3b4 were encoded into a seven-bit word using the matrix (all computations done mod 2)
where we give the 4´7 matrix the name G. We then saw how single errors could be corrected by solving a logic problem. Unfortunately, computers prefer computational algorithms to methods such as "solving a logic problem." Fortunately, we can use matrix algebra to find a nice algorithm for fixing single errors in the Hamming code.

The first thing we will want is to find a quick way of checking if a seven-bit word is a valid codeword. In order for (b1b2b3b4p123p124p134) to be a valid codeword, the four-bit sets b1b2b3p123, b1b2b4p124, and b1b3b4p134 must all have even parity. Since our computations are all done mod 2, this is the same as asking b1 + b2 + b3 + p123 = 0, b1 + b2 + b4 + p124 = 0, and b1 + b3 + b4 + p134 = 0. We can write this as a matrix equation

where H is the 3´7 matrix and all computations are done mod 2. We can use matrix computations to check that any valid codeword will give use a (0 0 0) vector after multiplying by H. Recall that a valid codeword must be of the form (b1b2b3b4)G. Then we compute ((b1b2b3b4)G)H = (b1b2b3b4)(GH) = (b1b2b3b4)0 = (0 0 0) where you can check that GH is the matrix of all zeros, which we've denoted 0. For any seven-bit word w, we call wH the syndrome of the word. If w is a valid codeword, then the syndrome is (0 0 0). If the syndrome is not (0 0 0), then an error has been made. We will next consider how to find which single bit could have caused the error from the syndrome.

Let’s look at what happens to the syndrome if we have an error in our seven-bit codeword. We first compute our valid codeword (b1b2b3b4)G. Suppose now an error is made in the third bit. This gives us our new seven-bit word (b1b2b3b4)G + (0 0 1 0 0 0 0). If we now check this invalid codeword using the H matrix we find

((b1b2b3b4)G + (0 0 1 0 0 0 0))H = (b1b2b3b4)GH + (0 0 1 0 0 0 0)H
= (0 0 0) + (1 0 1)
= (1 0 1).
So an error in the third bit means the syndrome will be (0 0 1 0 0 0 0)H, which is the third row of the H matrix. Similarly, an error in the jth bit will make the syndrome the jth row of H. So to check a seven-bit word w encoded using the Hamming (7,4) code, we compute the syndrome wH. If wH is (0 0 0), then the word is valid and we can just read the data bits from the first four places. If the syndrome is something else, then we check which row of H is the same as the syndrome and switch that bit in the word to get a valid codeword, from which we then read off the data bits. You can check that every possible non-zero three-bit syndrome is a row of H, so this method always finds a single error that can be fixed to give a valid codeword.

The Hamming code illustrates the general approach to error-correcting codes. You choose an algebraic structure (in this case Z2) and carry out several calculations with the number you want to send. You then send both the number and the results of the computations. At the other end, the person (or computer) who gets your number can do a little algebra to check that everything came through all right, or to put it right if it didn't. The question becomes what is the best choice of algebraic structure and computation. You want it to be easy to implement the computations, without requiring too much work to check if a word is a valid codeword or to find the corrected codeword if the received word is invalid. At the same time, a good error-correcting code should be efficient, correcting as many errors as possible while requiring as few extra numbers be transmitted as possible. The Hamming code is easy to implement, but is not very efficient. While theoretical considerations showed more efficient codes must be possible, it took nearly a decade to develop better error-correcting codes than the Hamming code and so the Hamming code was actually used in computer systems through the 60's. Nowadays, there are several families of more efficient codes that are used in practice. In order to develop one of these more efficient codes, the Reed-Solomon codes that are used on CDs, we will study some new algebraic structures, polynomial domains and finite fields.


Please report any problems with this page to bennett@math.ksu.edu
©1998 Andrew G. Bennett