Skip to main content

Section 6.2 Coding Theory

Motivating Questions.

In this section, we will explore the following questions.
  1. How can modular arithmetic be applied to the transmission of information?
  2. How does the UPC check digit scheme work? What errors can it detect? What errors can it correct?

Activity 6.2.1.

Get into groups of (ideally) 7-10 (larger is better!) and sit in a line. Then play the game of telephone: one person, sitting at the end, whispers a short message to the person sitting next to them. They then turn and whisper it to the person sitting next to them, and so on until the last person receives the message. The last person shares the message they received, and it is compared to the original version.
Part of the fun of the activity is that the message is often garbled somewhere along the line; what could be done to mitigate this? Brainstorm a couple of ideas and be ready to share them with the class.
In Section 6.1, we learned about modular arithmetic, which works by considering the remainders obtained upon division by a fixed number \(m\text{.}\) In this section, we’ll consider a relatively recent application of modular arithmetic to coding theory, which is the mathematical study of transmitting information.
The basic problem of coding theory is as follows: in order to send information from one entity to another, it needs to be encoded in some form by the sender (e.g., written, recorded, etc) and transmitted across a channel (e.g., mailed, emailed, uploaded/downloaded, etc) to the receiver. However, as in the classic game of telephone, errors can creep into the process —written messages can be smudged, physical defects or packet loss can corrupt digital messages, and so on, resulting in information that either cannot be read at all, or can be misread. How can we be reasonably sure that common errors can be detected, and, perhaps, corrected?

Activity 6.2.2.

Alice wants to send a bit of information to her friend Bob; for simplicity’s sake, let’s assume she seeks to send a 0 or 1. To improve the chances that the message is received correctly, she decides to send it three times in a row. So, if Alice desires to send Bob a 1, she’ll actually send 111; if she desires to send a 0, she’ll send 000.
  1. As described above, errors can creep into the transmitted messsage; perhaps Alice sends Bob 111 and that is what he receives, but what if he receives 101? Or 100? How should Bob interpet those two (different) messages?
  2. What strengths and weaknesses do you see in Alice’s system? How could you improve the chances that Bob interprets Alice’s message as she intends?
  3. Perhaps you decided that Alice’s system is still too prone to errors, so you decide she should send the intended digit seven times instead of 3. In what way(s) is this revised system better? Worse?
Lots of information is encoded using numbers, which makes the reliable transmission of numbers an important problem to solve. A fundamental example is alluded to in Activity 6.2.2, wherein we consider the transmission of a binary digit, 0 or 1 —this is the language of digital computers. The actual codes that are used to transmit digital signals over the internet rely on lots of sophisticated mathematics and so are beyond the scope of our work here.
However, certain codes are within the grasp of those familiar with modular arithmetic. We turn to one such example now: the UPC check digit scheme.
Universal Product Codes (UPCs) are found on most items available for sale (though books have their own identifiers called International Standard Book Numbers, or ISBNs)UPCs are often encoded with barcodes so as to be machine-readable. We will concern ourselves only with the digits, not the barcodes.
Let’s consider the UPC from a copy of the game “The Resistance”:
\begin{equation*} 7-22301-92617-8. \end{equation*}
The first digit (7) indicates a product type. This is typically 0, 1, or 6-9 (other digits are reserved for coupons, loyalty cards, etc). The first group of five digits (22301) is a manufacturer number, while the second group of five digits (92617) describes the product. The last digit (8) is known as the check digit; it is chosen so that
\begin{equation*} (3\cdot 7 + 2 + 3\cdot 2 + 3 + 3\cdot 0 + 1 + 3\cdot 9 + 2 + 3\cdot 6 + 1 + 3\cdot 7 + 8)\equiv 0\mod 10. \end{equation*}
More generally, if the digits of a UPC are \(d_1 - d_2 d_3 d_4 d_5 d_6 - d_7 d_8 d_9 d_{10} d_{11} - d_{12}\text{,}\) the check digit \(d_{12}\) is chosen so that
\begin{equation*} (3\cdot d_1 + d_2 + 3\cdot d_3 + d_4 + 3\cdot d_5 + d_6 + 3\cdot d_7 + d_8 + 3\cdot d_9 + d_{10} + 3\cdot d_{11} + d_{12})\equiv 0\mod 10. \end{equation*}
The check digit is called thus as it is an extra bit of information which helps validate everything that came before it. Recall that our primary question is: how can we send information reliably over a channel? That is, how can we send information in such a way that we can (a) if the information received is the same as what we sent, and (b) if not, (ideally) correct the information?

Activity 6.2.3.

Consider the following questions.
  1. Calculate the check digit \(d\) for the UPC \(1-03792 -19302-d\text{.}\)
  2. Is \(8-62069-00678-9\) a valid UPC? Explain. If it is invalid, change it to be a valid UPC.
We saw in Activity 6.2.3 that the check digit scheme can determine that a given proposed UPC is invalid. However, we have no clear understanding of what went wrong. Were two (or more) digits transposed? Did we simply record a digit incorrectly? In Exploration 6.2.4, we’ll see what sorts of errors can be detected.

Exploration 6.2.4.

As best you can, answer the following questions.
  1. Consider the following UPC with missing digit \(d\text{:}\)
    \begin{equation*} 7-2d920-16431-8. \end{equation*}
    Can you determine the value of \(d\text{?}\)
  2. A certain box of chalk has UPC \(6-15867-28380-2\text{.}\) Choose two adjacent digits (e.g., 6 and 1) and switch their places. Is the resulting 12-digit number still a valid UPC? How can you tell?
  3. A certain product has UPC \(0-55005-00550-5\text{.}\) Choose two adjacent digits and switch their places. Is the resulting 12-digit number still a valid UPC? How can you tell?
In this section, we learned about the main problem of coding theory: how can we reliably transmit information across a noisy channel (as in the game of telephone)? We saw that number theory has been used to encode “extra” information into Universal Product Codes. This extra information allows us to recover missing digits and detect if a given 12-digit code is valid or not. If all 12 digits are present and the code is invalid, the UPC check digit scheme cannot generally allow us to identify what went wrong in transmission.

Exercises Exercises

1.

The following is presented as a possible UPC; is it valid? Explain.
\begin{equation*} 8 - 05500 - 28542 -5 \end{equation*}

2.

The following is presented as a possible UPC; is it valid? Explain.
\begin{equation*} 0 - 47495 - 11254 - 2 \end{equation*}

3.

Identify the missing digit in the following partial UPC.
\begin{equation*} 0 - 3x915 - 90093 - 8 \end{equation*}

4.

Sam is learning to write his letters, but sometimes confuses the 6 and the 9. He records the following UPC; is it valid? If so, justify. If not, can you correct it so that it is?
\begin{equation*} 1 - 64252 - 72124 - 7 \end{equation*}