Ways of scrambling a message: Permutation polynomials over the integers mod n through the lens of monoalphabetic substitution ciphers

When Julius Caesar was on his military campaign in Gaul, he sent coded messages by courier back to Rome using a simple cipher. The so-called Caesar cipher is a letter shift by three places. In other words, every instance of the letter A gets replaced with the letter D, B gets replaced with E, C gets replaced with F, and so on. For the end of the alphabet, the cipher uses a wraparound rule, so that for example X gets replaced with A. Several centuries after Caesar, a tool was devised to assist with the enciphering and deciphering of messages, called a cipher disk (or sometimes a “code wheel”).

A cipher disk consists of two circular pieces, each printed with the alphabet. The inner piece can turn so that letters can be lined up.

More sophisticated ciphers were invented hundreds of years ago, but Caesar-type ciphers have continued to be used for educational and recreational purposes. Cipher disks are the same technology as the “secret decoder rings” and similar toys historically used by children to receive and send secret messages, including in A Christmas Story the secret message “Be sure to drink your Ovaltine!”

The math

The Caesar cipher is a type of monoalphabetic substitution cipher. Substitution because we replace letters with other letters, and monoalphabetic because the replacement letters and the original letters all come from the same alphabet. There are many different ways of creating monoalphabetic substitution ciphers. It doesn’t matter what our actual alphabet is, although it will eventually matter how many letters there are and the alphabet must be finite. For mathematical convenience, we’ll use numerals instead of letters. We will also start at 0 instead of 1 for convenience as well. Remember, however, that our symbols could be anything at all. We are choosing to set it up this way in order to make it easier to understand.

We can describe the Caesar cipher as y = x + 3 (mod 26). Here x is the index or place number of the original (plaintext) letter, y is the new (ciphertext) letter’s index, and “mod” refers to the modulo operator. Modulo gives the remainder after dividing one number by another, in this case the remainder after dividing by 26. In other words, the expression x + 3 (mod 26) says to take a number x, add 3 to it, divide it by 26, and take the remainder. The (mod 26) is what creates the alphabet wraparound rule.

Let’s try enciphering a short message, “PAX,” using the equation above. First, we need to replace the letter symbols with their corresponding number place in the alphabet. Since P is the 16th letter of the alphabet, it becomes a 15. Remember we are starting from 0, so A is 0. X becomes 23. Now we need to shift each letter.

15 + 3 = 18, and 18 ÷ 26 has a remainder of 18.

0 + 3 = 3, and 3 ÷ 26 has a remainder of 3.

23 + 3 = 26, and 26 ÷ 26 has a remainder of 0.

Finally, we translate back into letters. Letter number 18 is S, 3 is D, and 0 is A. Our ciphertext is “SDA.”

The existence of this equation which defines the Caesar cipher raises the question, what other equations could we use? Since that’s a very broad question, let’s arbitrarily narrow our focus to the familiar and interesting case of polynomial equations. Recall that a polynomial (in one variable) is an expression of the form

a_mx^m + a_m_-1x^m^-1 + … + a₂x² + a₁x + a₀

For example, 3x⁴ – 2x² + 1 is a polynomial.

Now, we want to consider what criteria make a polynomial appropriate for a cipher. For one, it must provide an unambiguous rule for finding each replacement letter (the ciphertext). Second, the rule for creating the ciphertext must be reversible so that the message can be deciphered. As it turns out, all polynomials meet the first criterion, so only the second one is interesting.

Let’s observe a non-example: the polynomial x² (mod 26). We can try enciphering the message “XD.” X is number 23 and D is number 3.

23² = 529, and 529 ÷ 26 has a remainder of 9.

3² = 9, and 9 ÷ 26 has a remainder of 9.

Letter number 9 is J, so our ciphertext is “JJ.” Consider what happens when someone tries to decipher this message: naturally, the thing to try would be square root.

√9 = 3, and 3 ÷ 26 has a remainder of 3.

The deciphered message would be “DD,” which is not correct. We might call this a collision, in that the letters X and D collided by both being sent to the letter J. What we’re looking for, then, is polynomials that do not cause letter collisions.

One slightly complicating factor is that for any such polynomial, there are infinitely many other polynomials that create the same cipher. In other words, different polynomials can produce the exact same ciphertext for a message. This is a consequence of our wraparound rule (the modulo business). However, we can mostly avoid this by picking a canonical representative from each class of polynomials like this. We will use the unique polynomial having lowest degree, fewest terms, and smallest coefficients. For a very simple example:

The following polynomials are equivalent (mod 26): 2x, 2x+26, 28x, 28x+26, etc. The reader is invited to prove that these are equivalent as an exercise. The unique “smallest” polynomial among this equivalence class of polynomials is 2x. For a more difficult challenge, try proving that the “smallest” polynomial is always unique.

Let’s take a moment to think about this from a different perspective. We can ask the question, “how many monoalphabetic substitution ciphers exist?” It is clear that the quantity must be finite due to our alphabet having a finite number of letters (which we stipulated earlier). Since such a cipher is replacing one letter by another, we can think of the action of the cipher as rearranging the alphabet. Under that perspective, each cipher is associated with a particular arrangement (permutation) of the alphabet. The idea is essentially that the cipher can be identified by what it does to the string ABCDEFGHIJKLMNOPQRSTUVWXYZ.

Now, it’s just a question of how many possible permutations of the alphabet there are, which is:

26! = 403,291,461,126,605,635,584,000,000

That’s almost half an octillion, or about a hundred times the number of stars estimated to exist in the observable universe. How might we approach describing this vast collection of ciphers?

Permutations lend themselves naturally to the mathematics of group theory. In short, that collection of 26! ciphers forms a group. Formally, a group is a set of elements along with an operation that follows certain rules. The operation is frequently compared to addition. Our set is the set of ciphers, but we need a way of “adding” or otherwise combining two ciphers together. One approach is to perform the two ciphers one after the other, so that the ciphertext of the first cipher is treated as the plaintext of the second cipher. We call this operation composition. The result of doing two ciphers one after the other is the same as a cipher that goes directly from the original plaintext to the final ciphertext. This is a key property of groups called closure. Whenever we combine two elements of our set using our operation, we must get another element of the set. Another property of groups is identity, meaning there is one element in the set that does nothing when combined with other elements. This corresponds to the cipher that consists of leaving the plaintext as-is, which we will count as a genuine cipher. Ciphers can be deciphered by applying another cipher, called having an inverse. Finally, there is associativity, just as with normal addition: a+(b+c) = (a+b)+c but for our group operation of composition.

Group = set & operation
Set	Operation
The set of all 26! permutations of the Roman alphabet, thought of as monoalphabetic substitution ciphers	(Composition) Must satisfy: Closure Identity Inverses Associativity

The group that we are considering here is the group that acts on a set of 26 items by rearranging (permuting) them. This group has a name, it is the symmetric group of degree 26 or S₂₆ for short. Likewise for any alphabet with n letters, the group of monoalphabetic substitution ciphers on that alphabet correspond to the group S_n. Now the question is, do all of these 26! ciphers have corresponding permutation polynomials (PPs)? What properties do these polynomials have?

With 26! being such a humongous number, it’s helpful to consider simpler cases first. We’ll start with a one letter alphabet and proceed from there.

Number of Letters	Total Permutations	Permutation Polynomials
1	1	x
2	2	x x + 1
3	6	x x + 1 x + 2 2x 2x + 1 2x + 2
4	24	x x + 1 x + 2 x + 3 3x 3x + 1 3x + 2 3x + 3

For small alphabets like this, we can find all the PPs with brute force, by just checking every possibility with a computer. This strategy quickly becomes unfeasible, however, due to exponential complexity.

Notice that the number of PPs is the same as the number of group elements for alphabets with 1, 2, or 3 letters. For an alphabet consisting of 4 letters, there are 4! = 24 ciphers total and only 8 PPs. This informs us that there exist monoalphabetic substitution ciphers that cannot be represented as polynomials, and furthermore the existence of such ciphers depends on the size of the alphabet. As it turns out, the significance of the number 4 is that it is the first composite (non-prime) number.

Proposition. Let A be a set of symbols. All monoalphabetic substitution ciphers over A can be represented by permutation polynomials if and only if A has a prime number of elements.

The culprit is once again the wraparound rule. Recall that when doing computations with these polynomials, we are using modular arithmetic, where after any operation we take the remainder after dividing by some specific n. When considering our ciphers, we have to deal in integers (…, -2, -1, 0, 1, 2, …) since a letter can only have an integer index within its alphabet. We’ll start by looking at just the addition operation.

The set of integers with the modular addition operation forms a group (called the cyclic group of order n, or C_n). Let’s get a little more precise about what we mean by that. We begin the construction by defining the congruence (mod n) relation on the integers.

We will say that two integers a and b are congruent (mod n) if and only if there exists an integer c such that a = b + cn.

We will write a ≡ b (mod n).

The reader may prove for themselves that this is an equivalence relation, satisfying the reflexive, symmetric, and transitive properties. Our next step is to define an equivalence class [a] consisting of all integers congruent to a:

For any integer a, let [a] be the set of all b such that b is an integer and a ≡ b (mod n).

These sets are called congruence classes or residue classes (“residue” referring to remainder). We define the set of congruence classes as follows.

Let ℤ_n be the set of all [a] such that a is an integer and [a] is a congruence class mod n.

ℤ_n is also written ℤ∕nℤ. Note that ℤ is the symbol used to represent the integers after the German word for number, Zahl.

Let’s take a moment to look at this set ℤ_n more closely. Formally, it is a set of sets, grouping integers based on having the same remainder when divided by n. Let’s look at ℤ₅ for example. It contains the following elements:

{0, 5, 10, 15, …}
{1, 6, 11,16, …}
{2, 7, 12, 17, …}
{3, 8, 13, 18, …}
{4, 9, 14, 19, …}

Notice that while there are infinitely many integers, this set has only 5 elements. We can refer to each element by its smallest member, so that we write:

ℤ₅ = {[0], [1], [2], [3], [4]}

Since these sets are not themselves integers and can’t be added in the normal way, the next thing we want to do is create an operation we will call “+” by extending the definition of addition from the integers.

Define an operation + on ℤ_n as follows: [a] + [b] = [a + b].

This operation is closed by definition and inherits the properties of identity, inverses, and associativity from normal addition. This makes ℤ_n with + a group. As mentioned above, the group has a name which is the cyclic group C_n. We can think of this group as being like “clock math.”

However, there is another operation going on in our polynomials: multiplication. We’ll have to define a new operation of “*” for ℤ_n.

Define an operation * on ℤ_n as follows: [a] * [b] = [a * b].

Multiplication also has closure, identity, and associativity for the same reasons as addition. However, multiplicative inverses of integers are not themselves integers, with the exception of -1 and 1. That could cause problems, but let’s see what happens in the mod n world.

For ordinary numbers, the multiplicative inverse of 4 is ¹/₄ because 4 * ¹/₄ = 1.

Consider ℤ₉. [4] * [7] = [4 * 7] = [28] = [1] so the multiplicative inverse of [4] in ℤ₉ is [7]. In other words, 4 * 7 ≡ 1 (mod 9).

However, ℤ₉ only has 9 elements, and none of them is the multiplicative inverse of [3]. We can verify:

[3] * [0] = [0]

[3] * [1] = [3]

[3] * [2] = [6]

[3] * [3] = [0]

[3] * [4] = [3]

[3] * [5] = [6]

[3] * [6] = [0]

[3] * [7] = [3]

[3] * [8] = [6]

So what is the difference between 4 and 3 that makes 2 have an inverse in ℤ₉ but not 3? The answer is that 4 is coprime with 9 while 3 is not. Coprime means that two numbers have no prime factors in common. 3 and 9 share the prime factor 3. This relationship creates a lockstep pattern between multiples of 3 and multiples of 9. In order for [3] to have an inverse, there would need to be a multiple of 3 that is 1 unit greater than some multiple of 9, but all multiples of 3 are either 0, 3, or 6 units away from the nearest smaller multiple of 9. Consider the following sequences:

a_n = [4n] = [0], [4], [8], [3], [7], [2], [6], [1], [5]

b_n = [3n] = [0], [3], [6], [0], [3], [6], [0], [3], [6]

Multiples of [4] don’t have an obvious pattern, and every element of ℤ₉ is represented. This characterizes an element with a multiplicative inverse, which is any number coprime to the modulus number.

Visual comparison of how multiples of 4 and multiples of 3 each relate to multiples of 9

The key to the proposition above (all ciphers in a prime-numbered alphabet can be represented as polynomials) is that prime numbers are coprime with every number other than themselves. Thus, a prime modulus creates a situation in which every number(‘s congruence class) other than the modulus itself has a multiplicative inverse. The modulus itself is always in the set [0], and this is one way of thinking about why division by zero is impossible.

What we have now is something more complex than a group, ℤ_n with both addition and multiplication. The addition part forms a group, and the multiplication part forms something like a group except at least one of its elements lacks a multiplicative inverse. When n is prime, this structure is called a field. Otherwise, it is a ring.

Definition. A commutative ring with unity is a set R along with two binary operations + and × satisfying the following axioms:

+ is associative: (a + b) + c = a + (b + c) for all a, b, c in R.
+ is commutative: a + b = b + a for all a, b, c in R.
There is an element 0 that is an additive identity: a + 0 = a for all a in R.
Every a in R has an additive inverse: for all a in R there exists b in R such that a + b = 0.
× is associative: (a × b) × c = a × (b × c) for all a, b, c in R.
× is commutative: a × b = b × a for all a, b in R.
There is an element 1 that is a multiplicative identity: a × 1 = a for all a in R.
× is distributive over +: a × (b + c) = a × b + a × c and (a + b) × c = a × c + b × c for all a, b, c in R.

A field is a commutative ring with unity in which 0 ≠ 1 and every nonzero element has a multiplicative inverse.

For example, the integers form a ring, and the rational numbers (fractions) form a field.

This distinction explains why an alphabet with a composite number of letters is missing some of the possible permutations in its list of PPs. PPs are sensitive to whether their coefficients lie in a ring or a field. Fields have nice properties that rings lack as a result of missing multiplicative inverses. When considering ciphers, we may choose to use an alphabet with a prime number of letters in order to take advantage of this (to use PPs). For example, we could use the 26 letters of the Roman alphabet plus the symbols space, period, and question mark for a total of 29 letters (a prime number).

When the number of letters is not prime, such as the 4-letter case, the set of PPs still forms a group with composition, despite not being the full symmetric group S₄. An example of the group operation (composition, denoted “∘”) follows. Note that the coefficients are actually elements of ℤ₄ = { [0], [1], [2], [3] }, but we omit the brackets for the sake of readability here.

a = 3x + 3

b = x + 2

a ∘ b = 3(x + 2) + 3 = 3x + 6 + 3 = 3x + 9 = 3x + 5

The reader may perform similar calculations to verify that the group has closure.

So what is the structure of this group? It has 8 elements, and there are only 5 “truly” unique groups of order 8. We can identify which group this is by calculating the order of each element of the group. The order of a group element a is the smallest positive integer k such that a^k is equal to the identity element, which in this case is x. By a^k in this context we mean a ∘ a ∘ … ∘ a ∘ a where the polynomials a is being composed with itself (substituted into itself) k times.

Element	Order Calculation	Order
x	x = x	1
x + 1	(((x + 1) + 1) + 1) + 1 = x + 4 = x	4
x + 2	(x + 2) + 2 = x + 4 = x	2
x + 3	(((x + 3) + 3) + 3) + 3 = x + 12 = x	4
3x	3(3x) = 9x = x	2
3x + 1	3(3x + 1) + 1 = 9x + 4 = x	2
3x + 2	3(3x + 2) + 2 = 9x + 8 = x	2
3x + 3	3(3x + 3) + 3 = 9x + 12 = x	2

The unique group having one element of order 1, 5 elements of order 2, and 2 elements of order 4 is the dihedral group of order 8, or D₈. In general, the dihedral group D_2n is the group of geometric symmetries of a regular n-sided polygon. In this case, D₈ is the group of symmetries of a square. In other words, this is the set of plane transformations of a square that leave the square looking the same, along with the group operation of composition of transformations (doing one after another). This group’ is’s elements can be generated from two transformations: rotation by 90° and reflection across the diagonal. These transformations correspond to the polynomials x + 1 and 3x. These three different ways of interpreting the group actions– as geometric transformations, as polynomials, and as ciphers –are illustrated below.

Using the element orders to figure out the group like this is another brute force method, so it quickly becomes unfeasible as we consider larger and larger examples. It would be preferable to work out logically what these groups must be. I will tell you now that I’m not aware of any comprehensive description. We know that the group depends on the ring (or field) that the PPs’ coefficients are in. Let’s take a closer look at the non-field rings.

Recall that the fundamental theorem of arithmetic states that any integer greater than 1 can be written uniquely as a product of prime power factors, n = p₁^k₁p₂^k₂…p_m^k_m. For example, 300 = 2²3¹5². As it turns out, the ring ℤ_n is made up of the rings ℤ_pi^ki in a similar way. By the Chinese remainder theorem, a foundational theorem of number theory,

where × represents the direct product. The direct product of two rings is defined as follows: Let R and S be two rings. R×S is the set of ordered pairs (r, s) where r is in R and s is in S. Addition and multiplication are defined by

(a, b) + (c, d) = (a + c, b + d)

(a, b) * (c, d) = (ac, bd)

For a, c in R and b, d in S.

Let’s consider the PPs over the prime power ring ℤ_p^_k. It turns out we can construct the group of PPs from the PP group over the field ℤ_p through something rather complicated called a wreath product.

ℤ₄ = ℤ₂ ≀ ℤ₂

My conjecture is that, in general, the PP group over ℤ_n is formed from the PP groups over the ℤ_ps, that is, symmetric groups of prime factor order, but at this time I have neither proved it nor found an existing proof of it. That being said, the orders of all PP groups is known and is recorded in the Online Encyclopedia of Integer Sequences as sequence A329812. I was able to calculate this by adapting an equation from a paper by G. Keller and F. R. Olson published in 1968. Order is not enough to identify a group though, since for any given order there are generally many distinct groups.

So, what’s the upshot of all this? Is any of this information useful? Probably not, though possibly for something in number theory. Permutation polynomials themselves can be useful, however, for example if we want to scramble and unscramble data. We know that substitution ciphers are not secure for protecting information, but they are good at scrambling information. That’s sometimes useful, for example when doing error-correcting for data transmissions.

Transmitting data very frequently introduces a small amount of error, i.e. lost or corrupted data. This would be a massive problem without technology that enables correction of errors. A very basic form of error correction is with redundancy. If we just send the same identical message repeatedly, we can identify and fix errors in each individual version of the message.

Simplified illustration of how redundancy enables error correction – if the majority of messages agree on a letter, then it is probably correct

An important fact of data transmission is that, in the real world, errors tend to occur in chunks. This is unfortunate as it is typically easier to correct many small errors than to reconstruct an entire portion of the data. To address this, many forms of error-correcting technology scramble the data before transmission and the recipient unscrambles the data once they receive it. This way, when an error takes out a chunk of data during transmission, it will no longer be a single error chunk when read by the recipient. Instead, it will look like many small errors. The amount of error is the same, but it is now easier to correct.

Top: scrambled message as it appears during transmission with several bits in a row corrupted. Bottom: unscrambled message as it appears to the recipient, with the same number of corrupted bits but now distributed throughout the message.

In real-world error correction, the permutation (or “cipher”) is generally performed using a look-up table that both the sender and recipient are in possession of, rather than being computed when needed on the basis of something like a polynomial. This is because there is a tradeoff between computation time and storage space, and in data transmission speed is much more important than conserving a little bit of space on the disk. That being said, I have seen some research into the idea that PPs could be useful in creating these tables. Specifically, it seems that polynomials with certain properties could be more or less effective for scrambling as part of error correction. Note that truly random scrambling, if it could be used, would be very undesirable since it would often keep chunks together. It is already well-known, in other words, that some kinds of permutations work better than others.

thoughts on mind

Leave a comment

The math

Share this:

Related

Leave a comment