Random Permutations (Part 6)

December 1, 2019

Random Permutations (Part 6)

John Baez

Now I'll tackle a really fundamental question about random permutations — one whose answer will quickly deliver the solutions to many others!

Puzzle 7. What is the expected number of cycles of length \(k\) in a random permutation of an \(n\)-element set?

The answer is beautiful: it's \(1/k\). More precisely, this holds whenever the question is reasonable, meaning \(1 \le k \le n\).

Note that this answer passes a simple sanity check. If on average there's \(1\) cycle of length \(1\), \(1/2\) cycles of length \(2\), and so on, the average number of elements in our \(n\)-element set must be

\[ 1 \cdot 1 + \frac{1}{2} \cdot 2 + \cdots + \frac{1}{n} \cdot n = n \]

Whew!

Before we show the expected number of cycles of length \(k\) in a random permutation of an \(n\)-element set is \(1/k\), let's see how knowing this lets us answer some other questions.

Since the expected number of cycles of length \(k\) is \(1/k\), and each of these cycle contains \(k\) elements, the expected number of elements lying in cycles of length \(k\) is

$$ k \cdot \frac{1}{k} = 1 $$

It's cool how this is independent of \(n\), as long as \(k \le n\).

This was Puzzle 8 of Part 3. Since the expected number of elements lying in cycles of length \(k\) is 1, and there are \(n\) elements, the probability that a given element lies in a cycle of length \(k\) is \[ \frac{1}{n} \]

It's cool how this is independent of \(k\), as long as \(1 \le k \le n\).

Since a fixed point is a cycle of length \(1\), the answer is

\[ 1 \]

We already saw this in Part 1. There we got it the hard way, by computing the probability that a permutation has \(k\) fixed points. Now we've seen it a different way. But I hinted that there's a much easier way. Here it is: since the probability that a random permutation of an \(n\)-element set fixes any given point is \(1/n\), the expected number of fixed points is \(n \cdot \frac{1}{n} = 1\).

Since \(k\) is more than \(n/2\), there can be at most one cycle of length \(k\). Thus, the probability that such a cycle exists equals the expected number of such cycles! So, the probability must be

\[ 1/k \]

We already saw this in Part 4. Now we're seeing a nice way to generalize this fact to \(k \le n/2\), not by asking whether a cycle of length \(k\) exists, but by asking the expected number of such cycles.

This was Puzzle 5 in Part 3. Since the expected number of cycles of length \(k\) is \(1/k\), we can sum over \(k = 1, \dots, n\) and get the answer:

\[ \frac{1}{1} + \cdots + \frac{1}{n} \]

Asymptotically this is \(\ln n\). Or, even better, it's \(\ln n + \gamma\) plus a correction that goes to zero as \(n \to \infty\).

Okay, I hope you're convinced by now that computing the expected number of cycles of length \(k\) in a random permutation of an \(n\)-element set is the key to understanding many things. Let's do it!

Flajolet and Sedgewick do this using bivariate generating functions in Example III.9 of their book Analytic Combinatorics. I explained the basic method last time so now I'll just use it.

We start by making up a functor

\[ H_k \colon S \times \mathbb{N} \to \textrm{FinSet} \]

that assigns to the pair \((n,i)\) the set \(H_k(n,i)\) of all permutations of \(n\) having exactly \(i\) cycles of length \(k\). Note that \(n\) here is being treated as a finite set, while \(i\) and \(k\) are just numbers.

The generating function of \(H_k\) is defined by

\[ \displaystyle{ |H_k|(z,u) = \sum_{n \ge 0} \sum_{i \ge 0} |H_k(n,i)| \, \frac{z^n}{n!} u^i } \]

Suppose we could compute this. How could we use this to answer our puzzle?

We want the expected number of cycles of length \(k\) in a permutation of an \(n\)-element set. Since the probability that a random permutation has \(i\) cycles of length \(k\) is

\[\frac{|H_k(n,i)|}{n!} \]

this expected number is

\[ \sum_{i \ge 0} i \frac{|H_k(n,i)|}{n!} \]

So this is the sum we need to compute.

But notice,

\[ \begin{array}{ccl} \displaystyle{ \left. \frac{\partial}{\partial u} |H_k|(z,u) \,\right|_{u=1} } &=& \displaystyle{ \left. \frac{\partial}{\partial u} \sum_{n \ge 0} \sum_{i \ge 0} |H_k(n,i)| \, \frac{z^n}{n!} u^i \; \right|_{u=1} } \\ \\ &=& \displaystyle{ \sum_{n \ge 0} \sum_{i \ge 0} i |H_k(n,i)| \, \frac{z^n}{n!} } \\ \\ \end{array} \]

So, the expected number we want is the coefficient of \(z^n\) in the power series for

\[ \left. \frac{\partial}{\partial u} |H_k|(z,u) \,\right|_{u=1} \]

Thus, to answer our puzzle, we should figure out \(|H_k|(z,u)\), differentiate it with respect to \(u\), set \(u = 1\), and read off the coefficient of \(z^n\).

For this, we'll need a formula for \(H_k\):

\[ H_k \cong \textrm{Exp} \circ (\textrm{Cyc}_{\ne k} + U \cdot \textrm{Cyc}_k) \]

Let's see what this means! As soon as you understand what it means, you'll know it's true. That happens a lot in category theory.

Here \(\textrm{Cyc}_k \colon S \to \textrm{FinSet}\) is the species 'being a cyclically ordered \(k\)-element set'. We've already met the species

\[ \textrm{Cyc} \cong \sum_{i \ge 0} \textrm{Cyc}_i \]

which is 'being a cyclically ordered finite set', and now we'll define a species

\[ \textrm{Cyc}_{\ne k} \cong \sum_{i \ne k} \textrm{Cyc}_i \]

which is 'being a cyclically ordered finite set of cardinality \(\ne k\)'. All these species can be seen as functors from \(S \times \mathbb{N}\) to \(\textrm{FinSet}\) that don't involve the second variable.

On the other hand, we have a functor

\[ U \colon \mathbb{N} \to \textrm{FinSet} \]

called 'being the number 1'. This is defined by setting \(U(1) = 1\) and \(U(i) = \emptyset\) for all \(i \ne 1\). This can be seen as a functor from \(S \times \mathbb{N}\) to \(\textrm{FinSet}\) that doesn't involve the first variable.

It's a bit hard for me to describe the functor

\[ \textrm{Cyc}_{\ne k} + U \cdot \textrm{Cyc}_k\]

in plain English, mainly because I haven't developed a vocabulary for functors \(S \times \mathbb{N} \to \textrm{FinSet}\) the way I have for species! A species describes a type of structure you can put on a finite set; one of these functors describes a type of structure you can put on a finite set and a natural number. This one is 'either your finite set is cyclically ordered and not of cardinality \(k\) and your number is \(0\), or it's cyclically ordered and of cardinality \(k\) and your number is \(1\)'.

As a consequence

\[ \textrm{Exp} \circ (\textrm{Cyc}_{\ne k} + U \cdot \textrm{Cyc}_k) \]

describes the following sort of structure on a finite set and a natural number. This structure is a way to chop the finite set into cyclically ordered pieces and simultaneously write the number as sum of \(0\)'s and \(1\)'s, but using a \(1\) for each piece of cardinality \(k\), and a \(0\) for every other piece. This is why we get

\[ \textrm{Exp} \circ (\textrm{Cyc}_{\ne k} + U \cdot \textrm{Cyc}_k) \cong H_k \]

where \(H_k(n,i)\) is the set of all permutations of the set \(n\) having exactly \(i\) cycles of length \(k\).

That's my best attempt at a purely verbal explanation of this isomorphism. Now let's exploit it: it gives an equation between generating functions

\[ |H_k| = \exp \circ (|\textrm{Cyc}_{\ne k}| + |U| \cdot |\textrm{Cyc}_k|) \]

which will let us compute \(|H_k|\).

To proceed, notice that

\[ |\textrm{Cyc}_k| = \frac{(k-1)!}{k!} z^k = \frac{z^k}{k} \]

since there are \((k-1)!\) ways to cyclically order a \(k\)-element set. Similarly,

\[ \begin{array}{ccl} |\textrm{Cyc}_{\ne k}| &=& \displaystyle{ \sum_{i \ne k} \frac{z^i}{i} } \\ \\ &=& \displaystyle{ \ln\left(\frac{1}{1-z}\right) - \frac{z^k}{k} } \end{array} \]

The whole point of \(U\) is that

\[ |U| = u \]

So, we get

\[ | \textrm{Cyc}_{\ne k} + U \cdot \textrm{Cyc}_k | = \ln\left(\frac{1}{1-z}\right) + (u-1) \frac{z^k}{k} \]

In other words, this generating function is like that of \(\textrm{Cyc}\), the species of cyclic orderings:

\[ |\textrm{Cyc}| = \ln \left( \frac{1}{1-z} \right) \]

except that we've yanked out the \(z^k/k\) term and stuck it back in multiplied by \(u\). That's why it pays special attention to cycles of length \(k\).

From this we get

\[ \begin{array}{ccl} |H_k| &=& \exp \circ (|\textrm{Cyc}_{\ne k}| + |U| \cdot |\textrm{Cyc}_k|) \\ \\ &=& \displaystyle{ \exp \left( \ln \left( \frac{1}{1-z} \right) + (u-1) \frac{z^k}{k} \right) } \\ \\ &=& \displaystyle{ \frac{e^{(u-1)z^k/k}}{1-z} } \end{array} \]

Now to answer our puzzle we differentiate this with respect to \(u\), set \(u = 1\), and read off the coefficient of \(z^n\). That'll be the expected number of cycles of length \(k\) in a random permutation of an \(n\)-element set!

Here we go:

\[ \begin{array}{ccl} \displaystyle{ \left. \frac{\partial}{\partial u} |H_k|(z,u) \,\right|_{u=1} } &=& \displaystyle{ \left. \frac{\partial}{\partial u} \frac{e^{(u-1)z^k/k}}{1-z} \right|_{u=1} } \\ \\ &=& \displaystyle{ \frac{z^k/k}{1-z} } \\ \\ &=& \displaystyle{ \frac{z^k}{k} + \frac{z^{k+1}}{k} + \frac{z^{k+1}}{k} + \cdots } \end{array} \]

As long as \(n \ge k\), the coefficient of \(z^n\) in this power series is \(1/k\). But of course, we need \(k \ge 1\) for this whole calculation to make sense.

So whenever \(1 \le k \le n\), the expected number of cycles of length \(k\) in a random permutation of an \(n\)-element set is \(1/k\).

This is a little jewel of mathematics. It's equivalent to saying the probability that any given element lies on a cycle of length \(k\) is \(1/n\). I asked around on Twitter for a shorter proof, and Zeno Rogue gave one.

Theorem. Given a random permutation of an \(n\)-element set, any given point lies on a cycle of length \(k\) with probability \(1/n\), where \(1 \le k \le n\).

Proof. Number the points \(1\) through \(n\) and compute the probability that \(1\) lies on a cycle of length \(k\). To do this, encode each permutation as follows: list all its cycles, sorted by the smallest number in the cycle, each cycle ending with the smallest number in that cycle. E.g. \([627][8][1354]\) is encoded as \(35417628\). This encoding gives a bijection \(S_n \to S_n\). \(1\) is in cycle of length \(k\) iff the \(k\)th number in its code is \(1\). This happens with probability \(1/n\).   ■

The advantage of this is that it's so short and convincing! The disadvantage is that it seems to require a stroke of brilliance, while the approach using species is 'just a calculation' once you know the methods.

Here's another proof, offered by Sridhar Ramesh. It has the advantage of not being at all tricky while still being short. It has the disadvantage of not making it instantly evident that the probability is independent of \(k\).

Proof. Choose a particular point in an \(n\)-element set. How many permutations of this set have this point in a cycle of length precisely \(k\)? To choose such a permutation we have to choose the other \(k - 1\) elements in the cycle and put an ordering on them. Then we have to permute the remaining \(n - k\) elements. There are

\[ \binom{n - 1}{k - 1} \times (k - 1)! \times (n - k)! = (n-1)!\]

ways of doing this. Dividing by the total number of permutations of our \(n\)-element set we obtain the probability \(1/n\).   ■


You can also read comments on the n-Category Café, and make your own comments or ask questions there!


© 2019 John Baez
baez@math.removethis.ucr.andthis.edu
home