Random Permutations (Part 11)

December 24, 2019

Random Permutations (Part 11)

John Baez

I think I'm closing in on a good understanding of random permutations using species and groupoid cardinality. Today I want to use this approach to state and prove a categorified version of the Cycle Length Lemma, which is Lemma 2.1 here:

Kevin Ford, Anatomy of Integers and Random Permutations — Course Lecture Notes.

This is a grand generalization of the result I proved in Part 9.

Let me state the Cycle Length Lemma before I categorify it and prove it. I'll say it a bit differently than Ford does.

In my understanding of random permutations, a key concept is the function

\[ C_k \colon S_n \to \mathbb{N} \]

which counts, for any permutation of \(n\) things, how many cycles of length \(k\) it has.

Another key concept is the \(p\)th falling power of some number \(x\):

\[ x^{\underline{p}} = x(x-1)(x-2) \, \cdots \, (x-p+1) \]

If \(x\) is a natural number, and you have \(x\) things, then \(x^{\underline{p}}\) is the number of ordered \(p\)-tuples of distinct things.

Putting these together we get

\[ C_k^{\underline{p}} \colon S_n \to \mathbb{N} \]

This counts, for any permutation of \(n\) things, how many ordered \(p\)-tuples of distinct cycles of length \(k\) it has. And note that we can replace the word 'distinct' here by 'disjoint', without changing the meaning, since distinct cycles can't overlap.

There's a really simple formula for the average of this lovable quantity \(C_k^{\underline{p}}\). I proved it in Part 9, and again using groupoid cardinality last time.

The Cycle Length Lemma goes further: it tells us not only the average value of \(C_k^{\underline{p}}\), but also the average of a product of quantities like this. To act like probability theorists let's write the average using an \(E\), which stands for 'expected value'.

Cycle Length Lemma. Suppose \(p_1 , \dots, p_n \in \mathbb{N}\). Then
\[ E\left( \prod_{k=1}^n C_k^{\underline{p}_k} \right) = \left\{ \begin{array}{ccc} \displaystyle{ \prod_{k=1}^n \frac{1}{k^{p_k}} } & & \textrm{if} \; p_1 + 2p_2 + \cdots + n p_n \le n \\ \\ 0 & & \textrm{otherwise} \end{array} \right. \]

Typing this in, I remember how obscure this looks at first! After having thought about it for several weeks, it seems beautiful to me now.

I guess to love it you need to prove it, or at least think about it. For any permutation of an \(n\)-element set,

\[ \prod_{k=1}^n C_k^{\underline{p}_k} \]

counts the number of ways to choose an ordered \(p_1\)-tuple of disjoint cycles of length \(1\), an ordered \(p_2\)-tuple of disjoint cycles of length \(2\), and so on. All these cycles are disjoint, so there's not enough room to fit them all into your \(n\)-element set if

\[ p_1 + 2 p_2 + 3 p_3 + \cdots + n p_n \gt n \]

Thus, when this inequality holds we get

\[ E\left( \prod_{k=1}^n C_k^{\underline{p}_k} \right) = 0 \]

The interesting case is the other case, where

\[ p_1 + 2 p_2 + 3 p_3 + \cdots + n p_n \le n \]

Now all our cycles fit inside our set, so we don't get zero. What we get is very simple: a product of factors of \(1/k\), one for each cycle of length \(k\) that we're choosing:

\[ E\left( \prod_{k=1}^n C_k^{\underline{p}_k} \right) = \prod_{k=1}^n \frac{1}{k^{p_k}} \]

Ultimately these factors arise for a simple reason: the probability that a random permutation of \(k\) elements has a single cycle is \(1/k\).

This formula implies a certain lack of 'correlation' between the functions \(C_k^{\underline{p}_k}\). Namely:

\[ E\left( \prod_{k=1}^n C_k^{\underline{p}_k} \right) = \prod_{k=1}^n E( C_k^{\underline{p}_k}) \]

This might be surprising at first. You might think that having a bunch of large cycles in a permutation would reduce the probability of there being other large cycles. But this doesn't happen... until your large cycles actually prohibit the existence of other large cycles, which happens when

\[ p_1 + 2 p_2 + 3 p_3 + \cdots + n p_n \gt n \]

Okay, so let's prove the Cycle Length Lemma. We'll assume

\[ p_1 + 2 p_2 + 3 p_3 + \cdots + n p_n \le n \]

because the other case is easy. To compute

\[ E\left( \prod_{k=1}^n C_k^{\underline{p}_k} \right) \]

we just need to:

go through all permutations \(\sigma \in S_n\),
for each one count all ways to choose an ordered \(p_1\)-tuple of disjoint cycles of length \(1\), an ordered \(p_2\)-tuple of disjoint cycles of length \(2\), and so on,
add up all these counts and divide by \(n!\).

But the resulting number is a groupoid cardinality! There's a groupoid \(A\) where an object consists of

a set \(X\) with \(n\) elements equipped with
a permutation \(\sigma \colon X \to X\) and
a choice, for \(\sigma\), of an ordered \(p_1\)-tuple of disjoint cycles of length \(1\), an ordered \(p_2\)-tuple of disjoint cycles of length \(2\), and so on.

The morphisms in \(A\) are structure-preserving bijections. The argument in Part 10 easily generalizes to show

\[ E\left( \prod_{k=1}^n C_k^{\underline{p}_k} \right) = |A| \]

There's really no calculation here, just a bit of thought.

Next, note that \(A\) is equivalent to another groupoid \(A'\), where an object consists of:

an ordered \(p_1\)-tuple of cyclically ordered sets of cardinality 1, an ordered \(p_2\)-tuple of cyclically ordered sets of cardinality 2, and so on... up to a \(p_n\)-tuple of cyclically ordered sets of cardinality \(n\), and
a set \(Y\) of cardinality \(n - p_1 - 2p_2 - \cdots - n p_n\) equipped with a permutation \(\sigma \colon Y \to Y\).

A morphism in \(A'\) is a bunch of structure-preserving bijections.

So, we have

\[ |A| = |A'| \]

But \(A'\), in turn, is equivalent to a product of groupoids:

\[ A' \simeq \mathrm{Perm}(n - p_1 - 2p_2 - \cdots - n p_n) \; \times \; \prod_{k = 1}^n B(\mathbb{Z}/k)^{p_k} \]

Here \(B(\mathbb{Z}/k)\) is the one-object groupoid corresponding to the group \(\mathbb{Z}/k\). This is equivalent to the groupoid of cyclically ordered sets of cardinality \(k\); that's why it shows up. \(\mathrm{Perm}(m)\) is the groupoid of \(m\)-element sets equipped with a permutation. \(\mathrm{Perm}(n - p_1 - 2p_2 - \cdots - n p_n)\) shows up here because of the set I called \(Y\) above.

Last time I computed the cardinalities of these groupoids:

\[ |B(\mathbb{Z}/k)| = \frac{1}{k} \]

while

\[ |\mathrm{Perm}(m)| = 1 \]

So, we get

\[ \begin{array}{ccl} \displaystyle{ E\left( \prod_{k=1}^n C_k^{\underline{p}_k} \right) } &=& \displaystyle{ \left| \mathrm{Perm}(n - p_1 - 2p_2 - \cdots - n p_n) \; \times \; \prod_{k=1}^n B(\mathbb{Z}/k)^{p_k} \right| } \\ \\ &=& \displaystyle{ \prod_{k=1}^n \frac{1}{k^{p_k}} } \end{array} \]

Voilà! We've proved the Cycle Length Lemma.

But even better, we've categorified it. The lemma is an equation between numbers, but we've exhibited these numbers as cardinalities of groupoids, and shown these groupoids are equivalent. So, the Cycle Length Lemma is really just a corollary of a stronger yet more obvious fact!

Categorified Cycle Length Lemma. Suppose \(p_1 , \dots, p_n \in \mathbb{N}\). Let \(A\) be the groupoid of \(n\)-element sets equipped with a permutation that is equipped with a choice of an ordered \(p_1\)-tuple of disjoint cycles of length \(1\), an ordered \(p_2\)-tuple of disjoint cycles of length \(2\), and so on. Then
\[ A \simeq \mathrm{Perm}(n - p_1 - 2p_2 - \cdots - n p_n) \; \times \; \prod_{k = 1}^n B(\mathbb{Z}/k)^{p_k} \]

By the way, let's define \(\mathrm{Perm}(n - p_1 - 2p_2 - \cdots - n p_n)\) to be the empty groupoid when \(p_1 + 2p_2 + \cdots + n p_n \gt n\). Then the lemma holds even in that case.

The case where \(p_1 + 2p_2 + \cdots + n p_n = n\) is especially pretty, since then our chosen cycles completely fill up our \(n\)-element set and we have \[ A \simeq \prod_{k = 1}^n B(\mathbb{Z}/k)^{p_k} \]

Taking the groupoid cardinality of both sides, we get a special case of the Cycle Length Lemma that was discovered by Cauchy in his research on permutations — maybe in one of his January 1815 papers. It's sometimes called Cauchy's Formula, though this guy had a lot of formulae.

You can also read comments on the n-Category Café, and make your own comments or ask questions there!