Random Permutations (Part 2)

## Random Permutations (Part 2)

#### John Baez

Here's a somewhat harder pair of problems on random permutations, with beautiful answers.

Puzzle 3. In the limit as $$n \to \infty$$, what is the probability that the shortest cycle in a randomly chosen permutation of an $$n$$-element set has length $$\gt k$$?

We mainly need to count the permutations of an $$n$$-element set whose shortest cycle has length $$\gt k$$. So, we define a functor

$\textrm{Perm}_{\gt k} \colon S \to \textrm{FinSet}$

from the groupoid of finite sets and bijections, $$S$$, to the category of finite sets, $$\textrm{FinSet}$$, sending any finite set $$X$$ to the finite set

$\textrm{Perm}_{\gt k}(X) = \{ \textrm{permutations of } X \textrm{ whose shortest cycle has length } \gt k \}$

A functor from $$S$$ to $$\textrm{FinSet}$$ is called a finite species, and this is what my course was about. We think of a finite species $$G : S \to \textrm{FinSet}$$ as a map sending any finite set $$X$$ to the set of '$$G$$-structures' we can put on $$X$$. Any finite species $$G$$ has a generating function

$|G|(x) = \sum_{n \ge 0} \frac{|G(n)|}{n!} x^n$

where in $$G(n)$$ we are using $$n$$ as our notation for some $$n$$-element set. Our basic strategy will be to compute the generating function of the species $$\textrm{Perm}_{\gt k}$$ and use that to count permutations of an $$n$$-element set whose shortest cycle has length $$k$$. But in fact, to save work, we will just work out the asymptotics as $$n \to \infty$$.

We can compose species: a $$G \circ H$$ structure on $$X$$ consists of a way of writing $$X$$ as a union of disjoint parts, putting a $$G$$-structure on the set of parts, and putting an $$H$$-structure on each part. We have

$|G \circ H| = |G| \circ |H|$

where at right we're composing generating functions.

So, if we can write $$\textrm{Perm}_{\gt k}$$ as a composite of two species whose generating functions we know, we can work out $$|\textrm{Perm}_{\gt k}|$$.

In fact $$\textrm{Perm}_{\gt k}$$ is the composite of two species called $$\textrm{Exp}$$ and $$\textrm{Cyc}_{\gt k}$$. The species $$\textrm{Exp}$$ is called 'being a finite set': we can put an $$\textrm{Exp}$$-structure on any finite set in exactly one way. The species $$\textrm{Cyc}_{\gt k}$$ is called 'being a cyclically ordered set of cardinality $$\gt k$$': we can put a $$\textrm{Cyc}_{\gt k}$$-structure on a finite set only if its cardinality is $$\gt k$$, and then a $$\textrm{Cyc}_{\gt k}$$-structure on it is just a cyclic ordering.

We have

$\textrm{Perm}_{\gt k} \cong \textrm{Exp} \circ \textrm{Cyc}_{\gt k}$

because a permutation with cycles of length $$\gt k$$ on a finite set $$X$$ is the same as a way of chopping $$X$$ into subsets of cardinality $$\gt k$$ and cyclically ordering each one. Thus, we have

$|\textrm{Perm}_{\gt k}| = |\textrm{Exp}| \circ |\textrm{Cyc}_{\gt k}|$

and we just need to compute the two generating functions on the right-hand side.

The generating function of $$\textrm{Exp}$$ is easy to compute:

$|\textrm{Exp}|(x) = \sum_{n \ge 0} \frac{1}{n!} x^n = \exp(x)$

and this explains the name $$\textrm{Exp}$$. The generating function of $$\textrm{Cyc}_{\gt k}$$ is

$|\textrm{Cyc}_{\gt k}|(x) = \sum_{n \gt k} \frac{(n-1)!}{n!} x^n = \sum_{n \gt k} \frac{x^n}{n}$

since there are $$(n-1)!$$ cyclic orderings on an $$n$$-element set. We have

$\sum_{n \ge 1} \frac{x^n}{n} = \ln\left(\frac{1}{1-x}\right)$

(just integrate the geometric series) so we get

$|\textrm{Cyc}_{\gt k}|(x) = \ln\left(\frac{1}{1-x}\right) - \left(x + \frac{x^2}{2} + \cdots + \frac{x^k}{k}\right)$

Since

$|\textrm{Perm}_{\gt k}| = |\textrm{Exp}| \circ |\textrm{Cyc}_{\gt k}|$

we get

$|\textrm{Perm}_{\gt k}|(x) = e^{ \ln\left(\frac{1}{1-x}\right) - \left(x + \frac{x^2}{2} + \cdots + \frac{x^k}{k}\right) }$

or simply

$|\textrm{Perm}_{\gt k}|(x) = \frac{e^{-\left(x + \frac{x^2}{2} + \cdots + \frac{x^k}{k}\right)}}{1-x}$

This is really a kind of formula for $$|\textrm{Perm}_{\gt k}(n)|$$, the number of permutations of an $$n$$-element set with shortest cycle of length $$\gt k$$. After all, by the definition of generating function, this equation says

$\sum_{n \ge 0} \frac{|\textrm{Perm}_{\gt k}(n)|}{n!} x^n = \frac{e^{-\left(x + \frac{x^2}{2} + \cdots + \frac{x^k}{k}\right)}}{1-x}$

Even better, the coefficient

$\frac{|\textrm{Perm}_{\gt k}(n)|}{n!}$

is the probability we're after: the probability that the shortest cycle in a randomly chosen permutation of an $$n$$-element set has length $$\gt k$$.

Unfortunately, working out these coefficients is hard.

Fortunately, we just want to know the limit of these coefficients as $$n \to \infty$$. And for this we can use a little result from complex analysis. I'll prove it later:

Lemma. Suppose

$f(z) = \sum_{n \ge 0} c_n z^n$

is a meromorphic function defined on a disc of radius $$\gt 1$$ centered at the origin, which is analytic on this disc except for a simple pole at $$z = 1$$. Then $$\lim_{n \to \infty} c_n$$ equals the residue of $$-f$$ at $$z = 1$$.

If you want to understand this lemma, I urge that you consider an example: the function $$1/(1-z)$$. The coefficients $$c_n$$ of its power series are all $$1$$, and it has a pole at $$z = 1$$ with residue $$-1$$. This is why we need to take the residue of $$-f$$ in the lemma. In fact, proving the lemma consists of reducing the general problem to this special case.

Given this lemma, the quantity we're trying to compute:

$\lim_{n \to \infty} \frac{|\textrm{Perm}_{\gt k}(n)|}{n!}$

is the residue of the function

$\frac{e^{-\left(z + \frac{z^2}{2} + \cdots + \frac{z^k}{k}\right)}}{z-1}$

at the point $$z = 1$$. This function is just minus the generating function of $$\textrm{Perm}_{\gt k}$$, but I've changed $$x$$ to $$z$$ to emphasize that now we're thinking of it as a function on the complex plane. It's actually analytic everywhere except $$z = 1$$, and since $$1/(z-1)$$ has a simple pole of residue $$1$$ at this point, the residue of

$\frac{e^{-\left(z + \frac{z^2}{2} + \cdots + \frac{z^k}{k}\right)}}{z-1}$

at $$z = 1$$ is just the numerator evaluated at $$z = 1$$, namely

$e^{-\left(1 + \frac{1}{2} + \cdots + \frac{1}{k}\right)}$

So this number is the probability that the shortest cycle in a randomly chosen permutation of an $$n$$-element set has length $$\gt k$$, in the $$n \to \infty$$ limit.

This is beautiful, I think. And it leads immediately to another beautiful fact.

Puzzle 4. Give a simpler asymptotic formula for the above probability as $$k \to \infty$$.

Here we use the fact that

$1 + \frac{1}{2} + \cdots + \frac{1}{k} = \ln k + \gamma + o(1)$

where $$\gamma \approx 0.57721$$ is Euler's constant. So, as $$k \to \infty$$, the probability we've just computed is

$e^{-\left(1 + \frac{1}{2} + \cdots + \frac{1}{k}\right)} \sim e^{-\left(\ln k + \gamma \right)} = \frac{1}{e^\gamma \, k}$

When Euler discovered his constant $$\gamma$$ in his 1776 paper "On a memorable number naturally occurring in the summation of the harmonic series", he voiced his suspicion that $$e^\gamma$$ would be interesting... and here is some evidence that it is!

By the way, I learned about these questions here:

They were first answered in 1952:

To wrap up, let me prove that lemma. It's just a special case of a much more general useful result on the asymptotic behavior of the coefficients of a power series:

• Philippe Flajolet and Robert Sedgewick, Analytic Combinatorics, Cambridge University Press, Cambridge, 2009.

But once you've seen the proof of this special case, it's easy to generalize.

Lemma. Suppose

$f(z) = \sum_{n \ge 0} c_n z^n$

is a meromorphic function defined on a disc of radius $$\gt 1$$ centered at the origin, which is analytic on this disc except for a simple pole at $$z = 1$$. Then $$\lim_{n \to \infty} c_n$$ equals the residue of $$-f$$ at $$z = 1$$.

Proof. The idea is to 'subtract off the singularity'. We can write

$f(z) = g(z) + \frac{a}{1-z}$

where $$g$$ is analytic on a disc of radius $$\gt 1$$ and $$a$$ is the residue of $$-f$$ at $$z = 1$$. If we write

$g(z) = \sum_{n \ge 0} d_n z^n$

then we have

$c_n = d_n + a$

for all $$n$$. But the Cauchy–Hadamard theorem says that

$\limsup_{n \to \infty} |d_n|^{1/n} = 1/R$

where $$R$$ is the radius of convergence of $$\sum_{n \ge 0} d_n z^n$$, and since $$R \gt 1$$ we have $$d_n \to 0$$ as $$n \to \infty$$. Thus

$c_n \to a$

as was to be shown.   ■