For a more formal introduction to the subject, try these:
If we let this story lead us where it wants to go, we'll meet
all sorts of famous and fascinating creatures, such as:
However, the charm of the tale is how many of these ideas are unified
and made simpler thanks to a big, simple idea: groupoidification.
So, what's groupoidification? It's a method of exposing the combinatorial
underpinnings of linear algebra - the hard bones of set theory underlying
the flexibility of the continuum.
Linear algebra is all about vector spaces and linear maps. One of the
lessons that gets drummed into you when you study this subject is that
it's good to avoid picking bases for your vector spaces until you need
them. It's good to keep the freedom to do coordinate transformations...
and not just keep it in reserve, but keep it manifest!
As Hermann Weyl wrote, "The introduction of a coordinate system
to geometry is an act of violence".
This is a deep truth, which hits many physicists when they study special
and general relativity. However, as Niels Bohr quipped, a deep truth is one
whose opposite is also a deep truth. There are some situations where
a vector space comes equipped with a god-given basis. Then it's foolish
not to pay attention to this fact!
The most obvious example is when our vector space has been defined
to consist of formal linear combinations of the elements of some set.
Then this set is our basis.
This often happens when we use linear algebra to study combinatorics.
But if sets give vector spaces, what gives linear operators? Your
first guess might be functions. And indeed, functions between sets
do give linear operators between their vector spaces. For example,
suppose we have a function
f: {livecat, deadcat} → {livecat, deadcat}
which "makes sure the cat is dead":
f(livecat) = deadcat
f(deadcat) = deadcat
Then, we can extend f to a linear operator defined on formal
linear combinations of cats:
F(a livecat + b deadcat) = a deadcat + b deadcat
Written as a matrix in the {livecat, deadcat} basis, this looks like
So, we get some linear operators from functions... but not all!
We only get operators whose matrices have exactly a single 1 in
each column, the rest of the entries being 0. That's because a
function f: X → Y sends each element of X to a single element of Y.
This is very limiting. We can do better if we get operators from
relations between sets. In a relation between sets X and Y,
an element of X can be related to any number of elements of Y, and
vice versa. For example, let the relation
R: {1,2,3,4} —/→ {1,2,3,4}
be "is a divisor of". Then 1 is a divisor of everybody, 2 is a
divisor of itself and 4, 3 is only a divisor of itself, and 4 is
only a divisor of itself. We can encode this in a matrix:
We can get any matrix of 0's and 1's this way. Relations are really
just matrices of truth values. We're thinking of them as matrices of
numbers. Unfortunately we're still far from getting all matrices
of numbers!
We can do better if we get matrices from spans of sets. A span of
sets, written S: X —/→ Y, is just a set S equipped with functions to
X and Y. We can draw it like this:
Spans of sets are like relations, but where you can be related to
someone more than once!
For example, X could be the set of Frenchman and Y could be the set of
Englishwomen. S could be the set of Russians. As you know, every
Russian has exactly one favorite Frenchman and one favorite
Englishwoman. So, F could be the function "your favorite
Frenchman", and G could be "your favorite Englishwoman".
Then, given a Frenchman x and an Englishwoman y, they're related by
the Russian s whenever s has x as their favorite Frenchman and y as
their favorite Englishwoman:
F(s) = x and G(s) = y.
Some pairs (x,y) will be related by no Russians, others will be related
by one, and others will be related by more than one! I bet the pair
(x,y) = (Gérard Depardieu, Emma Thompson)
is related by at least 57 Russians.
This idea lets us turn spans of sets into matrices of natural numbers.
Given a span of finite sets:
F(s) = x and G(s) = y.
We can get any finite-sized matrix of natural numbers this way.
Even better, there's a way to "compose" spans that nicely
matches the usual way of multiplying matrices. You can figure this
out yourself if you solve this puzzle:
To go further, we need to consider spans, not of sets, but of groupoids!
I'll say more about this later - I suspect you're getting tired.
But for now, briefly: a groupoid is a category with inverses. Any
group gives an example, but groupoids are more general - they're the
modern way of thinking about symmetry.
There's a way to define the cardinality of a finite groupoid:
12) John Baez and James Dolan, From finite sets to Feynman diagrams,
in Mathematics Unlimited - 2001 and Beyond, vol. 1, eds. Bjorn Engquist
and Wilfried Schmid, Springer, Berlin, 2001, pp. 29-50. Also available
as math.QA/0004133.
And, this can equal any nonnegative rational number! This lets us
generalize what we've done from finite sets to finite groupoids, and
get rational numbers into the game.
A span of groupoids is a diagram
There's also a way of composing spans of groupoids, which corresponds
to multiplying matrices. However, there's a trick involved in getting
this to work - I'll have to explain this later. For details, try:
13) Jeffrey Morton, Categorified algebra and quantum mechanics,
Theory and Application of Categories 16 (2006), 785-854. Available
at
http://www.emis.de/journals/TAC/volumes/16/29/16-29abs.html;
also available as math.QA/0601458.
14) Simon Byrne, On Groupoids and Stuff, honors thesis, Macquarie
University, 2005, available at
http://www.maths.mq.edu.au/~street/ByrneHons.pdf and
http://math.ucr.edu/home/baez/qg-spring2004/ByrneHons.pdf
Anyway: the idea of "groupoidification" is that in many cases where
mathematicians think they're playing around with linear operators
between vector spaces, they're actually playing around with spans of
groupoids!
This is especially true in math related to simple Lie groups, their
Lie algebras, quantum groups and the like. While people usually study
these gadgets using linear algebra, there's a lot of combinatorics
involved - and where combinatorics and symmetry show up, one invariably
finds groupoids.
As the name suggests, groupoidification is akin to categorification.
But, it's a bit different. In categorification, we try to boost up
mathematical ideas this way:
In groupoidification, we try this:
Actually, it's "decategorification" and
"degroupoidification" that are systematic processes. These
processes lose information, so there's no systematic way to reverse
them. But, as I explained in "week99", it's still fun to try! If we succeed,
we discover an extra layer of structure beneath the math we thought we
understood... and this usually makes that math clearer and
less technical, because we're not seeing it through a blurry,
information-losing lens.
I don't want this to be accessible only to experts, since a bunch of
it is so wonderfully elementary. So, I'm going to proceed rather slowly.
This may make the experts impatient, so near the end I'll zip ahead and
sketch out a bit of the big picture.
Last time I introduced spans of sets. A span of sets is just a set S
equipped with functions to X and Y:
Have you heard how computer scientists use the term
"witness"? They say the number 17 is a "witness"
to the fact that the number 221 isn't prime, since 17 evenly divides
221.
That's the idea here. Given a span S as above, we can say an element
x of X and an element y of Y are "related" if there's an
element s of S with
F(s) = x and G(s) = y
The element s is a "witness" to the relation.
Last time,
I gave an example where a Frenchman x and an Englishwoman y
were related if they were both the favorites of some Russian s.
Note: there's more information in the span than the relation it
determines. The relation either holds or fails to hold. The span
does more: it provides a set of "witnesses". The relation holds
if this set of witnesses is nonempty, and fails to hold if it's empty.
At least, that's how mathematicians think. When
I got married last month, I discovered the state of California demands
two witnesses attend the ceremony and sign the application for a
marriage license. Here the relation is "being married", and the
witnesses attest to that relation - but for the state, one witness
is not enough to prove that the relation holds! They're using a more
cautious form of logic.
To get the really interesting math to show up, we need to look at
other examples of "witnessed relations" - not involving Russians
or marriages, but geometry and symmetry.
For example, suppose we're doing 3-dimensional geometry. There's a
relation "the point x and the line y lie on a plane",
but it's pretty dull, since it's always true. More interesting is the
witnessed relation "the point x and the line y lie on the plane
z". The reason is that sometimes there will be just one plane
containing a point and a line, but when the point lies on the line,
there will be lots.
To think of this "witnessed relation" as a span
Can we take S to be the set of planes? No! Then there would be no way
to define the functions F and G, because the same plane contains lots of
different points and lines. So, we should take S to be the set of
triples (x,y,z) where x is a point, y is a line, and z is a plane
containing x and y. Then we can take
F(x,y,z) = x
and
G(x,y,z) = y
A "witness" to the fact that x and y lie on a plane is not just a
plane containing them, but the entire triple.
(If you're really paying attention, you'll have noticed that we need to
play the same trick in the example of witnesses to a marriage.)
Spans like this play a big role in "incidence geometry". There are
lots of flavors of incidence geometry, with "projective geometry"
being the most famous. But, a common feature is that we always have
various kinds of "figures" - like points, lines, planes, and so on.
And, we have various kinds of "incidence relations" involving these
figures. But to really understand incidence geometry, we need to
go beyond relations and use spans of sets.
Actually, we need to go beyond spans of sets and use spans of
groupoids! The reason is that incidence geometries usually have
interesting symmetries, and a groupoid is like a "set with
symmetries". For example, consider lines in 3-dimensional space.
These form a set, but there are also symmetries of 3-dimensional space
mapping one line to another. To take these into account we need a
richer structure: a groupoid!
Here's the formal definition: a groupoid consists of a set of
"objects", and for any objects x and y, a set of
"morphisms"
f: x → y
which we think of as symmetries taking x to y. We can compose a
morphism f: x → y and a morphism g: y → z to get a morphism
fg: x → z. We think of fg as the result of doing first f and
then g. So, we demand the associative law
(fg)h = f(gh)
whenever either side is well-defined. We also demand that every object
x has an identity morphism
1x: x → x
We think of this as the symmetry that doesn't do anything to x.
So, given any morphism f: x → y, we demand that
f 1y = f = 1x f
So far this is the definition of a "category". What makes
it a "groupoid" is that every morphism f: x → y has an
"inverse"
f -1: y → x
with the property that
f f -1 = 1x
and
f -1 f = 1y
In other words, we can "undo" any symmetry.
So, in our spans from incidence geometry:
What's a functor? Given groupoids A and B, clearly a functor
F: A → B
should send any object x in A to an object F(x) in B. But also, it
should send any morphism in A:
f: x → y
to a morphism in B:
F(f): F(x) → F(y)
And, it should preserve all the structure that a groupoid has,
namely composition:
F(fg) = F(f) F(g)
and identities:
F(1x) = 1F(x)
It then automatically preserves inverses too:
F(f -1) = F(f) -1
Given this, what's the meaning of a span of groupoids? You could say
it's a "invariant" witnessed relation - that is, a relation
with witnesses that's preserved by the symmetries at hand.
These are the very essence of incidence geometry. For example, if we
have a point and a line lying on a plane, we can rotate the whole
picture and get a new point and a new line lying on a new plane.
Indeed, a "symmetry" in incidence geometry is precisely
something that preserves all such "incidence relations".
For those of you not comfy with groupoids, let's see how this actually
works. Suppose we have a span of groupoids:
F(s) = x and G(s) = y
Also suppose we have a symmetry sending s to some other object of S:
f: s → s'
This gives morphisms
F(f): F(s) → F(s')
in X and
G(f): G(s) → G(s')
in Y. And if we define
F(s') = x' and G(s') = y'
we see that s' is a witness to the fact that x' and y' are related.
Let me summarize the Tale so far:
From all this, you should begin to vaguely see that starting from any sort
of incidence geometry, we should be able to get a bunch of matrices.
Facts about incidence geometry will give facts about linear algebra!
"Groupoidification" is an attempt to reverse-engineer this process.
We will discover that lots of famous facts about linear algebra are
secretly facts about incidence geometry!
To prepare for what's to come, the maniacally diligent reader might
like to review "week178",
"week180", "week181", "week186" and "week187", where I explained how any Dynkin
diagram gives rise to a flavor of incidence geometry. For example,
the simplest-looking Dynkin diagrams, the An series, like
this for n = 3:
(There will also be other ways to get spans of groupoids, that don't
quite fit into what's customarily called "incidence
geometry", but still fit very nicely into our Tale. For example,
Dynkin diagrams become "quivers" when we give each edge a
direction, and the "groupoid of representations of a quiver"
gives rise to linear-algebraic structures related to a quantum group.
In fact, I already mentioned this in item E of "week230". Eventually this will let us
groupoidify the whole theory of quantum groups! But, I don't want to
rush into that, since it makes more sense when put in the right
context.)
I'm telling a long story about symmetry, geometry,
and algebra. Some of this tale is new work done by James Dolan, Todd
Trimble and myself. But a lot of it is old work by famous people which
deserves a modern explanation.
A great example is Felix Klein's "Erlangen program" - a plan for
reducing many sorts of geometry to group theory. Many people tip
their hat to the Erlanger program, but few seem to deeply understand
it, and even fewer seem to have read what Klein actually wrote about it!
The problem goes back a long ways. In 1871, while at Göttingen,
Klein worked on non-Euclidean geometry, and showed that hyperbolic
geometry was consistent if and only if Euclidean geometry was.
In the process, he must have thought hard about the role of symmetry
groups in geometry. When he was appointed professor at Erlangen
in 1872, he wrote a lecture outlining his "Erlanger Programm" for
reducing geometry to group theory.
But, he didn't actually give this lecture as his inaugural speech!
He spoke about something else.
So, nobody ever heard him announce the Erlangen program. And, until
recently, the lecture he wrote was a bit hard to find. Luckily, now
you can get it online:
1) Felix Klein, Vergleichende Betrachtungen ueber neuere geometrische
Forschungen, Verlag von Andreas Deichert, Erlangen, 1872. Also
available at the University of Michigan Historical Mathematics Collection,
http://www.hti.umich.edu/cgi/t/text/text-idx?c=umhistmath;idno=ABN7632
Even better, Johan Ernst Mebius has recently prepared an HTML version,
with links to the above version:
2) Johan Ernst Mebius, Felix Klein's Erlanger Programm,
http://www.xs4all.nl/~jemebius/ErlangerProgramm.htm
But what if you have the misfortune of only reading English, not
German? Until now the only translation was quite hard to obtain:
3) Felix Klein, A comparative review of recent researches in geometry,
trans. M. W. Haskell, Bull. New York Math. Soc. 2, (1892-1893), 215-249.
In case you're wondering, the "New York Mathematical
Society" no longer exists! It was founded in 1888, but in 1894
it went national and became the American Mathematical Society.
Luckily, after Thomas Love pointed out the existence of this old
translation, Chris Hillman was able to get ahold of it and scan it in!
Then Robin Houston created a PDF file of the whole thing, and Lukas-Fabian
Moser created a DjVu file. Then Nitin C. Rughoonauth took the marvelous
step of putting it into LaTeX! So now, you can read Klein's paper in English
here:
4) The Erlangen program, http://math.ucr.edu/home/baez/erlangen/
English-speakers can read more about the Erlangen program here:
5) Felix Klein, Elementary Mathematics from an Advanced Standpoint:
Geometry, part 3: Systematic discussion of geometry and its foundations,
Dover, New York, 1939.
Luckily Dover keeps its books in print!
For more on the Erlangen program, try these:
6) Garrett Birkhoff and M. K. Bennett, Felix Klein and his
"Erlanger Programm", in History and Philosophy of Modern
Mathematics, eds. W. Aspray and P. Kitcher, Minnesota
Stud. Philos. Sci. XI, University of Minnesota Press, Minneapolis,
1988, pp. 145-176.
7) Hans A. Kastrup, The contributions of Emmy Noether, Felix Klein and
Sophus Lie to the modern concept of symmetries in physical systems,
in Symmetries in Physics (1600-1980), ed. M. G. Doncel, World Scientific,
Singapore, 1987, pp. 113-163.
8) I. M. Yaglom, Felix Klein and Sophus Lie: Evolution of the
Idea of Symmetry in the Nineteenth Century, trans. S. Sossinsky,
Birkhauser, Boston, 1988.
For more about Klein, try "week213" and this little biography:
9) MacTutor History of Mathematics Archive, Felix Klein,
http://www-history.mcs.st-andrews.ac.uk/Biographies/Klein.html
But what does the Erlangen program actually amount to, in the language
of modern mathematics? This will take a while to explain, so the best
thing is to dive right in.
In the last episode of the
Tale of Groupoidification I tried to explain two slogans:
They're a bit vague; they're mainly designed to give you enough intuition
to follow the next phase of the Tale, which is all about how:
But before the next phase, I need to say a bit about how groupoids
and spans of groupoids fit into Klein's Erlangen program.
Groupoids are a modern way to think about symmetries. A more
traditional approach would use a group acting as symmetries of
some set. And the most traditional approach of all, going back to
Galois and Klein, uses a group acting transitively on a set.
So, let me explain the traditional approach, and then relate it
to the modern one.
I hope you know what it means for a group G to "act" on a set X.
It means that for any element x of X and any guy g in G, we get a
new element gx in X. We demand that
1x = x
and
g(hx) = (gh)x.
More precisely, this is a "left action" of G on X, since we write
the group elements to the left of x. We can also define right actions,
and someday we may need those too.
We say an action of a group G on a set X is "transitive" if given
any two elements of X, there's some guy in G mapping the first
element to the second. In this case, we have an isomorphism of sets
X = G/H
for some subgroup H of G.
For example, suppose we're studying a kind of geometry where the symmetry
group is G. Then X could be the set of figures of some sort: points, or
lines, or something fancier. If G acts transitively on X, then all
figures of this sort "look alike": you can get from any one to any other
using a symmetry. This is often the case in geometry... but not always.
Suppose G acts transitively on X. Pick any figure x of type X and let
H be its "stabilizer": the subgroup consisting of all guys in G that map
x to itself. Then we get a one-to-one and onto map
f: X → G/H
sending each figure gx in X to the equivalence class [g] in G/H.
If you haven't seen this fact before, you should definitely prove it -
it's one of the big ways people use symmetry!
Here's one kind of thing people do with this fact. The 3d rotation
group G = SO(3) acts on the sphere X = S2, and the stabilizer of
the north pole is the 2d rotation group H = SO(2), so the sphere is
isomorphic to G/H = SO(3)/SO(2). The same sort of result holds in any
dimension, and we can use it to derive facts about spheres from facts
about rotation groups, and vice versa.
A grander use of this fact is to set up a correspondence between
sets on which G acts transitively and subgroups of G. This is
one of the principles lurking behind Galois theory.
Galois applied this principle to number theory - see "week201" for details. But, it really has
nothing particular to do with number theory! In his Erlangen program,
Klein applied it to geometry.
Klein's goal was to systematize a bunch of different kinds of non-Euclidean
geometry. Each kind of geometry he was interested in had a different
group of symmetries. For example:
The details here don't matter much yet; the point is that there are lots
of interesting kinds of geometry, with interesting symmetry groups!
Klein realized that in any kind of geometry like this, a "type of
figure" corresponds to a set on which G acts transitively. Here
a "figure" could be a point, a line, a plane, or something
much fancier. Regardless of the details, the set of all figures of
the same type can be written as G/H, and G acts transitively on this set.
The really cool part is that we can use Klein's idea to
define a geometry for any group G. To do this, we just say
that every subgroup H of G gives rise to a type of figure. So, we
work out all the subgroups of G. Then, we work out all the incidence
relations - relations like "a point lies on a line". To do
this, we take two sets of figures, say
X = G/H
and
Y = G/K
and find all the invariant relations between them: that is, subsets of
X × Y preserved by all the symmetries. I'll say more about how to
do this next time - we can use something called "double
cosets". In nice cases, like when G is a simple Lie group and H
and K are so-called "parabolic" subgroups, these let us express
all the invariant relations in terms of finitely many
"atomic" ones! So, we can really carry out Klein's program
of thoroughly understanding geometry starting from groups - at least
in nice cases.
In short, group actions - especially transitive ones - are a traditional
and very powerful way of using symmetry to tackle lots of problems.
So, to bridge the gap between the traditional and the new, I should
explain how group actions give groupoids. I'll show you that:
It's not very hard to get a groupoid from a group action. Say we have
a group G acting on a set X. Then the objects of our groupoid are
just elements of X, and a morphism
g: x → y
is just a group element g with
gx = y.
Composing morphisms works the obvious way - it's basically just
multiplication in the group G.
Some people call this groupoid an "action groupoid". I often call
it the "weak quotient" X//G, since it's like the ordinary quotient
X/G, but instead of declaring that x and y are equal when we have
a group element g sending x to y, we instead declare they're
isomorphic via a specified isomorphism g: x → y.
But for now, let's call X//G the "action groupoid".
So, group actions give action groupoids. But, these groupoids come
with extra stuff!
First of all, the action groupoid X//G always comes equipped with a
functor
Second of all, this functor p is always "faithful": given
two morphisms from x to y, if p maps them to the same morphism, then
they were equal.
And that's all! Any groupoid with a faithful functor to G is
equivalent to the action groupoid X//G for some action of G on some
set X. This takes a bit of proving... let's not do it now.
So: in my slogan
the "certain extra stuff" was precisely a faithful functor to G.
What if we have a transitive group action? Then something nice
happens.
First of all, saying that G acts transitively on X is the same as
saying there's a morphism between any two objects of X//G. In
other words, all objects of X//G are isomorphic. Or in other words,
there's just one isomorphism class of objects.
Just as a groupoid with one object is a group, a groupoid with one
isomorphism class of objects is equivalent to a group. Here
I'm using the usual notion of "equivalence" of categories, as
explained back in "week76".
So, G acts transitively on X precisely when X//G is equivalent
to a group!
And what group? Well, what could it possibly be? It's just the
stabilizer of some element of X! So, in the case of a transitive
group action, our functor
So, we see how Klein's old idea of geometrical figures as
subgroups of G is being generalized. We can start with any groupoid Y
of "figures" and "symmetries between figures", and
play with that. It becomes an action groupoid if we equip it with a
faithful functor to some group G:
So, when we blend our ideas on spans of groupoids with Klein's ideas,
we'll want to use spans of groupoids "over G" - that is,
commutative diamonds of groupoids and functors, like this:
I'll say one last thing before quitting. It's a bit more technical,
but I feel an urge to see it in print.
People often talk about "the" stabilizer group of a transitive action
of some group G on some set X. This is a bit dangerous, since every
element of X has its own stabilizer, and they're not necessarily all equal!
However, they're all conjugate: if the stabilizer of x is H,
then the stabilizer of gx is gHg-1.
So, when I say above that
Of course, I'll try to write later issues in a way that doesn't force
you to have understood all these nuances!
Though Newton seems to have believed in some form of "absolute
space", the idea that motion is relative predates Einstein by a
long time. In 1632, in his Dialogue Concerning the Two Chief World Systems,
Galileo wrote:
When you have observed all these things carefully (though doubtless
when the ship is standing still everything must happen in this way),
have the ship proceed with any speed you like, so long as the motion
is uniform and not fluctuating this way and that. You will discover
not the least change in all the effects named, nor could you tell
from any of them whether the ship was moving or standing still.
(t, x, y, z) |→ (t, x + vt, y, z)
By the time Maxwell came up with his equations describing light,
the idea of relativity of motion was well established. In 1876,
he wrote:
As Einstein wrote in 1905:
Nowadays this is called the "Galilei group":
With special relativity, the relevant group became the "Poincare
group":
It's still 10-dimensional, not any bigger. But, it acts differently
as transformations of the spacetime coordinates (t,x,y,z).
Another thing that changed was our appreciation of the importance
of symmetry! Before the 20th century, group theory was not in the
toolkit of most theoretical physicists. Now it is.
Okay. Now suppose you're the only thing in the universe, floating in
empty space, not rotating. To make your stay in this thought experiment
a pleasant one, I'll give you a space suit. And for simplicity, suppose
special relativity holds true exactly, with no gravitational fields
to warp the geometry of spacetime.
Would the universe be any different if you were moving at constant
velocity? Or translated 2 feet to the left or right? Or turned
around? Or if it were one day later?
No! Not in any observable way, at least! It would seem exactly
the same.
So in this situation, it doesn't really make much sense to say
"where you are", or "which way you're facing", or
"what time it is". There are no "invariant
propositions" to make about your location or motion. In other
words, there's nothing to say whose truth value remains unchanged
after you apply a symmetry.
Well, almost nothing to say! The logicians in the crowd will
note that you can say "T": the tautologously true statement.
You can also say "F": the tautologously false statement.
But, these aren't terribly interesting.
Next, suppose you have a friend also floating through space. Now
there are more interesting invariant propositions. There's nothing
much invariant to say about just you, and nothing to say about just your
friend, but there are invariant relations. For example, you
can measure your friend's speed relative to you, or your distance of
closest approach.
Mathematicians study invariant relations using a tool called "double
cosets". I want to explain these today, since we'll need them soon
in the Tale of Groupoidification.
"Double cosets" sound technical, but that's just to keep timid
people from understanding the subject. A double coset is secretly
just an "atomic" invariant relation: one that can't be
expressed as "P or Q" where P and Q are themselves invariant
relations - unless precisely one of P or Q is tautologously false.
So, atomic invariant relations are like prime numbers: they can't
be broken down into simpler bits. And, as we'll see, every invariant
relation can be built out of atomic ones!
Here's an example in the case we're considering:
This is clearly an invariant relation. It's atomic if we idealize
the situation and assume you and your friends are points - so we
can't ask which way you're facing, whether you're waving at each other,
etc.
To see why it's atomic, note that we can always find a frame of
reference where you're at rest and your friend is moving by like this:
If you and your friend are points, the situation is completely
described (up to symmetries) by the relative speed and distance
of closest approach. So, the invariant relation quoted above
can't be written as "P or Q" for other invariant relations.
The same analysis shows that in this example, every atomic invariant
relation is of this form:
(Quiz: why don't we need to let s be negative if your friend is moving
to the left?)
From this example, it's clear there are often infinitely many
double cosets. But there are some wonderful examples with just
finitely many double cosets - and these are what I'll focus
on in our Tale.
Here's the simplest one. Suppose we're doing projective plane
geometry. This is a bit like Euclidean plane geometry, but there are
more symmetries: every transformation that preserves lines is allowed.
So, in addition to translations and rotations, we also have other
symmetries.
For example, imagine taking a blackboard with some points and lines
on it:
We get even more symmetries if we use a clever trick. Suppose we're
standing on the blackboard, and it extends infinitely like an endless
plain. Points on the horizon aren't really points on the blackboard.
They're called "points at infinity". But, it's nice to
include them as part of the so-called "projective plane".
They make things simpler: now every pair of lines intersects in a
unique point, just as every pair of points lies on a unique line.
You've probably seen how parallel railroad tracks seem to meet at the
horizon - that's what I'm talking about here. And, by including these
extra points at infinity, we get extra symmetries that map points at
infinity to ordinary points, and vice versa.
I gave a more formal introduction to projective geometry in "week106" and "week145", and "week178". If you read these, you'll know that
points in the projective plane correspond to lines through the origin
in a 3d space. And, you'll know a bit about the group of symmetries
in projective geometry: it's the group G = PGL(3), consisting of 3×3
invertible matrices, modulo scalars.
(I actually said SL(3), but I
was being sloppy - this is another group with the same Lie algebra.)
For some great examples of double cosets, let F be the space of
"flags". A "flag" is a very general concept, but
in projective plane geometry a flag is just a point x on a line y:
For example, the flag (x',y') has the following relation to (x,y):
There's a lot more to say about this, and we'll need to delve into
it much deeper soon... but not yet. For now, I just want to mention
that all this stuff generalizes from G = PGL(3) to any other simple
Lie group! And, the picture above is an example of a general concept,
called an "apartment". Apartments are a great way to visualize
atomic invariant relations between flags.
This "apartment" business is part of a wonderful theory due
to Jacques Tits, called the theory of "buildings". The
space of all flags is a building; a building has lots of apartments
in it. Buildings have a reputation for being scary, because in his
final polished treatment, Tits started with a few rather unintuitive
axioms and derived everything from these. But, they're actually lots
of fun if you draw enough pictures!
Next, let me explain why people call atomic invariant relations
"double cosets".
First of all, what's a relation between two sets X and Y? We can
think of it as a subset S of X × Y: we say a pair (x,y) is in S
if the relation holds.
Next, suppose some group G acts on both X and Y. What's an
"invariant" relation? It's a subset S of X × Y such
that whenever (x,y) is in S, so is (gx,gy). In other words, the
relation is preserved by the symmetries.
Now let's take these simple ideas and make them sound more complicated,
to prove we're mathematicians. Some of you may want to take a little
nap right around now - I'm just trying to make contact with the usual
way experts talk about this stuff.
First, let's use an equivalent but more technical way to think of an
invariant relation: it's a subset of the quotient space G\(X × Y).
Note: often I'd call this quotient space (X × Y)/G. But now I'm
writing it with the G on the left side, since we had a left action
of G on X and Y, hence on X × Y - and in a minute we're gonna need
all the sides we can get!
Second, recall from the previous
episode that if G acts transitively on both
X and Y, we have isomorphisms
X ≅ G/H
and
Y ≅ G/K
for certain subgroups H and K of G. Note: here we're really modding
out by the right action of H or K on G.
Combining these facts, we see that when G acts transitively on both
X and Y, an invariant relation is just a subset of
G\(X × Y) ≅ G\(G/H x G/K)
Finally, if you lock yourself in a cellar and think about this for a
few minutes (or months), you'll realize that this weird-looking set is
isomorphic to
H\G/K
This notation may freak you out at first - I know it scared me!
The point is that we can take G, mod out by the right action of K
to get G/K, and then mod out by the left action of H on G/K, obtaining
H\(G/K).
Or we can take G, mod out by the left action of H to get H\G, and then
mod out by the right action of K on H\G, obtaining
(H\G)/K.
And, these two things are isomorphic! So, we relax and write
H\G/K
A point in here is called a "double coset": it's an equivalence class
consisting of all guys in G of the form
hgk
for some fixed g, where h ranges over H and k ranges over K.
Since subsets of H\G/K are invariant relations, we can think of a
point in H\G/K as an "atomic" invariant relation. Every invariant
relation is the union - the logical "or" - of a bunch of these.
So, just as any hunk of ordinary matter can be broken down into atoms,
every invariant statement you can make about an entity of type X and
an entity of type Y can broken down into "atomic" invariant
relations - also known as double cosets!
So, double cosets are cool. But, it's good to fit them into the "spans
of groupoids" perspective. When we do this, we'll see:
This relies on the simpler slogan I mentioned last time:
Let's see how it goes. Suppose we have two sets on which G acts
transitively, say X and Y. Pick a figure x of type X, and a figure
y of type Y. Let H be the stabilizer of x, and let K be the
stabilizer of y. Then we get isomorphisms
X ≅ G/H
and
Y ≅ G/K
The subgroup H ∩ K stabilizes both x and y, and
Z = G/(H ∩ K)
is another set on which G acts transitively. How can we think of this
set? It's the set of all pairs of figures, one of type X and one of
type Y, which are obtained by taking the pair (x,y) and applying
an element of G. So, it's a subset of X × Y that's invariant under
the action of G. In other words, it's an invariant relation between
X and Y!
Furthermore, it's the smallest invariant subset of X × Y that contains
the pair (x,y). So, it's an atomic invariant relation - or in other
words, a double coset!
Now, let's see how to get a span of groupoids out of this. We have
a commutative diamond of group inclusions:
But, this span of groupoids is a span "over G", meaning it's
part of a commutative diamond with G at the bottom:
To tersely summarize what we've learned, let's use the fact that a
groupoid is equivalent to a group precisely when it's "connected":
that is, all its objects are isomorphic. Furthermore, a functor between
connected groupoids is equivalent to an inclusion of groups precisely
when it's "faithful": one-to-one on each homset. So, when I said that:
what I really meant was:
This sounds complicated, but it's mainly because we're trying to toss
in extra conditions to make our concepts match the old-fashioned "double
coset" notion. Here's a simpler, more general fact:
where a "G-set" is a set on which G acts. This is the
natural partner of the slogan I explained in the
last episode, though not in
this language:
Things get even simpler if we drop the "faithfulness" assumption, and
simply work with groupoids over G, and spans of these. This takes
us out of the traditional realm of group actions on sets, and into the
21st century! And that's where we want to go.
Indeed, for the last couple episodes I've just been trying to lay out the
historical context for the Tale of Groupoidification, so experts can
see how the stuff to come relates to stuff that's already known. In
some ways things will get simpler when I stop doing this and march
ahead. But, I'll often be tempted to talk about group actions on
sets, and double cosets, and other traditional gadgets... so I feel
obliged to set the stage.
There is a lot more we could say about this. But, our most urgent
goal is to see how spans of groupoids act like twice categorified
matrices - matrices whose entries are not just numbers, and not
just sets, but groupoids! This will expose the secret
combinatorial underpinnings of a lot of fancy linear algebra. Once
we've categorified linear algebra this way, we'll be in a great
position to tackle fashionable topics like categorified quantum groups,
invariants of higher-dimensional knots, and the like.
But, we should restrain ourselves from charging ahead too fast!
Everything will hang together better if we lay the groundwork properly.
For this, it pays to re-examine the history of mathematics a bit.
If we're trying to understand linear algebra using groupoids, it
pays to ask: how did people connect linear algebra and group theory
in the first place?
This book is very helpful:
9) Charles W. Curtis, Pioneers of Representation Theory: Frobenius,
Burnside, Schur and Brauer, History of Mathematics vol. 15, AMS,
Providence, Rhode Island, 1999.
Back in 1897, a mathematician named William Burnside wrote the first
book in English on finite groups. It was called Theory of Groups of
Finite Order.
In the preface, Burnside explained why he studied finite groups by
letting them act as permutations of sets, but not as linear
transformations of vector spaces:
In short, he didn't see the point of representing groups on vector
spaces - at least as a tool in the "pure" theory of finite
groups, as opposed to their applications.
However, within months after this book was published, he discovered
the work of Georg Frobenius, who used linear algebra very effectively to
study groups!
So, Burnside started using linear algebra in his own work on
finite groups, and by the time he wrote the second edition of his
book in 1911, he'd changed his tune completely:
I don't really know the full answer to this question. But, it quickly
leads us into the fascinating theory of "Hecke operators", which will
play a big role in the Tale of Groupoidification. So, let's pursue it
a bit.
Suppose we have two guys, William and Georg, who are studying a finite
group G.
William says, "I'm studying how G acts on sets."
Georg replies, "I'm studying how it acts on complex vector spaces,
as linear transformations. Mere sets are feeble entities. I can do
anything you can do - but I have the tools of linear algebra to help me!"
William says, "But, you're paying a moral price. You're getting the
complex numbers - a complicated infinite thing - involved in what
should be a completely finitary and combinatorial subject: the study
of a finite group. Is this really necessary?"
Georg replies, "I don't know, but it's sure nice. For example,
suppose I have G acting on vector space V. Then I can always
break down V into a direct sum of subspaces preserved by G, which
can't themselves be broken down any further. In technical terms:
every representation of G is a direct sum of irreducible representations.
And, this decomposition is unique! It's very nice: it's a lot like
how every natural integer has a unique prime factorization."
William says, "Yes, it's always fun when we can break things down
into 'atoms' that can't be further decomposed. It's very satisfying
to our reductionist instincts. But, I can do even better than you!"
Georg raises an eyebrow. "Oh?"
"Yeah," William says. "Suppose I have our group G acting on a set S.
Then I can always break down S into a disjoint union of subsets
preserved by G, which can't themselves be broken down any further.
In technical terms: every action of G is a disjoint union of transitive
actions. And, this decomposition is unique!"
Embarrassed, Georg replies, "Oh, right - we heard that back in
Episode 3".
I wasn't paying enough attention. But how is what you're doing better
than what I'm doing? It sounds just the same."
William hesitates. "Well, first of all, a rather minor point, which I can't
resist mentioning ... when you said your decomposition of representations
into irreducibles was unique, you were exaggerating a little. It's just
unique up to isomorphism, and not a canonical isomorphism either.
For example, if you have an irreducible representation of G on V,
there are lots of different ways to slice the direct sum V + V
into two copies of the representation V. It's a sort of floppy
business. On the other hand, when I have a transitive action of G on S,
there's exactly one way to chop the disjoint union S + S into two
copies of the G-set S. I just look at the two orbits."
Georg says, "Hmm. This is actually a rather delicate point.
There's not really a canonical isomorphism in your case either,
since S may be isomorphic to itself in lots of ways, even as a G-set.
There's something in what you say, but it's a bit hard to make
precise, and it's certainly not enough to worry me."
"Okay, but here's my more serious point. Given that
I can break apart any set on which G acts into 'atoms', my job is
to classify those atoms: the transitive G-sets. And there's a very
nice classification! Any subgroup H of G gives a transitive G-set,
namely G/H, and all transitive G-sets look like this. More precisely:
isomorphism classes of transitive G-sets correspond to conjugacy
classes of subgroups of G.
Even better, this has a nice meaning in terms of Klein geometry.
Any type of figure in Klein geometry gives a transitive G-set,
with H being the stabilizer of a chosen figure of that type.
You, on the other hand, need to classify irreducible representations
of G. And this is not so conceptual. What do these irreducible
representations mean in terms of the group G?"
Georg replies, "Well, there's one irreducible representation for
each conjugacy class in G..."
At this, William pounces. "You mean the number of isomorphism
classes of irreducible representations of G equals the number
of conjugacy classes in G! But as you know full well, there's no
god-given correspondence. You can't just take a conjugacy class in
G and cook up an irreducible representation, or irrep. So, you've
just made my point. You've shown how mysterious these irreps actually
are!"
Georg replies, "Actually in some cases there is a nice way to
read off irreps from conjugacy classes. For example, you can
do it for the symmetric groups Sn.
But, I admit you can't in general... or at least, I don't know how."
William laughs, "So, I win!"
"Not at all!" replies Georg. "First, there are lots of theorems
about groups that I can prove using representations, which you can't
prove using actions on sets. For example, nobody knows how to prove
that every group
with an odd number of elements is solvable without using the tools of linear algebra."
William nods. "I admit that linear algebra is very practical.
But just give us time! I proved back in 1904 that every group of size
paqb is solvable if p and q are prime. To do
it, I broke down and used linear algebra. But then, in 1972, Helmut
Bender found a proof that avoids linear algebra."
Georg said, "Okay, struggle on then. So far, without using linear
algebra, nobody can even prove my famous theorem on 'Frobenius groups'.
The details don't matter here: the point is, this is a result on
group actions, which seems to need linear algebra for its proof.
But if practicality won't sway you, maybe this conceptual argument
will. My atoms are more fine-grained than yours!"
"What do you mean?" asks William.
"You can decompose any action of G into 'atoms', namely transitive
G-sets. Similarly, I can decompose any representation of G into
one of my 'atoms', namely irreps. But, there's an obvious way to
turn G-sets into representations of G, and if we do this to one of
your atoms, we don't get one of my atoms! We can usually break it down
further! So, my atoms are smaller than yours."
"How does this work, exactly?"
"It's easy," says Georg, getting a bit cocky. "Say you
have a group G acting on a set S. Then I can form the vector space
C[S] whose elements are formal linear combinations of elements of S.
In other words, it's the vector space having S as basis. If we're
feeling sloppy we can think of guys in C[S] as functions on S that
vanish except at finitely many points. It's better to think of them
as measures on S. But anyway: since G acts on S, it acts linearly on
C[S]!
So, any G-set gives a representation of G. But, even when G acts
transitively on S, its representation on C[S] is hardly ever irreducible."
William glowers. "Oh yeah?"
"Yeah. Suppose for example that S is finite. Then the constant functions
on S form a 1-dimensional subspace of C[S] that's invariant under the
action of G. So, at the very least, we can break C[S] into two pieces."
"Well," replies William defensively, "That's pretty obvious. But it's
also not such a big deal. So you can break up any my transitive G-sets
into two of your irreps, one being the 'trivial' irrep. So what???"
"It wouldn't be a big deal if that's all that ever
happened," says Georg. "In fact, we can break C[S] into
precisely two irreps whenever the action of G on S is 'doubly transitive'
- meaning we can send any pair of distinct elements of S to any
other using some element of G. But, there lots of transitive actions
aren't doubly transitive! And usually, one your atoms breaks down
into a bunch of my atoms. In fact I'd like to show you how this
works, in detail, for the symmetric groups."
"Maybe next time," says William. "But, I see your
point. Your atoms are more atomic than my atoms."
Georg seems to have won the argument. But, William wouldn't have
conceded the point quite so fast, if he'd thought about invariant
relations!
The point is this. Suppose we have two G-sets, say X and Y. Any
G-set map from X to Y gives an intertwining operator from C[X] to C[Y].
But, even after taking linear combinations, there are typically
plenty of intertwining operators that don't arise this way. It's
these extra intertwining operators that let us chop representations
into smaller atoms.
But where do these extra intertwining operators come from? They come
from invariant relations between X and Y!
And, what are these extra intertwining operators called? In some
famous special cases, like in study of modular forms, they're called
"Hecke
operators". In some other famous special cases,
like in the study of symmetric groups, they form algebras called
"Hecke
algebras".
A lot of people don't even know that
Hecke operators and Hecke algebras are two faces of the same idea:
getting intertwining operators from invariant relations.
But, we'll see this is true, once we look at some examples.
I think I'll save those for future episodes. But if you've followed
the Tale so far, you can probably stand a few extra hints of where
we're going. Recall from Episode 4 that invariant relations between
G-sets are just spans of groupoids equipped with some extra stuff.
So, invariant relations between G-sets are just a warmup for the more
general, and simpler, theory of spans of groupoids. I said back in
Episode 2 that spans of
groupoids give linear operators. What I'm trying to say now is that
these linear operators are a massive generalization - but also a
simplification - of what people call "Hecke operators".
Finally, for students trying to learn a little basic category theory, I'd
like to cast the argument between William and Georg into the language of
categories, just to help you practice your vocabulary.
A G-set is the same as a functor
A: G → Set
where we think of G as a 1-object category. There's a category of
G-sets, namely
hom(G,Set)
This has functors A: G → Set as objects, and natural transformations
between these as morphisms. Usually the objects are called "G-sets",
and the morphisms are called "maps of G-sets".
We can also play this whole game with the category of vector spaces
replacing the category of sets. A representation of G is the same as
a functor
A: G → Vect
As before, there's a category of such things, namely
hom(G,Vect)
This has functors A: G → Vect as objects, and natural transformations
between these as morphisms. Now the objects are called "representations
of G" and the morphisms are called "intertwining operators".
We could let any groupoid take the place of the group G. We could also
let any other category take the place of Set or Vect.
Part of what William and Georg were debating was: how similar are
hom(G,Set) and hom(G,Vect)? How are they related?
First of all, there's a functor
F: Set → Vect
sending each set S to the vector space C[S] with that set as basis. So,
given an action of G on a set:
A: G → Set
we can compose it with F and get a representation of G:
FA: G → Vect
This kind of representation is called a "permutation representation".
And, this trick gives a functor from G-sets to representations of G:
hom(G,Set) → hom(G,Vect)
is not "essentially surjective".
Moreover, not every intertwining operator between permutation
representations comes from a map between their underlying G-sets!
In other words, the functor
hom(G,Set) → hom(G,Vect)
is not "full".
But, given two different maps from one G-set to another, they give
different intertwining operators. So, at least our functor is "faithful".
Maps of G-sets are a special case of invariant relations. So, to get a
category that more closely resembles hom(G,Vect), while remaining purely
combinatorial, we can replace hom(G,Set) by the category with G-sets as
objects and invariant binary relations as morphisms. This is the basic
idea of "Hecke operators".
Or, even better, we can try a weak 2-category, with
This is where groupoidification comes into its own.
I want to work my way to the concept of "Hecke operator"
through a series of examples. The examples I'll use are a bit
trickier than the concept I'm really interested in, since the
examples involve integrals, where the Hecke operators I ultimately
want to discuss involve sums. But, the examples are nice if
you like to visualize stuff...
In these examples we'll always have a relation between two sets
X and Y. We'll use this to get an operator that turns functions
on X into functions on Y - a "Hecke operator".
Suppose you're trying to do a CAT scan. You want to obtain
a 3d image of someone's innards. Unfortunately, all you do is
take lots of 2d X-ray photos of them. How can you assemble all
this information into the picture you want?
Who better to help you out than a guy named after a
radioactive gas: Radon!
In 1917, the Viennese mathematician Johann Radon tackled a related
problem one dimension down. You could call it a "CAT scan for
flatlanders".
Suppose you want to obtain a complete image of the insides of a
2-dimensional person, but all you can do is shine beams of X-rays
through them and see how much each beam is attenuated.
So, mathematically: you have a real-valued function on the plane -
roughly speaking, the density of your flatlander. You're trying
to recover this function from its integrals along all possible
lines. Someone hands you this function on the space of lines,
and you're trying to figure out the original function on the space
of points.
(Points lying on lines! If you've been following the Tale of
Groupoidification, you'll know this "incidence relation" is
connected to Klein's approach to geometry, and ultimately to
spans of groupoids. But pretend you don't notice, yet.)
Now, it's premature to worry about this tricky "inverse problem"
before we ponder what it's the inverse of: the "Radon transform".
This takes our original function on the space of points and
gives a function on the space of lines.
Let's call the Radon transform T. It takes a function f on the
space of points and gives a function Tf on the space of lines,
as follows. Given a line y, (Tf)(y) is the integral of f(x) over
the set of all points x lying on y.
What Radon did is figure out a nice formula for the inverse of
this transform. But that's not what I'm mainly interested in now.
It's the Radon transform itself that's a kind of Hecke operator!
Next, look at another example.
This is an obvious generalization to higher dimensions of what
I just described. Before we had a space
X = {points in the plane}
and a space
Y = {lines in the plane}
and an incidence relation
S = {(x,y): x is a point lying on the line y}
If we go to n dimensions, we can replace all this with
X = {points in Rn}
Y = {lines in Rn}
S = {(x,y): x is a point lying on the line y}
Again, the X-ray transform takes a function f on the space of
points and gives a function Tf on the space of lines. Given a
line y, (Tf)(y) is the integral of f(x) over the set of all x
with (x,y) in S.
Next, yet another example!
Radon actually considered a different generalization of the
2d Radon transform, using hyperplanes instead of lines. Using
hyperplanes is nicer, because it gives a very simple relationship
between the Radon transform and the Fourier transform. But never
mind - that's not the point here! The point is how similar
everything is. Now we take:
X = {points in Rn}
Y = {hyperplanes in Rn}
S = {(x,y): x is a point lying on the hyperplane y}
And again, the Radon transform takes a function f on X
and gives a function Tf on Y. Given y in Y, (Tf)(y) is the
integral of f(x) over the set of all x with (x,y) in S.
We're always doing the same thing here. Now I'll describe
the general pattern a bit more abstractly.
We've always got three spaces, and maps that look like this:
P(x,y) = x
Q(x,y) = y
But, they don't need to be.
Now, how do we get a linear operator in this situation?
Easy! We start with a real-valued function on our space X:
f: X → R
Then we take f and "pull it back along P" to get a function on S.
"Pulling back along P" is just impressive jargon for composing
with P:
f o P: S → R
Next, we take f o P and "push it forwards along Q" to get a
function on Y. The result is our final answer, some function
Tf: Y → R
"Pushing forwards along Q" is just impressive jargon for
integrating: Tf(y) is the integral over all s in S with Q(s) = y.
For this we need a suitable measure, and we need the integral
to converge.
This is the basic idea: we define an operator T by pulling
back and then pushing forward functions along a "span", meaning
a diagram shaped like a bridge:
This group G acts on all 3 spaces: X, Y, and S. This makes the
space of functions on X into a representation of G! And, ditto
for the space of function on Y and S.
Furthermore, the maps P and Q are "equivariant", meaning
P(gs) = gP(s)
and
Q(gs) = gQ(s)
This makes "pulling back along P" into an intertwining operator
between representations of G. "Pushing forwards along Q" will
also be an intertwining operator if the measure we use is
G-invariant in a suitable sense. In this case, our transform T
becomes an intertwining operator between group representations!
Let's call an intertwining operator constructed this way a "Hecke
operator".
If you're a nitpicky person, e.g. a mathematician, you may wonder what
I mean by "a suitable sense". Well, each "fiber"
Q-1(y) of the map
Q: S → Y
needs a measure on it, so we can take a function on S and
integrate it over these fibers to get a function on Y. We need
this family of measures to be invariant under the action of G,
for pushing forwards along Q be an intertwining operator.
This stuff about invariant families of measures is mildly
annoying, and so is the analysis involved in making precise
which class of functions on X we can pull back to S and then
push forward to Y - we need to make sure the integrals converge,
and so on. When I really get rolling on this Hecke operator
business, I'll often focus on cases where X, Y, and S are
finite sets... and then these issues go away!
Hmm. I'm getting tired, but I can't quit until I say one more
thing. If you try to read about Hecke operators, you won't
see anything about the examples I just mentioned! You're most
likely to see examples where X and Y are spaces of lattices in
the complex plane. This is the classic example, which we're
trying to generalize. But, this example is more sophisticated
than the ones I've mentioned, in that the "functions" on X and
Y become "sections of vector bundles" over X and Y. The same
sort of twist happens when we go from the Radon transform to the
more general "Penrose transform".
Anyway, next time I'll talk about some really easy examples,
where X, Y, and S are finite sets and G is a finite group.
These give certain algebras of Hecke operators, called "Hecke
algebras".
In the meantime, see if you can find any reference in the
literature which admits that "Hecke algebras" are related
to "Hecke operators". It ain't easy!
It's a great example of a mathematical cover-up - one we're gonna
bust wide open.
So, this time I'll fill in some details about
"degroupoidification" - the process that sends groupoids to
vector spaces and spans of groupoids to linear operators.
How does this work? For starters, each groupoid X gives a vector
space [X] whose basis consists of isomorphism classes of objects
of X.
Given an object x in X, let's write its isomorphism class
as [x]. So: x in X gives [x] in [X].
Next, each span of groupoids
[S]: [X] → [Y]
Note: this operator [S] depends on the whole span, not just the
groupoid S sitting on top. So, I'm abusing notation here.
More importantly: how do we get this operator [S]? The recipe is
simple, but I think you'll profit much more by seeing where the
recipe comes from.
To figure out how it should work, we insist that degroupoidification
be something like a functor. In other words, it should get along
well with composition:
[TS] = [T] [S]
and identities:
[1X] = 1[X]
(Warning: today, just to confuse you, I'll write composition in
the old-fashioned backwards way, where doing S and then T is
denoted TS.)
How do we compose spans of groupoids? We do it using a "weak
pullback". In other words, given a composable pair of spans:
(Exercise: figure out the morphisms in the weak pullback. Figure
out and prove the universal property of the weak pullback.)
So, how should we take a span of groupoids
[S]: [X] → [Y] ?
We just need to know what this operator does to a bunch of
vectors in [X]. How do we describe vectors in [X]?
I already said how to get a basis vector [x] in [X] from any object
x in X. But, that's not enough for what we're doing now, since a
linear operator doesn't usually send basis vectors to basis
vectors. So, we need to generalize this idea.
An object x in X is the same as a functor from 1 to X:
First suppose V is a finite set, thought of as a groupoid with
only identity morphisms. Then to define [V], we just go through
all the points of V, figure out what p maps them to - some bunch
of objects x in X - and add up the corresponding basis vectors
[x] in [X].
I hope you see how pathetically simple this idea is! It's especially
familiar when V and X are both sets. Here's what it looks like then:
Now we just need to generalize a bit further, to cover the case
where V is a groupoid:
Now you probably have two questions:
A) Given a functor p: V → X between groupoids and an object
x in X, what's the "essential preimage" of x?
and
B) what's the "cardinality" of a groupoid?
Here are the answers:
A) An object in the essential preimage of x is an object
v in V equipped with an isomorphism from p(v) to x.
(Exercise: define the morphisms in the essential preimage.
Figure out and prove the universal property of the essential
preimage. Hint: the essential preimage is a special case of a
weak pullback!)
B) To compute the cardinality of a groupoid, we pick one object
from each isomorphism class, count its automorphisms, take the
reciprocal of this number, and add these numbers up.
(Exercise: check that the cardinality of the groupoid of finite sets
is e = 2.718281828... If you get stuck, read "week147".)
Also: define the morphisms in the essential preimage. Figure out
and prove the universal property of the essential preimage. Hint:
the essential preimage is a special case of a weak pullback!)
Okay. Now in principle you know how any groupoid over X, say
So, let's finally figure out how a span of groupoids
[S]: [X] → [Y]
It's enough to know what this operator does to vectors
coming from groupoids over X:
So here's what we do. To apply the operator [S] coming from
the span
[S] [V] = [SV]
Slick, eh? Of course you need to check that [S] is well-defined.
Given that, it's trivial to prove that [-] gets along with
composition of spans:
[TS] = [T] [S]
At least, it's trivial once you know that composition of spans is
associative up to equivalence, and equivalent spans give the same
operator! But your friendly neighborhood category theorist can check
such facts in a jiffy, so let's just take them for granted. Then
the proof goes like this. We have:
[TS] = [T] [S]
Voilà!
By the way, if "week47"
doesn't satisfy your hunger for information on groupoid cardinality,
try this:
19) John Baez and James Dolan, From finite sets to Feynman diagrams, in
Mathematics Unlimited - 2001 and Beyond, vol. 1, eds. Bjorn Engquist
and Wilfried Schmid, Springer, Berlin, 2001, pp. 29-50. Also
available as math.QA/0004133.
For more on turning spans of groupoids into linear operators, and
composing spans via weak pullback, try these:
20) Jeffrey Morton, Categorified algebra and quantum mechanics,
TAC 16 (2006), 785-854. Available at
http://www.emis.de/journals/TAC/volumes/16/29/16-29abs.html;
also available as math.QA/0601458.
21) Simon Byrne, On Groupoids and Stuff, honors thesis, Macquarie
University, 2005, available at
http://www.maths.mq.edu.au/~street/ByrneHons.pdf and
http://math.ucr.edu/home/baez/qg-spring2004/ByrneHons.pdf
For a more leisurely exposition, with a big emphasis on applications
to combinatorics and the quantum mechanics of the harmonic oscillator,
try:
22) John Baez and Derek Wise, Quantization and Categorification,
lecture notes available at:
Finally, a technical note. Why did I say the degroupoidification
process was "something like" a functor? It's because spans of
groupoids don't want to be a category!
Already spans of sets don't naturally form a category. They form a
weak 2-category! Since pullbacks are only defined up to canonical
isomorphism, composition of spans of sets is only associative
up to isomomorphism... but luckily, this "associator" isomorphism
satisfies the "pentagon identity" and all that jazz, so we get a
weak 2-category, or bicategory.
Similarly, spans of groupoids form a weak 3-category. Weak pullbacks
are only defined up to canonical equivalence, so composition of spans
of groupoids are associative up to equivalence... but luckily this
"associator" equivalence satisfies the pentagon identity up to an
isomorphism, and this "pentagonator" isomomorphism satisfies a
coherence law of its own, governed by the 3d Stasheff polytope.
So, we're fairly high in the ladder of n-categories. But, if we
want a mere category, we can take groupoids and equivalence classes
of spans. Then, degroupoidification gives a functor
[-]: [finite groupoids, spans] → [vector spaces, linear maps]
That's the fact whose proof I tried to sketch here.
While I'm talking about annoying technicalities, note we need
some sort of finiteness assumption on our spans of groupoids
to be sure all the necessary sums converge. If we go all-out
and restrict to spans where all groupoids involved are finite,
we'll be very safe. The cardinality of a finite groupoid is a
nonnegative rational number, so we can take our vector spaces to
be defined over the rational numbers.
But, it's also fun to consider "tame" groupoids, as defined that
paper I wrote with Jim Dolan. These have cardinalities that can
be irrational numbers, like e. So, in this case we should use
vector spaces over the real numbers - or complex numbers, but that's
overkill.
Finding a class of groupoids or other entities whose cardinalities
are complex would be very nice, to push the whole groupoidification
program further into the complex world. In the above paper by
Jeff Morton, he uses sets over U(1), but that's probably not the
last word.
The Tale of Groupoidification – Episode 1
0 0
1 1
(The relation to quantum mechanics here is just a vague hint of
themes to come. I've deliberately picked an example where the linear
operator is not unitary.)
1 0 0 0
1 1 0 0
1 0 1 0
1 1 0 1
where 1 means "is a divisor of" and 0 means "is not a
divisor of".
S
/ \
/ \
F/ \G
/ \
v v
X Y
This is my wretched ASCII attempt to draw two arrows coming down from
the set S to the sets X and Y. It's supposed to look like a bridge -
hence the term "span".
S
/ \
/ \
F/ \G
/ \
v v
X Y
we get an X × Y matrix whose (x,y) entry is the number of Russians -
I mean elements s of S - such that
Let X be the set of people on Earth. Let T be the X × X matrix
corresponding to the relation "is the father of". Why does
the matrix T2 correspond to the relation "is the paternal
grandfather of"? Let S correspond to the relation "is a friend
of". Why doesn't the matrix S2 correspond to the
relation "is a friend of a friend of"? What span does this
matrix correspond to?
S
/ \
/ \
F/ \G
/ \
v v
X Y
where X, Y, S are groupoids and F, G are functors. If all the groupoids
are finite, we can turn this span into a finite-sized matrix of nonnegative
rational numbers, by copying what we did for spans of finite sets.
sets → categories
functions → functors
vector spaces → groupoids
linear operators → spans of groupoids
Episode 2
S
/ \
/ \
F/ \G
/ \
v v
X Y
Simple! But the important thing is to understand this thing as a
"witnessed relation".
S
/ \
/ \
F/ \G
/ \
v v
X Y
we can take X to be the set of points and Y to be the set of lines.
S
/ \
/ \
F/ \G
/ \
v v
X Y
X, Y and S will be groupoids, while F and G will be maps between
groupoids: that is, "functors"!
S
/ \
/ \
F/ \G
/ \
v v
X Y
and the object s is a witness to the fact that x and y are related:
o------o------o
points lines planes
give rise to n-dimensional projective geometry. I may have to review
this stuff, but first I'll probably say a bit about the theory of
group representations and Hecke algebras.
Episode 3
THE SAME AS A GROUP ACTION
X//G
|
|p
|
v
G
sending any object of X//G to the one object of G, and any morphism
g: x → y to the corresponding element of G. Remember, a group is
a groupoid with one object: this is the 21st century!
THE SAME AS A GROUP ACTION
X//G
|
|p
|
v
G
is secretly equivalent to the inclusion
H
|
|i
|
v
G
of the stabilizer group of this element.
Y
|
|
|
v
G
Then the action is transitive if all the objects of Y are isomorphic.
In that case, our functor is equivalent to an inclusion
H
|
|
|
v
G
and we're back down to Klein's approach to geometry. But, it's actually
good to generalize what Klein did, and think about arbitrary "groupoids
over G" - that is, groupoids equipped with functors to G.
S
/ \
/ \
/ \
/ \
v v
X Y
\ /
\ /
\ /
\ /
v v
G
There's much more to say about this, but not today!
X//G
|
|p
|
v
G
is equivalent to
H
|
|i
|
v
G
I could equally well have said it's equivalent to
H
|
|i'
|
v
G
where the inclusion i' is the inclusion i conjugated by g. If you
know some category theory, you'll see that i and i' are naturally
isomorphic: a natural isomorphism between functors between groups
is just a "conjugation". Picking the specific inclusion i requires
picking a specific element x of X.
Episode 4
I want to explain double cosets as spans
of groupoids... but it's best if I start with some special relativity.
Shut yourself up with some friend in the main cabin below decks on
some large ship, and have with you there some flies, butterflies,
and other small flying animals. Have a large bowl of water with some
fish in it; hang up a bottle that empties drop by drop into a wide
vessel beneath it. With the ship standing still, observe carefully how
the little animals fly with equal speed to all sides of the cabin. The
fish swim indifferently in all directions; the drops fall into the
vessel beneath; and, in throwing something to your friend, you need
throw it no more strongly in one direction than another, the distances
being equal; jumping with your feet together, you pass equal spaces
in every direction.
As a result, the coordinate transformation we use in Newtonian
mechanics to switch from one reference frame to another moving at
a constant velocity relative to the first is called a "Galilei
transformation". For example:
Our whole progress up to this point may be described as a gradual
development of the doctrine of relativity of all physical phenomena.
Position we must evidently acknowledge to be relative, for we cannot
describe the position of a body in any terms which do not express
relation. The ordinary language about motion and rest does not so
completely exclude the notion of their being measured absolutely, but
the reason of this is, that in our ordinary language we tacitly assume
that the earth is at rest.... There are no landmarks in space; one
portion of space is exactly like every other portion, so that we
cannot tell where we are. We are, as it were, on an unruffled sea,
without stars, compass, sounding, wind or tide, and we cannot tell in
what direction we are going. We have no log which we can case out to
take a dead reckoning by; we may compute our rate of motion with
respect to the neighboring bodies, but we do not know how these bodies
may be moving in space.
So, the big deal about special relativity is not that motion is
relative. It's that this is possible while keeping the speed of
light the same for everyone - as Maxwell's equations insist, and as
we indeed see! This is what forced people to replace Galilei
transformations by "Lorentz transformations", which have the
new feature that two coordinate systems moving relative to each
other will disagree not just on where things are, but when they
are.
Examples of this sort, together with the unsuccessful attempts to
discover any motion of the earth relative to the "light medium",
suggest that the phenomena of electrodynamics as well as mechanics
possess no properties corresponding to the idea of absolute rest.
They suggest rather that, as has already been shown to the first
order of small quantities, the same laws of electrodynamics and
optics will be valid for all frames of reference for which the
equations of mechanics are valid. We will elevate this conjecture
(whose content will be called the "principle of relativity") to
the status of a postulate, and also introduce another postulate,
which is only apparently irreconcilable with it, namely, that
light is always propagated in empty space with a definite velocity
c which is independent of the state of motion of the emitting
body. These two postulates suffice for attaining a simple and
consistent theory of the electrodynamics of moving bodies based on
Maxwell's theory for stationary bodies.
So, what really changed with the advent of special relativity?
First, our understanding of precisely which transformations count
as symmetries of spacetime. These transformations form a group.
Before special relativity, it seemed the relevant group was a
10-dimensional gadget consisting of:
"My friend's speed relative to me is 50 meters/second, and our
distance of closest approach is 10 meters."
-----FRIEND---->
YOU
"My friend's speed relative to me is s, and our distance of
closest approach is d."
for some nonnegative numbers s and d.
\ /
------------x-----------x-----------
\ /
\ /
\ /
\ /
\ /
x
/ \
/ \
/ \
We can translate it and rotate it. But, we can also view it from
an angle: that's another symmetry in projective geometry! This
hints at how projective geometry arose from the study of perspective
in painting.
-----------------x----------------
y
An amazing fact is that there are precisely 6 atomic invariant relations
between a pair of flags. You can see them all in this picture:
\ /
------------x-----------x'----------
\ / y
\ /
\ /
\ /
\ /
x"
/ \
/ \
y'/ \y"
There are six flags here, and each exemplifies a different
atomic invariant relation to our favorite flag, say (x,y).
"The point of (x',y') lies on the line of (x,y), and no more."
By "no more" I mean that no further incidence relations hold.
THE SAME AS A DOUBLE COSET.
THE SAME AS A GROUP ACTION.
H∩K
/ \
/ \
/ \
v v
H K
\ /
\ /
\ /
v v
G
This gives a commutative diamond of spaces on which G acts
transitively:
G/(H∩K)
/ \
/ \
/ \
v v
G/H G/K
\ /
\ /
\ /
v v
G/G
We already have names for three of these spaces - and G/G is just
a single point, say *:
Z
/ \
/ \
/ \
v v
X Y
\ /
\ /
\ /
v v
*
Now, in Episode 3
I explained how you could form the "action groupoid"
X//G given a group G acting on a space X. If I were maniacally
consistent, I would write it as G\\X, since G is acting on the left.
But, I'm not. So, the above commutative diamond gives a commutative
diamond of groupoids:
Z//G
/ \
/ \
/ \
v v
X//G Y//G
\ /
\ /
\ /
v v
*//G
The groupoid on the bottom has one object, and one morphism for each
element of G. So, it's just G! So we have this:
Z//G
/ \
/ \
/ \
v v
X//G Y//G
\ /
\ /
\ /
v v
G
So - voila! - our double coset indeed gives a span of groupoids
Z//G
/ \
/ \
/ \
v v
X//G Y//G
X//G is the groupoid of figures just like x (up to symmetry), Y//G
is the groupoid of figures just like y, and Z//G is the groupoid of
pairs of figures satisfying the same atomic invariant relation
as the pair (x,y). For example, point-line pairs, where the point
lies on the line! For us, a pair of figures is just a more complicated
sort of figure.
Z//G
/ \
/ \
/ \
v v
X//G Y//G
\ /
\ /
\ /
v v
G
If you remember everything in "Episode 3" - and I bet you don't - you'll notice
that this commutative diamond is equivalent to diamond we started
with:
H∩K
/ \
/ \
/ \
v v
H K
\ /
\ /
\ /
v v
G
We've just gone around in a loop! But that's okay, because we've
learned something en route.
THE SAME AS A DOUBLE COSET.
IS THE SAME AS A DOUBLE COSET.
C
/ \
/ \
/ \
v v
A B
\ /
\ /
\ /
v v
G
where A,B,C are connected groupoids and the arrows are faithful
functors.
IS THE SAME AS A SPAN OF G-SETS.
IS THE SAME AS A G-SET.
Episode 5
In Episode 4 we reached the
point of seeing how spans of groupoids over a fixed group G subsume
the theory of G-sets and invariant relations between these - which are
traditionally studied using "double cosets".
Cayley's dictum that "a group is defined by means of the laws of
combination of its symbols" would imply that, in dealing with the
theory of groups, no more concrete mode of representation should be
used than is absolutely necessary. It may then be asked why, in
a book that professes to leave all applications to one side, a
considerable space is devoted to substitution groups [permutation
groups], but other particular modes of representation, such as
groups of linear transformations, are not even referred to. My
answer to this question is that while, in the present state of our
knowledge, many results in the pure theory are arrived at most
readily by dealing with properties of substitution groups, it
would be difficult to find a result that could most directly be
obtained by the consideration of groups of linear transformations.
Very considerable advances in the theory of groups of finite
order have been made since the appearance of the first edition
of this book. In particular the theory of groups of linear
substitutions has been the subject of numerous and important
investigations by several writers; and the reason given in the
original preface for omitting any account of it no longer holds
good. In fact it is now more true to say that for further
advances in the abstract theory one must look largely to the
representation of a group as a group of linear transformations.
It's interesting to see exactly how representing finite groups on vector
spaces lets us understand them better. By now almost everyone agrees
this is true. But how much of the detailed
machinery of linear algebra is really needed? How much we could
do purely combinatorially, using just spans of groupoids?
hom(G,Set) → hom(G,Vect)
A |-> FA
If this functor were an equivalence of categories, it would have to
be essentially surjective, full and faithful.
But, not every representation of G is isomorphic to a permutation
representation! In other words, the functor
Episode 6
S
/ \
/ \
P/ \Q
/ \
v v
X Y
In our examples so far these maps are given by
S
/ \
/ \
P/ \Q
/ \
v v
X Y
But, the reason this operator counts as a "Hecke operator"
is that it gets along with a symmetry group G that's acting
on everything in sight. In the examples so far, this is
the group of Euclidean symmetries of Rn. But, it could be
anything.
Episode 7
When I left
off, I was about to discuss an example: Hecke operators in the
special case of symmetric groups. But, one reader expressed
unease with what I'd done so far, saying it was too informal and
hand-wavy to understand.
S
/ \
/ \
/ \
v v
X Y
gives a linear operator
S T
/ \ / \
f/ \g h/ \i
/ \ / \
v v V v
X Y Z
we form the weak pullback in the middle, like this:
TS
/ \
j/ \k
/ \
v v
S T
/ \ / \
f/ \g h/ \i
/ \ / \
v v v v
X Y Z
Then, we compose the arrows along the sides to get a big span
from X to Z:
TS
/ \
/ \
/ \
fj / \ ik
/ \
/ \
/ \
/ \
v v
X Z
Never heard of "weak pullbacks"? Okay: I'll tell you what an
object in the weak pullback TS is. It's an object t in T and an
object s in S, together with an isomorphism between their images in Y.
If we were doing the ordinary pullback, we'd demand that these
images be equal. But that would be evil! Since t and s are living
in groupoids, we should only demand that their images be isomorphic
in a specified way.
S
/ \
/ \
/ \
v v
X Y
and turn it into a linear operator
1
|
|p
|
v
X
where 1 is the groupoid with one object and one morphism. So,
let's generalize this and figure out how any functor from
any finite groupoid V to X:
V
|
|p
|
v
X
picks out a vector in [X]. Again, by abuse of notation we'll
call this vector [V], even though it also depends on p.
------------------
V | o |
| o o |
| | o o o o |
|p | o o o o o |
| ------------------
v
------------------
X | o o o o o o |
------------------
I've drawn the elements of V and X as little circles, and shown how
each element in X has a bunch of elements of V sitting over it. When
degroupoidify this to get a vector in the vector space [X], we get:
[V] = (1, 4, 3, 2, 0, 2)
This vector is just a list of numbers saying how many points of V
are sitting over each point of X!
V
|
|p
|
v
X
Sitting over each object x in X we have its "essential preimage",
which is a groupoid. To get the vector [V], we add up basis
vectors [x] in [X], one for each isomorphism class of objects
in X, multiplied by the "cardinalities" of their essential preimages.
V
|
|
|
v
X
determines a vector [V] in [X]. You have to work some examples
to get a feel for it, but I want to get to the punchline. We're
unpeeling an onion here, and we're almost down to the core, where
you see there's nothing inside and wonder why you were crying so much.
S
/ \
/ \
/ \
v v
X Y
gives a linear operator
V
|
|
|
v
X
And, the trick is to notice that such a diagram is the same as
a silly span like this:
V
/ \
/ \
/ \
v v
1 X
1 is the groupoid with one object and one morphism, so there's
only one choice of the left leg here!
S
/ \
/ \
/ \
v v
X Y
to the vector [V] corresponding to the silly span
V
/ \
/ \
/ \
v v
1 X
we just compose these spans, and get a silly span
SV
/ \
/ \
/ \
v v
1 Y
which picks out a vector [SV] in [Y]. Then, we define
[TS] [V] = [(TS)V] by definition
= [T(SV)] by those facts I just mentioned
= [T] [SV] by definition
= [T] [S] [V] by definition
Since this is true for all [V], we conclude
http://math.ucr.edu/home/baez/qg-fall2003/
http://math.ucr.edu/home/baez/qg-winter2004/
http://math.ucr.edu/home/baez/qg-spring2004/
Further Episodes
This is just the beginning of the Tale of Groupoidification.
For further episodes you can see lectures notes, blog
entries and videos of this seminar taught by Jim Dolan and
myself:
© 2009 John Baez
baez@math.removethis.ucr.andthis.edu
home