For a more formal introduction to the subject, try these:
If we let this story lead us where it wants to go, we'll meet all sorts of famous and fascinating creatures, such as:
However, the charm of the tale is how many of these ideas are unified and made simpler thanks to a big, simple idea: groupoidification.
So, what's groupoidification? It's a method of exposing the combinatorial underpinnings of linear algebra - the hard bones of set theory underlying the flexibility of the continuum.
Linear algebra is all about vector spaces and linear maps. One of the lessons that gets drummed into you when you study this subject is that it's good to avoid picking bases for your vector spaces until you need them. It's good to keep the freedom to do coordinate transformations... and not just keep it in reserve, but keep it manifest!
As Hermann Weyl wrote, "The introduction of a coordinate system to geometry is an act of violence".
This is a deep truth, which hits many physicists when they study special and general relativity. However, as Niels Bohr quipped, a deep truth is one whose opposite is also a deep truth. There are some situations where a vector space comes equipped with a god-given basis. Then it's foolish not to pay attention to this fact!
The most obvious example is when our vector space has been defined to consist of formal linear combinations of the elements of some set. Then this set is our basis.
This often happens when we use linear algebra to study combinatorics.
But if sets give vector spaces, what gives linear operators? Your first guess might be functions. And indeed, functions between sets do give linear operators between their vector spaces. For example, suppose we have a function
f: {livecat, deadcat} → {livecat, deadcat}
which "makes sure the cat is dead":
f(livecat) = deadcat
f(deadcat) = deadcat
Then, we can extend f to a linear operator defined on formal linear combinations of cats:
F(a livecat + b deadcat) = a deadcat + b deadcat
Written as a matrix in the {livecat, deadcat} basis, this looks like
0 0 1 1(The relation to quantum mechanics here is just a vague hint of themes to come. I've deliberately picked an example where the linear operator is not unitary.)
So, we get some linear operators from functions... but not all! We only get operators whose matrices have exactly a single 1 in each column, the rest of the entries being 0. That's because a function f: X → Y sends each element of X to a single element of Y.
This is very limiting. We can do better if we get operators from relations between sets. In a relation between sets X and Y, an element of X can be related to any number of elements of Y, and vice versa. For example, let the relation
R: {1,2,3,4} —/→ {1,2,3,4}
be "is a divisor of". Then 1 is a divisor of everybody, 2 is a divisor of itself and 4, 3 is only a divisor of itself, and 4 is only a divisor of itself. We can encode this in a matrix:
1 0 0 0 1 1 0 0 1 0 1 0 1 1 0 1where 1 means "is a divisor of" and 0 means "is not a divisor of".
We can get any matrix of 0's and 1's this way. Relations are really just matrices of truth values. We're thinking of them as matrices of numbers. Unfortunately we're still far from getting all matrices of numbers!
We can do better if we get matrices from spans of sets. A span of sets, written S: X —/→ Y, is just a set S equipped with functions to X and Y. We can draw it like this:
S / \ / \ F/ \G / \ v v X YThis is my wretched ASCII attempt to draw two arrows coming down from the set S to the sets X and Y. It's supposed to look like a bridge - hence the term "span".
Spans of sets are like relations, but where you can be related to someone more than once!
For example, X could be the set of Frenchman and Y could be the set of Englishwomen. S could be the set of Russians. As you know, every Russian has exactly one favorite Frenchman and one favorite Englishwoman. So, F could be the function "your favorite Frenchman", and G could be "your favorite Englishwoman".
Then, given a Frenchman x and an Englishwoman y, they're related by the Russian s whenever s has x as their favorite Frenchman and y as their favorite Englishwoman:
F(s) = x and G(s) = y.
Some pairs (x,y) will be related by no Russians, others will be related by one, and others will be related by more than one! I bet the pair
(x,y) = (Gérard Depardieu, Emma Thompson)
is related by at least 57 Russians.
This idea lets us turn spans of sets into matrices of natural numbers. Given a span of finite sets:
S / \ / \ F/ \G / \ v v X Ywe get an X × Y matrix whose (x,y) entry is the number of Russians - I mean elements s of S - such that
F(s) = x and G(s) = y.
We can get any finite-sized matrix of natural numbers this way.
Even better, there's a way to "compose" spans that nicely matches the usual way of multiplying matrices. You can figure this out yourself if you solve this puzzle:
Let X be the set of people on Earth. Let T be the X × X matrix corresponding to the relation "is the father of". Why does the matrix T^{2} correspond to the relation "is the paternal grandfather of"? Let S correspond to the relation "is a friend of". Why doesn't the matrix S^{2} correspond to the relation "is a friend of a friend of"? What span does this matrix correspond to?
To go further, we need to consider spans, not of sets, but of groupoids!
I'll say more about this later - I suspect you're getting tired. But for now, briefly: a groupoid is a category with inverses. Any group gives an example, but groupoids are more general - they're the modern way of thinking about symmetry.
There's a way to define the cardinality of a finite groupoid:
12) John Baez and James Dolan, From finite sets to Feynman diagrams, in Mathematics Unlimited - 2001 and Beyond, vol. 1, eds. Bjorn Engquist and Wilfried Schmid, Springer, Berlin, 2001, pp. 29-50. Also available as math.QA/0004133.
And, this can equal any nonnegative rational number! This lets us generalize what we've done from finite sets to finite groupoids, and get rational numbers into the game.
A span of groupoids is a diagram
S / \ / \ F/ \G / \ v v X Ywhere X, Y, S are groupoids and F, G are functors. If all the groupoids are finite, we can turn this span into a finite-sized matrix of nonnegative rational numbers, by copying what we did for spans of finite sets.
There's also a way of composing spans of groupoids, which corresponds to multiplying matrices. However, there's a trick involved in getting this to work - I'll have to explain this later. For details, try:
13) Jeffrey Morton, Categorified algebra and quantum mechanics, Theory and Application of Categories 16 (2006), 785-854. Available at http://www.emis.de/journals/TAC/volumes/16/29/16-29abs.html; also available as math.QA/0601458.
14) Simon Byrne, On Groupoids and Stuff, honors thesis, Macquarie University, 2005, available at http://www.maths.mq.edu.au/~street/ByrneHons.pdf and http://math.ucr.edu/home/baez/qg-spring2004/ByrneHons.pdf
Anyway: the idea of "groupoidification" is that in many cases where mathematicians think they're playing around with linear operators between vector spaces, they're actually playing around with spans of groupoids!
This is especially true in math related to simple Lie groups, their Lie algebras, quantum groups and the like. While people usually study these gadgets using linear algebra, there's a lot of combinatorics involved - and where combinatorics and symmetry show up, one invariably finds groupoids.
As the name suggests, groupoidification is akin to categorification. But, it's a bit different. In categorification, we try to boost up mathematical ideas this way:
sets → categories
functions → functors
In groupoidification, we try this:
vector spaces → groupoids
linear operators → spans of groupoids
Actually, it's "decategorification" and "degroupoidification" that are systematic processes. These processes lose information, so there's no systematic way to reverse them. But, as I explained in "week99", it's still fun to try! If we succeed, we discover an extra layer of structure beneath the math we thought we understood... and this usually makes that math clearer and less technical, because we're not seeing it through a blurry, information-losing lens.
I don't want this to be accessible only to experts, since a bunch of it is so wonderfully elementary. So, I'm going to proceed rather slowly. This may make the experts impatient, so near the end I'll zip ahead and sketch out a bit of the big picture.
Last time I introduced spans of sets. A span of sets is just a set S equipped with functions to X and Y:
S / \ / \ F/ \G / \ v v X YSimple! But the important thing is to understand this thing as a "witnessed relation".
Have you heard how computer scientists use the term "witness"? They say the number 17 is a "witness" to the fact that the number 221 isn't prime, since 17 evenly divides 221.
That's the idea here. Given a span S as above, we can say an element x of X and an element y of Y are "related" if there's an element s of S with
F(s) = x and G(s) = y
The element s is a "witness" to the relation.
Last time, I gave an example where a Frenchman x and an Englishwoman y were related if they were both the favorites of some Russian s.
Note: there's more information in the span than the relation it determines. The relation either holds or fails to hold. The span does more: it provides a set of "witnesses". The relation holds if this set of witnesses is nonempty, and fails to hold if it's empty.
At least, that's how mathematicians think. When I got married last month, I discovered the state of California demands two witnesses attend the ceremony and sign the application for a marriage license. Here the relation is "being married", and the witnesses attest to that relation - but for the state, one witness is not enough to prove that the relation holds! They're using a more cautious form of logic.
To get the really interesting math to show up, we need to look at other examples of "witnessed relations" - not involving Russians or marriages, but geometry and symmetry.
For example, suppose we're doing 3-dimensional geometry. There's a relation "the point x and the line y lie on a plane", but it's pretty dull, since it's always true. More interesting is the witnessed relation "the point x and the line y lie on the plane z". The reason is that sometimes there will be just one plane containing a point and a line, but when the point lies on the line, there will be lots.
To think of this "witnessed relation" as a span
S / \ / \ F/ \G / \ v v X Ywe can take X to be the set of points and Y to be the set of lines.
Can we take S to be the set of planes? No! Then there would be no way to define the functions F and G, because the same plane contains lots of different points and lines. So, we should take S to be the set of triples (x,y,z) where x is a point, y is a line, and z is a plane containing x and y. Then we can take
F(x,y,z) = x
and
G(x,y,z) = y
A "witness" to the fact that x and y lie on a plane is not just a plane containing them, but the entire triple.
(If you're really paying attention, you'll have noticed that we need to play the same trick in the example of witnesses to a marriage.)
Spans like this play a big role in "incidence geometry". There are lots of flavors of incidence geometry, with "projective geometry" being the most famous. But, a common feature is that we always have various kinds of "figures" - like points, lines, planes, and so on. And, we have various kinds of "incidence relations" involving these figures. But to really understand incidence geometry, we need to go beyond relations and use spans of sets.
Actually, we need to go beyond spans of sets and use spans of groupoids! The reason is that incidence geometries usually have interesting symmetries, and a groupoid is like a "set with symmetries". For example, consider lines in 3-dimensional space. These form a set, but there are also symmetries of 3-dimensional space mapping one line to another. To take these into account we need a richer structure: a groupoid!
Here's the formal definition: a groupoid consists of a set of "objects", and for any objects x and y, a set of "morphisms"
f: x → y
which we think of as symmetries taking x to y. We can compose a morphism f: x → y and a morphism g: y → z to get a morphism fg: x → z. We think of fg as the result of doing first f and then g. So, we demand the associative law
(fg)h = f(gh)
whenever either side is well-defined. We also demand that every object x has an identity morphism
1_{x}: x → x
We think of this as the symmetry that doesn't do anything to x. So, given any morphism f: x → y, we demand that
f 1_{y} = f = 1_{x} f
So far this is the definition of a "category". What makes it a "groupoid" is that every morphism f: x → y has an "inverse"
f^{ -1}: y → x
with the property that
f f^{ -1} = 1_{x}
and
f^{ -1} f = 1_{y}
In other words, we can "undo" any symmetry.
So, in our spans from incidence geometry:
S / \ / \ F/ \G / \ v v X YX, Y and S will be groupoids, while F and G will be maps between groupoids: that is, "functors"!
What's a functor? Given groupoids A and B, clearly a functor
F: A → B
should send any object x in A to an object F(x) in B. But also, it should send any morphism in A:
f: x → y
to a morphism in B:
F(f): F(x) → F(y)
And, it should preserve all the structure that a groupoid has, namely composition:
F(fg) = F(f) F(g)
and identities:
F(1_{x}) = 1_{F(x)}
It then automatically preserves inverses too:
F(f^{ -1}) = F(f)^{ -1}
Given this, what's the meaning of a span of groupoids? You could say it's a "invariant" witnessed relation - that is, a relation with witnesses that's preserved by the symmetries at hand. These are the very essence of incidence geometry. For example, if we have a point and a line lying on a plane, we can rotate the whole picture and get a new point and a new line lying on a new plane. Indeed, a "symmetry" in incidence geometry is precisely something that preserves all such "incidence relations".
For those of you not comfy with groupoids, let's see how this actually works. Suppose we have a span of groupoids:
S / \ / \ F/ \G / \ v v X Yand the object s is a witness to the fact that x and y are related:
F(s) = x and G(s) = y
Also suppose we have a symmetry sending s to some other object of S:
f: s → s'
This gives morphisms
F(f): F(s) → F(s')
in X and
G(f): G(s) → G(s')
in Y. And if we define
F(s') = x' and G(s') = y'
we see that s' is a witness to the fact that x' and y' are related.
Let me summarize the Tale so far:
From all this, you should begin to vaguely see that starting from any sort of incidence geometry, we should be able to get a bunch of matrices. Facts about incidence geometry will give facts about linear algebra!
"Groupoidification" is an attempt to reverse-engineer this process. We will discover that lots of famous facts about linear algebra are secretly facts about incidence geometry!
To prepare for what's to come, the maniacally diligent reader might like to review "week178", "week180", "week181", "week186" and "week187", where I explained how any Dynkin diagram gives rise to a flavor of incidence geometry. For example, the simplest-looking Dynkin diagrams, the A_{n} series, like this for n = 3:
o------o------o points lines planesgive rise to n-dimensional projective geometry. I may have to review this stuff, but first I'll probably say a bit about the theory of group representations and Hecke algebras.
(There will also be other ways to get spans of groupoids, that don't quite fit into what's customarily called "incidence geometry", but still fit very nicely into our Tale. For example, Dynkin diagrams become "quivers" when we give each edge a direction, and the "groupoid of representations of a quiver" gives rise to linear-algebraic structures related to a quantum group. In fact, I already mentioned this in item E of "week230". Eventually this will let us groupoidify the whole theory of quantum groups! But, I don't want to rush into that, since it makes more sense when put in the right context.)
I'm telling a long story about symmetry, geometry, and algebra. Some of this tale is new work done by James Dolan, Todd Trimble and myself. But a lot of it is old work by famous people which deserves a modern explanation.
A great example is Felix Klein's "Erlangen program" - a plan for reducing many sorts of geometry to group theory. Many people tip their hat to the Erlanger program, but few seem to deeply understand it, and even fewer seem to have read what Klein actually wrote about it!
The problem goes back a long ways. In 1871, while at Göttingen, Klein worked on non-Euclidean geometry, and showed that hyperbolic geometry was consistent if and only if Euclidean geometry was. In the process, he must have thought hard about the role of symmetry groups in geometry. When he was appointed professor at Erlangen in 1872, he wrote a lecture outlining his "Erlanger Programm" for reducing geometry to group theory.
But, he didn't actually give this lecture as his inaugural speech! He spoke about something else.
So, nobody ever heard him announce the Erlangen program. And, until recently, the lecture he wrote was a bit hard to find. Luckily, now you can get it online:
1) Felix Klein, Vergleichende Betrachtungen ueber neuere geometrische Forschungen, Verlag von Andreas Deichert, Erlangen, 1872. Also available at the University of Michigan Historical Mathematics Collection, http://www.hti.umich.edu/cgi/t/text/text-idx?c=umhistmath;idno=ABN7632
Even better, Johan Ernst Mebius has recently prepared an HTML version, with links to the above version:
2) Johan Ernst Mebius, Felix Klein's Erlanger Programm, http://www.xs4all.nl/~jemebius/ErlangerProgramm.htm
But what if you have the misfortune of only reading English, not German? Until now the only translation was quite hard to obtain:
3) Felix Klein, A comparative review of recent researches in geometry, trans. M. W. Haskell, Bull. New York Math. Soc. 2, (1892-1893), 215-249.
In case you're wondering, the "New York Mathematical Society" no longer exists! It was founded in 1888, but in 1894 it went national and became the American Mathematical Society.
Luckily, after Thomas Love pointed out the existence of this old translation, Chris Hillman was able to get ahold of it and scan it in! Then Robin Houston created a PDF file of the whole thing, and Lukas-Fabian Moser created a DjVu file. Then Nitin C. Rughoonauth took the marvelous step of putting it into LaTeX! So now, you can read Klein's paper in English here:
4) The Erlangen program, http://math.ucr.edu/home/baez/erlangen/
English-speakers can read more about the Erlangen program here:
5) Felix Klein, Elementary Mathematics from an Advanced Standpoint: Geometry, part 3: Systematic discussion of geometry and its foundations, Dover, New York, 1939.
Luckily Dover keeps its books in print!
For more on the Erlangen program, try these:
6) Garrett Birkhoff and M. K. Bennett, Felix Klein and his "Erlanger Programm", in History and Philosophy of Modern Mathematics, eds. W. Aspray and P. Kitcher, Minnesota Stud. Philos. Sci. XI, University of Minnesota Press, Minneapolis, 1988, pp. 145-176.
7) Hans A. Kastrup, The contributions of Emmy Noether, Felix Klein and Sophus Lie to the modern concept of symmetries in physical systems, in Symmetries in Physics (1600-1980), ed. M. G. Doncel, World Scientific, Singapore, 1987, pp. 113-163.
8) I. M. Yaglom, Felix Klein and Sophus Lie: Evolution of the Idea of Symmetry in the Nineteenth Century, trans. S. Sossinsky, Birkhauser, Boston, 1988.
For more about Klein, try "week213" and this little biography:
9) MacTutor History of Mathematics Archive, Felix Klein, http://www-history.mcs.st-andrews.ac.uk/Biographies/Klein.html
But what does the Erlangen program actually amount to, in the language of modern mathematics? This will take a while to explain, so the best thing is to dive right in.
In the last episode of the Tale of Groupoidification I tried to explain two slogans:
They're a bit vague; they're mainly designed to give you enough intuition to follow the next phase of the Tale, which is all about how:
But before the next phase, I need to say a bit about how groupoids and spans of groupoids fit into Klein's Erlangen program.
Groupoids are a modern way to think about symmetries. A more traditional approach would use a group acting as symmetries of some set. And the most traditional approach of all, going back to Galois and Klein, uses a group acting transitively on a set.
So, let me explain the traditional approach, and then relate it to the modern one.
I hope you know what it means for a group G to "act" on a set X. It means that for any element x of X and any guy g in G, we get a new element gx in X. We demand that
1x = x
and
g(hx) = (gh)x.
More precisely, this is a "left action" of G on X, since we write the group elements to the left of x. We can also define right actions, and someday we may need those too.
We say an action of a group G on a set X is "transitive" if given any two elements of X, there's some guy in G mapping the first element to the second. In this case, we have an isomorphism of sets
X = G/H
for some subgroup H of G.
For example, suppose we're studying a kind of geometry where the symmetry group is G. Then X could be the set of figures of some sort: points, or lines, or something fancier. If G acts transitively on X, then all figures of this sort "look alike": you can get from any one to any other using a symmetry. This is often the case in geometry... but not always.
Suppose G acts transitively on X. Pick any figure x of type X and let H be its "stabilizer": the subgroup consisting of all guys in G that map x to itself. Then we get a one-to-one and onto map
f: X → G/H
sending each figure gx in X to the equivalence class [g] in G/H.
If you haven't seen this fact before, you should definitely prove it - it's one of the big ways people use symmetry!
Here's one kind of thing people do with this fact. The 3d rotation group G = SO(3) acts on the sphere X = S^{2}, and the stabilizer of the north pole is the 2d rotation group H = SO(2), so the sphere is isomorphic to G/H = SO(3)/SO(2). The same sort of result holds in any dimension, and we can use it to derive facts about spheres from facts about rotation groups, and vice versa.
A grander use of this fact is to set up a correspondence between sets on which G acts transitively and subgroups of G. This is one of the principles lurking behind Galois theory.
Galois applied this principle to number theory - see "week201" for details. But, it really has nothing particular to do with number theory! In his Erlangen program, Klein applied it to geometry.
Klein's goal was to systematize a bunch of different kinds of non-Euclidean geometry. Each kind of geometry he was interested in had a different group of symmetries. For example:
The details here don't matter much yet; the point is that there are lots of interesting kinds of geometry, with interesting symmetry groups!
Klein realized that in any kind of geometry like this, a "type of figure" corresponds to a set on which G acts transitively. Here a "figure" could be a point, a line, a plane, or something much fancier. Regardless of the details, the set of all figures of the same type can be written as G/H, and G acts transitively on this set.
The really cool part is that we can use Klein's idea to define a geometry for any group G. To do this, we just say that every subgroup H of G gives rise to a type of figure. So, we work out all the subgroups of G. Then, we work out all the incidence relations - relations like "a point lies on a line". To do this, we take two sets of figures, say
X = G/H
and
Y = G/K
and find all the invariant relations between them: that is, subsets of X × Y preserved by all the symmetries. I'll say more about how to do this next time - we can use something called "double cosets". In nice cases, like when G is a simple Lie group and H and K are so-called "parabolic" subgroups, these let us express all the invariant relations in terms of finitely many "atomic" ones! So, we can really carry out Klein's program of thoroughly understanding geometry starting from groups - at least in nice cases.
In short, group actions - especially transitive ones - are a traditional and very powerful way of using symmetry to tackle lots of problems.
So, to bridge the gap between the traditional and the new, I should explain how group actions give groupoids. I'll show you that:
It's not very hard to get a groupoid from a group action. Say we have a group G acting on a set X. Then the objects of our groupoid are just elements of X, and a morphism
g: x → y
is just a group element g with
gx = y.
Composing morphisms works the obvious way - it's basically just multiplication in the group G.
Some people call this groupoid an "action groupoid". I often call it the "weak quotient" X//G, since it's like the ordinary quotient X/G, but instead of declaring that x and y are equal when we have a group element g sending x to y, we instead declare they're isomorphic via a specified isomorphism g: x → y.
But for now, let's call X//G the "action groupoid".
So, group actions give action groupoids. But, these groupoids come with extra stuff!
First of all, the action groupoid X//G always comes equipped with a functor
X//G | |p | v Gsending any object of X//G to the one object of G, and any morphism g: x → y to the corresponding element of G. Remember, a group is a groupoid with one object: this is the 21st century!
Second of all, this functor p is always "faithful": given two morphisms from x to y, if p maps them to the same morphism, then they were equal.
And that's all! Any groupoid with a faithful functor to G is equivalent to the action groupoid X//G for some action of G on some set X. This takes a bit of proving... let's not do it now.
So: in my slogan
the "certain extra stuff" was precisely a faithful functor to G.
What if we have a transitive group action? Then something nice happens.
First of all, saying that G acts transitively on X is the same as saying there's a morphism between any two objects of X//G. In other words, all objects of X//G are isomorphic. Or in other words, there's just one isomorphism class of objects.
Just as a groupoid with one object is a group, a groupoid with one isomorphism class of objects is equivalent to a group. Here I'm using the usual notion of "equivalence" of categories, as explained back in "week76".
So, G acts transitively on X precisely when X//G is equivalent to a group!
And what group? Well, what could it possibly be? It's just the stabilizer of some element of X! So, in the case of a transitive group action, our functor
X//G | |p | v Gis secretly equivalent to the inclusion
H | |i | v Gof the stabilizer group of this element.
So, we see how Klein's old idea of geometrical figures as subgroups of G is being generalized. We can start with any groupoid Y of "figures" and "symmetries between figures", and play with that. It becomes an action groupoid if we equip it with a faithful functor to some group G:
Y | | | v GThen the action is transitive if all the objects of Y are isomorphic. In that case, our functor is equivalent to an inclusion
H | | | v Gand we're back down to Klein's approach to geometry. But, it's actually good to generalize what Klein did, and think about arbitrary "groupoids over G" - that is, groupoids equipped with functors to G.
So, when we blend our ideas on spans of groupoids with Klein's ideas, we'll want to use spans of groupoids "over G" - that is, commutative diamonds of groupoids and functors, like this:
S / \ / \ / \ / \ v v X Y \ / \ / \ / \ / v v GThere's much more to say about this, but not today!
I'll say one last thing before quitting. It's a bit more technical, but I feel an urge to see it in print.
People often talk about "the" stabilizer group of a transitive action of some group G on some set X. This is a bit dangerous, since every element of X has its own stabilizer, and they're not necessarily all equal!
However, they're all conjugate: if the stabilizer of x is H, then the stabilizer of gx is gHg^{-1}.
So, when I say above that
X//G | |p | v Gis equivalent to
H | |i | v GI could equally well have said it's equivalent to
H | |i' | v Gwhere the inclusion i' is the inclusion i conjugated by g. If you know some category theory, you'll see that i and i' are naturally isomorphic: a natural isomorphism between functors between groups is just a "conjugation". Picking the specific inclusion i requires picking a specific element x of X.
Of course, I'll try to write later issues in a way that doesn't force you to have understood all these nuances!
Though Newton seems to have believed in some form of "absolute space", the idea that motion is relative predates Einstein by a long time. In 1632, in his Dialogue Concerning the Two Chief World Systems, Galileo wrote:
Shut yourself up with some friend in the main cabin below decks on some large ship, and have with you there some flies, butterflies, and other small flying animals. Have a large bowl of water with some fish in it; hang up a bottle that empties drop by drop into a wide vessel beneath it. With the ship standing still, observe carefully how the little animals fly with equal speed to all sides of the cabin. The fish swim indifferently in all directions; the drops fall into the vessel beneath; and, in throwing something to your friend, you need throw it no more strongly in one direction than another, the distances being equal; jumping with your feet together, you pass equal spaces in every direction.As a result, the coordinate transformation we use in Newtonian mechanics to switch from one reference frame to another moving at a constant velocity relative to the first is called a "Galilei transformation". For example:When you have observed all these things carefully (though doubtless when the ship is standing still everything must happen in this way), have the ship proceed with any speed you like, so long as the motion is uniform and not fluctuating this way and that. You will discover not the least change in all the effects named, nor could you tell from any of them whether the ship was moving or standing still.
(t, x, y, z) |→ (t, x + vt, y, z)
By the time Maxwell came up with his equations describing light, the idea of relativity of motion was well established. In 1876, he wrote:
Our whole progress up to this point may be described as a gradual development of the doctrine of relativity of all physical phenomena. Position we must evidently acknowledge to be relative, for we cannot describe the position of a body in any terms which do not express relation. The ordinary language about motion and rest does not so completely exclude the notion of their being measured absolutely, but the reason of this is, that in our ordinary language we tacitly assume that the earth is at rest.... There are no landmarks in space; one portion of space is exactly like every other portion, so that we cannot tell where we are. We are, as it were, on an unruffled sea, without stars, compass, sounding, wind or tide, and we cannot tell in what direction we are going. We have no log which we can case out to take a dead reckoning by; we may compute our rate of motion with respect to the neighboring bodies, but we do not know how these bodies may be moving in space.So, the big deal about special relativity is not that motion is relative. It's that this is possible while keeping the speed of light the same for everyone - as Maxwell's equations insist, and as we indeed see! This is what forced people to replace Galilei transformations by "Lorentz transformations", which have the new feature that two coordinate systems moving relative to each other will disagree not just on where things are, but when they are.
As Einstein wrote in 1905:
Examples of this sort, together with the unsuccessful attempts to discover any motion of the earth relative to the "light medium", suggest that the phenomena of electrodynamics as well as mechanics possess no properties corresponding to the idea of absolute rest. They suggest rather that, as has already been shown to the first order of small quantities, the same laws of electrodynamics and optics will be valid for all frames of reference for which the equations of mechanics are valid. We will elevate this conjecture (whose content will be called the "principle of relativity") to the status of a postulate, and also introduce another postulate, which is only apparently irreconcilable with it, namely, that light is always propagated in empty space with a definite velocity c which is independent of the state of motion of the emitting body. These two postulates suffice for attaining a simple and consistent theory of the electrodynamics of moving bodies based on Maxwell's theory for stationary bodies.So, what really changed with the advent of special relativity? First, our understanding of precisely which transformations count as symmetries of spacetime. These transformations form a group. Before special relativity, it seemed the relevant group was a 10-dimensional gadget consisting of:
Nowadays this is called the "Galilei group":
With special relativity, the relevant group became the "Poincare group":
It's still 10-dimensional, not any bigger. But, it acts differently as transformations of the spacetime coordinates (t,x,y,z).
Another thing that changed was our appreciation of the importance of symmetry! Before the 20th century, group theory was not in the toolkit of most theoretical physicists. Now it is.
Okay. Now suppose you're the only thing in the universe, floating in empty space, not rotating. To make your stay in this thought experiment a pleasant one, I'll give you a space suit. And for simplicity, suppose special relativity holds true exactly, with no gravitational fields to warp the geometry of spacetime.
Would the universe be any different if you were moving at constant velocity? Or translated 2 feet to the left or right? Or turned around? Or if it were one day later?
No! Not in any observable way, at least! It would seem exactly the same.
So in this situation, it doesn't really make much sense to say "where you are", or "which way you're facing", or "what time it is". There are no "invariant propositions" to make about your location or motion. In other words, there's nothing to say whose truth value remains unchanged after you apply a symmetry.
Well, almost nothing to say! The logicians in the crowd will note that you can say "T": the tautologously true statement. You can also say "F": the tautologously false statement. But, these aren't terribly interesting.
Next, suppose you have a friend also floating through space. Now there are more interesting invariant propositions. There's nothing much invariant to say about just you, and nothing to say about just your friend, but there are invariant relations. For example, you can measure your friend's speed relative to you, or your distance of closest approach.
Mathematicians study invariant relations using a tool called "double cosets". I want to explain these today, since we'll need them soon in the Tale of Groupoidification.
"Double cosets" sound technical, but that's just to keep timid people from understanding the subject. A double coset is secretly just an "atomic" invariant relation: one that can't be expressed as "P or Q" where P and Q are themselves invariant relations - unless precisely one of P or Q is tautologously false.
So, atomic invariant relations are like prime numbers: they can't be broken down into simpler bits. And, as we'll see, every invariant relation can be built out of atomic ones!
Here's an example in the case we're considering:
"My friend's speed relative to me is 50 meters/second, and our distance of closest approach is 10 meters."
This is clearly an invariant relation. It's atomic if we idealize the situation and assume you and your friends are points - so we can't ask which way you're facing, whether you're waving at each other, etc.
To see why it's atomic, note that we can always find a frame of reference where you're at rest and your friend is moving by like this:
-----FRIEND----> YOU
If you and your friend are points, the situation is completely described (up to symmetries) by the relative speed and distance of closest approach. So, the invariant relation quoted above can't be written as "P or Q" for other invariant relations.
The same analysis shows that in this example, every atomic invariant relation is of this form:
"My friend's speed relative to me is s, and our distance of closest approach is d."for some nonnegative numbers s and d.
(Quiz: why don't we need to let s be negative if your friend is moving to the left?)
From this example, it's clear there are often infinitely many double cosets. But there are some wonderful examples with just finitely many double cosets - and these are what I'll focus on in our Tale.
Here's the simplest one. Suppose we're doing projective plane geometry. This is a bit like Euclidean plane geometry, but there are more symmetries: every transformation that preserves lines is allowed. So, in addition to translations and rotations, we also have other symmetries.
For example, imagine taking a blackboard with some points and lines on it:
\ / ------------x-----------x----------- \ / \ / \ / \ / \ / x / \ / \ / \We can translate it and rotate it. But, we can also view it from an angle: that's another symmetry in projective geometry! This hints at how projective geometry arose from the study of perspective in painting.
We get even more symmetries if we use a clever trick. Suppose we're standing on the blackboard, and it extends infinitely like an endless plain. Points on the horizon aren't really points on the blackboard. They're called "points at infinity". But, it's nice to include them as part of the so-called "projective plane". They make things simpler: now every pair of lines intersects in a unique point, just as every pair of points lies on a unique line. You've probably seen how parallel railroad tracks seem to meet at the horizon - that's what I'm talking about here. And, by including these extra points at infinity, we get extra symmetries that map points at infinity to ordinary points, and vice versa.
I gave a more formal introduction to projective geometry in "week106" and "week145", and "week178". If you read these, you'll know that points in the projective plane correspond to lines through the origin in a 3d space. And, you'll know a bit about the group of symmetries in projective geometry: it's the group G = PGL(3), consisting of 3×3 invertible matrices, modulo scalars.
(I actually said SL(3), but I was being sloppy - this is another group with the same Lie algebra.)
For some great examples of double cosets, let F be the space of "flags". A "flag" is a very general concept, but in projective plane geometry a flag is just a point x on a line y:
-----------------x---------------- yAn amazing fact is that there are precisely 6 atomic invariant relations between a pair of flags. You can see them all in this picture:
\ / ------------x-----------x'---------- \ / y \ / \ / \ / \ / x" / \ / \ y'/ \y"There are six flags here, and each exemplifies a different atomic invariant relation to our favorite flag, say (x,y).
For example, the flag (x',y') has the following relation to (x,y):
"The point of (x',y') lies on the line of (x,y), and no more."By "no more" I mean that no further incidence relations hold.
There's a lot more to say about this, and we'll need to delve into it much deeper soon... but not yet. For now, I just want to mention that all this stuff generalizes from G = PGL(3) to any other simple Lie group! And, the picture above is an example of a general concept, called an "apartment". Apartments are a great way to visualize atomic invariant relations between flags.
This "apartment" business is part of a wonderful theory due to Jacques Tits, called the theory of "buildings". The space of all flags is a building; a building has lots of apartments in it. Buildings have a reputation for being scary, because in his final polished treatment, Tits started with a few rather unintuitive axioms and derived everything from these. But, they're actually lots of fun if you draw enough pictures!
Next, let me explain why people call atomic invariant relations "double cosets".
First of all, what's a relation between two sets X and Y? We can think of it as a subset S of X × Y: we say a pair (x,y) is in S if the relation holds.
Next, suppose some group G acts on both X and Y. What's an "invariant" relation? It's a subset S of X × Y such that whenever (x,y) is in S, so is (gx,gy). In other words, the relation is preserved by the symmetries.
Now let's take these simple ideas and make them sound more complicated, to prove we're mathematicians. Some of you may want to take a little nap right around now - I'm just trying to make contact with the usual way experts talk about this stuff.
First, let's use an equivalent but more technical way to think of an invariant relation: it's a subset of the quotient space G\(X × Y).
Note: often I'd call this quotient space (X × Y)/G. But now I'm writing it with the G on the left side, since we had a left action of G on X and Y, hence on X × Y - and in a minute we're gonna need all the sides we can get!
Second, recall from the previous episode that if G acts transitively on both X and Y, we have isomorphisms
X ≅ G/H
and
Y ≅ G/K
for certain subgroups H and K of G. Note: here we're really modding out by the right action of H or K on G.
Combining these facts, we see that when G acts transitively on both X and Y, an invariant relation is just a subset of
G\(X × Y) ≅ G\(G/H x G/K)
Finally, if you lock yourself in a cellar and think about this for a few minutes (or months), you'll realize that this weird-looking set is isomorphic to
H\G/K
This notation may freak you out at first - I know it scared me! The point is that we can take G, mod out by the right action of K to get G/K, and then mod out by the left action of H on G/K, obtaining
H\(G/K).
Or we can take G, mod out by the left action of H to get H\G, and then mod out by the right action of K on H\G, obtaining
(H\G)/K.
And, these two things are isomorphic! So, we relax and write
H\G/K
A point in here is called a "double coset": it's an equivalence class consisting of all guys in G of the form
hgk
for some fixed g, where h ranges over H and k ranges over K.
Since subsets of H\G/K are invariant relations, we can think of a point in H\G/K as an "atomic" invariant relation. Every invariant relation is the union - the logical "or" - of a bunch of these.
So, just as any hunk of ordinary matter can be broken down into atoms, every invariant statement you can make about an entity of type X and an entity of type Y can broken down into "atomic" invariant relations - also known as double cosets!
So, double cosets are cool. But, it's good to fit them into the "spans of groupoids" perspective. When we do this, we'll see:
This relies on the simpler slogan I mentioned last time:
Let's see how it goes. Suppose we have two sets on which G acts transitively, say X and Y. Pick a figure x of type X, and a figure y of type Y. Let H be the stabilizer of x, and let K be the stabilizer of y. Then we get isomorphisms
X ≅ G/H
and
Y ≅ G/K
The subgroup H ∩ K stabilizes both x and y, and
Z = G/(H ∩ K)
is another set on which G acts transitively. How can we think of this set? It's the set of all pairs of figures, one of type X and one of type Y, which are obtained by taking the pair (x,y) and applying an element of G. So, it's a subset of X × Y that's invariant under the action of G. In other words, it's an invariant relation between X and Y!
Furthermore, it's the smallest invariant subset of X × Y that contains the pair (x,y). So, it's an atomic invariant relation - or in other words, a double coset!
Now, let's see how to get a span of groupoids out of this. We have a commutative diamond of group inclusions:
H∩K / \ / \ / \ v v H K \ / \ / \ / v v GThis gives a commutative diamond of spaces on which G acts transitively:
G/(H∩K) / \ / \ / \ v v G/H G/K \ / \ / \ / v v G/GWe already have names for three of these spaces - and G/G is just a single point, say *:
Z / \ / \ / \ v v X Y \ / \ / \ / v v *Now, in Episode 3 I explained how you could form the "action groupoid" X//G given a group G acting on a space X. If I were maniacally consistent, I would write it as G\\X, since G is acting on the left. But, I'm not. So, the above commutative diamond gives a commutative diamond of groupoids:
Z//G / \ / \ / \ v v X//G Y//G \ / \ / \ / v v *//GThe groupoid on the bottom has one object, and one morphism for each element of G. So, it's just G! So we have this:
Z//G / \ / \ / \ v v X//G Y//G \ / \ / \ / v v GSo - voila! - our double coset indeed gives a span of groupoids
Z//G / \ / \ / \ v v X//G Y//GX//G is the groupoid of figures just like x (up to symmetry), Y//G is the groupoid of figures just like y, and Z//G is the groupoid of pairs of figures satisfying the same atomic invariant relation as the pair (x,y). For example, point-line pairs, where the point lies on the line! For us, a pair of figures is just a more complicated sort of figure.
But, this span of groupoids is a span "over G", meaning it's part of a commutative diamond with G at the bottom:
Z//G / \ / \ / \ v v X//G Y//G \ / \ / \ / v v GIf you remember everything in "Episode 3" - and I bet you don't - you'll notice that this commutative diamond is equivalent to diamond we started with:
H∩K / \ / \ / \ v v H K \ / \ / \ / v v GWe've just gone around in a loop! But that's okay, because we've learned something en route.
To tersely summarize what we've learned, let's use the fact that a groupoid is equivalent to a group precisely when it's "connected": that is, all its objects are isomorphic. Furthermore, a functor between connected groupoids is equivalent to an inclusion of groups precisely when it's "faithful": one-to-one on each homset. So, when I said that:
what I really meant was:
C / \ / \ / \ v v A B \ / \ / \ / v v Gwhere A,B,C are connected groupoids and the arrows are faithful functors.
This sounds complicated, but it's mainly because we're trying to toss in extra conditions to make our concepts match the old-fashioned "double coset" notion. Here's a simpler, more general fact:
where a "G-set" is a set on which G acts. This is the natural partner of the slogan I explained in the last episode, though not in this language:
Things get even simpler if we drop the "faithfulness" assumption, and simply work with groupoids over G, and spans of these. This takes us out of the traditional realm of group actions on sets, and into the 21st century! And that's where we want to go.
Indeed, for the last couple episodes I've just been trying to lay out the historical context for the Tale of Groupoidification, so experts can see how the stuff to come relates to stuff that's already known. In some ways things will get simpler when I stop doing this and march ahead. But, I'll often be tempted to talk about group actions on sets, and double cosets, and other traditional gadgets... so I feel obliged to set the stage.
There is a lot more we could say about this. But, our most urgent goal is to see how spans of groupoids act like twice categorified matrices - matrices whose entries are not just numbers, and not just sets, but groupoids! This will expose the secret combinatorial underpinnings of a lot of fancy linear algebra. Once we've categorified linear algebra this way, we'll be in a great position to tackle fashionable topics like categorified quantum groups, invariants of higher-dimensional knots, and the like.
But, we should restrain ourselves from charging ahead too fast! Everything will hang together better if we lay the groundwork properly. For this, it pays to re-examine the history of mathematics a bit. If we're trying to understand linear algebra using groupoids, it pays to ask: how did people connect linear algebra and group theory in the first place?
This book is very helpful:
9) Charles W. Curtis, Pioneers of Representation Theory: Frobenius, Burnside, Schur and Brauer, History of Mathematics vol. 15, AMS, Providence, Rhode Island, 1999.
Back in 1897, a mathematician named William Burnside wrote the first book in English on finite groups. It was called Theory of Groups of Finite Order.
In the preface, Burnside explained why he studied finite groups by letting them act as permutations of sets, but not as linear transformations of vector spaces:
Cayley's dictum that "a group is defined by means of the laws of combination of its symbols" would imply that, in dealing with the theory of groups, no more concrete mode of representation should be used than is absolutely necessary. It may then be asked why, in a book that professes to leave all applications to one side, a considerable space is devoted to substitution groups [permutation groups], but other particular modes of representation, such as groups of linear transformations, are not even referred to. My answer to this question is that while, in the present state of our knowledge, many results in the pure theory are arrived at most readily by dealing with properties of substitution groups, it would be difficult to find a result that could most directly be obtained by the consideration of groups of linear transformations.
In short, he didn't see the point of representing groups on vector spaces - at least as a tool in the "pure" theory of finite groups, as opposed to their applications.
However, within months after this book was published, he discovered the work of Georg Frobenius, who used linear algebra very effectively to study groups!
So, Burnside started using linear algebra in his own work on finite groups, and by the time he wrote the second edition of his book in 1911, he'd changed his tune completely:
Very considerable advances in the theory of groups of finite order have been made since the appearance of the first edition of this book. In particular the theory of groups of linear substitutions has been the subject of numerous and important investigations by several writers; and the reason given in the original preface for omitting any account of it no longer holds good. In fact it is now more true to say that for further advances in the abstract theory one must look largely to the representation of a group as a group of linear transformations.It's interesting to see exactly how representing finite groups on vector spaces lets us understand them better. By now almost everyone agrees this is true. But how much of the detailed machinery of linear algebra is really needed? How much we could do purely combinatorially, using just spans of groupoids?
I don't really know the full answer to this question. But, it quickly leads us into the fascinating theory of "Hecke operators", which will play a big role in the Tale of Groupoidification. So, let's pursue it a bit.
Suppose we have two guys, William and Georg, who are studying a finite group G.
William says, "I'm studying how G acts on sets."
Georg replies, "I'm studying how it acts on complex vector spaces, as linear transformations. Mere sets are feeble entities. I can do anything you can do - but I have the tools of linear algebra to help me!"
William says, "But, you're paying a moral price. You're getting the complex numbers - a complicated infinite thing - involved in what should be a completely finitary and combinatorial subject: the study of a finite group. Is this really necessary?"
Georg replies, "I don't know, but it's sure nice. For example, suppose I have G acting on vector space V. Then I can always break down V into a direct sum of subspaces preserved by G, which can't themselves be broken down any further. In technical terms: every representation of G is a direct sum of irreducible representations. And, this decomposition is unique! It's very nice: it's a lot like how every natural integer has a unique prime factorization."
William says, "Yes, it's always fun when we can break things down into 'atoms' that can't be further decomposed. It's very satisfying to our reductionist instincts. But, I can do even better than you!"
Georg raises an eyebrow. "Oh?"
"Yeah," William says. "Suppose I have our group G acting on a set S. Then I can always break down S into a disjoint union of subsets preserved by G, which can't themselves be broken down any further. In technical terms: every action of G is a disjoint union of transitive actions. And, this decomposition is unique!"
Embarrassed, Georg replies, "Oh, right - we heard that back in Episode 3". I wasn't paying enough attention. But how is what you're doing better than what I'm doing? It sounds just the same."
William hesitates. "Well, first of all, a rather minor point, which I can't resist mentioning ... when you said your decomposition of representations into irreducibles was unique, you were exaggerating a little. It's just unique up to isomorphism, and not a canonical isomorphism either.
For example, if you have an irreducible representation of G on V, there are lots of different ways to slice the direct sum V + V into two copies of the representation V. It's a sort of floppy business. On the other hand, when I have a transitive action of G on S, there's exactly one way to chop the disjoint union S + S into two copies of the G-set S. I just look at the two orbits."
Georg says, "Hmm. This is actually a rather delicate point. There's not really a canonical isomorphism in your case either, since S may be isomorphic to itself in lots of ways, even as a G-set. There's something in what you say, but it's a bit hard to make precise, and it's certainly not enough to worry me."
"Okay, but here's my more serious point. Given that I can break apart any set on which G acts into 'atoms', my job is to classify those atoms: the transitive G-sets. And there's a very nice classification! Any subgroup H of G gives a transitive G-set, namely G/H, and all transitive G-sets look like this. More precisely: isomorphism classes of transitive G-sets correspond to conjugacy classes of subgroups of G.
Even better, this has a nice meaning in terms of Klein geometry. Any type of figure in Klein geometry gives a transitive G-set, with H being the stabilizer of a chosen figure of that type.
You, on the other hand, need to classify irreducible representations of G. And this is not so conceptual. What do these irreducible representations mean in terms of the group G?"
Georg replies, "Well, there's one irreducible representation for each conjugacy class in G..."
At this, William pounces. "You mean the number of isomorphism classes of irreducible representations of G equals the number of conjugacy classes in G! But as you know full well, there's no god-given correspondence. You can't just take a conjugacy class in G and cook up an irreducible representation, or irrep. So, you've just made my point. You've shown how mysterious these irreps actually are!"
Georg replies, "Actually in some cases there is a nice way to read off irreps from conjugacy classes. For example, you can do it for the symmetric groups S_{n}. But, I admit you can't in general... or at least, I don't know how."
William laughs, "So, I win!"
"Not at all!" replies Georg. "First, there are lots of theorems about groups that I can prove using representations, which you can't prove using actions on sets. For example, nobody knows how to prove that every group with an odd number of elements is solvable without using the tools of linear algebra."
William nods. "I admit that linear algebra is very practical. But just give us time! I proved back in 1904 that every group of size p^{a}q^{b} is solvable if p and q are prime. To do it, I broke down and used linear algebra. But then, in 1972, Helmut Bender found a proof that avoids linear algebra."
Georg said, "Okay, struggle on then. So far, without using linear algebra, nobody can even prove my famous theorem on 'Frobenius groups'. The details don't matter here: the point is, this is a result on group actions, which seems to need linear algebra for its proof.
But if practicality won't sway you, maybe this conceptual argument will. My atoms are more fine-grained than yours!"
"What do you mean?" asks William.
"You can decompose any action of G into 'atoms', namely transitive G-sets. Similarly, I can decompose any representation of G into one of my 'atoms', namely irreps. But, there's an obvious way to turn G-sets into representations of G, and if we do this to one of your atoms, we don't get one of my atoms! We can usually break it down further! So, my atoms are smaller than yours."
"How does this work, exactly?"
"It's easy," says Georg, getting a bit cocky. "Say you have a group G acting on a set S. Then I can form the vector space C[S] whose elements are formal linear combinations of elements of S. In other words, it's the vector space having S as basis. If we're feeling sloppy we can think of guys in C[S] as functions on S that vanish except at finitely many points. It's better to think of them as measures on S. But anyway: since G acts on S, it acts linearly on C[S]!
So, any G-set gives a representation of G. But, even when G acts transitively on S, its representation on C[S] is hardly ever irreducible."
William glowers. "Oh yeah?"
"Yeah. Suppose for example that S is finite. Then the constant functions on S form a 1-dimensional subspace of C[S] that's invariant under the action of G. So, at the very least, we can break C[S] into two pieces."
"Well," replies William defensively, "That's pretty obvious. But it's also not such a big deal. So you can break up any my transitive G-sets into two of your irreps, one being the 'trivial' irrep. So what???"
"It wouldn't be a big deal if that's all that ever happened," says Georg. "In fact, we can break C[S] into precisely two irreps whenever the action of G on S is 'doubly transitive' - meaning we can send any pair of distinct elements of S to any other using some element of G. But, there lots of transitive actions aren't doubly transitive! And usually, one your atoms breaks down into a bunch of my atoms. In fact I'd like to show you how this works, in detail, for the symmetric groups."
"Maybe next time," says William. "But, I see your point. Your atoms are more atomic than my atoms."
Georg seems to have won the argument. But, William wouldn't have conceded the point quite so fast, if he'd thought about invariant relations!
The point is this. Suppose we have two G-sets, say X and Y. Any G-set map from X to Y gives an intertwining operator from C[X] to C[Y]. But, even after taking linear combinations, there are typically plenty of intertwining operators that don't arise this way. It's these extra intertwining operators that let us chop representations into smaller atoms.
But where do these extra intertwining operators come from? They come from invariant relations between X and Y!
And, what are these extra intertwining operators called? In some famous special cases, like in study of modular forms, they're called "Hecke operators". In some other famous special cases, like in the study of symmetric groups, they form algebras called "Hecke algebras".
A lot of people don't even know that Hecke operators and Hecke algebras are two faces of the same idea: getting intertwining operators from invariant relations. But, we'll see this is true, once we look at some examples.
I think I'll save those for future episodes. But if you've followed the Tale so far, you can probably stand a few extra hints of where we're going. Recall from Episode 4 that invariant relations between G-sets are just spans of groupoids equipped with some extra stuff. So, invariant relations between G-sets are just a warmup for the more general, and simpler, theory of spans of groupoids. I said back in Episode 2 that spans of groupoids give linear operators. What I'm trying to say now is that these linear operators are a massive generalization - but also a simplification - of what people call "Hecke operators".
Finally, for students trying to learn a little basic category theory, I'd like to cast the argument between William and Georg into the language of categories, just to help you practice your vocabulary.
A G-set is the same as a functor
A: G → Set
where we think of G as a 1-object category. There's a category of G-sets, namely
hom(G,Set)
This has functors A: G → Set as objects, and natural transformations between these as morphisms. Usually the objects are called "G-sets", and the morphisms are called "maps of G-sets".
We can also play this whole game with the category of vector spaces replacing the category of sets. A representation of G is the same as a functor
A: G → Vect
As before, there's a category of such things, namely
hom(G,Vect)
This has functors A: G → Vect as objects, and natural transformations between these as morphisms. Now the objects are called "representations of G" and the morphisms are called "intertwining operators".
We could let any groupoid take the place of the group G. We could also let any other category take the place of Set or Vect.
Part of what William and Georg were debating was: how similar are hom(G,Set) and hom(G,Vect)? How are they related?
First of all, there's a functor
F: Set → Vect
sending each set S to the vector space C[S] with that set as basis. So, given an action of G on a set:
A: G → Set
we can compose it with F and get a representation of G:
FA: G → Vect
This kind of representation is called a "permutation representation". And, this trick gives a functor from G-sets to representations of G:
hom(G,Set) → hom(G,Vect) A |-> FAIf this functor were an equivalence of categories, it would have to be essentially surjective, full and faithful. But, not every representation of G is isomorphic to a permutation representation! In other words, the functor
hom(G,Set) → hom(G,Vect)
is not "essentially surjective".
Moreover, not every intertwining operator between permutation representations comes from a map between their underlying G-sets! In other words, the functor
hom(G,Set) → hom(G,Vect)
is not "full".
But, given two different maps from one G-set to another, they give different intertwining operators. So, at least our functor is "faithful".
Maps of G-sets are a special case of invariant relations. So, to get a category that more closely resembles hom(G,Vect), while remaining purely combinatorial, we can replace hom(G,Set) by the category with G-sets as objects and invariant binary relations as morphisms. This is the basic idea of "Hecke operators".
Or, even better, we can try a weak 2-category, with
This is where groupoidification comes into its own.
I want to work my way to the concept of "Hecke operator" through a series of examples. The examples I'll use are a bit trickier than the concept I'm really interested in, since the examples involve integrals, where the Hecke operators I ultimately want to discuss involve sums. But, the examples are nice if you like to visualize stuff...
In these examples we'll always have a relation between two sets X and Y. We'll use this to get an operator that turns functions on X into functions on Y - a "Hecke operator".
Suppose you're trying to do a CAT scan. You want to obtain a 3d image of someone's innards. Unfortunately, all you do is take lots of 2d X-ray photos of them. How can you assemble all this information into the picture you want?
Who better to help you out than a guy named after a radioactive gas: Radon!
In 1917, the Viennese mathematician Johann Radon tackled a related problem one dimension down. You could call it a "CAT scan for flatlanders".
Suppose you want to obtain a complete image of the insides of a 2-dimensional person, but all you can do is shine beams of X-rays through them and see how much each beam is attenuated.
So, mathematically: you have a real-valued function on the plane - roughly speaking, the density of your flatlander. You're trying to recover this function from its integrals along all possible lines. Someone hands you this function on the space of lines, and you're trying to figure out the original function on the space of points.
(Points lying on lines! If you've been following the Tale of Groupoidification, you'll know this "incidence relation" is connected to Klein's approach to geometry, and ultimately to spans of groupoids. But pretend you don't notice, yet.)
Now, it's premature to worry about this tricky "inverse problem" before we ponder what it's the inverse of: the "Radon transform". This takes our original function on the space of points and gives a function on the space of lines.
Let's call the Radon transform T. It takes a function f on the space of points and gives a function Tf on the space of lines, as follows. Given a line y, (Tf)(y) is the integral of f(x) over the set of all points x lying on y.
What Radon did is figure out a nice formula for the inverse of this transform. But that's not what I'm mainly interested in now. It's the Radon transform itself that's a kind of Hecke operator!
Next, look at another example.
This is an obvious generalization to higher dimensions of what I just described. Before we had a space
X = {points in the plane}
and a space
Y = {lines in the plane}
and an incidence relation
S = {(x,y): x is a point lying on the line y}
If we go to n dimensions, we can replace all this with
X = {points in R^{n}}
Y = {lines in R^{n}}
S = {(x,y): x is a point lying on the line y}
Again, the X-ray transform takes a function f on the space of points and gives a function Tf on the space of lines. Given a line y, (Tf)(y) is the integral of f(x) over the set of all x with (x,y) in S.
Next, yet another example!
Radon actually considered a different generalization of the 2d Radon transform, using hyperplanes instead of lines. Using hyperplanes is nicer, because it gives a very simple relationship between the Radon transform and the Fourier transform. But never mind - that's not the point here! The point is how similar everything is. Now we take:
X = {points in R^{n}}
Y = {hyperplanes in R^{n}}
S = {(x,y): x is a point lying on the hyperplane y}
And again, the Radon transform takes a function f on X and gives a function Tf on Y. Given y in Y, (Tf)(y) is the integral of f(x) over the set of all x with (x,y) in S.
We're always doing the same thing here. Now I'll describe the general pattern a bit more abstractly.
We've always got three spaces, and maps that look like this:
S / \ / \ P/ \Q / \ v v X YIn our examples so far these maps are given by
P(x,y) = x
Q(x,y) = y
But, they don't need to be.
Now, how do we get a linear operator in this situation?
Easy! We start with a real-valued function on our space X:
f: X → R
Then we take f and "pull it back along P" to get a function on S. "Pulling back along P" is just impressive jargon for composing with P:
f o P: S → R
Next, we take f o P and "push it forwards along Q" to get a function on Y. The result is our final answer, some function
Tf: Y → R
"Pushing forwards along Q" is just impressive jargon for integrating: Tf(y) is the integral over all s in S with Q(s) = y. For this we need a suitable measure, and we need the integral to converge.
This is the basic idea: we define an operator T by pulling back and then pushing forward functions along a "span", meaning a diagram shaped like a bridge:
S / \ / \ P/ \Q / \ v v X YBut, the reason this operator counts as a "Hecke operator" is that it gets along with a symmetry group G that's acting on everything in sight. In the examples so far, this is the group of Euclidean symmetries of R^{n}. But, it could be anything.
This group G acts on all 3 spaces: X, Y, and S. This makes the space of functions on X into a representation of G! And, ditto for the space of function on Y and S.
Furthermore, the maps P and Q are "equivariant", meaning
P(gs) = gP(s)
and
Q(gs) = gQ(s)
This makes "pulling back along P" into an intertwining operator between representations of G. "Pushing forwards along Q" will also be an intertwining operator if the measure we use is G-invariant in a suitable sense. In this case, our transform T becomes an intertwining operator between group representations! Let's call an intertwining operator constructed this way a "Hecke operator".
If you're a nitpicky person, e.g. a mathematician, you may wonder what I mean by "a suitable sense". Well, each "fiber" Q^{-1}(y) of the map
Q: S → Y
needs a measure on it, so we can take a function on S and integrate it over these fibers to get a function on Y. We need this family of measures to be invariant under the action of G, for pushing forwards along Q be an intertwining operator.
This stuff about invariant families of measures is mildly annoying, and so is the analysis involved in making precise which class of functions on X we can pull back to S and then push forward to Y - we need to make sure the integrals converge, and so on. When I really get rolling on this Hecke operator business, I'll often focus on cases where X, Y, and S are finite sets... and then these issues go away!
Hmm. I'm getting tired, but I can't quit until I say one more thing. If you try to read about Hecke operators, you won't see anything about the examples I just mentioned! You're most likely to see examples where X and Y are spaces of lattices in the complex plane. This is the classic example, which we're trying to generalize. But, this example is more sophisticated than the ones I've mentioned, in that the "functions" on X and Y become "sections of vector bundles" over X and Y. The same sort of twist happens when we go from the Radon transform to the more general "Penrose transform".
Anyway, next time I'll talk about some really easy examples, where X, Y, and S are finite sets and G is a finite group. These give certain algebras of Hecke operators, called "Hecke algebras".
In the meantime, see if you can find any reference in the literature which admits that "Hecke algebras" are related to "Hecke operators". It ain't easy!
It's a great example of a mathematical cover-up - one we're gonna bust wide open.
So, this time I'll fill in some details about "degroupoidification" - the process that sends groupoids to vector spaces and spans of groupoids to linear operators.
How does this work? For starters, each groupoid X gives a vector space [X] whose basis consists of isomorphism classes of objects of X.
Given an object x in X, let's write its isomorphism class as [x]. So: x in X gives [x] in [X].
Next, each span of groupoids
S / \ / \ / \ v v X Ygives a linear operator
[S]: [X] → [Y]
Note: this operator [S] depends on the whole span, not just the groupoid S sitting on top. So, I'm abusing notation here.
More importantly: how do we get this operator [S]? The recipe is simple, but I think you'll profit much more by seeing where the recipe comes from.
To figure out how it should work, we insist that degroupoidification be something like a functor. In other words, it should get along well with composition:
[TS] = [T] [S]
and identities:
[1_{X}] = 1_{[X]}
(Warning: today, just to confuse you, I'll write composition in the old-fashioned backwards way, where doing S and then T is denoted TS.)
How do we compose spans of groupoids? We do it using a "weak pullback". In other words, given a composable pair of spans:
S T / \ / \ f/ \g h/ \i / \ / \ v v V v X Y Zwe form the weak pullback in the middle, like this:
TS / \ j/ \k / \ v v S T / \ / \ f/ \g h/ \i / \ / \ v v v v X Y ZThen, we compose the arrows along the sides to get a big span from X to Z:
TS / \ / \ / \ fj / \ ik / \ / \ / \ / \ v v X ZNever heard of "weak pullbacks"? Okay: I'll tell you what an object in the weak pullback TS is. It's an object t in T and an object s in S, together with an isomorphism between their images in Y. If we were doing the ordinary pullback, we'd demand that these images be equal. But that would be evil! Since t and s are living in groupoids, we should only demand that their images be isomorphic in a specified way.
(Exercise: figure out the morphisms in the weak pullback. Figure out and prove the universal property of the weak pullback.)
So, how should we take a span of groupoids
S / \ / \ / \ v v X Yand turn it into a linear operator
[S]: [X] → [Y] ?
We just need to know what this operator does to a bunch of vectors in [X]. How do we describe vectors in [X]?
I already said how to get a basis vector [x] in [X] from any object x in X. But, that's not enough for what we're doing now, since a linear operator doesn't usually send basis vectors to basis vectors. So, we need to generalize this idea.
An object x in X is the same as a functor from 1 to X:
1 | |p | v Xwhere 1 is the groupoid with one object and one morphism. So, let's generalize this and figure out how any functor from any finite groupoid V to X:
V | |p | v Xpicks out a vector in [X]. Again, by abuse of notation we'll call this vector [V], even though it also depends on p.
First suppose V is a finite set, thought of as a groupoid with only identity morphisms. Then to define [V], we just go through all the points of V, figure out what p maps them to - some bunch of objects x in X - and add up the corresponding basis vectors [x] in [X].
I hope you see how pathetically simple this idea is! It's especially familiar when V and X are both sets. Here's what it looks like then:
------------------ V | o | | o o | | | o o o o | |p | o o o o o | | ------------------ v ------------------ X | o o o o o o | ------------------I've drawn the elements of V and X as little circles, and shown how each element in X has a bunch of elements of V sitting over it. When degroupoidify this to get a vector in the vector space [X], we get:
[V] = (1, 4, 3, 2, 0, 2)This vector is just a list of numbers saying how many points of V are sitting over each point of X!
Now we just need to generalize a bit further, to cover the case where V is a groupoid:
V | |p | v XSitting over each object x in X we have its "essential preimage", which is a groupoid. To get the vector [V], we add up basis vectors [x] in [X], one for each isomorphism class of objects in X, multiplied by the "cardinalities" of their essential preimages.
Now you probably have two questions:
A) Given a functor p: V → X between groupoids and an object x in X, what's the "essential preimage" of x?
and
B) what's the "cardinality" of a groupoid?
Here are the answers:
A) An object in the essential preimage of x is an object v in V equipped with an isomorphism from p(v) to x.
(Exercise: define the morphisms in the essential preimage. Figure out and prove the universal property of the essential preimage. Hint: the essential preimage is a special case of a weak pullback!)
B) To compute the cardinality of a groupoid, we pick one object from each isomorphism class, count its automorphisms, take the reciprocal of this number, and add these numbers up.
(Exercise: check that the cardinality of the groupoid of finite sets is e = 2.718281828... If you get stuck, read "week147".)
Also: define the morphisms in the essential preimage. Figure out and prove the universal property of the essential preimage. Hint: the essential preimage is a special case of a weak pullback!)
Okay. Now in principle you know how any groupoid over X, say
V | | | v Xdetermines a vector [V] in [X]. You have to work some examples to get a feel for it, but I want to get to the punchline. We're unpeeling an onion here, and we're almost down to the core, where you see there's nothing inside and wonder why you were crying so much.
So, let's finally figure out how a span of groupoids
S / \ / \ / \ v v X Ygives a linear operator
[S]: [X] → [Y]
It's enough to know what this operator does to vectors coming from groupoids over X:
V | | | v XAnd, the trick is to notice that such a diagram is the same as a silly span like this:
V / \ / \ / \ v v 1 X1 is the groupoid with one object and one morphism, so there's only one choice of the left leg here!
So here's what we do. To apply the operator [S] coming from the span
S / \ / \ / \ v v X Yto the vector [V] corresponding to the silly span
V / \ / \ / \ v v 1 Xwe just compose these spans, and get a silly span
SV / \ / \ / \ v v 1 Ywhich picks out a vector [SV] in [Y]. Then, we define
[S] [V] = [SV]
Slick, eh? Of course you need to check that [S] is well-defined.
Given that, it's trivial to prove that [-] gets along with composition of spans:
[TS] = [T] [S]
At least, it's trivial once you know that composition of spans is associative up to equivalence, and equivalent spans give the same operator! But your friendly neighborhood category theorist can check such facts in a jiffy, so let's just take them for granted. Then the proof goes like this. We have:
[TS] [V] = [(TS)V] by definition = [T(SV)] by those facts I just mentioned = [T] [SV] by definition = [T] [S] [V] by definitionSince this is true for all [V], we conclude
[TS] = [T] [S]
Voilà!
By the way, if "week47" doesn't satisfy your hunger for information on groupoid cardinality, try this:
19) John Baez and James Dolan, From finite sets to Feynman diagrams, in Mathematics Unlimited - 2001 and Beyond, vol. 1, eds. Bjorn Engquist and Wilfried Schmid, Springer, Berlin, 2001, pp. 29-50. Also available as math.QA/0004133.
For more on turning spans of groupoids into linear operators, and composing spans via weak pullback, try these:
20) Jeffrey Morton, Categorified algebra and quantum mechanics, TAC 16 (2006), 785-854. Available at http://www.emis.de/journals/TAC/volumes/16/29/16-29abs.html; also available as math.QA/0601458.
21) Simon Byrne, On Groupoids and Stuff, honors thesis, Macquarie University, 2005, available at http://www.maths.mq.edu.au/~street/ByrneHons.pdf and http://math.ucr.edu/home/baez/qg-spring2004/ByrneHons.pdf
For a more leisurely exposition, with a big emphasis on applications to combinatorics and the quantum mechanics of the harmonic oscillator, try:
22) John Baez and Derek Wise, Quantization and Categorification,
lecture notes available at:
http://math.ucr.edu/home/baez/qg-fall2003/
http://math.ucr.edu/home/baez/qg-winter2004/
http://math.ucr.edu/home/baez/qg-spring2004/
Finally, a technical note. Why did I say the degroupoidification process was "something like" a functor? It's because spans of groupoids don't want to be a category!
Already spans of sets don't naturally form a category. They form a weak 2-category! Since pullbacks are only defined up to canonical isomorphism, composition of spans of sets is only associative up to isomomorphism... but luckily, this "associator" isomorphism satisfies the "pentagon identity" and all that jazz, so we get a weak 2-category, or bicategory.
Similarly, spans of groupoids form a weak 3-category. Weak pullbacks are only defined up to canonical equivalence, so composition of spans of groupoids are associative up to equivalence... but luckily this "associator" equivalence satisfies the pentagon identity up to an isomorphism, and this "pentagonator" isomomorphism satisfies a coherence law of its own, governed by the 3d Stasheff polytope.
So, we're fairly high in the ladder of n-categories. But, if we want a mere category, we can take groupoids and equivalence classes of spans. Then, degroupoidification gives a functor
[-]: [finite groupoids, spans] → [vector spaces, linear maps]
That's the fact whose proof I tried to sketch here.
While I'm talking about annoying technicalities, note we need some sort of finiteness assumption on our spans of groupoids to be sure all the necessary sums converge. If we go all-out and restrict to spans where all groupoids involved are finite, we'll be very safe. The cardinality of a finite groupoid is a nonnegative rational number, so we can take our vector spaces to be defined over the rational numbers.
But, it's also fun to consider "tame" groupoids, as defined that paper I wrote with Jim Dolan. These have cardinalities that can be irrational numbers, like e. So, in this case we should use vector spaces over the real numbers - or complex numbers, but that's overkill.
Finding a class of groupoids or other entities whose cardinalities are complex would be very nice, to push the whole groupoidification program further into the complex world. In the above paper by Jeff Morton, he uses sets over U(1), but that's probably not the last word.