\chapter*{Introduction}
One of the things that makes mathematics fun is its relation to
physics. It's not surprising that one can build beautiful
self-consistent mathematical structures and prove theorems about
them. What is surprising and mysterious is that some of these
structures are well suited to describing aspects of the world we live
in. We call these aspects the `laws of physics'. Why does the
universe have mathematical laws? Nobody really knows. Lots of people
have thought about this question, but they didn't get very far.
Perhaps it is too soon to answer this question. After all, we don't
even fully know what the laws of physics \emph{are} yet. So we should
probably start by figuring out \emph{what} they are, and then think
more about \emph{why} they exist.
From this, we are inevitably led to quantum gravity. After all,
one of the big problems in figuring out the laws of physics is that
right now there are two sets of laws, general relativity and quantum
theory, which do not seem to get along well. Quantum gravity is an
attempt at reconciling them.
To really understand the latest ideas about quantum gravity one must
first know general relativity and quantum theory. So this course
really should have an introduction explaining these subjects before we
go into quantum gravity. Unfortunately, this introduction would need
to be very long! To get around this problem, we will take two
complementary approaches. In Track 1, we will do things that do not
assume any knowledge of general relativity or quantum field theory.
In Track 2, we will assume the reader is already rather familiar with
both these subjects. Eventually the two tracks will merge.
In general relativity there is a thing called space-time, and we can
think of it as made of slices which we call ``space'' evolving into
one another as time passes.
$$
\xy
(-7,0)*\xycircle(3,2){};
(7,0)*\xycircle(3,2){};
(0,-14)*\xycircle(3,2){};
(-10,0)*{};(-3,-14)*{};
**\crv{(-2,-10)&(-11,-4)};
(10,0)*{};(3,-14)*{};
**\crv{(2,-10)&(11,-4)};
(-4,0)*{};(4,0)*{};
**\crv{(0,-8)};
\endxy
\qquad
\xymatrix{
S\ar[d]^T\\
S'\\}
$$
Spacetime and space are smooth manifolds in this theory, so the
fundamental mathematics one uses in general relativity is differential
geometry.
In quantum mechanics, on the other hand, one uses completely different
mathematics, namely Hilbert spaces (roughly speaking, vector spaces
with an inner product). A unit vector in the Hilbert space $\psi\in H$
is taken to describe a ``state'' that the world can be in. There are
also linear operators
$$
\xymatrix{
\psi\rlap{$\in H$}\ar[d]^T\\
T(\psi)\rlap{$\in H'$}\\}
$$
which describe how things can change (a note on terminology: the terms
``linear map'', ``linear operator'' and ``linear function'' will be
used interchangeably throughout this seminar).
Quantum mechanics is therefore mainly based on algebra, which looks
nothing like the geometry of smooth manifolds on which general
relativity is based, and so quantum gravity is like trying to mix oil
and water. Just about the only thing these theories have in common is
the way in which both talk about states that undergo some
transformation. This analogy is best displayed diagrammatically---just
look at the above diagrams---and with this motivation we can plunge
right into track~1.
\chapter{Diagrammatic Methods for Linear Algebra (I)}
We are going to study a diagrammatic notation for doing linear
algebra. The amazing thing about it is that, if one takes the diagrams
we will be using really literally, one starts to see how space-time
might be built of just these kinds of diagrams and nothing else.
The basic objects in our theory will be vector spaces (which we will
usually take to be finite-dimensional and complex). Let us now exhibit
how different operations of linear algebra can be represented
diagrammatically:
\section{Linear Maps}
A linear map is a function
$$
f\colon V\rightarrow V'\qquad\hbox{such that}\quad f(\alpha v+\beta w)=\alpha f(v)+\beta f(w)\qquad(\alpha,\beta\in\C).
$$
One well-known way to represent linear maps is with matrices, but we
will introduce diagrams for that purpose.
$$
\xy
(0,0)*++{f}*\cir{}="f";
(0,10)**\dir{-} ?(.5)*\dir{<}+(3,0)*{\scriptstyle V};
"f";(0,-10)**\dir{-} ?(.75)*\dir{>}+(3,0)*{\scriptstyle V'};
\endxy
$$
A linear map is represented by the name of the map surrounded by a
``blob'' with arrows sticking out at the top and bottom of the
blob. Arrows are labeled by the name of the vector space they
represent. The downward direction represents the passage of a
``metaphorical time'', in other words, from top to bottom one
draws the domain, the function and the codomain.
\section{Composition of Maps}
Given linear maps $f\colon V\rightarrow V'$ and $g\colon V'\rightarrow
V''$ we can compose them to obtain $gf\colon V\rightarrow V''$ and we
draw the composition by sticking the diagrams for $f$ and $g$ one on
top of the other.
$$
\xy
(0,5)*++{f}*\cir{}="f";
(0,15)**\dir{-} ?(.5)*\dir{<}+(3,0)*{\scriptstyle V};
"f";(0,-5)*++{g}*\cir{}="g";
**\dir{-} ?(.4)*\dir{<}+(3,0)*{\scriptstyle V'};
"f";"g";(0,-15)**\dir{-} ?(.75)*\dir{>}+(3,0)*{\scriptstyle V''};
\endxy
=
\xy
(0,15)*{}="v";
(0,0)*++{gf}*\cir{}="gf";
**\dir{-} ?(.5)*\dir{<}+(3,0)*{\scriptstyle V};
"gf";(0,-15)*{}="w";
**\dir{-} ?(.4)*\dir{<}+(3,0)*{\scriptstyle V''};
\endxy
$$
If you have a set, it always comes with an identity function ``at no
extra cost''. Similarly, every vector space is equipped with an
identity linear map
$$
\matrix{\id_V\colon &V&\rightarrow &V\cr
&v&\mapsto &v\cr}
$$
which we draw as just an arrow labeled by $V$.
$$
\xy
(0,10)*{};(0,-10)**\dir{-} ?(.6)*\dir{>}+(2,0)*{\scriptstyle V};
\endxy
=
\xy
(0,0)*++{\id}*\cir{}="f";
(0,10)**\dir{-} ?(.5)*\dir{<}+(2,0)*{\scriptstyle V};
"f";(0,-10)**\dir{-} ?(.6)*\dir{>}+(2,0)*{\scriptstyle V};
\endxy
$$
This is a good notation in that the identity map is the identity for
the operation of composition of maps, and attaching an arrow to
another arrow does not change the diagram. Note that, if we have
$f\colon V\rightarrow V'$, then
$f\id_V=f=\id_{V'}f$. Diagrammatically,
$$
\xy
(0,0)*++{f}*\cir{}="f";
(0,12)*++{\id}*\cir{}="1";
(0,24)**\dir{-} ?(.5)*\dir{<}+(2,0)*{\scriptstyle V};
"f";(0,-24)**\dir{-} ?(.5)*\dir{>}+(2,0)*{\scriptstyle V'};
"1";"f";**\dir{-} ?(.5)*\dir{<}+(2,0)*{\scriptstyle V};
\endxy
\quad =\quad
\xy
(0,0)*++{f}*\cir{}="f";
(0,24)**\dir{-} ?(.5)*\dir{<}+(2,0)*{\scriptstyle V};
"f";(0,-24)**\dir{-} ?(.5)*\dir{>}+(2,0)*{\scriptstyle V'};
\endxy
\quad =\quad
\xy
(0,-12)*++{\id}*\cir{}="1";
(0,-24)**\dir{-} ?(.5)*\dir{>}+(2,0)*{\scriptstyle V'};
(0,0)*++{f}*\cir{}="f";
(0,24)**\dir{-} ?(.5)*\dir{<}+(2,0)*{\scriptstyle V};
"1";"f";**\dir{-} ?(.5)*\dir{>}+(2,0)*{\scriptstyle V};
\endxy
$$
\section{Tensor Products (a crash course)}
If $V,W$ are (finite-dimensional) vector spaces, $V\otimes W$ is a
vector space which can be defined thus: pick bases $\{e_i\}\subset V$
and $\{f_j\}\subset W$ and let $V\otimes W$ be such that $\{e_i\otimes
f_j\}$ is a formal basis for $V\otimes W$. We have that $\dim(V\otimes
W)=\dim(V)\dim(W)$.
We now define a tensor product of vectors in $V$ and $W$ as a bilinear
map $\otimes\colon V\times W\rightarrow V\otimes W$, so that if
$v=v^ie_i\in V$ and $w=w^jf_j\in W$, their tensor product is $v\otimes
w=v^iw^j(e_i\otimes f_j)$. Here we use for the first time Einstein's
summation convention, which is that when an index appears twice, once
as a subscript and one as a superscript, a summation over the range of
the index is understood implicitly; thus, we have
$$
v\otimes w=v^i w^j(e_i\otimes f_j)\colon =\sum_{i=1}^n\sum_{j=1}^m v^i w^j(e_i\otimes f_j).
$$
This notation is arguably Einstein's most important contribution to
human thought.
Given linear maps $S\colon V\rightarrow V'$ and $T\colon W\rightarrow
W'$, we can construct another linear map
$$
\matrix{S\otimes T\colon &V\otimes V &\rightarrow &V'\otimes W'\cr
&e_i\otimes f_j&\mapsto &S(e_i)\otimes
T(f_j)}
$$
We can draw this as follows:
$$
\xy
(0,0)*+{S}*\cir{}="f";
(0,10)**\dir{-} ?(.5)*\dir{<}+(-3,0)*{\scriptstyle V};
"f";(0,-10)**\dir{-} ?(.6)*\dir{>}+(-3,0)*{\scriptstyle V'};
\endxy
\xy
(0,0)*+{T}*\cir{}="f";
(0,10)**\dir{-} ?(.5)*\dir{<}+(3,0)*{\scriptstyle W};
"f";(0,-10)**\dir{-} ?(.6)*\dir{>}+(3,0)*{\scriptstyle W'};
\endxy
=\quad
\xy
(0,0)*+{S\otimes T}*\frm<4pt>{-}="f";
(0,10)**\dir{-} ?(.5)*\dir{<}+(5,0)*{\scriptstyle V\otimes W};
"f";(0,-10)**\dir{-} ?(.6)*\dir{>}+(5,0)*{\scriptstyle V'\otimes W'};
\endxy
$$
Now consider the following diagram:
$$
\xy
(0,5)*+{S}*\cir{}="f";
(0,15)**\dir{-} ?(.5)*\dir{<}+(-3,0)*{\scriptstyle V};
"f";(0,-15)**\dir{-} ?(.6)*\dir{>}+(-3,0)*{\scriptstyle V'};
\endxy
\xy
(0,-5)*+{T}*\cir{}="f";
(0,15)**\dir{-} ?(.5)*\dir{<}+(3,0)*{\scriptstyle W};
"f";(0,-15)**\dir{-} ?(.6)*\dir{>}+(3,0)*{\scriptstyle W'};
\endxy
=
({\id}_{V'}\otimes T)(S\otimes{\id}_W)
$$
(in the future we will feel free to equate diagrams and formulae). The
following ``identity'' suggests itself:
$$
\xy
(0,5)*+{S}*\cir{}="f";
(0,15)**\dir{-} ?(.5)*\dir{<}+(-3,0)*{\scriptstyle V};
"f";(0,-15)**\dir{-} ?(.6)*\dir{>}+(-3,0)*{\scriptstyle V'};
\endxy
\xy
(0,-5)*+{T}*\cir{}="f";
(0,15)**\dir{-} ?(.5)*\dir{<}+(3,0)*{\scriptstyle W};
"f";(0,-15)**\dir{-} ?(.6)*\dir{>}+(3,0)*{\scriptstyle W'};
\endxy
=
\xy
(0,0)*+{S}*\cir{}="f";
(0,15)**\dir{-} ?(.5)*\dir{<}+(-3,0)*{\scriptstyle V};
"f";(0,-15)**\dir{-} ?(.6)*\dir{>}+(-3,0)*{\scriptstyle V'};
\endxy
\xy
(0,-0)*+{T}*\cir{}="f";
(0,15)**\dir{-} ?(.5)*\dir{<}+(3,0)*{\scriptstyle W};
"f";(0,-15)**\dir{-} ?(.6)*\dir{>}+(3,0)*{\scriptstyle W'};
\endxy
=
\xy
(0,-5)*+{S}*\cir{}="f";
(0,15)**\dir{-} ?(.5)*\dir{<}+(-3,0)*{\scriptstyle V};
"f";(0,-15)**\dir{-} ?(.6)*\dir{>}+(-3,0)*{\scriptstyle V'};
\endxy
\xy
(0,5)*+{T}*\cir{}="f";
(0,15)**\dir{-} ?(.5)*\dir{<}+(3,0)*{\scriptstyle W};
"f";(0,-15)**\dir{-} ?(.6)*\dir{>}+(3,0)*{\scriptstyle W'};
\endxy
$$
We will call this operation shifting. It is a special case of the
principle that deforming the diagram (there is quite a bit of topology
lurking here) does not change the answer. To explain what ``deforming
the diagram'' means, picture the diagram drawn on a framed surface
with the endpoints of free lines glued to the frame, and allow any
smooth one-to-one deformation of the surface (and hence of the
diagram).
\begin{exercise}
Prove algebraically that shifting works.
\end{exercise}
Finally, consider the following example: given linear maps $f\colon
V_1\otimes V_2\otimes V_3\rightarrow V_4\otimes V_5$ and $g\colon
V_5\otimes V_6\rightarrow V_7$, we can combine them is a unique way,
which we draw as follows:
$$
\xy
(-5,5)*++{f}*\cir{}="f";
(-15,15)**\dir{-} ?(.5)*\dir{<}+(1,1)*{\scriptstyle V_1};
"f";(-5,15)**\dir{-} ?(.5)*\dir{<}+(2,0)*{\scriptstyle V_2};
"f";(5,15)**\dir{-} ?(.5)*\dir{<}+(1,-1)*{\scriptstyle V_3};
"f";(5,-5)*++{g}*\cir{}="g";**\dir{-} ?(.5)*\dir{<}+(1,1)*{\scriptstyle V_5};
"g";(15,15)**\dir{-} ?(.5)*\dir{<}+(2,0)*{\scriptstyle V_6};
"f";(-5,-15)**\dir{-} ?(.5)*\dir{>}+(2,-1)*{\scriptstyle V_4};
"g";(5,-15)**\dir{-} ?(.5)*\dir{>}+(2,0)*{\scriptstyle V_7};
\endxy
=
\xy
(0,0)*++{h}*\cir{}="h";
(-15,15)**\dir{-} ?(.5)*\dir{<}+(1,1)*{\scriptstyle V_1};
"h";(-5,15)**\dir{-} ?(.5)*\dir{<}+(2,0)*{\scriptstyle V_2};
"h";(5,15)**\dir{-} ?(.5)*\dir{<}+(2,-1)*{\scriptstyle V_3};
"h";(15,15)**\dir{-} ?(.5)*\dir{<}+(1,-1)*{\scriptstyle V_6};
"h";(-5,-15)**\dir{-} ?(.5)*\dir{>}+(2,-1)*{\scriptstyle V_4};
"h";(5,-15)**\dir{-} ?(.5)*\dir{>}+(2,0)*{\scriptstyle V_7};
\endxy
$$
and we obtain a new map $h\colon V_1\otimes V_2\otimes V_3\otimes
V_6\rightarrow V_4\otimes V_7$ by means of a weird combination of
tensoring and composition for which there is essentially no good
coordinate-free notation other than the diagram!\footnote{Do we need a
section on abstract index notation? In abstract index notation, we
have $h^{ij}_{klmn}=f^{ip}_{klm}g^j_{pn}$.}
\section{Duality (I)}
Given a vector space $V$ over $\C$ one has the dual vector space
defined as
$$
V^*=\{\hbox{linear maps}\quad f\colon
V\rightarrow\C\}.
$$
Moreover, given a linear map $T\colon V\rightarrow W$ one can define
its adjoint, which is the linear map $T^*\colon W^*\rightarrow V^*$
defined by $(T^*g)v=g(Tv)$ for all $g\in W^*$ and $v\in V$:
$$
\xymatrix{
V\ar[r]^T\ar[rd]_{T^*g}&W\ar[d]^g\\
&\C\\}
$$
There is a nice way to draw adjoints, which is to rotate the diagram
by $180^\circ$:
$$
\xy
(0,0)*++{f}*\cir{}="f";
(0,10)**\dir{-} ?(.5)*\dir{<}+(3,0)*{\scriptstyle V};
"f";(0,-10)**\dir{-} ?(.75)*\dir{>}+(3,0)*{\scriptstyle W};
\endxy
\stackrel{*}{\rightarrow}
\xy
(0,0)*+{f^*}*\cir{}="f";
(0,-10)**\dir{-} ?(.5)*\dir{<}+(-3,0)*{\scriptstyle V};
"f";(0,10)**\dir{-} ?(.75)*\dir{>}+(-3,0)*{\scriptstyle W};
\endxy
=
\xy
(0,0)*+{f^*}*\cir{}="f";
(0,-10)**\dir{-} ?(.75)*\dir{>}+(-3,0)*{\scriptstyle V^*};
"f";(0,10)**\dir{-} ?(.5)*\dir{<}+(-3,0)*{\scriptstyle W^*};
\endxy
$$
Remember that ``time'' always flows downwards so, when we introduce
duals, arrows pointing downstream represent vector spaces and arrows
pointing upstream represent their duals. Note that we do not need to
write the asterisk on the label to denote the dual space because the
direction of the arrow does this for us automatically. We do need to
write the asterisk on the name of the operator because the direction
of the arrows may not be sufficient to tell $T$ apart from
$T^*$. Consider an operator $T\colon V\rightarrow V^*$. Then the
adjoint is $T^*\colon V\rightarrow V^*$ and, because an operator need
not be self-adjoint, not writing the asterisk would lead to an
ambiguous diagram
$$
\xy
(0,0)*++{T}*\cir{}="f";
(0,10)**\dir{-} ?(.5)*\dir{<}+(3,0)*{\scriptstyle V};
"f";(0,-10)**\dir{-} ?(.6)*\dir{<}+(-3,0)*{\scriptstyle V};
\endxy
\stackrel{*}{\rightarrow}
\xy
(0,0)*++{T}*\cir{}="f";
(0,10)**\dir{-} ?(.5)*\dir{<}+(3,0)*{\scriptstyle V};
"f";(0,-10)**\dir{-} ?(.6)*\dir{<}+(-3,0)*{\scriptstyle V};
\endxy
$$
However, at this point we might decide that ``blobs'' should not be
drawn as circles but as some other shape that is not symmetric, in
which case we could drop all asterisks without ambiguities.
The idea to represent adjoints by drawing the diagrams ``backwards in
time'' arose in particle physics, in which taking the adjoint is
equivalent to exchanging particles and antiparticles. Richard Feynman
was the first to think of antiparticles as ``particles going backwards
in time'', and represented them by reversing the arrows on what we now
call Feynman diagrams.
\begin{exercise}Consider the following ambiguous diagram:
$$
\xy
(0,5)*++{S}*\cir{}="f";
(0,15)**\dir{-} ?(.5)*\dir{<}+(3,0)*{\scriptstyle U};
"f";(0,-5)*++{T}*\cir{}="g";
**\dir{-} ?(.4)*\dir{<}+(3,0)*{\scriptstyle V};
"f";"g";(0,-15)**\dir{-} ?(.75)*\dir{>}+(3,0)*{\scriptstyle W};
\endxy
\rightarrow
\xy
(0,-5)*+{S^*}*\cir{}="f";
(0,-15)**\dir{-} ?(.5)*\dir{<}+(-3,0)*{\scriptstyle U};
"f";(0,5)*+{T^*}*\cir{}="g";
**\dir{-} ?(.4)*\dir{<}+(-3,0)*{\scriptstyle V};
"f";"g";(0,15)**\dir{-} ?(.75)*\dir{>}+(-3,0)*{\scriptstyle W};
\endxy
$$
Check that ``rotate-then-compose'' is the same operation as
``compose-then-rotate'', therefore showing that the diagrammatic
notation is unambiguous. (Hint: translate the diagram into symbols in
two ways which will be the left- and right-hand sides of an
identity; then prove the identity.)
\end{exercise}
\chapter{Lagrangians for Field Theories (I)}
\section{Framework and
Notations}
These are the ``stars of the show'':
\begin{itemize}
\item A Lie group denoted by $G$, which physicists call the ``gauge
group'' and is not to be confused with the ``group of gauge
transformations''. For simplicity, we will assume that the group is a
group of matrices like $\SO(n)$, $\SU(n)$, $\SP(n)$, so it will be a
submanifold of the linear space $\End(V)$ for some $V$. The whole
theory can be carried through without assuming that $G$ is a group of
matrices.
\item The Lie algebra of $G$, denoted by $\frak g$. The names of the lie
algebras are obtained from the group names by transmogrifying them
into low-case gothic script, for example $\so(n)$, $\su(n)$, $\Sp(n)$,
etc. A Lie algebra is a vector space, but we will assume that it is a
space of $n\times n$ matrices so they can be multiplied, although
strictly speaking\footnote{Strictly speaking, when the lie algebra is not
an algebra of matrices, one defines a ``universal enveloping algebra''
with an associative product such that the original Lie bracket equals
the commutator of the enveloping algebra.} one is not allowed to do
that.
\item The trace of an $n\times n$ matrix, denoted $\tr$. This actually
represents two functions: $\tr\colon G\rightarrow\C$ and
$\tr\colon\frak g\rightarrow\C$. These operations can be defined
without reference to matrices, but they are still denoted $\tr$ for
convenience.
\item An $n$-dimensional (smooth, paracompact, Hausdorff) manifold
representing ``space-time'' and denoted by $M$. We will require that
$M$ be oriented (to be able to integrate functions) and (usually)
compact. $M$ will usually be boundaryless, but sometimes we will
consider manifolds with a boundary.
\item A principal $G$-bundle over $M$, denoted $\pi\colon P\rightarrow
M$. Since {\sl Gauge Fields, Knots and Gravity} does not cover
principal bundles---the principal flaw of that book---we will give a
definition of these sometime. (For now, we'll assume you either know
it or can fake it.)
\item Fields (functions) on $M$, especially
\begin{itemize}
\item A connection $A$ on the principal bundle $P$, which
physicists call ``gauge field'' or ``vector potential''. Locally
(or on a trivial $G$-bundle), $A$ is a $\frak g$-valued $1$-form,
so in a coordinate patch we can write $A=A_\mu\dd x^\mu$. The
connection is associated to an exterior covariant derivative
$$
\dd_A=\dd+A=\cases{\dd+A\wedge & acting on the fundamental
representation of $G$\cr
\dd+[A,~] & acting on the adjoint representation of $G$.\cr}
$$
\item Gauge transformations, which are locally $G$-valued functions
on $M$. The action of a gauge transformation $g$ on the connection
is required to satisfy $\dd_{A'}g=g\dd_A$, which implies
$$
A\mapsto A'=gAg^{-1}+g(\dd g^{-1})=gAg^{-1}-(\dd g)g^{-1}.
$$
\item The curvature $F$ of the connection $A$, which physicists
call ``field strength''. Locally, $F$ is a $\frak g$-valued
$2$-form, and it is a function of $A$:
$$
F=\dd_A^2=\cases{\dd A+A\wedge A& on the fundamental representation\cr
\dd A+{1\over 2}[A,A]& on the adjoint representation\cr}
$$
When we apply a gauge transformation to $A$, the curvature changes as
$F\mapsto F'=gFg^{-1}$. We therefore say that $F$ is
$Ad(P)$-valued.
\end{itemize}
\end{itemize}
\begin{exercise}
Show that this is the case. [Hints: $\dd(AB)=(\dd A)B+A(\dd B)$ for
matrix-valued functions; when $d$ jumps over an $n$-form it picks up a
factor of $(-1)^n$; and $gg^{-1}=1$. Alternatively, show that
$F=\dd_A^2$ and use $\dd_A'=g\dd_A g^{-1}$.]
\end{exercise}
Now, to do field theory we need to concoct Lagrangians from these
ingredients (the connection and other fields at hand). Mathematically,
a Lagrangian $\cal L$ is just a scalar-valued $n$-form which is a
function of the fields. It has to be an $n$-form so that it can be
integrated over the manifold $M$ to get a number.
If $\cal L$ is a Lagrangian and $M$ is a manifold, the integral
$$
S=\int_M\cal L
$$
is called ``the action of the field configuration''. For ``nice''
theories, the action should be invariant under gauge
transformations. One way to do this is to require that the Lagrangian
itself be invariant under gauge transformations.
\section{Gauge-invariant Lagrangians for Gauge Theories (I)}
So far the only field we have is the connection $A$, so let us see if
we can build any gauge-invariant $n$-forms from it.
The most simple-minded $n$-form we can obtain from $A$ is simply $A$
itself which, as a $\frak g$-valued $1$-form, would work on
$1$-dimensional manifolds. To obtain a scalar $1$-form, we take the
trace. Unfortunately, ${\cal L}=\tr A$ is not gauge-invariant, as
$$
\tr
A'=\tr(gAg^{-1}+gdg^{-1})=\tr(gAg^{-1})+\tr(gdg^{-1})=\tr(A)+\tr(gdg^{-1})=\tr(A)-d\log\det
g,
$$
and the last term vanishes only if $\det g$ is constant, which will
only be the case if $G$ is a special group, in which case $\frak g$ is
an algebra of traceless matrices and $\tr A$ is zero in the first
place. We have used the cyclic property of the trace,
$\tr(AB)=\tr(BA)$, which will prove very useful in the following.
\subsection{The First Chern Theory}
In two dimensions we can do ${\cal L}=\tr F$, which turns out to be
gauge-invariant, again by the cyclic property of the trace:
$$
\tr F'=\tr(gFg^{-1})=\tr(F).
$$
Two-dimensional field theories are interesting, among other reasons,
because in string theory the fundamental objects are the
two-dimensional world sheets of one-dimensional strings, and all
dynamical variables, including the coordinates of space-time, are
fields defined on this world sheet.
Mathematicians call the integral
$$
S=\int_M\tr F
$$
the first Chern class, so a natural name for a theory with this action
would be ``the first Chern theory''. It turns out, for example, that
when $G=\SO(2)$ the first Chern theory is two-dimensional general
relativity!
The $n$th Chern class is
$$
\int_M\tr(\underbrace{F\wedge\cdots\wedge F}_{n~\rm times}),
$$
which, when taken as an action, gives rise to a perfectly sensible
gauge-invariant theory on $2n$-dimensional manifolds which we may well
call the $n$th Chern theory.
The special case of the second Chern theory is interesting because it
works in four dimensions and that's what we think our space-time is!
This is not general relativity, though, for any choice of the gauge
group $G$. Sometimes the second Chern theory with $G = \SO(3,1)$ is
called ``topological gravity''. It is similar to general relativity but
much simpler---a bit like GR's baby brother.
In fact, as we shall see, there is a way to obtain general relativity
from an Lagrangian of the form $e\wedge e\wedge F$, where $e$ is an
additional $1$-form (variously called ``cotetrad'', ``Vierbein'' or
``soldering form'') independent of the connection $A$.