February 11, 1996

General Relativity Tutorial - Long Course Outline

John Baez

This is a longer version of the course outline. If you click on some of the capitalized concepts, you will see more information on them.

  1. A TANGENT VECTOR or simply VECTOR at the point p of spacetime may be visualized as an infinitesimal arrow with tail at the point p. The tangent vectors at p form a vector space called the TANGENT SPACE; in other words, we can add them and multiply them by real numbers.

    Suppose we work in a local COORDINATE SYSTEM with coordinates (x^0,x^1,x^2,x^3). (Since we are working in 4d spacetime there are 4 coordinates; we may think of x^0 as the time coordinate t and the other 3 as x, y, and z, but we don't need to think of them this way, since we're using an utterly arbitrary coordinate system.) Then we can describe a tangent vector v by listing its components (v^0,v^1,v^2,v^3) in this coordinate system. For short we write these components as v^a, where the superscript a, like all of our superscripts and subscripts, goes from 0 to 3.

  2. A COTANGENT VECTOR or simply COVECTOR at the point p is a function f that eats a tangent vector v and spits out a real number f(v) in a linear way. Cotangent vectors can be viewed as ordered stacks of parallel planes in the tangent space at p. They don't "point" like tangent vectors do; instead, they "copoint".

    Working in local coordinates, we define the components of a covector f to be the numbers (f_0,f_1,f_2,f_3) you get you get when you evaluate f on the basis vectors:

    f_0 = f(1,0,0,0)

    f_1 = f(0,1,0,0)

    f_2 = f(0,0,1,0)

    f_3 = f(0,0,0,1)

  3. A TENSOR of "rank (0,k)" at a point p of spacetime is a function that takes as input a list of k tangent vectors at the point p and returns as output a number. The output must depend linearly on each input.

    A TENSOR of "rank (1,k)" at a point p of spacetime is a function that takes as input a list of k tangent vectors at the point p and returns as output a tangent vector at the point p. The output must depend linearly on each input.

    More generally, a TENSOR of "rank (j,k)" at a point p of spacetime is a function that takes as input a list of j cotangent vectors and k tangent vectors and returns as output a number. The output must depend linearly on each input. Note that this definition is compatible with the previous ones! This is obvious for the rank (0,k) tensors, but for the rank (1,k) ones we need to check that a function that eats k vectors and spits out a vector v can be reinterpreted as a function that eats k vectors and one covector f and spits out a number. We just let the covector f eat the vector v and spit out f(v)!

    Similarly, note that a vector can be reinterpreted as a tensor of rank (1,0), and a covector can be reinterpreted as a tensor of rank (0,1).

    In local coordinates we write the components of a tensor T of rank (j,k) as a monstrous array T^{ab....c}_{de....f} with j superscripts and k subscripts. Again, all superscripts and subscripts range from 0 to 3; each number T^{ab....c}_{de....f} is simply the number the tensor spits out when fed a suitable wad of basis vectors and covectors. I will describe this in more detail in the following example:

  4. The METRIC is the star of general relativity. It describes everything about the geometry of spacetime, since it lets us measure angles and distances. Einstein's equation describes how the flow of energy and momentum through spacetime affects the metric. What it affects is something about the metric called the "curvature". The biggest job in learning general relativity is learning to understand curvature!

    Mathematically, the metric g is a tensor of rank (0,2). It eats two tangent vectors v,w and spits out a number g(v,w), which we think of as the "dot product" or "inner product" of the vectors v and w. This lets us compute the length of any tangent vector, or the angle between two tangent vectors. Since we are talking about spacetime, the metric need not satisfy g(v,v) > 0 for all nonzero v. A vector v is SPACELIKE if g(v,v) > 0, TIMELIKE if g(v,v) < 0, and LIGHTLIKE if g(v,v) = 0.

    The inner product g(v,w) of two tangent vectors is given by

    g(v,w) = g_{ab} v^a w^b

    for some matrix of numbers g_{ab}, where we sum over the repeated indices a,b (this being the so-called EINSTEIN SUMMATION CONVENTION). Another way to think of it is that our coordinates give us a basis of tangent vectors at p, and g_{ab} is the inner product of the basis vector pointing in the x^a direction and the basis vector pointing in the x^b direction.

  5. PARALLEL TRANSPORT or parallel translation is an operation which, given a curve from p to q and a tangent vector v at p, spits out a tangent vector v' at q. We think of this as the result of dragging v from p to q while at each step of the way not rotating or stretching it. There's an important theorem saying that if we have a metric g, there is a unique way to do parallel translation which is:

    1. Linear: the output v' depends linearly on v.

    2. Compatible with the metric: if we parallel translate two vectors v and w from p to q, and get two vectors v' and w', then g(v',w') = g(v,w). This means that parallel translation preserves lengths and angles. This is what we mean by "no stretching".

    3. Torsion-free: this is a way of making precise the notion of "no rotating". We can define the TORSION tensor, with components t_{ab}, as follows. Take a little vector of size epsilon pointing in the a direction, and a little vector of size epsilon pointing in the b direction. Parallel translate the vector pointing in the a direction by an amount epsilon in the b direction. Similarly, parallel translate the vector pointing in the b direction by an amount epsilon in the a direction. (Draw the resulting two vectors.) If the tips touch, up to terms of epsilon^3, there's no torsion! Otherwise take the difference of the tips and divide by epsilon^2. Taking the limit as epsilon -> 0 we get the torsion t_{ab}. We say that parallel translation is "torsion-free" if t_{ab} = 0.

  6. A GEODESIC is a curve whose tangent vector is parallel transported along itself. I.e., to follow a geodesic is to follow ones nose while never turning ones nose... to follow a completely unaccelerated path. A particle in free fall follows a geodesic in spacetime. In this sense, in general relativity gravity is not a force!

  7. The CONNECTION is a mathematical gadget that describes "parallel translation along an infinitesimal curve in a given direction". In local coordinates the connection may be described using the components of the CHRISTOFFEL SYMBOL Gamma_{ab}^c. There is an explicit formula for these components in terms of components g_{ab} of the metric, which may be derived from the assumptions 1-3 above. However, this formula is very frightening.

  8. The RIEMANN CURVATURE TENSOR is a tensor of rank (1,3) at each point of spacetime. Thus it takes three tangent vectors, say u, v, and w as inputs, and outputs one tangent vector, say R(u,v,w). The Riemann tensor is defined like this:

    Take the vector w, and parallel transport it around a wee parallelogram whose two edges point in the directions epsilon u and epsilon v , where epsilon is a small number. The vector w comes back a bit changed by its journey; it is now a new vector w'. We then have

    w' - w = -epsilon^2 R(u,v,w) + terms of order epsilon^3

    Thus the Riemann tensor keeps track of how much parallel transport around a wee parallelogram changes the vector w. When we say spacetime is curved, we mean that parallel transport around a loop can change a vector. As it turns out, all the information about the curvature of spacetime is contained in the Riemann tensor!

    In addition to this simple coordinate-free definition of the Riemann tensor, we may describe its components R^a_{bcd} using coordinates. Namely, the vector R(u,v,w) has components

    R(u,v,w)^a = R^a_{bcd} u^b v^c w^d

    where we sum over the indices b,c,d. Another way to think of this is that if we feed the Riemann tensor 3 basis vectors in the x^b, x^c, x^d directions, respectively, it spits out a vector whose component in the x^a direction is R^a_{bcd}.

    There is an explicit formula for the components R^a_{bcd} in terms of the Christoffel symbols. Together with the aforementioned formula for the Christoffel symbols in terms of the metric, this lets us compute the Riemann tensor of any metric! Thus to do computations in general relativity, these formulas are quite important. However, they are not for the faint of heart, so I will only describe them to readers who have passed certain tests of courage and valor.

    See the adventures of Oz and the Wizard for an example of one such daring reader!

  9. The RICCI TENSOR. The matrix g_{ab} is invertible and we write its inverse as g^{ab}. We use this to cook up some tensors starting from the Riemann curvature tensor and leading to the Einstein tensor, which appears on the left side of Einstein's marvelous equation for general relativity.

    Okay, starting from the Riemann tensor, which has components R^a_{bcd}, we now define the Ricci tensor to have components

    R_{bd} = R^c_{bcd}

    where as usual we sum over the repeated index c.

    The physical significance of the Ricci tensor is best explained by an example. So, suppose an astronaut taking a space walk accidentally spills a can of ground coffee.

    Consider one coffee ground. Say that a given moment it's at the point P of spacetime, and its velocity vector is the tangent vector v. Note: since we are doing relativity, its velocity is defined to be the tangent vector to its path in *spacetime*, so if we used coordinates v would have 4 components, not 3.

    The path the coffee ground traces out in spacetime is called its "worldline". Let's draw a little bit of its worldline near P:

    The vector v is an arrow with tail P, pointing straight up. I've tried to draw it in, using crappy ASCII graphics.

    Now imagine a bunch of coffee grounds right near our original one, that are initially at rest relative to it --- or "comoving". What does this mean? Well, it means that for any tangent vector w at P which is orthogonal to v, if we follow a geodesic along w for a certain while, we find ourselves at a point Q where there's another coffee ground. Let me draw the worldline of this other coffee ground.

       |       |
      v^       ^v'
       | w     |
       |       |
       |       |
       |       |
    I've drawn w so you can see how it is orthogonal to the worldline of our first coffee ground. The horizontal path is a geodesic from P to Q, which has tangent vector w at Q. I have also drawn the worldline of the coffee ground which goes through the point Q of spacetime, and I've also drawn the velocity vector v' of this other coffee ground.

    What does it mean to say the coffee grounds are initially comoving? It means simply that if we take v and parallel translate it over to Q along the horizontal path, we get v'.

    This may seem like a lot of work to say that two coffee grounds are moving in the same direction at the same speed, but when spacetime is curved we gotta be very careful. Note that everything I've done is based on parallel translation! (I defined geodesics using parallel translation.)

    Now consider, not just two coffee grounds, but a whole swarm of comoving coffee grounds near P. If spacetime were flat, these coffee grounds would *stay* comoving as time passed. But if there is a gravitational field around (and there is, even in space), spacetime is not flat. So what happens?

    Well, basically the coffee grounds will tend to be deflected, relative to one another. It's not hard to figure out exactly how much they will be deflected. We just use the definition of the Riemann curvature! We get an equation called the GEODESIC DEVIATION EQUATION.

    But let me not do that just yet. Instead, let me just say what this has to do with the Ricci tensor.

    Imagine a bunch of coffee grounds near the coffee ground that went through the point P. Consider, for example, all the coffee grounds that were within a given distance at time zero (in the local rest frame of the coffee ground that went through P). And suppose that at time zero all the coffee grounds are comoving. A little round ball of coffee grounds in free fall through outer space! As time passes this ball will change shape and size depending on how the paths of the coffee grounds are deflected by the spacetime curvature. Since everything in the universe is linear to first order, we can imagine shrinking or expanding, and also getting deformed to an ellipsoid. There is a lot of information about spacetime curvature encoded in the rate at which this ball changes shape and size. But let's only keep track of the rate of change of its volume! This rate is basically the Ricci tensor.

    More precisely, the second time derivative of the volume of this little ball is approximately

    -R_{ab} v^a v^b

    times the original volume of the ball. This approximation becomes better and better in the limit as the ball gets smaller and smaller. The first time derivative of the volume is zero, since the coffee grounds started out comoving.

    In 4-dimensional spacetime, the Riemann tensor has 20 independent components. 10 of these are captured by the Ricci tensor, while the remaining 10 are captured by the WEYL TENSOR.

  10. The RICCI SCALAR. Starting from the Ricci tensor, we define

    R^a_d = g^{ab} R_{bd}.

    As always, we follow the Einstein summation convention and sum over repeated indices when one is up and the other is down. This process, which turned one subscript on the Ricci tensor into a superscript, is called RAISING AN INDEX. Similarly we can LOWER AN INDEX, turning any superscript into a subscript, using g_{ab}.

    Then we define the Ricci scalar by

    R = R^a_a.

    This process, whereby we get rid of a superscript and a subscript in a tensor by summing over them a la Einstein, is called CONTRACTING.

  11. The EINSTEIN TENSOR. Finally, we define the Einstein tensor by

    G_{ab} = R_{ab} - (1/2)R g_{ab}.

    You should not feel you understand why I am defining it this way!! Don't worry! That will take quite a bit longer to explain; the point is that with this definition, local conservation of energy and momentum will be an automatic consequence of Einstein's equation. To understand this, we need to know Einstein's equation, so we need to know about:

  12. The STRESS-ENERGY TENSOR. The stress-energy is what appears on the right side of Einstein's equation. It is a tensor of rank (0,2), and it defined as follows: given any two tangent vectors u and v at a point p, the number T(u,v) says how much momentum in the u direction is flowing through the point p in the v direction. Writing it out in terms of components in any coordinates, we have

    T(u,v) = T_{ab} u^a v^b

    In coordinates where x^0 is the time direction t while x^1, x^2, x^3 are the space directions (x,y,z), and the metric looks like the usual Minkowski metric (at the point in question) we have the following physical interpretation of the components T_{ab}:

    The top row of this 4x4 matrix, keeps track of the density of energy --- that's T_{00} --- and the density of momentum in the x,y, and z directions --- those are T_{01}, T_{02}, and T_{03} respectively. This should make sense if you remember that "density" is the same as "flow in the time direction" and "energy" is the same as "momentum in the time direction". The other components of the stress-energy tensor keep track of the flow of energy and momentum in various spatial directions.

  13. EINSTEIN'S EQUATION: This is what general relativity is all about. It says that

    G = T

    or if you like coordinates and more standard units,

    G_{ab} = 8 pi k/c^2 T_{ab}

    where k is Newton's gravitational constant and c is the speed of light. So it says how the flow of energy and momentum through a given point of spacetime affect the curvature of spacetime there.

    But what does it mean? To see this, let's do some "index gymnastics". Stand with your feet slightly apart and hands loosely at your sides. Now, assume the Einstein equation!

    G_{ab} = T_{ab}

    Substitute the definition of Einstein tensor!

    R_{ab} - (1/2)R g_{ab} = T_{ab}

    Raise an index!

    R^a_b - (1/2)R g^a_b = T^a_b


    R^a_a - (1/2)R g^a_a = T^a_a

    Remember the definition of Ricci scalar, and note that g^a_a = 4 in 4d!

    R - 2R = T^a_a


    R = - T^a_a

    Okay. That's already a bit interesting. It says that when Einstein's equation is true, the Ricci scalar R is the sum of the diagonal terms of T^a_a. What are those terms, anyway? Well, they involve energy density and pressure. But let's wait a bit on that... let's put this formula for R back into Einstein's equation:

    R_{ab} + (1/2) T^c_c g_{ab} = T_{ab}


    R_{ab} = T_{ab} - (1/2) T^c_c g_{ab}.

    This equation is equivalent to Einstein's equation. What does it mean? Well, first of all, it's nice because we have a simple geometrical way of understanding the Ricci tensor R_{ab} in terms of convergence of geodesics. Remember, if v is the velocity vector of the particle in the middle of a little ball of initially comoving test particles in free fall, and the ball starts out having volume V, the second time derivative of the volume of the ball is

    -R_{ab} v^a v^b

    times V. If we know the above quantity for all velocities v (or just all timelike velocities, which are the physically achievable ones), we can reconstruct the Ricci tensor R_{ab}. But we might as well work in the local rest frame of the particle in the middle of the little ball, and use coordinates that make things look just like Minkowski spacetime right near that point. Then

    g_{ab} = -1  0  0  0 
              0  1  0  0
              0  0  1  0
              0  0  0  1
    v^a = 1
    So then --- here's a good little computation for you budding tensor jocks --- we get

    R_{ab} v^a v^b = R_{00}

    So in this coordinate system we can say the 2nd time derivative of the volume of the little ball of test particles is just -R_{00}.

    On the other hand, check out the right side of the equation:

    R_{ab} = T_{ab} - (1/2) T^c_c g_{ab}

    Take a = b = 0 and get

    R_{00} = T_{00} + (1/2) T^c_c

    Note: demanding this to be true at every point of spacetime, in every local rest frame, is the same as demanding that the whole Einstein equation be true! So we just need to figure out what it MEANS!

    What's T_{00}? It's just the energy density at the center of our little ball. How about T^c_c? Well, remember this is just g^{ca} T_{ac}, where we sum over a and c. So --- have a go at it, tensor jocks and jockettes! --- it equals -T_{00} + T_{11} + T_{22} + T_{33}. So we get

    R_{00} = (1/2) [T_{00} + T_{11} + T_{22} + T_{33}]

    What about T_{11}, T_{22}, and T_{33}? In general these are the flow of x-momentum in the x direction, and so on. In a typical fluid at rest, these are all equal to the pressure.

    So the "simple geometrical essence of Einstein's equation" is this:

    Take any small ball of initially comoving test particles in free fall. Work in the local rest frame of this ball. As time passes the ball changes volume; calculate its second derivative at time zero and divide by the original volume. The negative of this equals 1/2 the energy density at the center of the ball, plus the flow of x-momentum in the x direction there, plus the flow of y-momentum in the y direction, plus the flow of z-momentum in the z direction.

    Or, if you want a less precise but more catchy version:

    Take any small ball of initially comoving test particles in free fall. As time passes, the rate at which the ball begins to shrink in volume is proportional to the energy density at the center of the ball plus the flow of x-momentum in the x direction there plus the flow of y-momentum in the y direction plus the flow of z-momentum in the z direction.

    Note: all of general relativity can in principle be recovered from the above paragraph! Also note that the minus sign in that paragraph is good, since it says if you have POSITIVE energy density, the ball of test particles SHRINKS. I.e., gravity is attractive.