\documentstyle{article}
\parskip=1ex
\parindent=0ex
\pagestyle{plain}
\begin{document}
\bibliographystyle{plain}
\newcommand{\qi}{{\bf i}}
\newcommand{\qj}{{\bf j}}
\newcommand{\qk}{{\bf k}}
\newcommand{\rx}{\mbox{$\bf \hat{x}$}}
\newcommand{\ry}{\mbox{$\bf \hat{y}$}}
\newcommand{\rz}{\mbox{$\bf \hat{z}$}}
\newcommand{\id}{{\bf 1}}
\newcommand{\zero}{{\bf 0}}
\newcommand{\tr}{{\rm tr}}
\newcommand{\reals}{{\bf R}}
\newcommand{\cmplx}{{\bf C}}
\newcommand{\up}{|{\rm up}\rangle}
\newcommand{\down}{|{\rm down}\rangle}
\newcommand{\figrule}{\rule{5.6in}{.02in}}
\title{Spin}
\date{}
\author{Michael Weiss}
\maketitle
\paragraph{The Facts}
I will begin by stating the bare facts briefly, at the cost of some
precision.
Any quantum-mechanical system possesses angular momentum, sometimes called
{\it spin}\footnote{Historically, the term {\it spin} had (and sometimes
has) a narrower meaning than {\it angular momentum}. Picture an electron
in an atom. Sacrificing some accuracy on the altar of visual imagery,
think of the electron as orbiting about the nucleus, and spinning on its
axis. The orbital motion gives rise to so-called orbital angular momentum;
the spinning gives rise to so-called intrinsic or spin angular momentum.}.
Angular momentum is quantized, that is it is always an integer multiple of
$\hbar/2$. (Here, $\hbar=h/2\pi$, where $h$ of course is Planck's
constant.) Traditionally, $j$ stands for angular momentum\footnote{And $l$
traditionally stands for orbital angular momentum, and $s$ stands for spin
angular momentum.}. If we use so-called natural units of measurement (and
we will), then $\hbar=1$ and possible values of $j$ are $0, \frac{1}{2}, 1,
\frac{3}{2}, \ldots$.
Say we measure the component of angular momentum along some axis, say the
$z$-axis. Let $j_z$ stand for this component. \footnote{People sometimes
use $m_z$ instead of $j_z$ to denote this.} Then $j_z$ is also
quantized, and possible values for $j_z$ are
\[
-j,-j+1,\ldots,j-1,j
\]
A more precise definition of $j$ is the maximum possible value for $j_z$.
For example, a system with angular momentum one-half (called a
``spin-$\frac{1}{2}$ system'' for short) will have $j_z=\pm\frac{1}{2}$; a
system with angular momentum~1 (spin~1) will have $j_z=-1,0$, or $+1$. The
same result holds for the component of angular momentum measured along {\it
any} axis.
The total magnitude of angular momentum is given by
$\sqrt{j_x^2+j_y^2+j_z^2}$; if this is measured somehow\footnote{Not so
trivial, because the three component measurements are mutually
incompatible; it is theoretically impossible to measure $j_x$ and $j_y$
simultaneously (for example). Nonetheless, the combination
$j_x^2+j_y^2+j_z^2$ {\it is} measurable.}, the result will be
$\sqrt{j(j+1)}$. In classical mechanics, we would get $j$ and not
$\sqrt{j(j+1)}$ (as we will see in a moment).
Suppose we combine two systems together, one with angular momentum $j$,
the other with angular momentum $j'$. The resulting composite system will
have angular momentum equal to one of the values
\[
|j-j'|,|j-j'|+1,\ldots,j+j'
\]
For example, combining two spin-$\frac{1}{2}$ systems can result in a
spin-0 system or a spin-1 system.
This addition rule for angular momenta lies behind a wealth of physical
phenomena. A simple example: suppose an atom absorbs a photon. A photon
always has spin~1. Suppose the atom starts out in a state with $j=1$.
After absorbing the photon, the atom must make a transition to a state with
$j=0,1$, or~2. The same conclusion holds when an atom emits a photon.
This three-way choice ultimately manifests itself as a triplet of spectral
lines.
Now let's be a little more precise. In classical mechanics, angular
momentum is a vector, say {\bf j}, with components $j_x,j_y,j_z$ in some
coordinate system, and magnitude $j=|{\bf j}|=\sqrt{j_x^2+j_y^2+j_z^2}$.
Any component, say $j_z$, can range in value from $-j$ to $+j$.
In quantum mechanics, we have a (complex) Hilbert space of state-vectors
(say $H$) for any system. A state-vector $v\in H$ specifies a state of the
system, and $v$ and $w$ specify the same state if and only if $v=cw$ for
some non-zero complex number $c$. Let $Q$ be some classical, real-valued
variable that you can measure (like energy or momentum). The ``quantum
version'' of $Q$ is a Hermitian operator on $H$. The eigenvalues of this
operator are the possible values you can get from measuring $Q$. The
quantum system has a definite value for $Q$ if and only if the system is in
an eigenstate of the operator. If not, the act of measurement will serve
to cast the system into such an eigenstate, with probabilities that can be
computed by the rules of quantum mechanics.
Applying this prescription to angular momentum, we see that $j_x, j_y, j_z$
all must be Hermitian operators. It turns out now that the Hilbert space
of quantum states decomposes, in the most general case, into a direct sum:
\[
H = H_0 \oplus H_{1/2} \oplus \ldots \oplus H_j \oplus \ldots
\]
where each summand in turn decomposes:
\[
H_j = \sum_{m=-j}^j K_{jm}
\]
and each element of $K_{jm}$ is an eigenvector of $j_z$ with eigenvalue
$m$. (Here, $m$ ranges in steps of~1 from $-j$ to $+j$.) In particular
cases, some of these summands may be missing. For example, if the quantum
system has definite angular momentum $j$, then $H=H_j$.
Each $K_{jm}$ is of course invariant under $j_z$, i.e., we have an
invariant direct-sum decomposition of $H$ for the operator $j_z$. The
$K_{jm}$ are {\it not} invariant under $j_x$ or $j_y$, but it turns out
that the $H_j$ {\it are} invariant under all three operators $j_x$, $j_y$,
and $j_z$. Put another way, the invariant direct-sum decompositions for
$j_x$ and $j_y$ have the same $H_j$'s but different $K_{jm}$'s.
Finally, the operator $j_x^2 + j_y^2 + j_z^2$ is itself a Hermitian
operator; each $H_j$ is an eigenspace, with eigenvalue $j(j+1)$.
This excursion into Hilbert spaces should make mathematically-minded folk
more comfortable with the catalog of ``spin facts''. The space $H_j$ is
just the space of states that have ``spin $j$''; the subspace $K_{jm}$ is
the space with component $m$ along the $z$-axis. I haven't discussed the
combination rules; this translates into statements about the direct-sum
decomposition of tensor products. Nor have I explained the curious
appearence of the term $j(j+1)$; this is bound up with the
non-commutativity of $j_x$, $j_y$, $j_z$.
Historically, the rules for spin came from playing with experimental data.
The rules worked but remained mysterious. The birth of quantum mechanics
came later, at the hands of Heisenberg, Schr\"o\-dinger, and Dirac. Spin
fell into place soon after that.
Quantum mechanics is mysterious, of course, as numerous philosophical
treatises attest. But during the crucial period from Bohr's first great
work (in~1913) to Heisenberg's discovery of matrix mechanics (in~1925),
certain technical facts sowed confusion above and beyond the general
quantum ``spookiness''. I will single out one of these for special
attention: the spin of the electron. In~1926, Goudsmit and Uhlenbeck
proposed (correctly) that the electron has spin one-half. Only integer
spin quantum systems have classical counterparts, as we will see. The
fractional spin of the electron lurked amid the general confusion during
the heyday of the ``old quantum theory'' (1913--1925), bedevilling
physicists. (Goudsmit and Uhlenbeck actually made their proposal in the
language of the old quantum theory, not the newly minted quantum mechanics.
The next year Pauli showed how to incorporate spin into quantum mechanics.)
In these notes, I will roam through the history of the old quantum theory,
zeroing in on the adumbrations of modern quantum spin. (I will intersperse
remarks on what the pioneers were ``really doing''.) I will not lay out
the whole array of modern mathematical toys that makes Everything Clear.
The reader whose appetite has been whetted must turn to standard textbooks
for that. But I will try paint an impressionistic picture of the luminous
synthesis, with a few broad brush strokes and one or two detailed
pointillistic patches.
\paragraph{Entr'act}
In 1900, Planck's paper on black-body radiation introduced his constant
$h$; in~1905, Einstein's paper on the photo-electric effect introduced
light-quanta (later called photons). I hasten past these developments with
just a smattering of comments. Both phenomena dealt with the interaction
of light and matter. Late nineteenth century physicists felt they
understood light, thanks to Maxwell. They knew less about matter, and knew
they knew less, but had no reason to suspect that Newton's laws didn't hold
``all the way down''. But the interaction between the light and matter lay
at the research frontier.
Einstein proposed that light of frequency $\nu$ was composed of quanta,
each with energy $h\nu$. This flies in the face of Maxwell's
identification of light with electromagnetic waves. Moreover, Einstein's
relation weds a particle concept (the energy of one photon) to a wave
concept (the frequency). (Einstein regarded his hypothesis as
provisional--- a guide to a more complete theory.)
\paragraph{The Old Quantum Theory}
Bohr's 1913 model of the hydrogen atom really starts the story. Nineteenth
and early twentieth century experimentalists heaped up data on the spectra
of atoms. Besides the plain old spectra of hydrogen, helium, sodium, etc.,
these physicists studied ionized atoms, atoms in electric fields (the Stark
effect), atoms in magnetic fields (the Zeeman effect), atoms in crossed
electric and magnetic fields; they discovered the so-called
``fine-structure'' of spectra, where one spectral line under higher
resolution splits into two or more spectral lines; and so on, and so on,
for thousands of journal pages.
Bohr's model solidified regularities in this data already partly noted by
earlier workers (notably Balmer and Ritz). Bohr's transition picture (as I
will call it) states that an atom has a discrete set of energy levels.
When the atom emits a photon, it loses energy, changing from (say) energy
level $E_i$ to level $E_j$, so the photon has energy $E_i-E_j$ and hence
frequency
\[
\nu=(E_i-E_j)/h
\]
(This is known as the Bohr frequency condition.) So to explain the
spectrum of an atom, we must identify all its energy levels. Of course,
this picture depends fundamentally on the Einstein relation. It is utterly
incompatible with any classical picture in which an orbiting electron
gradually changes from one orbit to another, emitting radiation as it
decellerates. Indeed, classically there should be no stable orbits at
all!--- as noted in all histories of quantum theory.
The transition picture survives in modern quantum mechanics, although
naturally much refined. The electromagnetic field is a quantum system; so
is the atom; the two are coupled to form one combined quantum system. The
interaction between the atom's charged particles and the electromagnetic
field shows up as a term in the Hamiltonian (the energy operator, which
governs the time evolution of the system). If this so-called
coupling term is neglected, then the mathematics predicts that the atom
should have discrete stable energy levels (i.e., if the atom is in an
eigenstate for a particular energy level, then it will stay in
that eigenstate forever.)
Just as the energy of the atom is quantized, so too is the energy of the
electromagnetic field at a particular frequency. This falls out from
computations with the quantum version of Maxwell's equations. We
interpret this result as saying that electromagnetic field consists of
quanta (or photons). Einstein's relation $E=h\nu$ can be deduced, rather
than postulated.
Finally, if the coupling term between the field and the charged particles
is {\it not} neglected, transitions become possible between formerly stable
eigenstates. Both the atom and the field change their state together.
Bohr's frequency condition can be derived, placing a satisfying capstone on
the theoretical development.
Returning to history's tangled tale\ldots Bohr's 1913 paper did more than
introduce the transition picture. He also derived the energy levels for
the hydrogen atom, by a remarkably simple argument. He assumed that the
orbital angular momentum of the atom's solitary electron was $n\hbar$, with
$n$ a positive integer. He assumed circular orbits, and applied Newton's
laws, and the formula for the energy-levels of the atom popped right out.
And it was right! (Apart from second-order
corrections\footnote{Specifically, Bohr's energy levels agree exactly with
those derived from Schr\"o\-dinger's equation, when we neglect the spin of
the electron, the spin of the proton, and relativistic effects. (Spin
itself is sometimes considered a relativistic effect.)}.)
Once Bohr opened the gates, other quantum soldiers flocked in to
consolidate the victory. Sommerfeld (in 1916) extended Bohr's model to
allow for elliptical orbits. He included relativistic corrections, and
calculated the effects of magnetic fields. Sommerfeld assembled a cadre of
students, in Munich. Bohr had his group at Copenhagen, and Born started
one at G\"ottingen. Heisenberg and Pauli began their careers as students
in Sommerfeld's seminar on spectra, and studied at all three centers.
Almost all work during the period 1913--1925 shared a common approach, now
called the {\it old quantum theory}. One began with a classical treatment
of a dynamical problem--- for example, elliptical orbits. One slapped {\it
quantum conditions} on top. Quantum conditions stated that certain
classical quantities had to be integer multiples of $\hbar$--- like the
orbital angular momentum of the electron, in Bohr's theory of the hydrogen
atom. (Sommerfeld added other quantum conditions.) Quantum conditions
plus classical physics yielded a discrete set of energy levels, which in
turn yielded spectra via Bohr's frequency condition.
These forays met initially with great success, but ultimately with crushing
defeat. Generally speaking, the Bohr-Sommerfeld approach handled two-body
problems reasonably well, but collapsed when faced with three-or-more-body
problems. The hydrogen atom is a two-body problem: an electron orbits a
proton. Bohr himself extended the theory to ``hydrogenic'' atoms--- atoms
that consist of a core of tightly-bound electrons around the nucleus, and a
single loosely-bound electron orbiting further out. Hydrogenic atoms are
``approximately'' two-body systems: core-plus-electron. As soon as the
theoreticians turned to three-body problems, theory and experiment parted
company. Helium (two electrons and a nucleus) and the singly-ionized
hydrogen molecule (one electron and two nuclei) spelled doom for the old
quantum theory.
\paragraph{The Hydrogen Atom, Then and Now}
Let us look at a simple model of hydrogen atom: we neglect the spin of the
proton and the electron, and relativistic effects. What remains is a point
charge in an inverse-square force field--- the classical Kepler problem.
With these simplifications, the states of the hydrogen atom can be
specified by giving three integer labels $(n,l,m)$, with:
\begin{eqnarray*}
n & = & 1,2,\ldots\\
l & = & 0,1,\ldots, n-1\\
m & = & -l,-(l-1),\ldots,(l-1), l\\
\end{eqnarray*}
The labels $(n,l,m)$ are called {\it quantum numbers}, and made their debut
in Sommerfeld's elliptical orbit model. They correspond to certain
classical features of the orbit:
\begin{eqnarray*}
n & \leftrightarrow & \rm \sqrt{semimajor\ axis}\\
l & \leftrightarrow & \rm orbital\ angular\ momentum\\
m & \leftrightarrow & \rm orientation\ of\ the\ orbit
\end{eqnarray*}
Sommerfeld's quantum conditions stated that these three classical
quantities were restricted to integer values (and in fact the collection of
values given above).\footnote{Sommerfeld gave a single more general quantum
condition which implied the quantum conditions for $n$, $l$, and $m$ as
special cases. So Sommerfeld's approach was not so ad-hoc as this summary
makes it look.} I should really say ``integer multiples of $\hbar$'', but
from now on I will adopt so-called natural units of measurement, which are
chosen so that $\hbar=1$.
I should make one refinement to this description of the ``old quantum''
viewpoint. The quantum conditions apply only to so-called {\it stationary}
orbits. Bohr offered no details for what happened during a transition from
one stationary orbit to another (a ``quantum jump''), nor could he explain
why these orbits were stationary. Initially, physicists probably regarded
these questions as topics for future research.
In the post-1925 reformulation, we have to solve a particular instance of
Schr\"o\-dinger's equation, subject to certain boundary conditions. The
space of all solutions is a Hilbert space, with a basis $\{v_{nlm}\}$.
The vector $v_{nlm}$ is simultaneously an eigenvector of three operators:
\begin{tabbing}
xxxxxxxx\=xxxxx\=\kill
\>$E$,\>the energy\\
\>$L$,\>the magnitude of the orbital angular momentum\\
\>$L_z$,\>the $z$-component of the orbital angular momentum\\
\end{tabbing}
So we can say that if the hydrogen atom is in state $v_{nlm}$, then it
has a definite energy, and its angular momentum has a definite magnitude
and $z$-component. In fact, the eigenvalues for $v_{nlm}$ are:
\begin{eqnarray*}
E v_{nlm} & = & \frac{R}{n^2}v_{nlm}\\
L v_{nlm} & = & \sqrt{l(l+1)}v_{nlm}\\
L_z v_{nlm} & = & m v_{nlm}\\
\end{eqnarray*}
where $R$ is a physical constant known as Rydberg's constant.
The modern equivalent of the old notion of ``stationary orbit'' is
``eigenstate of the energy operator''. Such eigenstates do not change with
time, and they possess a definite value for the energy. (These facts are
closely related.) Schr\"o\-dinger's equation, in fact, amounts to
$Ev=\lambda v$, where $E$ is the energy operator.
\begin{figure}[t]
\setlength{\unitlength}{0.0125in}%
\begin{picture}(410,180)(15,505)
\put( 15,565){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{$n=1$}}}
\put( 15,615){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{$n=2$}}}
\put( 15,645){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{$n=3$}}}
\put( 75,505){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{$l=0$}}}
\put(160,505){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{$l=1$}}}
\put(312,505){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{$l=2$}}}
\put( 70,525){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{$m\!=\!0$}}}
\put(118,525){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{$m\!=\!\!-\!1$}}}
\put(158,525){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{$m\!=\!0$}}}
\put(191,525){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{$m\!=\!1$}}}
\put(237,525){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{$m\!=\!\!-\!2$}}}
\put(274,525){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{$m\!=\!\!-\!1$}}}
\put(312,525){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{$m\!=\!0$}}}
\put(345,525){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{$m\!=\!1$}}}
\put(380,525){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{$m\!=\!2$}}}
\thicklines
\put( 70,615){\line( 1, 0){ 25}}
\put(120,615){\line( 1, 0){ 25}}
\put(155,615){\line( 1, 0){ 25}}
\put(190,615){\line( 1, 0){ 25}}
\put( 70,645){\line( 1, 0){ 25}}
\put(120,645){\line( 1, 0){ 25}}
\put(155,645){\line( 1, 0){ 25}}
\put(190,645){\line( 1, 0){ 25}}
\put(240,645){\line( 1, 0){ 25}}
\put(275,645){\line( 1, 0){ 25}}
\put(310,645){\line( 1, 0){ 25}}
\put(345,645){\line( 1, 0){ 25}}
\put(380,645){\line( 1, 0){ 25}}
\put( 70,565){\line( 1, 0){ 25}}
\put( 50,685){\line( 0,-1){140}}
\put( 50,545){\line( 1, 0){375}}
\put(124,614){\line(-1,-1){ 48}}
\put(155,613){\line(-5,-3){ 80}}
\put(202,614){\line(-5,-2){120}}
\end{picture}
\caption{Term Scheme for Hydrogen}
\label{fig-1}
\figrule
\end{figure}
Figure~\thefigure\ gives a pictorial representation of the basis
$\{v_{nlm}\}$ in a form known as a {\it term scheme}. The horizontal lines
stand for basis vectors (or equivalently, stationary quantum states);
height gives energy, and transitions are indicated by slanted lines. (Only
three transitions are pictured, to avoid clutter.)
From this simple diagram, many treasures flow. The next few paragraphs
give a taste.
\paragraph{Degeneracy}
We see that all energy levels, except for $n=1$ (the
so-called ground state), contain several states. Physicists say that the
energy levels are {\it degenerate}; in particular, the degree of degeneracy
of level $n$ is $1+3+5+\ldots + (2n-1) = n^2$. Mathematically, the $n$-th
eigenspace of the energy operator has dimension $n^2$.
Moralists may abhor degeneracy, but physicists love it, for it adds spice
and complexity. (Of course, the same may hold for moral degeneracy!)
Multi-dimensional eigenspaces furnish room for interesting transformation
groups. Physical applications of group representation theory really took
off after the discovery of quantum mechanics.
\paragraph{Pertubations and Fine Structure}
Bohr's formula for the energy levels
is quite simple:
\[
E_{nlm} = \frac{R}{n^2}
\]
This simplicity stems from the simplicity of the formula he used for the energy:
\[
E = \frac{p^2}{2m_e} + \frac{{\rm constant}}{r}
\]
where $p^2$ is the square of the momentum, $r$ is the radial distance from
the proton to the electron, and $m_e$ is the mass of the electron. The
first term represents the kinetic energy, the second the potential energy
due to the inverse square force (Coulomb potential). The quantum theorists
came along later and said that $E$, $p^2$, and $r$ are all really
operators, and the energy levels are really eigenvalues, but the source of
the simplicity remains the same. ``Simplicity'' is too vague a word:
``symmetry'' serves better, as we will see.
For example, the formula for $E$ is spatially symmetric under rotations
about the proton (because $p^2$ and $r$ are unchanged by such rotations).
Thus the formula for $E_{nlm}$ cannot depend on $m$, which specifies the
orientation of the orbit. If we impose a magnetic field along the
$z$-axis, say, then $E$ acquires a new term (of the form $BL_z$, where $B$
is the magnetic field strength, and $L_z$ is the $z$-component of orbital
angular momentum). The new formula for $E_{nlm}$ now depends on $m$, since
$L_z$ is not symmetric under arbitrary rotations. The magnetic field is
referred to as a perturbation.
\begin{figure}[t]
\setlength{\unitlength}{0.0125in}%
\begin{picture}(410,180)(15,505)
\put( 15,565){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{$n=1$}}}
\put( 15,615){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{$n=2$}}}
\put( 15,645){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{$n=3$}}}
\put( 75,505){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{$l=0$}}}
\put(160,505){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{$l=1$}}}
\put(312,505){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{$l=2$}}}
\put( 70,525){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{$m\!=\!0$}}}
\put(118,525){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{$m\!=\!\!-\!1$}}}
\put(158,525){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{$m\!=\!0$}}}
\put(191,525){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{$m\!=\!1$}}}
\put(237,525){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{$m\!=\!\!-\!2$}}}
\put(274,525){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{$m\!=\!\!-\!1$}}}
\put(312,525){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{$m\!=\!0$}}}
\put(345,525){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{$m\!=\!1$}}}
\put(380,525){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{$m\!=\!2$}}}
\thicklines
\put( 70,615){\line( 1, 0){ 25}}
\put(155,615){\line( 1, 0){ 25}}
\put( 70,645){\line( 1, 0){ 25}}
\put(155,645){\line( 1, 0){ 25}}
\put(310,645){\line( 1, 0){ 25}}
\put( 70,565){\line( 1, 0){ 25}}
\put( 50,685){\line( 0,-1){140}}
\put( 50,545){\line( 1, 0){375}}
\put(120,620){\line( 1, 0){ 25}}
\put(190,610){\line( 1, 0){ 25}}
\put(120,650){\line( 1, 0){ 25}}
\put(190,640){\line( 1, 0){ 25}}
\put(240,655){\line( 1, 0){ 25}}
\put(275,650){\line( 1, 0){ 25}}
\put(345,640){\line( 1, 0){ 25}}
\put(380,635){\line( 1, 0){ 25}}
\put(133,619){\line(-1,-1){ 52}}
\put(165,614){\line(-5,-3){ 80}}
\put(194,609){\line(-5,-2){110}}
\end{picture}
\caption{Term Scheme for Hydrogen in a (Strong) Magnetic Field}
\label{fig-2}
\figrule
\end{figure}
We can see the effects of the magnetic field in changes to the term scheme:
some states move up in energy, some move down. Reduced symmetry has lead
to reduced degeneracy. Transitions that formerly had the same $\Delta E$
now have slightly different $\Delta E$'s (see figure~\thefigure). This
means that a single spectral line with no magnetic field will split into
multiple spectral lines when the field is turned on. This is the famous
{\it Zeeman effect}.
Another example: in~1916, Sommerfeld replaced Bohr's simple formula for $E$
with its relativistic equivalent. This lead to a new formula for
$E_{nlm}$, depending on both $n$ and $l$ (but not on $m$, of course).
These relativistic corrections (or perturbations) account for the so-called
{\it fine structure} of the hydrogen spectrum, known already to
spectroscopists long before~1916.
A historical footnote: one modern author has written:
\begin{quotation}
When Dirac developed relativistic quantum mechanics, the relativistic
Coulomb problem proved to be {\it exactly solvable} \ldots But the
resulting formula for the energy levels was truly a surprise: {\it The new
answer was precisely the old Sommerfeld formula!}
How could this possibly be? Clearly Sommerfeld's methods were
heuristic (Bohr quantization rules), out-dated by {\it two} revolutions
(Heisenberg-Schr\"o\-dinger nonrelativistic quantum mechanics and \linebreak[4]
Dirac's relativistic quantum mechanics) and his methods obviously had
no place at all for the electron spin \ldots So Sommerfeld's correct
answer could only be a lucky accident, a sort of cosmic joke at the
expense of serious minded physicists.
\end{quotation}
The author calls this the ``Sommerfeld puzzle'', and resolves it, but I
will discuss it no further.
\paragraph{The Periodic Table}
From~1920 to~1923, Bohr applied his ideas to explain
features of the periodic table; Bohr's work in this vein was corrected and
extended by Pauli in~1924 (with an assist from Stoner).
Suppose we treat the states in the term scheme for hydrogen as slots to
hold electrons. Let us hypothesize that each slot can hold at most two
electrons. We start with hydrogen, and keep adding electrons, adding new
electrons always to the lowest slot that has a vacancy. (Bohr christened
this the {\it Aufbauprincip}, the building-up principle.) The energy
levels of the slots may shift somewhat as a result of interactions between
the electrons. This simple model (extremely naive from the modern
viewpoint) serves to explain a remarkable number of chemical regularities.
For example, energy level $n$ can hold $2n^2$ electrons. We therefore get
completely filled energy levels for 2 electrons, 2+8 electrons, 2+8+18
electrons, etc.--- which correspond exactly to the noble gases.
Bohr was able to explain some properties of the rare earth elements using
his version of this model\footnote{Bohr did not make use of the quantum
number $m$; instead, he had $n$ slots at level $n$ ($l=0,1,\ldots,n-1$),
each capable of holding $2n$ electrons. Note that this is actually
incorrect.}; he even predicted correctly that element~72 (not yet
discovered) would {\it not} be a rare earth, but instead would resemble
zirconium (contrary to what some chemists expected). Students of Bohr then
discovered element~72 in zirconium ore samples, and element~72 was named
hafnium in honor of the Latin name for Copenhagen.
Why two electrons per slot? Pauli proposed adding a new quantum number to
the triple $(n,l,m)$; nowadays we use $s$ for this number, and recognize
that it stands for the spin of the electron. So $s$ is restricted to the
values $\pm \frac{1}{2}$, and each state (or slot) in the term scheme above
is really two states. In other words, our Hilbert space must be enlarged
to a space with basis $\{v_{nlms}\}$. Pauli also proposed his famous {\it
exclusion principle}: a state can hold at most one electron.
A historical note: Bohr made no detailed orbit calculations for
multi-electron atoms. He did use some general intuitive principles
stemming from a classical picture, e.g., a electron in a state of large $n$
will be far away from the nucleus, and electrons with smaller $n$ will
partly ``screen'' the charge of the nucleus from the far-off electron.
Nevertheless, the Bohr-Sommerfeld hydrogen model had a comforting,
semi-classical feel to it: we compute the classical orbits, then decree
that only some are permitted. For multi-electron atoms, Bohr dropped this
connection. The term scheme becomes almost a combinatorial device.
Physicists (such as Born and Heisenberg) who did perform detailed orbit
calculations found utter disagreement between theory and experiment--- even
for helium, a mere two electrons. No surprise, since the classical picture
leaves out spin, the exclusion principle, and other purely quantum
interactions between the electrons. Bohr intuited just how far to push the
classical picture; he picked those approximation schemes that survive
(reinterpreted) in quantum mechanics.
\paragraph{The Zeeman Effect}
Place an atom in a magnetic field, and (as Zeeman discovered) its spectral
lines will split. Pais's book {\it Inward Bound} gives fascinating details
of the experimental history.
For some atoms, spectral lines split in three under a magnetic field. This
is known as the {\it normal Zeeman effect}. For other atoms, the spectrum
displays a more complex pattern of splittings, known as the {\it
anomalous Zeeman effect}. When the magnetic field becomes strong enough,
however, some lines merge back together, and the anomalous Zeeman splitting
coalesces smoothly into a normal splitting--- this is called the {\it
Paschen-Back effect}. Young Heisenberg wrestled with the anomalous Zeeman
effect, as a student in Sommerfeld's seminar on atomic spectra (see
Cassidy's biography {\it Uncertainty}.)
The full explanation of all this phenomena presented quite a puzzle to the
quantum pioneers. Even today, fitting all the pieces together can be
confusing. In the next few sections I will try to assemble this puzzle.
I treat the Zeeman effect in such gory detail for two reasons. First,
historical interest: the Zeeman effect gave the first hints of the
complexities of quantum spin. Many of the strange ``spin facts'' I stated
earlier were first discovered empirically, from poring over spectra and
noticing patterns.
Second, ``gestalt'': the very complexity of the Zeeman effect means that any
explanation of it must bring together several aspects of quantum mechanics.
We can get started by comparing figures~\ref{fig-1} and~\ref{fig-2}. We
see that spectral lines will split if energy levels move up or down (i.e.,
if the energy levels of quantum states change). To progress further, one
needs a formula for the energy levels of the states of an atom--- or
rather, a formula for the change in energy level due to the magnetic field.
Say we write $E = E^0 + E^{\rm mag}$, where $E^0$ is the energy with no
magnetic field, and $E^{\rm mag}$ is the energy due to the magnetic field.
(I will add quantum-number subscripts (like we had with $E_{nlm}$) in a
moment.) Here is a formula for $E^{\rm mag}$ that accounts for the normal
Zeeman effect:
\begin{equation}
E^{\rm mag}_{LM_LSM_S} = {\rm constant}\,B(M_L+2M_S)
\label{eq-1}
\end{equation}
where $L$, $M_L$, $S$, and $M_S$ are quantum numbers that I will discuss
below, and $B$ is the strength of the magnetic field. Here is a formula
that accounts for the anomalous Zeeman effect:
\begin{equation}
E^{\rm mag}_{JLSM_J} =
{\rm constant}\,BM_J\left(1+\frac{J(J+1)-L(L+1)+S(S+1)}{2J(J+1)}\right)
\label{eq-2}
\end{equation}
where $J$ and $M_J$ are another pair of quantum numbers to be discussed
below. The complicated fraction built out of $J$, $L$, and $S$ on the
right hand side is known as the Land\'e $g$-factor.
So there are four pieces to the puzzle:
\begin{enumerate}
\item What is the meaning of the six quantum numbers $L$, $S$, $J$, $M_L$,
$M_S$, and $M_J$?
\item How do the ``normal'' and ``anomalous'' formulas (formulas~\ref{eq-1}
and~\ref{eq-2}) account for the details of the Zeeman splitting?
\item Where do the normal and anomalous Zeeman formulas come from?
\item How does the anomalous Zeeman formula turn into
the normal Zeeman formula in the Paschen-Back limit of strong magnetic
fields?
\end{enumerate}
The next four sections tackle each of these questions in turn.
\paragraph{Angular Momentum Quantum Numbers}
{\it What is the meaning of the six quantum numbers $L$, $S$, $J$, $M_L$,
$M_S$, and $M_J$?}
The ``old quantum'', Bohrish-Sommerfeldian notion of quantum numbers ran
like this:
\begin{quote}
Certain classical quantities are constrained, for a stationary quantum
state, to take on only integer values. These values are called the quantum
numbers of the state.
\end{quote}
Generalizing this just a bit, we allow any discrete set of real
numbers in place of the integers (e.g., the set $\{\sqrt{j(j+1)}\}$ as $j$
ranges over the integers; or more significantly, as $j$ ranges over the set
of integers and half-integers).
Dressing this up in mathematical language, we have a space (actually a
differential manifold) $\Sigma$ that represents the set of all classical
states. Classical quantities are continuous real-valued functions on
$\Sigma$. Bohr and Sommerfeld assumed that nature really permits only a
subset $\Sigma_0$ to occur as stationary states. They assumed further,
for certain classical $f:\Sigma \rightarrow \reals$, that the image
$f(\Sigma_0)$ was a discrete subset of \reals.
The ``new quantum'' notion of quantum numbers (\`a la Heisenberg,
Schr\"o\-dinger, and Dirac) runs like this:
\begin{quote}
Classical quantities correspond to Hermitian operators on a complex Hilbert
space. Quantum states are specified by elements of the space. If a
Hermitian operator $Q$ has a discrete spectrum, and $v$ is an eigenvector
of $Q$ with eigenvalue $q$, then we say that $q$ is a quantum number of the
state specified by $v$.
\end{quote}
I will start with the ``old quantum'' explanation of our six quantum
numbers. (The ``new quantum'' version will have to wait till we discuss
the Paschen-Back effect.) Consider first the hydrogen atom. The electron
possesses orbital angular momentum, given by a vector\footnote{We mean an
ordinary 3-space vector, not a vector in a (complex) Hilbert space.} {\bf
l}. Since the electron is spinning, it has also spin angular momentum,
given by a vector {\bf s}. The total angular momentum of the
system\footnote{ignoring any motion of the proton, a good approximation} is
then given by the vector ${\bf j} = {\bf l} + {\bf s}$.
For a multi-electron atom, we let {\bf L} be the sum of the {\bf l}'s
over all the electrons; {\bf S} is likewise the sum of all the {\bf s}'s;
and ${\bf J} = {\bf L} + {\bf S}$.
We can now say what $J$ is: it's the quantum number associated with the
magnitude of {\bf J} (likewise for $L$ and $S$). If the term
``associated'' seems too vague, here is the exact relationship: the
magnitude of {\bf J} is $\sqrt{J(J+1)}$.
We pick an arbitrary direction in space and call it the $z$-axis. $M_J$ is
the quantum number for the component of {\bf J} along the $z$-axis. In
other words, if ${\bf J} = (J_x,J_y,J_z)$ in some coordinate system, then
$M_J$ is the quantum number for $J_z$. Likewise for $M_L$ and $M_S$. Note
that $M_J=M_L+M_S$, but we have no such simple relation for $J$, $L$, and
$S$; the magnitude of {\bf J} depends of course on the angle between {\bf
L} and {\bf S} as well as their magnitudes.
With our six angular momentum quantum numbers, we have enough ammunition to
explain the Zeeman splitting (as we will see later). You may be wondering
what happened to the quantum number $n$. Our six quantum numbers are not a
complete set--- that is, they are insufficient to fully specify the quantum
state of the atom (or label the set of basis vectors, in modern
terminology). But they are sufficient to explain the Zeeman effect.
Sommerfeld (and Bohr) knew nothing of spin, of course--- that had to wait
for Goudsmit and Uhlenbeck. Even so, Sommerfeld had concluded that he
needed to augment his arsenal of quantum numbers, just from poring over
spectra. Bohr and Sommerfeld supposed that in multi-electron atoms, the
``inner core'' of electrons interacted in some complex way with the outer
electrons. That justified the introduction of more quantum numbers. (For
historical accuracy, I note that Sommerfeld's set of quantum numbers
differed from the ``modern'' set ($J$, $S$, etc.). The ``modern'' set is
equivalent to Sommerfeld's, but conceptually clearer.)
Sommerfeld turned the problem of the anomalous Zeeman effect over to young
Heisenberg, a student in his seminar. Heisenberg, struggling with it,
introduced half-integer quantum numbers into the old quantum theory for the
first time. Sommerfeld was shocked, and urged Heisenberg not to
publish\footnote{Alfred Land\'e independently came up with the same
half-integer trick. He did publish, and so attached his name to the
Land\'e $g$-factor.}. ``If we know one thing about quantum numbers, we know
they are integers!'' Bohr, too, expressed displeasure. Pauli warned that
today we allow half-integers, tomorrow it's quarter-integers, then eighths,
sixteenths, and before you know it the quantum conditions have eroded away.
\paragraph{Selection Rules}
{\it How do the normal and anomalous Zeeman formulas account for the details of
the Zeeman splitting?}
I omitted many transition lines from figures~\ref{fig-1} and~\ref{fig-2}
to avoid cluttering the diagram. Nature seems to share this taste for
simplicity. Of the multitude of transition lines one might draw, most are
forbidden by {\it selection rules}.
For figures~\ref{fig-1} and~\ref{fig-2}, the appropriate selection rules
assert that $l$ changes by one unit, and $m$ changes by at most one unit:
$\Delta l = \pm 1$, $\Delta m = 0, \pm 1$. Conservation of angular
momentum accounts for these rules, as hinted earlier. But before getting
into that, let us explore the role of selection rules in the Zeeman effect.
As it happens, the normal and anomalous Zeeman effects call for different
selection rules. Four selection rules apply to the normal Zeeman effect:
\begin{itemize}
\item $\Delta L = 0, \pm 1$.
\item $\Delta M_L = 0, \pm 1$.
\item $\Delta S = 0$.
\item $\Delta M_S = 0$.
\end{itemize}
Plugging this into the formula $E^{\rm mag} = {\rm constant}\,B(M_L +
2M_S)$, we get:
\begin{equation}
\Delta E^{\rm mag} = {\rm constant}\, B\Delta M_L
\label{eq-3}
\end{equation}
Spin has dropped out completely! $\Delta M_L$ has only three possible
values, thus the magnetic perturbation $\Delta E^{\rm mag}$ splits each
spectral line into three.
For example, consider transitions from $L=2$ states to $L=1$ states.
Without the magnetic field, each $L=2$ state belongs to a degenerate
quintuplet of states (distinguished by the five possible values of
$M_L$)--- where ``degenerate'', you may recall, simply means that all
states in the quintuplet have the same energy. Likewise, each $L=1$ state
belongs to a degenerate triplet. This degeneracy stems from the fact that
the energy does not depend on $M_L$, when there is no field.
Had we no selection rules at all, we would have~15 possible transitions
from the $L=2$ quintuplet to the $L=1$ triplet. The selection rule cuts
this down to~9 (we can draw three lines to each state in the triplet).
Since the quintuplet and the triplet are each degenerate, $\Delta E$ is the
same for all~9 quintuplet-triplet transitions. We see a single spectral
line from~9 transitions.
Now turn on the magnetic field. The degeneracy is lifted--- the energy
levels in the quintuplet separate slightly, as do those in the triplet (see
figures~\ref{fig-2}). Group the~9 quintuplet-triplet transitions into
three groups of three, according to the value of $\Delta M_L$. Each group
has the same $\Delta E^{\rm mag}$, and so~9 transitions give rise to
three spectral lines.
For the anomalous Zeeman effect, different selection rules apply:
\begin{itemize}
\item $\Delta J = 0,\pm 1$.
\item $\Delta M_J = 0, \pm 1$.
\end{itemize}
$M_L$ and $M_S$ fade from the picture. We will search for them again when
we come to the Paschen-Back effect. ($L$ and $S$ are still with us, though.)
We want to plug this into formula~\ref{eq-2} for the anomalous Zeeman
$E^{\rm mag}$. We don't need the full complexity of formula~\ref{eq-2};
the following is enough:
\begin{equation}
E^{\rm mag}_{JLSM_J} = f(J,L,S)M_J
\label{eq-4}
\end{equation}
The function $f(J,L,S)$ contains the complicated fraction we had before,
plus constants (like the magnetic field strength $B$). Consider a
transition from a $(J,L,S,M_J)$ state to a $(J',L',S',M_J')$ state.
Plugging in the selection rules, we get:
\begin{equation}
\Delta E^{\rm mag} = f(J,L,S)M_J - f(J',L',S')M_J'
\end{equation}
As before, we first turn the field off, then on. Without the field, the
energy of a state does not depend on $M_J$, and so each state belongs to a
degenerate multiplet. We label the multiplet with the quantum numbers $(J,L,S)$.
Without the field, all transitions from one multiplet to another have the same
$\Delta E$--- shining out one single spectral line.
Turn the field on, and this overloaded spectral line splits according to
the different values of $\Delta E^{\rm mag}$. In contrast to normal Zeeman
splitting, transitions with the same $\Delta M_J$ generally don't have the
same $\Delta E^{\rm mag}$. Count the spectral lines, and you've counted
the transitions (apart from accidental coincidences, which are rare).
\begin{figure}[t]
\setlength{\unitlength}{0.0125in}%
\begin{picture}(357,231)(10,493)
\thicklines
\put(264,702){\line( 1, 0){ 15}}
\put(288,702){\line( 1, 0){ 15}}
\put(314,702){\line( 1, 0){ 15}}
\put(340,702){\line( 1, 0){ 15}}
\put( 80,724){\line( 0,-1){189}}
\put( 80,535){\line( 1, 0){287}}
\put(125,553){\line( 1, 0){ 15}}
\put(153,553){\line( 1, 0){ 15}}
\put(186,628){\line( 1, 0){ 15}}
\put(213,628){\line( 1, 0){ 15}}
\put(204,619){\line(-1,-1){ 57}}
\put(150,562){\line( 5, 4){155}}
\put( 85,500){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{\rm M}}}
\put(220,500){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{\rm 1/2}}}
\put(124,500){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{\rm -1/2}}}
\put(155,500){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{\rm 1/2}}}
\put(189,500){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{\rm -1/2}}}
\put(345,500){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{\rm 3/2}}}
\put(320,500){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{\rm 1/2}}}
\put(289,500){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{\rm -1/2}}}
\put(257,500){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{\rm -3/2}}}
\put( 97,493){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{\rm J}}}
\put( 10,551){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{\rm L=0 J=1/2}}}
\put( 10,628){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{\rm L=1 J=1/2}}}
\put( 10,701){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{\rm L=1 J=3/2}}}
\put(147,589){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{\rm D}}}
\put(269,638){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{\rm D}}}
\put(157,581){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{\rm 1}}}
\put(280,630){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{\rm 2}}}
\end{picture}
\caption{Sodium D-lines}
\label{fig-3}
\figrule
\end{figure}
\begin{figure}[t]
\setlength{\unitlength}{0.0125in}%
\begin{picture}(358,312)(10,493)
\thicklines
\put( 80,805){\line( 0,-1){270}}
\put( 80,535){\line( 1, 0){288}}
\put(153,553){\line( 1, 0){ 15}}
\put(125,560){\line( 1, 0){ 15}}
\put(186,646){\line( 1, 0){ 15}}
\put(213,614){\line( 1, 0){ 15}}
\put(256,785){\line( 1, 0){ 15}}
\put(284,750){\line( 1, 0){ 15}}
\put(308,717){\line( 1, 0){ 15}}
\put(335,688){\line( 1, 0){ 15}}
\put(189,643){\line(-3,-4){ 63}}
\put(189,641){\line(-1,-3){ 29}}
\put(218,612){\line(-1,-1){ 57}}
\put(217,611){\line(-5,-3){ 85}}
\put(261,784){\line(-2,-3){ 28}}
\put(290,748){\line(-1,-1){ 26}}
\put(291,748){\line(-3,-5){ 18}}
\put(314,715){\line(-6,-5){ 24}}
\put(316,716){\line(-3,-4){ 18}}
\put(343,687){\line(-5,-4){ 30}}
\multiput(233,741)(-9.98766,-14.98150){2}{\line(-2,-3){ 8.012}}
\multiput(263,720)(-24.78624,-24.78624){2}{\line(-1,-1){ 10.214}}
\multiput(273,717)(-16.56839,-27.61399){2}{\line(-3,-5){ 7.432}}
\multiput(291,694)(-23.72078,-18.97663){2}{\line(-5,-4){ 11.279}}
\multiput(299,691)(-18.33333,-24.44444){2}{\line(-3,-4){ 8.667}}
\multiput(313,662)(-18.72078,-14.97663){2}{\line(-5,-4){ 11.279}}
\put( 85,500){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{\rm M}}}
\put(220,500){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{\rm 1/2}}}
\put(124,500){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{\rm -1/2}}}
\put(155,500){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{\rm 1/2}}}
\put(189,500){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{\rm -1/2}}}
\put(345,500){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{\rm 3/2}}}
\put(320,500){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{\rm 1/2}}}
\put(289,500){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{\rm -1/2}}}
\put(257,500){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{\rm -3/2}}}
\put( 97,493){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{\rm J}}}
\put( 10,551){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{\rm L=0 J=1/2}}}
\put( 10,618){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{\rm L=1 J=1/2}}}
\put( 10,728){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{\rm L=1 J=3/2}}}
\end{picture}
\caption{Zeeman Effect for Sodium}
\label{fig-4}
\figrule
\end{figure}
For example, consider sodium, a historically important case. Sodium has
two bright yellow spectral lines, known as D-lines (D$_1$ and D$_2$, close
together).\footnote{These lines are responsible for the color of sodium vapor
lamps used on many highways.} In a weak magnetic field, the D$_1$ line
splits in four, and the D$_2$ line splits in six.
``Scarcely credible!'' wrote Lorentz of this discovery. Each D-line splits
into an {\it even} number of lines. It would seem that some multiplet must
contain an even number of states. But there are $2J+1$ different $M_J$'s
for each value of $J$! Heisenberg and Land\'e unflinchingly ascribed
half-integer values to $J$.
Here are the details. The D-lines come from three multiplets: a quartet
and two doublets (see figure~\ref{fig-3}). The pair of quantum numbers
$(L,J)$ specify a multiplet, as can be seen from the figure--- $S$ is
$\frac{1}{2}$ for all eight states. The D-lines are indicated by the
slanted lines. The D$_1$-line actually encompasses all transitions
from the higher to the lower doublet; the D$_2$-line encompasses all
transitions from the quartet to the lower doublet.
Applying the selection rule for $\Delta M_J$, we find that all four
conceivable D$_1$ transitions are permitted; six out of the eight
conceivable D$_2$ transitions are permitted. Turn on the field, and these
all separate. (See figure~\ref{fig-4}. To avoid cluttering the figure, the
D$_2$-transitions are only partly drawn.) Hence the observed splitting.
I note finally that sodium is a ``hydrogenic'' atom. It has~11 electrons,
one more than the noble gas neon. Ten electrons form a relatively inert
``core''--- they make up two complete ``shells'', which contribute
collectively zero spin and zero orbital angular momentum (all electron
spins are paired in the core.) The remaining electron is called a {\it
valence} electron, and is far more loosely bound. A good approximation
treats sodium as a system with a positively charged, immobile core, and a
single orbiting electron--- rather like hydrogen. The eight states we just
discussed are (to a high degree of approximation) eight different states of
the valence electron (which is why they all have $S=\frac{1}{2}$).
Where do the selection rules come from? Here is one way to view a
transition:
\begin{center}
atom$_{\rm initial}$ $\rightarrow$ atom$_{\rm final}$ + photon
\end{center}
where ``atom$_{\rm initial}$'' and ``atom$_{\rm final}$'' refer of course
to the initial and final states of the atom. $J_{\rm photon}=1$, so our
rules for adding angular momenta say that
\[
|J_{\rm final}-1| \leq J_{\rm initial} \leq J_{\rm final}+1
\]
in steps of~1, or in other words, $\Delta J=0,\pm 1$ for the atom.
We must supplement the catalog of spin facts to get the rule for $M_J$.
Our first ``spin fact'' told us that if we have a composite system:
\[
C = A\oplus B
\]
then $|J(A)-J(B)|\leq J(C) \leq J(A)+J(B)$, in steps of~1. We now add a
new requirment:
\[
|M_J(A)-M_J(B)|\leq M_J(C) \leq M_J(A)+M_J(B)
\]
in steps of~1.
You may be wondering how to interpret these equations. What does it mean
to say $C = A\oplus B$? Modern quantum mechanics gives a clear answer:
tensor product. (So $\otimes$ would be a better notation than $\oplus$.)
But a muddled view hews closer to history. The old quantum theory did
offer derivations of the selection rules, via a strange brew of Maxwell's
equations and other ingredients. On the other hand, the Paschen-Back
effect presented a severe conceptual stumbling block. How does one set of
selection rules fade away, and the other take over? Heisenberg discarded
the derivations when it suited his purpose, a matter of some controversy at
the time (see Cassidy's biography). So I leave a coherent discussion to
the textbooks, and continue with cruder arguments.
To get the selection rules for $\Delta S$ and $\Delta M_S$, we shift gears,
and treat light as an electromagnetic wave, not photons. Also, we consider
the reverse process, where an atomic state absorbs energy from incident
light. An electron is so small (point-like, so far as we know today) that
it feels at any moment a uniform electric field, and a uniform magnetic
field. Classical physics says that magnetic effects are much smaller than
electric effects, so we will ignore the former. (In modern parlance, we
are ``deriving'' the selection rule for electric dipole transitions, and
ignoring magnetic dipole transitions.)
The uniform electric field can shove the electron this way or that, but
cannot flip it over. So {\bf S} is unaffected, and hence $\Delta S=\Delta
M_S = 0$. On the other hand, if the wavelength of light is short enough,
the electric field will be shoving the electron one way in one part of its
orbit, and another way for another part, and so may change its orbital
angular momentum {\bf L}.
${\bf J}={\bf L}+{\bf S}$. We've just seen that {\bf S} doesn't change, so
{\bf L} and {\bf J} must change the same way. So the selection rules for
$L$ and $M_L$ must be the same as those for $J$ and $M_J$.
\paragraph{Magnetic Moments}
{\it Where do the normal and anomalous Zeeman formulas come from?}
First, a bit of classical physics. Imagine an ``infinitesimally small''
bar magnet--- this is called a {\it magnetic dipole}. A vector (say
{\boldmath $\mu$}) represents the strength and direction of the dipole;
{\boldmath $\mu$} is known as the {\it magnetic moment}.
Suppose we place the dipole in a magnetic field {\bf B}. The field will
exert a torque on the dipole, trying to flip it into alignment (so {\bf B}
and {\boldmath $\mu$} are parallel). Twisting the dipole out of alignment
takes energy. In other words, the dipole has a potential energy that
depends on its orientation with respect to the field. Classical physics
says that this ``magnetic'' energy is given by \(-{\bf B}\cdot
\mbox{$\boldmath\mu$}\).
A tiny current loop acts like a dipole, with a magnetic moment
perpendicular to the plane of the loop. The current loop possesses angular
momentum (after all, the current consists of moving charges, which have
mass). According to classical physics, the magnetic moment is proportional
to the angular momentum: $\boldmath\mu =$ constant times {\bf J}. The
constant depends on the charge/mass ratio of the moving charges making up
the current.
We now apply all this to an electron in an atom in a magnetic field.
Thanks to its orbital angular momentum {\bf l}, it has a ``magnetic''
energy proportional to ${\bf B}\cdot{\bf l}$; likewise, the spin
contributes energy proportional to ${\bf B}\cdot{\bf s}$. Adding up the
{\bf l}'s and {\bf s}'s for all the electrons in the atom, we might expect
a formula like this for the total ``magnetic'' energy:
\[
E^{\rm mag} = {\rm constant}\, {\bf B}\cdot ({\bf L}+{\bf S}) =
{\rm constant}\, {\bf B}\cdot {\bf J}
\]
Instead, the correct formula is:
\begin{equation}
E^{\rm mag} = {\rm constant}\, {\bf B}\cdot ({\bf L}+2{\bf S})
\label{eq-5}
\end{equation}
Let us accept this formula, noting but not discussing the mysterious
factor~2. If we pick the $z$-axis in the direction of {\bf B}, then the
dot product simply pulls out the $z$-component of {\bf L} and {\bf S},
times the magnitude of {\bf B}. The quantum numbers $M_L$ and $M_S$ give
the values of these components. So we get the ``normal Zeeman'' formula:
\[
E^{\rm mag}_{LM_LSM_S} = {\rm constant}\,B(M_L+2M_S)
\]
or equivalently:
\begin{equation}
E^{\rm mag}_{LM_LSM_S} = {\rm constant}\,B(M_J+M_S)
\label{eq-6}
\end{equation}
since $M_J=M_L+M_S$.
Without the enigmatic~2, we'd simply have $M_J$ on the right hand side of
formula~\ref{eq-6}. Without $M_S$ on the right hand side, the normal and
anomalous Zeeman formulas would be the same--- as we will see.
How does the electron maintain its orientation in the magnetic field--- why
doesn't it snap into alignment? Or in ``old quantum'' terms, how can we
have stationary states with {\bf S} (or {\bf L}) out of alignment with the
field? This was a legitimate question for the old quantum theory, and it
had a good answer. The spinning electron acts like a gyroscope. The field
exerts a torque on the electron, and like any gyroscope, the electron
precesses. (The same argument holds for {\bf L}.)
We can get the ``anomalous Zeeman'' formula from formula~\ref{eq-6} with a
little vector algebra, and a crucial assumption:
\begin{quote}
Assume that in any stationary quantum state, {\bf S} and {\bf L} each
precess around {\bf J}, so that the projection of {\bf S} on {\bf J}
doesn't change, but the component of {\bf S} perpendicular to {\bf J}
rotates uniformly (ditto for {\bf L}).
\end{quote}
This is called {\it spin-orbit coupling}. Only without an external
magnetic field is it strictly true. It is approximately true if the
external field is weak enough. It breaks down in a strong magnetic field,
a phenomenon known as spin-orbit decoupling.
Where does spin-orbit coupling come from? Naively expressed, from a torque
trying to make {\bf L} and {\bf S} point opposite to each other (i.e.,
anti-parallel). Goudsmit and Uhlenbeck imagined it this way. Suppose
there is only a single electron. If we ride along an electron's orbit
right next to it, but not spinning ourselves, we will appear to see the
nucleus orbiting {\it us} (shades of Ptolemy!) Effectively we have a
positive current circulating around us; this will generate a magnetic field
proportional to {\bf L}. The electron has a magnetic moment proportional
to {\bf S}, and hence we get an energy term proportional to ${\bf
L}\cdot{\bf S}$. (Because of the positive charge on the nucleus, the signs
come out so that spin-orbit coupling trys to make {\bf L} and {\bf S}
anti-parallel.)
Life is more complicated in multi-electron atoms. ${\bf L}\cdot{\bf S}$ is
only an approximation, applicable when the electron-electron forces
dominate the spin-orbit forces. In this case, one can approximate the
totality of electrons as a single electron ``smear'', with total orbital
angular momentum {\bf L}, and total spin {\bf S}. For ``light'' atoms,
this works pretty well. For ``heavy'' atoms, a different approximation
works better.
A classical model has emerged. The vectors {\bf L} and {\bf S} feel a
torque trying to make them anti-parallel; when the magnetic field {\bf B}
is turned on, {\bf L} and {\bf S} each feel a torque trying to align them
with {\bf B}. We have, in other words, ``perturbation terms'' occuring
somewhere in the total energy formula:
\begin{equation}
c_1{\bf L}\cdot{\bf S} + c_2{\bf B}\cdot({\bf J}+{\bf S})
\label{eq-7}
\end{equation}
where $c_1$ and $c_2$ are constants, and {\bf S} occurs on the right hand
side only because of the mysterious factor~2 in formula~\ref{eq-5}.
Classical mechanics would next try to compute the ``trajectories'' of the
vectors {\bf L} and {\bf S} (i.e., how they vary with time). If ${\bf
B}=0$, then {\bf J} is constant (conservation of angular momentum), and
{\bf L} and {\bf S} precess about {\bf J}. If the field is weak, then (to
a good approximation) {\bf J} precesses about {\bf B}, while {\bf L} and
{\bf S} still precess about {\bf J}. {\bf J} is no longer constant, since
it is subject to an external torque. If the field is strong, then {\bf L}
and {\bf S} precess about {\bf B} (again to a good approximation), because
the magnetic torque swamps the spin-orbit torque.
The ``old quantum'' prescription says to find the energy levels of these
stationary states.
As it happens, the energy of a stationary state depends primarily on terms
other than those in formula~\ref{eq-7}. We have terms analogous to the
$R/n^2$ of hydrogen; then terms depending on the magnitude of {\bf L} and
{\bf S} (i.e., terms depending on $L$ and $S$); only after that does
formula~\ref{eq-7} come into play.
The spin-orbit coupling contributes a perturbation that depends on the
angle between {\bf L} and {\bf S} (and hence indirectly on the magnitude of
{\bf J}). So when ${\bf B}=0$, we expect to find bunches of energy levels
with the same $L$ and $S$ grouped closely together, separated slightly by
an energy difference depending on $J$. This is an aspect of the famous
{\it fine structure} of the spectrum.\footnote{The sodium D-lines provide
an example. The quartet and the higher doublet both have $L=1$ and
$S=\frac{1}{2}$; their $J$'s are $\frac{3}{2}$ and $\frac{1}{2}$.} If
$S=0$, then the fine structure disappears. (Low-lying energy levels of
atoms with even numbers of electrons usually have $S=0$.)
Without a field, we have complete rotational symmetry in the energy
formula. The spin-orbit perturbation depends only on the {\it magnitude}
of {\bf J}, not on its direction (on $J$, not on $M_J$). With a field, the
symmetry is broken and the degeneracy lifted.
We have perturbations depending on ${\bf B}\cdot{\bf L}$ and ${\bf
B}\cdot{\bf S}$. With a strong field (or if $S=0$), we confidently replace
these dot products with $BM_L$ and $BM_S$, for the dot products don't
change and must be quantized (i.e., they are stationary). We have seen
already how this leads to the normal Zeeman effect. Note also another
regularity: spectra without fine-structure have $S=0$, and hence display
normal Zeeman splitting.
With a weak field, we express the perturbation in terms of ${\bf
B}\cdot{\bf J}$ and ${\bf B}\cdot{\bf S}$. The first term we replace with
$BM_J$. Were this all, we would again see a normal Zeeman effect, because
$\Delta M_J$ and $\Delta M_L$ satisfy the same selection rule.
Turn to the ${\bf B}\cdot{\bf S}$ perturbation. Because {\bf S} precesses
about {\bf J}, the {\it average} value of ${\bf B}\cdot{\bf S}$ is the same
as
\[
{\bf B}\cdot{\bf S}_\parallel
\]
where ${\bf S}_\parallel$ is the projection of {\bf S} on {\bf J}. The dot
product ${\bf B}\cdot{\bf S}_\parallel$ doesn't change, so it must be
quantized. ${\bf S}_\parallel$ is some fraction of {\bf J}, so this dot
product must be some fraction of $M_J$. What fraction? If we knew the
lengths of {\bf L} and {\bf S} and the angle between them, we'd clearly
have enough information to specify the geometry completely. Instead of the
angle, we could also use the length of {\bf J}. So we expect the fraction
to depend on $L$, $S$, and $J$. And that, as we've seen, accounts for the
anomalous Zeeman effect.
The rest is vector algebra. You may skip to the next section if you
like--- no new concepts appear.
First derive a formula for ${\bf L}\cdot{\bf S}$, in terms of
quantum numbers. We have:
\[
{\bf J}\cdot{\bf J} = ({\bf L}+{\bf S})\cdot({\bf L}+{\bf S}) =
{\bf L}\cdot {\bf L} + {\bf S}\cdot {\bf S} + 2{\bf L}\cdot {\bf S}
\]
Rearrange this equation and replace ${\bf J}\cdot{\bf J}$ with $J(J+1)$
(and likewise for {\bf L} and {\bf S}), to get:
\begin{equation}
{\bf L}\cdot{\bf S}\rightarrow \frac{J(J+1)-L(L+1)-S(S+1)}{2}
\label{eq-8}
\end{equation}
where I've used an arrow instead of an equal sign to indicate
``quantization''--- classical quantities on the left, quantum numbers on
the right. From the modern perspective, the left hand side is an operator,
and the right hand side is one of its eigenvalues. (Formula~\ref{eq-8}
looks tantalizingly close to the Land\'e g-factor, but we're not quite
there yet.)
Next derive a formula for ${\bf S}_\parallel$, the projection of {\bf S} on
{\bf J}:
\begin{eqnarray*}
{\bf S}_\parallel &=& \frac{{\bf J}\cdot{\bf S}}{{\bf J}\cdot{\bf J}}
{\bf J}\\
\ &=& \left(\frac{{\bf L}\cdot{\bf S}}{{\bf J}\cdot{\bf J}} +
\frac{{\bf S}\cdot{\bf S}}{{\bf J}\cdot{\bf J}}
\right) {\bf J}\\
\end{eqnarray*}
``Quantizing'' the term in parentheses:
\[
\frac{J(J+1)-L(L+1)-S(S+1)}{2J(J+1)}+\frac{S(S+1)}{J(J+1)}=\frac{J(J+1)-L(L+1)+S(S+1)}{2J(J+1)}
\]
In other words, ${\bf S}_\parallel$ is this fraction times {\bf J}.
Taking the dot product with {\bf B}, we got previously:
\[
E^{\rm mag} = {\rm constant}\,B(M_J+M_S)
\]
Replace the term $BM_S$ with the ``average'' value of ${\bf B}\cdot{\bf
S}$, which is ${\bf B}\cdot{\bf S}_\parallel$, which is the above fraction
times $BM_J$:
\[
E^{\rm mag} = {\rm constant}\,B(M_J+\frac{J(J+1)-L(L+1)+S(S+1)}{2J(J+1)}M_J)
\]
And this reduces to our anomalous Zeeman formula, formula~\ref{eq-2}.
\paragraph{The Paschen-Back Limit}
{\it How does the anomalous Zeeman formula turn into the normal Zeeman
formula in the Paschen-Back limit of strong magnetic fields?}
The previous section puts classical physics in a rather positive light. A
thoroughly classical picture appears to account for fine structure, normal
Zeeman splitting, anomalous Zeeman splitting, and the passage from one to
the other as the field strength increases.
On closer examination, we find cracks in the picture. Several
non-classical ingredients played a crucial role:
\begin{enumerate}
\item The Bohr-Sommerfeld ``quantization'' prescription.
\item The selection rules.
\item The spin one-half of the electron.
\item The ``mysterious factor~2''.
\end{enumerate}
The old quantum theory had problems with all of these:
\begin{enumerate}
\item The quantization rules apply only to ``stationary'' states, but just
what constitutes a stationary state? A stationary state is not
absolutely stable, since it can make transitions.
\item I've noted already the muddle over selection rules as we pass from
the anomalous to the normal Zeeman effect.
\item Heisenberg and Land\'e had half-integer spin forced upon them by
counting spectral lines. Now the quantum number $L$ assumes only
integer values; Sommerfeld had a derivation of sorts for this fact
(which I have not given). Why should this derivation work for $L$ but
not for $S$? Neither Heisenberg nor anyone else had an answer
at the time.
\item This is related to the previous point, but is a distinct issue. The
ratio of magnetic moment to angular momentum depends on what kind of
momentum we're dealing with: orbital or spin.
\end{enumerate}
The question of ``stationary'' states comes to the fore when we look at the
Paschen-Back effect. Modern progress in chaos theory has clarified the
subject further. For weak fields, we have one kind of approximately
stationary state ({\bf L} and {\bf S} precess about {\bf J}); for strong
fields, a different kind ({\bf L} and {\bf S} precess about {\bf B}.) For
intermediate strength fields, the motion becomes chaotic (classically
speaking). The Bohr-Sommerfeld quantization procedure throws up its hands
when faced with chaotic motion. Yet quantization does not cease when the
motion becomes chaotic, as we can tell by gazing at spectral lines.
Quantum mechanics broke the logjam. The stimulus for the fundamental shift
in viewpoint did not come from the Zeeman effect, it is true. But the new
approach unlocked this and many other mysteries within a year or two.
The next few sections sketch the resolution of these problems. I will do a
pretty good job for (1) and (2), an overbrief treatment of (3), but I will
barely touch on (4), regretfully leaving the real explanation to the
standard textbooks.
\paragraph{Quantization}
Schr\"o\-dinger titled his first great paper on quantum mechanics
``Quantization as an Eigenvalue Problem''. I have alluded to the meaning
of this several times already. A state specified by $v$ has quantum number
$q$ for operator $Q$ if $v$ satisfies the eigenvalue equation $Qv=qv$.
Borrowing the imagery of the old quantum theory, we say that $Q$ {\it has
the value} $q$ in the state specified by $v$.
Other jargon says that $Q$ is ``sharp'' or ``definite'' in the state $v$.
According to the quantum theory of measurement, if we prepare an ensemble
of systems, all in the state $v$, and measure $Q$ in each one, we will
always get the value $q$. But if $v$ is not an eigenvector of $Q$, then we
will various eigenvalues of $Q$ with different probabilities. There is
statistical spread, and the value of $Q$ for $v$ is not ``sharp''.
What about ``stationary''? Here we must bring in another concept I've
mentioned before: the energy operator governs the time evolution of a
quantum system. A state $v$ is absolutely stationary (does not change at
all) if and only if it is an eigenstate of the energy operator $E$. To be
a touch more precise: suppose $Ev=\omega v$. Then $v$ evolves like so:
\[
v(t) = e^{i\omega t}v(0)
\]
Remember that $v$ and $cv$ specify the same state for any non-zero complex
$c$. So the state doesn't change. (I am also assuming that the energy
operator does not change with time. A changing $E$ calls for a different
equation for $v(t)$.)
Bohr's states are ``almost'' stationary. Perturbation theory deals
with this ambiguity. Express the energy operator as a sum of two parts:
\[
E = H+V
\]
where $V$, the perturbation, is ``small'' relative to $H$. Eigenstates of
$H$ will not usually be eigenstates of $E$, but they will change ``slowly''
because $V$ is ``small''\footnote{In particular cases, ``small'' and
``slowly'' can be given more definite meaning. For example, we may be able
to write $E$ as a power series in something; $V$ might contain the second
and higher order terms.}. Start the system in an eigenstate of $H$, $v(0)$
say. Wait awhile. The system is now in a superposition of eigenstates of
$H$--- say, for simplicity, $v(1)=c_1v(0)+c_2u$. Here $u$ is another
eigenvector of $H$, and $c_1$ and $c_2$ are complex numbers. If we measure
$H$, we have a certain probability of finding the system in the state $u$.
In other words, the system has made a transition from one state to another,
under the influence of a perturbation.
Bohr's transition picture is a special case of this. The perturbation $V$
is the electromagnetic term, representing the interaction between the
electromagnetic field and the atom. In other words, $V$ is due to the
ability of the atom to make transitions by absorbing or emitting a photon.
Everything else is stuffed into $H$: the attraction of the nucleus,
spin-orbit coupling, the constant magnetic field of the Zeeman effect.
(Classical electromagnetism gives an unambiguous way to separate this
constant magnetic field from the travelling field of light.) Bohr's
stationary states are the eigenstates of $H$.
What has become of Bohr's notion of quantum number? For Bohr and
Sommerfeld, stationary states had quantum numbers. Translated into Hilbert
space language, this asserts that eigenvectors of $H$ are eigenvectors of all
other operators of physical importance. Alas, life is not so easy. If
operators $Q$ and $H$ commute, one can generally pick a basis of
eigenvectors of both $Q$ and $H$. (Trivial exercise: prove the converse.)
Some phenomena do submit to such an approach. The Paschen-Back effect
does not--- as we will soon see.
\paragraph{Selection Rules Redux}
We start in a stationary state $v(0)$. We wait a bit. Under the influence
of the electromagnetic perturbation $V$, $v$ evolves to a state which is a
superposition of $v(0)$ and other stationary states $u_j$. What can we say
about the possible $u_j$?
At first you might guess that one could not say anything without a detailed
study of $V$. Say $\{u_j\}$ is a basis of eigenvectors of $H$.
Expand $v(t)$ out in this basis:
\[
v(t) = \sum_j c_j(t) u_j
\]
It turns out that for small positive $t$, $c_j(t)\approx 0$ unless $Vv(0)$
contains a non-zero $u_j$ component. Say we set $v(0)=u_i$, and expand out
$Vu_i$ in the $\{u_j\}$ basis--- $Vu_i=\sum_j a_{ij}u_j$. We must have
$a_{ij}\neq 0$ to have a significant chance of the transition $u_i
\rightarrow u_j$.
Let's rephrase this without so many indices. $V$ has a matrix
representation in the basis of eigenvectors of $H$. ``Immediate''
transitions between ``stationary states'' (i.e., eigenvectors of $H$) come
from non-zero off-diagonal entries in the $V$ matrix.
Over longer periods of time, we can have transitions through intermediate
states: $u_i \rightarrow u_j \rightarrow u_k$. If you work through the
math, you will find out that you are computing $V^2$, $V^3$, etc., for
these indirect transitions; the power series for $\exp(iV)$ makes an
appearance in the final result.
The moral of this tale is that {\it zero entries in the $V$ matrix correspond
to forbidden transitions}. A selection rule translates into an assertion
about the form of the $V$ matrix. And so, it appears, we need a detailed
study of $V$ to determine the selection rules.
Our crude ``derivation'' of the selection rules for $S$ indeed depended on
the ``mechanism'' of light. But for $J$, we appealed to a general physical
principle, conservation of angular momentum. Can one not translate this
argument into quantum mechanics?
One can. The thread runs thus: $V$ must be invariant under all spatial
rotations, for electromagnetism does not single out any preferred direction
in space. So the group of spatial rotations, $SO(3)$, must play a special
role. At this point group representation theory takes over, and out pops
the selection rules for $J$ and $M_J$. (You do need some additional
assumptions I won't spell out.)
Let's take a last look at the Paschen-Back effect, using all we've learned.
Where do the $J,M_J$ selection rules leave off and the $L,M_L$ and $S,M_S$
selection rules take over, as we increase the magnetic field?
Both sets of selection rules hold throughout! The trick is picking the
right basis. To understand this, we must consider again the combined
influence of the spin-orbit and magnetic perturbations.
The operator $H$ looks like this:
\[
H = H^0 + c_1 {\bf L}\cdot{\bf S} + c_2 {\bf B}\cdot({\bf L}+2{\bf S})
\]
or equally well:
\[
H = H^0 + c_1 {\bf L}\cdot{\bf S} + c_2 {\bf B}\cdot({\bf J}+{\bf S})
\]
$H^0$ contains terms representing the attraction of the nucleus, and
perhaps other refinements for complex atoms. The next two terms are the
spin-orbit coupling and the effect of the magnetic field.
{\bf B} is just a conventional 3-space vector, but {\bf L}, {\bf S}, and
{\bf J} are all ``operator vectors''. That is, for any coordinate system,
we have ${\bf L}=(L_x,L_y,L_z)$, where $L_x$, $L_y$, and $L_z$ are all
operators. Ditto for {\bf S} and {\bf J}.
Pick the $z$-axis in the direction of the magnetic field. Several
operators now demand a role:
\begin{eqnarray*}
{\bf J}\cdot{\bf J} &=& J_xJ_x+J_yJ_y+J_zJ_z\\
{\bf L}\cdot{\bf L} &=& L_xL_x+L_yL_y+L_zL_z\\
{\bf S}\cdot{\bf S} &=& S_xS_x+S_yS_y+S_zS_z\\
{\bf L}\cdot{\bf S} &=& L_xS_x+L_yS_y+L_zS_z\\
J_z,&L_z,&S_z\\
\end{eqnarray*}
some with eigenvalues we've come to know and love:
\begin{eqnarray*}
{\bf J}\cdot{\bf J}& \rightarrow & J(J+1)\\
{\bf L}\cdot{\bf L}& \rightarrow & L(L+1)\\
{\bf S}\cdot{\bf S}& \rightarrow & S(S+1)\\
J_z &\rightarrow& M_J\\
L_z &\rightarrow& M_L\\
S_z &\rightarrow& M_S\\
\end{eqnarray*}
Alas, these operators do not all commute. It turns out that ${\bf
L}\cdot{\bf L}$ and ${\bf S}\cdot{\bf S}$ commute with each other and all
the rest, and hence with $H$. For this reason, $L$ and $S$ are ``good''
quantum numbers: we can pick stationary states (in the Bohr sense) that
have sharp values for $L$ and $S$.
Finding additional ``good'' quantum numbers proves more frustrating. ${\bf
J}\cdot{\bf J}$ commutes with $J_z$ and with ${\bf L}\cdot{\bf S}$, but not
with $L_z$ or $S_z$ (as it happens). Drop the ${\bf B}\cdot{\bf S}$ term
from $H$, and we have $J$ and $M_J$ as good quantum numbers. Alternately,
drop the spin-orbit coupling term, and, as luck will have it, $M_L$ and
$M_S$ become good quantum numbers.
So we have a choice of bases. With no field, we can find stationary states
with sharp values of $(L,S,J,M_J)$. With no spin-orbit coupling, we can
demand sharp values for $(L,S,M_L,M_S)$.
Start with the $(L,S,J,M_J)$ basis, and turn on a weak field. The states
are now only approximately stationary (even in Bohr's sense). The stronger
the field, the less accurate the approximation. But the matrix elements
for $V$, in this basis, strictly obey the $J$ and $M_J$ selection rules, no
matter how strong the field.
For a very strong field, the $(L,S,M_L,M_S)$ basis consists of
approximately stationary states (i.e., near-eigenvectors of $H$). The
$M_L$ and $M_S$ selection rules hold strictly with respect to this basis,
no matter how weak the field. (The $L$ and $S$ selection rules hold for
either basis.)
Just to hose away the last traces of the muddle: how can we reconcile
$\Delta M_S=0$ with the formulas $M_S=g(J,L,S)M_J$ and $\Delta M_J=0,\pm
1$? (We needed the latter two formulas for the anomalous Zeeman effect.)
Answer: Suppose an atom makes a transition from $|M_J=\uparrow\rangle$ to
$|M_J=\downarrow\rangle$ (using the Dirac $|\rangle$ notation for
state-vectors, plus the ``colloquial'' abbreviations
$\uparrow=\frac{1}{2}$, $\downarrow=-\frac{1}{2}$). The atom begins and
ends in an eigenstate of $J_z$. Each eigenstate is a ``blend'' of
eigenstates of $S_z$, say:
\begin{eqnarray*}
|M_J=\uparrow\rangle &=& a\,|M_S=\uparrow\rangle +
b\,|M_S=\downarrow\rangle\\
|M_J=\downarrow\rangle &=& c\,|M_S=\uparrow\rangle +
d\,|M_S=\downarrow\rangle\\
\end{eqnarray*}
The selection rule for $M_S$ holds, in the sense that
$|M_S=\uparrow\rangle$ makes a transition only to itself; likewise for
$|M_S=\downarrow\rangle$. But the coefficients $a$, $b$, $c$, and $d$
change. The $M_S$ in the equation $M_S=g(J,L,S)M_J$ is an ``average''
$M_S$, and depends on these coefficients. (More precisely, it is the
expectation value of $S_z$ for a state which is an eigenstate of $J_z$, not
of $S_z$.) The $M_S$ of the selection rule is the quantum number of an
eigenstate of $S_z$.
We have already seen this resolution foreshadowed in the classical
treatment, when we obtained the ``average'' $z$-component of {\bf S} by
taking the $z$-component of ${\bf S}_\parallel$. But classical mechanics
lacks the notion of ``blended'' states, and so is ill-equipped to pass
smoothly from the weak field to the strong field regime.
So much for the Zeeman effect. Let us punctuate the tale with an anecdote.
A friend ran into Heisenberg on the streets of Cophenhagen, around~1920;
Heisenberg had a grim expression. ``Cheer up, Werner, things can't be that
bad!'' Replied Heisenberg, ``How can one be cheerful when one is thinking
about the anomalous Zeeman effect?''
\paragraph{Spin One-half, and the Mysterious Factor 2}
Lie group and Lie algebra representation theory fit flawlessly into the
structure of quantum mechanics, and the most satisfying explanation of the
mysteries of spin one-half lie in this connection. Careful exposition
of this material belongs to standard textbooks. But I must at least
justify my remark that half-integer spin is fundamentally non-classical.
Let us start with a quantum mechanical system which has $n$ states. The
Hilbert space for this system is then isomorphic to $\cmplx^n$. Suppose
the system ``inhabits'' ordinary physical Euclidean 3-space, that is, we
can picture the physical processes of the system as taking place in
3-space. Vague though this statement is, something precise will come out
of it. Any classical physical system (e.g., a spinning ball) ``inhabits''
3-space in this (as yet) fuzzy sense.
Rotate the system in 3-space. Or if you prefer, rotate the coordinate axes
used to describe the system. Either way, we have a transformation from one
state of the system to another. (Physicists like to distinguish ``active''
from ``passive'' transformations, along the lines of preference just
mentioned, but we won't need to be so exact.) We will assume this
transformation can be represented by a linear operator on the Hilbert space
of the system.
In fact, no generality is lost by assuming the linear operator is
unitary--- one can prove this. Even more: since our Hilbert space is
isomorphic to $\cmplx^n$, the operator has a matrix representation, and it
has been proved that one can arrange for the determinant of the matrix
to be~1. In short, we have a unitary unimodular representation of the Lie
group $SO(3)$ on $\cmplx^n$:
\[
\sigma: SO(3) \rightarrow SU(n)
\]
This is the precise statement that emerges from our vague notion of
``inhabiting 3-space''.
Group representation theory now grinds away. It tells us first, that $\sigma$
is a direct sum of irreducible representations, and next, that an
irreducible representation exists if and only if the dimension of the
target space is odd (in which case the representation is essentially unique).
So we now have:
\[
SO(3) \rightarrow SU(H_1) \oplus\ldots\oplus SU(H_r)
\]
where $\Sigma_i H_i$ is a direct sum decomposition of the Hilbert space of
our system, and $SU(H_i)$ is the group of unitary unimodular
transformations on $H_i$; also, each $H_i$ has odd dimension.
Any state in subspace $H_i$ can be rotated into any other state in subspace
$H_i$. But rotation never mixes different subspaces together: a state in
subspace $H_i$ remains in that state. So it seems natural to concentrate
on the irreducible representations.
Associated with the Lie group $SO(3)$ is the Lie algebra $so(3)$, generated
by three Lie algebra elements; we can picture these generators as
``infinitesimal'' rotations about the $x$, $y$, and $z$ axes (as did Lie),
or as angular velocities about the coordinate axes.
Suppose $\sigma:SO(3) \rightarrow SU(H)$ is an irreducible representation,
and suppose $H$ has dimension $2j+1$. Then $\sigma$ induces a map from
$so(3)$ to the Lie algebra $su(H)$, which just happens to consist of all
the anti-Hermitian traceless operators on $H$. The three generators of
$so(3)$ map to three operators I will label $iJ_x$, $iJ_y$, and $iJ_z$---
that factor of $i$ makes $J_x$, $J_y$, and $J_z$ Hermitian.
The generators of $so(3)$ ``look like'' angular velocities, so $J_x$,
$J_y$, and $J_z$ are good candiates for the angular momentum operators.
Ultimately this comes down to a matter of definition. Let us make this
identification without further ado.
Finally, the operator $J_xJ_x+J_yJ_y+J_zJ_z$ commutes with all of $SU(H)$
(as can be shown by direct calculation, or more cleverly), and so by
Schur's lemma, is a constant times the identity matrix. And now the
punchline: it turns out that this operator is $j(j+1)I$, where $j$ (you may
recall) is defined by the relation $\dim H = 2j+1$.
In one sense we should be well satisfied. Our original direct sum
decomposition of the $n$-dimensional Hilbert space of the system simply
decomposed that space into states with a definite magnitude for the angular
momentum--- that is, with definite quantum number $j$. On the other hand,
group representation theory has told us that $j$ must be an integer and
$\dim H$ is odd. And we want $j=\frac{1}{2}$ for an electron.
But now something curious appears. $SO(3)$ has a double-cover, the Lie
group $SU(2)$. Their Lie algebras are of course the same. Perhaps we
pick up some additional representations by starting with $SU(2)$?
Indeed we do. There is a unique irreducible representation $SU(2)
\rightarrow SU(k)$ for every integer $k$; for reasons that should be
obvious by now, we set $2j+1=k$, and use $j$ in place of $k$ in all the
formulas. The $SU(2)$ representation factors through the $SO(3)$
representation precisely when $j$ is an integer: $SU(2) \rightarrow SO(3)
\rightarrow SU(2j+1)$.
This is as far as I will pursue the mathematics. Our Hilbert spaces are
still rather bloodless. Schr\"o\-dinger conjured up lovely images of
wavefunctions spreading through 3-space, which is to say he picked a
representation for the Hilbert space which ``inhabits 3-space'' in our
sense. Naturally the Schr\"o\-dinger wavefunctions will not serve to
represent a spin one-half particle like an electron. Pauli solved this
puzzle, scant months after the invention of quantum mechanics. Relativity
led to further paradoxes. To Dirac belongs the glory of resolving these.
But thereby hangs another tangled tale of history\ldots
I will ramble a bit further though on classical imagery versus quantum
reality, just with regard to this question of spin. The groups $SU(2)$ and
$SO(3)$ have the same Lie algebra, and as we've seen, the Lie algebra
begets the angular momentum operators. This above all is why classical
pictures can carry us so far, even for half-integer spin. Push it far
enough though, and any classical picture will finally break down, for
$SU(2)$ and $SO(3)$ are different groups.
In some sense, ``the mysterious factor~2'' stems from this difference.
Because $SU(2)$ is the double cover of $SO(3)$, factors of~2
appear in the formula for the covering projection. These factors
ultimately find their way into the formula for the ratio of the electron's
magnetic moment to its spin angular momentum. The orbital motion of an
electron however ``inhabits 3-space'', and so no factors of~2 appear.
\paragraph{Helium}
Having dwelt lovingly upon the Zeeman effect, I will pass more fleetingly
over some of the other applications of spin. I wish only to give an
inkling of the pervasiveness of the concept.
Close study of the spectrum of helium revealed a striking feature. There
are two systems of energy levels, which do not interact with each other---
transitions between the systems do not exist. The bottom level of the
first system is well below the bottom level of the second system. I omit a
detailed catalog of other regularities.
Born, Heisenberg, Pauli, and other applied more and more recondite
techniques from classical mechanics to the problem of the helium spectrum
(specifically celestial mechanics, including Poincar\'e's work). All in
vain. As I've already noted, helium poses one of the simplest three-body
problems in quantum mechanics, and the old quantum theory just couldn't
handle it.
Heisenberg solved the problem after the invention of quantum mechanics.
The two electrons each have spin-$\frac{1}{2}$, so the total spin $S$ is
either~0 or~1. States with $S=0$ are called {\it singlet states}, states
with $S=1$ are {\it triplet} states (recall again the formula $2S+1$ for
the number of possible values for $M_S$). The selection rule $\Delta S=0$
accounts for the division of energy levels into two subsystems. If $S=1$,
then the electrons have the same spin, and cannot both occupy the bottom
level in a term scheme like figure~\ref{fig-1}. If $S=0$, they can. Thus
the bottom singlet state is significantly lower than the bottom triplet
state. The bottom triplet state is sometimes called {\it metastable}, for
processes other than photon absorption or emission can (slowly) change a
triplet state into a singlet state. (The $\Delta S=0$ selection rule
applies strictly only to electric dipole radiation transitions.)
Heisenberg noted that the hydrogen molecule H$_2$ should also have a
similar ``double spectrum'', corresponding to $S=0$ and $S=1$ states.
(Here we ignore the spins of the protons and count only the spins of the
two electrons.) In a sense, H$_2$ has two ``allotropic forms'', dubbed
parahydrogen and orthohydrogen. These two forms were subsequently
discovered by experimentalists. The Nobel committee cited this work when
awarding Heisenberg the Nobel prize.
\paragraph{Phosphorescence}
\begin{figure}[t]
\setlength{\unitlength}{0.0125in}%
\begin{picture}(160,270)(5,557)
\thicklines
\put( 25,585){\vector( 0, 1){175}}
\put(160,769){\vector( 0,-1){ 84}}
\put( 81,782){\vector( 1, 0){ 66}}
\put(140,650){\vector(-4,-3){ 80}}
\put(145,657){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{\rm state}}}
\put( 30,700){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{\it photon}}}
\put( 30,682){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{\it absorbtion}}}
\put(100,605){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{\it phosphorescence}}}
\put( 5,780){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{\rm excited singlet}}}
\put( 10,767){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{\rm state}}}
\put(140,670){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{\rm metastable triplet}}}
\put( 10,570){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{\rm ground singlet}}}
\put( 20,557){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{\rm state}}}
\put(165,725){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{\it vibrational}}}
\put(165,712){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{\it relaxation}}}
\put(150,782){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{\rm excited triplet}}}
\put(152,771){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{\rm state}}}
\put(101,815){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{\it spin-}}}
\put(100,804){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{\it orbit}}}
\put( 98,791){\makebox(0,0)[lb]{\raisebox{0pt}[0pt][0pt]{\it coupling}}}
\end{picture}
\caption{Phosphorescence}
\figrule
\end{figure}
First, let's distinguish phosphorescence from fluorescence. A fluorescent
paint glows under a UV lamp, but stops glowing as soon as the lamp is
turned off. A phosphorescent paint keeps glowing for a while.
Phosphorescent substances have the ability to store up light and release it
gradually. The notion of a metastable state explains this. If the
molecules of the substance can get from the ground state to a
metastable state, and if the metastable state can slowly decay back to the
ground state via photon emission, then we have phosphorescence.
Typically, the metastable state is a triplet state, and the ground state is
a singlet state. Ground state molecules absorb photons and go to excited
singlet states (see figure~\thefigure). Most of them immediately hop right
back to the ground state, emitting a photon, but non-radiative processes
take a few to a less energetic triplet state. Once these molecules get to
the lowest triplet state, they are stuck there, at least for a while. Some
low probability process accomplishes the triplet-singlet conversion, and
the molecules slowly leak out light.
What are the singlet-triplet and triplet-singlet processes? I mention one
possibility of several. If the molecule contains two unpaired electrons,
and each is subject to spin-orbit coupling--- but with different
strengths--- then ${\bf s}_1$ and ${\bf s}_2$ will precess at different
rates, and the magnitude of the combined vector ${\bf s}_1+{\bf s}_2$ will
flip-flop between $S=0$ and $S=1$.
Complicating matters a little, the energy of a molecule consists of many
parts. For example, the vibration and rotation of the ``nuclear
framework'' stores energy. Suppose the excited singlet state is close in
energy to a triplet state. After the molecule makes the singlet-triplet
transition, it may shed energy into the environment through the vibrational
modes ({\it vibrational relaxation}). Finally it arrives at the bottom
triplet state. Eventually it decays back to the ground state via a
triplet-singlet transition.
Both the singlet-triplet and triplet-singlet transitions violate the
$\Delta S=0$ selection rule. This rule applies absolutely only for pure
electric dipole transitions, so the glow-in-the-dark T-shirts don't
violate the laws of universe. Still, transitions that violate it are apt
to be slow.
One last fact completes the explanation. A result of perturbation theory
says that the probability of a transition between state $u_i$ and state
$u_j$ is determined by two factors: the matrix element connecting the two
states (the ``strength'' of the coupling), and the difference in energy
levels. It is, in general, easier to make a transition when the energies
are close together. Thus the singlet-triplet transition goes faster than
the triplet-singlet transition.
\paragraph{Loose Ends}
Why $j(j+1)$, instead of $j^2$? Well, group representation theory tells us
so!--- as already mentioned in the section titled ``Spin One-half''. If
you consult the computations in a textbook, you will find it all hinges on
the commutators of the Lie algebra. You may also find it amusing to
compute the ``average value of $m^2$'' ($m$ being the projection on the
$z$-axis, as usual) like so:
\[
\frac{(-j)^2 + (-j+1)^2 +\ldots+(j-1)^2+j^2}{2j+1}
\]
and verify that $j(j+1)/3$ results. Now argue that for any vector {\bf
v}, $|{\bf v}|^2 = v_x^2 + v_y^2 + v_z^2$, so the average value of $m^2$
should be one-third the magnitude of the angular momentum vector.
How do we reconcile the verdict of group representation theory with the
classical value $j^2$? I suggested, loosely, that ``$SO(3)$ is classical,
$SU(2)$ is quantum''. The eigenvalue $j(j+1)$ applies to representations
of $SO(3)$ as well as $SU(2)$.
The culprit here is the finite dimensionality of the representation. To
call $j$ the ``classical value'' for the angular momentum is misleading.
Let us return to ``unnatural units'', where $\hbar$ is small. The
``classical'' magnitude for angular momentum is $j\hbar$, with large $j$.
We get the ``classical limit'' by simultaneously letting $j \rightarrow
\infty$ and $\hbar \rightarrow 0$, while keeping the product constant.
In the classical limit, the quantum expression $\sqrt{j(j+1)}\hbar$
simplifies to $j\hbar$.
Bohr elevated this and similar limiting relations into a guiding principle
in the old quantum theory. He termed it the Correspondence Principle.
Sommerfeld called it a magic wand that only worked in Copenhagen, a
back-handed compliment.
The orbital angular momentum number $l$ can grow as large as one wishes.
The intrinsic spin, $s$, cannot--- $s=\frac{1}{2}$ for an electron, $s=1$
for a photon. Even for a large atom like uranium, $s$ would be at most a
few hundred, assuming the spin of all the electrons, protons, and neutrons
combined constructively. (But such a large value of $s$ would imply an
enormous energy which would blow the nucleus apart.) In practice, the
classical limit makes no sense for $s$, another sense in which intrinsic
spin is fundamentally non-classical.
The addition rules for angular momentum come from the following
considerations: suppose we have two representations $\sigma_1: SU(2)
\rightarrow SU(H_1)$ and $\sigma_2: SU(2) \rightarrow SU(H_2)$. $H_1$ and
$H_2$ are the Hilbert spaces for two separate physical systems. The
Hilbert space of the combined system is the tensor product $H_1\otimes
H_2$, and the representation $\sigma_1\otimes\sigma_2:SU(2)\rightarrow
SU(H_1\otimes H_2)$ begets the angular momentum operator for the combined
system. Even if $\sigma_1$ and $\sigma_2$ are irreducible representations,
$\sigma_1\otimes\sigma_2$ generally won't be, but will decompose into a
direct sum of irreducible representations:
\[
\sigma_1\otimes\sigma_2: SU(2) \rightarrow SU(K_1)\oplus\ldots\oplus
SU(K_r)
\]
The addition rules now all fall out from theorems of group representation theory.
Enough said.
\end{document}
L.C. Biedenharn, Foundations of Physics, vol 13, no. 1, 1983, The
``Sommerfeld Puzzle'' Revisited and Resolved, pp 13--34.
When Dirac developed relativistic quantum mechanics, the relativistic
Coulomb problem proved to be {\it exactly solvable} (in the approximation
of a heavy spinless nucleus and no radiative corrections). But the
resulting formula for the energy levels was truly a surprise: {\it The new
answer was precisely the old Sommerfeld formula!}
How could this possibly be? Clearly Sommerfeld's methods were heuristic
(Bohr quantization rules), out-dated by {\it two} revolutions
(Heisenberg-Schr\"o\-dinger nonrelativistic quantum mechanics and Dirac's
relativistic quantum mechanics) and his methods obviously had no place at
all for the electron spin, let alone the four-components of the Dirac
electron. So Sommerfeld's correct answer could only be a lucky accident, a
sort of cosmic joke at the expense of serious minded physicists.