Game Theory (Part 7)

John Baez

We need to learn a little probability theory to go further in our work on game theory.

We'll start with some finite set \( X\) of 'outcomes'. The idea is that these are things that can happen — for example, choices you could make while playing a game. We assume that if one of these outcomes happens, the others don't. A 'probability distribution' on this set assigns to each outcome a number called a 'probability' — which says, roughly speaking, how likely that outcome is. If we've got some outcome \( i,\) we'll call its probability \( p_i.\)

For example, suppose we're interested in whether it will rain today or not. Then we might look at a set of two outcomes:

\( X = \{\textrm{rain}, \textrm{no rain} \}\)

If the weatherman says the chance of rain is 20%, then

\( p_{\textrm{rain} } = 0.2 \)

since 20% is just a fancy way of saying 0.2. The chance of no rain will then be 80%, or 0.8, since the probabilities should add up to 1:

\( p_{\textrm{no rain}} = 0.8 \)

Let's make this precise with an official definition:

Definition. Given a finite set \( X\) of outcomes, a probability distribution \( p\) assigns a real number \( p_i\) called a probability to each outcome \( i \in X,\) such that:

1) \( 0 \le p_i \le 1 \)

and

2) \( \displaystyle{ \sum_{i \in X} p_i = 1} \)

Note that this official definition doesn't say what an outcome really is, and it doesn't say what probabilities really mean. But that's how it should be! As usual with math definitions, the words in boldface could be replaced by any other words and the definition would still do its main job, which is to let us prove theorems involving these words. If we wanted, we could call an outcome a doohickey, and call a probability a schnoofus. All our theorems would still be true.

Of course we hope our theorems will be useful in real world applications. And in these applications, the probabilities \( p_i\) will be some way of measuring 'how likely' outcomes are. But it's actually quite hard to say precisely what probabilities really mean! People have been arguing about this for centuries. So it's good that we separate this hard task from our definition above, which is quite simple and 100% precise.

Why is it hard to say what probabilities really are? Well, what does it mean to say "the probability of rain is 20%"? Suppose you see a weather report and read this. What does it mean?

A student suggests: "it means that if you looked at a lot of similar days, it would rain on 20% of them."

Yes, that's pretty good. But what counts as a "similar day"? How similar does it have to be? Does everyone have to wear the same clothes? No, that probably doesn't matter, because presumably doesn't affect the weather. But what does affect the weather? A lot of things! Do all those things have to be exactly the same for it count as similar day.

And what counts as a "lot" of days? How many do we need?

And it won't rain on exactly 20% of those days. How close do we need to get?

Imagine I have a coin and I claim it lands heads up 50% of the time. Say I flip it 10 times and it lands heads up every time. Does that mean I was wrong? Not necessarily. It's possible that the coin will do this. It's just not very probable.

But look: now we're using the word 'probable', which is the word we're trying to understand! It's getting sort of circular: we're saying a coin has a 50% probability of landing heads up if when you flip it a lot of times, it probably lands head up close to 50% of the time. That's not very helpful if you don't already have some idea what 'probability' means.

For all these reasons, and many more, it's tricky to say exactly what probabilities really mean. People have made a lot of progress on this question, but we will sidestep it and focus on learning to calculate with probabilities.

If you want to dig in a bit deeper, try this:

• Probability interpretations, Wikipedia.

Equally likely outcomes

As I've tried to convince you, it can be hard to figure out the probabilities of outcomes. But it's easy if we assume all the outcomes are equally likely.

Suppose we have a set \( X\) consisting of \( n\) outcomes. And suppose that all the probabilities \( p_i\) are equal: say for some constant \( c\) we have

\( p_i = c \)

for all \( i \in X.\) Then by rule 1) above,

\( \displaystyle{ 1 = \sum_{i \in X} p_i = \sum_{i \in X} c = n c } \)

since we're just adding the number \( c\) to itself \( n\) times. So,

\( \displaystyle{ c = \frac{1}{n} } \)

and thus

\( \displaystyle{ p_i = \frac{1}{n} } \)

for all \( i \in X\).

I made this look harder than it really is. I was just trying to show you that it follows from the definitions, not any intuition. But it's obvious: if you have \( n\) outcomes that are equally likely, each one has probability \( 1/n.\)

Example 1. Suppose we have a coin that can land either heads up or tails up — let's ignore the possibility that it lands on its edge! Then

\( X = \{ H, T\}\)

If we assume these two outcomes are equally probable, we must have

\( \displaystyle{ p_H = p_T = \frac{1}{2} } \)

Note I said "if we assume" these two outcomes are equally probable. I didn't say they actually are! Are they? Suppose we take a penny and flip it a zillion times. Will it land heads up almost exactly half a zillion times?

Probably not! The treasury isn't interested in making pennies that do this. They're interested in making the head look like Lincoln, and the tail look like the Lincoln national monument:

Or at least they used to. Since the two sides are different, there's no reason they should have the exact same probability of landing on top.

In fact nobody seems to have measured the difference between heads and tails in probabilities for flipping pennies. For hand-flipped pennies, it seems whatever side that starts on top has a roughly 51% chance of landing on top! But if you spin a penny, it's much more likely to land tails up:

• The coin flip: a fundamentally unfair proposition?, Coding the Wheel.

or if you're really serious, try this paper by three friends of mine:

• Persi Diaconis, Susan Holmes and Richard Montgomery, Dynamical bias in the coin flip.

Example 2. Suppose we have a standard deck of cards, well-shuffled, and assume that when I draw a card from this deck, each card is equally likely to be chosen. What is the probability that I draw the ace of spades?

If there's no joker in the deck, there are 52 cards, so the answer is 1/52.

Let me remind you how a deck of cards works: I wouldn't want someone to fail the course because they didn't ever play cards! Here are the 52 cards in a standard deck. Here's what they all look like (click to enlarge):

As you can see, they come in 4 kinds, called suits. The suits are:

• clubs: ♣

• spades: ♠

• diamonds: ♦

• hearts: ♥

Two suits are black and two are red. Each suit has 13 cards in it, for a total of 4 × 13 = 52. The cards in each suit are numbered from 1 to 13, except for four exceptions. They go like this:

A, 2, 3, 4, 5, 6, 7, 8, 9, 10, J, Q, K

A stands for 'ace', J for 'jack', Q for 'queen' and K for 'king'.

Probabilities of subsets

If we know a probability distribution on a finite set \( X\), we can define the probability that an outcome in some subset \( S \subseteq X\) will occur. We define this to be

\( \displaystyle{p(S) = \sum_{i \in S} p_i } \)

For example, suppose I always have one of three things for breakfast:

\( X = \{ \textrm{oatmeal}, \textrm{waffles}, \textrm{eggs} \} \)

This is my set of outcomes. If I eat one of these things, I don't eat the others, so the probabilities of these different outcomes should add up to 1.

Suppose I have an 86% chance of eating oatmeal for breakfast, a 10% chance of eating waffles, and a 4% chance of eating eggs and toast. What's the probability that I will eat oatmeal or waffles? These choices form the subset

\( S = \{ \textrm{oatmeal}, \textrm{waffles} \} \)

and the probability for this subset is

\( p(S) = p_{\textrm{oatmeal}} + p_{\textrm{waffles}} = 0.86 + 0.1 = 0.96 \)

Here's an example from cards:

Since there are 13 cards in the suit of hearts, each with probability 1/52, we add up their probabilities and get

\( \displaystyle{ 13 \times \frac{1}{52} = \frac{1}{4} }\)

This should make sense, since there are 4 suits, and as many cards in each suit.

Card tricks

This is just a fun digression. The deck of cards involves some weird numerology. For starters, it has 52 cards. That's a strange number! Where else have you seen this number?

A student says: "It's the number of weeks in a year."

Right! And these 52 cards are grouped in 4 suits. What does the year have 4 of?

A student says: "Seasons!"

Right! And we have 52 = 4 × 13. So what are there 13 of?

A student says: "Weeks in a season!"

Right! I have no idea if this is a coincidence or not. And have you ever added up the values of all the cards in a suit, where we count the ace as 1, and so on? We get

1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 10 + 11 + 12 + 13

And what's that equal to?

After a long pause, a student says "91."

Yes, that's a really strange number. But let's say we total up the values of all the cards in the deck, not just one suit. What do we get?

A student says "We get 4 × 91... or 364."

Right. Three-hundred and sixty-four. Almost the number of days in year.

"So add one more: the joker! Then you get 365!"

Right, maybe that's why they put an extra card called the joker in the deck:

One extra card for one extra day, joker-day... April Fool's Day! That brings the total up to 365.

Again, I have no idea if this is a coincidence or not. But the people who invented the Tarot deck were pretty weird—they packed it with symbolism—so maybe the ordinary cards were designed this way on purpose too.

Puzzle. What are the prime factors of the number 91? You should know by now... and you should know what they have to do with the calendar!

You can also read comments on Azimuth, and make your own comments or ask questions there!