Surprises in Logic

John Baez

April 4, 2016

There's a complexity barrier built into the very laws of logic: roughly speaking, while lots of things are more complex than this, we can't prove any specific thing is more complex than this. And this barrier is surprisingly low! Just how low? Read this.

And then see what happens when combine this idea with the famous 'surprise examination paradox', also known as the 'unexpected hanging paradox'.

Mathematically speaking, these ideas are called Chaitin's incompleteness theorem and the Kritchman–Raz proof of Gödel's second incompleteness theorem. But don't be intimidated: I'll explain everything you need to know!

After that I'll explain another surprise: there's a computer that computes any uncomputable function... in some model of arithmetic.

Chaitin's incompleteness theorem

Could we grow the whole universe with all its seeming complexity starting from a little seed? How much can you do with just a little information?

People have contests about this. Dan Piponi pointed out this animated video created in 2009 using a program less than 4 kilobytes long that runs on a Windows XP machine:

A beautiful alien planet compressed into a mere 4 kilobytes of information! As my friend the programmer Bruce Smith noted:

To be fair, the complexity of some of the OS, graphics drivers, and hardware should be included, but this is a lot less than you might think if you imagine rewriting it purely for compactness rather than for speed, and only including what this sort of program needs to produce output.

Still, it's quite amazing.

Mind you, people didn't figure out how to produce such fancy images from tiny amounts of information overnight! Iñigo Quílez, one of the guys who made this video, has explained some of the tricks. They're deep! And they were developed over decades in the demoscene — a computer art subculture where people produce demos: non-interactive audio-visual computer presentations that run in real time.

According to the Wikipedia article on the demoscene:

Recent computer hardware advancements include faster processors, more memory, faster video graphics processors, and hardware 3D acceleration. With many of the past's challenges removed, the focus in making demos has moved from squeezing as much out of the computer as possible to making stylish, beautiful, well-designed real time artwork.

The old tradition still lives on, though. Demo parties have competitions with varying limitations in program size or platform (different series are called compos). On a modern computer the executable size may be limited to 64 kB or 4 kB. Programs of limited size are usually called intros. In other compos the choice of platform is restricted; only old computers, like the 8-bit Commodore 64, or the 16-bit Amiga, or Atari ST, or mobile devices like handheld phones or PDAs are allowed. Such restrictions provide a challenge for coders, musicians and graphics artists and bring back the old motive of making a device do more than was intended in its original design.

What else can you do with just a little information? Bruce Smith listed a couple more:

So, amazingly complex things can be compressed into fairly little information. You can't help but wonder: how complex can something be?

The answer: arbitrarily complex! At least that's true if we're talking about the Kolmogorov complexity of a string of bits: namely, the length of the shortest computer program that prints it out. Lots of long strings of bits can't be compressed. You can't print out most of them using short programs, since there aren't enough short programs to go around.

Of course, we need to fix a computer language ahead of time, so this is well-defined. And we need to make sure the programs are written in binary, so the comparison is fair.

So, things can be arbitrarily complex. But here's a more interesting question: how complex can we prove something to be?

The answer is one of the most astounding facts I know. It's called Chaitin's incompleteness theorem. It says, very roughly:

There's a number L such that we can't prove the Kolmogorov complexity of any specific string of bits is more than L.

Make sure you understand this. For any number, we can prove there are infinitely many bit strings with Kolmogorov complexity more than that. But we can't point to any particular bit string and prove its Kolmogorov complexity is more than \(L\)!

Allen Knutson wrote:

That's an incredibly disturbing theorem, like driving to the edge of the universe and finding a wall.

I call \(L\) the 'complexity barrier'. So one question is, how big is \(L\)?

\(L\) depends not only on the programming language we use, but also on the system of math we use to prove things. By adjusting these in strange ways, we can make \(L\) as big as small as we like. But suppose we make a 'reasonable' choice?

Chaitin claims that for a certain version of the programming langauge LISP, and any system of math whose axioms can be encoded in a LISP program \(N\) bits long, the complexity barrier is $$ L \le N + 2359 $$ bits! It's hard, or perhaps even impossible, to find the smallest \(L\) that does the job for a certain programming language and system of math, so this is an upper bound. For details, see:

It's also interesting to see what a skilled programmer would guess about the value of \(L\), after the proof of Chaitin's theorem has been explained ot them. So, I asked my friend Bruce Smith to guess \(L\) for some other reasonable programming language and system of math. He estimated that it's a a few kilobytes! This is larger than Chaitin's value, but roughly consistent.

I'd like to see a program a few kilobytes long that produces a video showing a big bang, the formation of stars and galaxies, then planets, including one where life evolves, then intelligent life, then the development of computers... and finally someone writing the very same program.

I can't prove it's possible... but you can't prove it isn't!

Let's see why. Let's see the proof of Chaitin's incompleteness theorem.

Chaitin's incompleteness theorem—the proof

For starters, we need to choose some axioms for our system of math, so we know what 'provable' means. We need a system that's powerful enough to prove a bunch of basic facts about arithmetic, but simple enough that a computer program can check if a proof in this system is valid.

There are lots of systems like this. Three famous ones are Peano arithmetic, Robinson arithmetic (which is less powerful) and Zermelo-Fraenkel set theory (which is more powerful).

When you have a system of math like this, Gödel's first incompleteness theorem kicks in: if the system is consistent, it can't be complete. In other words, there are some questions it leaves unsettled. This is why we shouldn't be utterly shocked that while a bunch of bit strings have complexity more than \( L\), we can't prove this.

Gödel's second incompleteness theorem also kicks in: if the system can prove that it's consistent, it's not! (If it's not consistent, it can prove anything, so you shouldn't trust it.) So, there's a sense in which we can never be completely sure that our system of math is consistent. But let's assume it is.

Given this, Chaitin's theorem says:

There exists a constant \(L\) such that no string of bits has Kolmogorov complexity that provably exceeds \(L\).

How can we get a number that does the job? Any number \( L\) with

$$ U + \log_2(L) + K \lt L $$

will do. Here:

The length of \(L\) written out in binary is about \( \log_2(L)\). This bigger program thus has length

$$ U + \log_2(L) + K $$

and for the proof of Chaitin's incompleteness theorem to work, we need this to be smaller than \(L\). Obviously we can accomplish this by making \( L\) big enough, since \(\log_2 L\) grows slower than \(L\).

Given all the stuff I've told you, the proof of Chaitin's theorem almost writes itself! You run this bigger program I just described. If there were a bit string whose Kolmogorov complexity is provably greater than \( L\), this program would print one out. But that's a contradiction, because this program has length less than \( L\).

So, we just need to pick a computer language and a suitable system of math, and estimate \( U\) and, less importantly because it's so much smaller, \(K\). Then \( L\) will be just a bit bigger than \( U + K\).

I picked the language C and Peano arithmetic and asked Bruce if he could guess, roughly, what answer we get for \(L\). He replied:

I don't think it can be done in C, since C semantics are not well-defined unless you specify a particular finite machine size. (Since C programs can do things like convert pointers to integers and back, tell you the size of any datatype, and convert data of any specified datatype to bytes and back.) On a finite machine of \( N\) bits, all programs either finish in time less than about \( 2^N\) or take forever.

But if you take "C without size-specific operations", or a higher level language like Python, or for that matter a different sort of low-level language like a Turing machine, then that's not an issue—you can define a precise semantics that allows it to run a program for an arbitrarily long time and allocate an arbitrary number of objects in memory which contain pointers to each other. (To stick with the spirit of the question, for whatever language you choose, you'd want to disallow use of any large external batch of information like a "standard library", except for whatever is so basic that you think of it as part of the native language. This is not a serious handicap for this problem.)

The main things that the program 'U' (I'd rather call the program itself 'U' than call its length '\(U\)') needs to do are:

Assuming that U expresses the proofs it wants to check in a practical proof language (which will be more like what a practical theorem-prover like Coq uses than like what a traditional logician would recognize as "straight Peano arithmetic", but which will not be excessively powerful in the spirit of this question), I'd estimate that the most complex part is checking proof validity, but that that can still be expressed in at most a few dozen syntactic rules, each expressible in a few lines of code. (The authors of a system like Coq, which includes code to actually do that, would know better, as long as they remember that the vast majority of their system's actual code is not needed for this problem.)

This makes me think that even without trying to compact it much, in a reasonable language we could write U in a few hundred lines of code, or (after a bit of simple compression) a few thousand bytes. (And perhaps much less if we tried hard to compact the whole program in clever ways.)

So \(L\) will also be "a few thousand" (bytes or digits), or perhaps less, rather than some number you can never possibly count to.

The Kritchman–Raz proof of Gödel's second incompleteness theorem

In 2010, Shira Kritchman and Ran Raz used Chaitin's incompleteness theorem to prove Gödel's incompleteness theorem in an unexpected new way.

I'd like to explain to explain their argument a really sloppy way so everyone can understand it. Logic is cool, but most people never get to the cool part because they can't fight their way through the rather dry technicalities.

You've heard of the surprise examination paradox, right? The teacher says one day he'll give a quiz and it will be a surprise. So the kids think "well, it can't be on the last day then—we'd know it was coming." And then they think "well, so it can't be on the day before the last day, either!—we'd know it was coming." And so on... and they convince themselves it can't happen at all.

But then the teacher gives it the very next day, and they're completely surprised.

People still argue a lot about how to settle this paradox. But anyway, Kritchman and Raz used a rigorous version of this paradox together with Chaitin's incompleteness theorem to demonstrate Gödel's second incompleteness theorem—which says, very roughly, that:

If math can prove itself consistent, it's not.

If you're a logician, I bet this sort of sloppy statement really annoys you. If so, I'm sorry: I want to summarize Kritchman and Raz's argument in a really sloppy way. You can easily add the fine print to make this summary rigorous—or if you prefer, read their paper.

Okay, here we go:

Chaitin's theorem, which is already astounding, says there's some length \(L\) such that you can't prove any particular string of bits needs a program longer than \(L\) to print it out. At least, this is so if our system of math is consistent. If it's not consistent, you can prove anything!

On the other hand, there's some finite number of programs of length \(\le L\). So if you take a list of more numbers than that, say \(1, 2, \dots, N\), there's got to be at least one that needs a program longer than \(L\) to print it out.

Okay: there's got to be at least one. How many? Suppose just one. Then we can go through all programs of length \(\le L\), find those that print all the other numbers on our list... and thus, by a process of elimination, find the culprit.

But that means we've proved that this culprit is a number can only be computed by a program of length \(\gt L\). But Chaitin's theorem says that's impossible! At least, not if math is consistent.

So there can't be just one. At least, not if math is consistent.

Okay: suppose there are just two. Well, we can pull the same trick and find out who they are! So there can't be just two, either. At least, not if math is consistent.

We can keep playing this game until we rule out all the possibilities... and then we're really stuck. We get a contradiction. At least, if math is consistent.

So if we could prove math is consistent, we'd know it's not!


This article first appeared as three posts on my blog. I've polished them up here, but there are lots of interesting comments over there, and you may want to make your own, or ask questions:

As I mentioned, Chaitin proves a version of his incompleteness theorem here:

However, this book is hard to read without going back to this earlier one:

Kritchman and Raz proved their result here:

There's been a lot of discussion about the significance of Chaitin's incompleteness theorem. Here's an intelligent argument that some of the claims are overblown:

For more on Kolmogorov complexity and related ideas, try:

The computability of incomputable functions

Here's another surprising result:

Let me try to explain it without assuming you're an expert on mathematical logic. That may be hard, but I'll give it a try. We need to start with some background.

First, you need to know that there are many different models of arithmetic. If you write down the usual axioms for the natural numbers, the Peano axioms, you can then look around for different structures that obey these axioms. These are called 'models' of Peano arithmetic.

One of them is what you think the natural numbers are. For you, the natural numbers are just 0, 1, 2, 3, ..., with the usual way of adding and multiplying them. This is usually called the 'standard model' of Peano arithmetic. The numbers 0, 1, 2, 3, ... are called the 'standard' natural numbers.

But there are also nonstandard models of arithmetic. These models contain extra numbers beside the standard ones! These are called 'nonstandard' natural numbers.

This takes a while to get used to. There are several layers of understanding to pass through.

For starters, you should think of these extra 'nonstandard' natural numbers as bigger than all the standard ones. So, imagine a whole bunch of extra numbers tacked on after the standard natural numbers, with the operations of addition and multiplication cleverly defined in such a way that all the usual axioms still hold.

You can't just tack on finitely many extra numbers and get this to work. But there can be countably many, or uncountably many. There are infinitely many different ways to do this. They are all rather hard to describe.

To get a handle on them, it helps to realize this. Suppose you have a statement S in arithmetic that is neither provable nor disprovable from the Peano aioms. Then S will hold in some models of Peano arithmetic, while its negation not(S) will hold in some other models.

For example, the Gödel sentence G says "this sentence is not provable from the Peano axioms". If Peano arithmetic is consistent, neither G nor not(G) is provable in Peano arithmetic. So G holds in some models, while not(G) holds in others.

Thus, you can intuitively think of different models as "possible worlds". If you have an undecidable statement, meaning one that you can't prove or disprove using the Peano axioms, then it holds in some worlds, while its negation holds in other worlds.

In the case of the Gödel sentence G, most mathematicians think G is "true". Why the quotes? Truth is a slippery concept in logic — there's no precise definition of what it means for a sentence in arithmetic to be "true". All we can precisely define is:

1) whether or not a sentence is provable from some axioms


2) whether or not a sentence holds in some model.

Nonetheless, mathematicians are human, so they have beliefs about what's true. Many mathematicians believe that G is true: indeed, in popular accounts one often hears that G is "true but unprovable from the Peano axioms". So, these mathematicians are inclined to say that any model where G doesn't hold is nonstandard.

Anyway, what is Joel David Hamkins' result? It's this:

There is a Turing machine T with the following property. For any function \(f\) from the natural numbers to the natural numbers, there is a model of Peano arithmetic such that in this model, if we give T any standard natural \(n\) as input, it halts and outputs \(f(n).\)

So, take \(f\) to be your favorite uncomputable function. Then there's a model of arithmetic such that in this model, the Turing machine computes \(f,\) at least when you feed the machine standard numbers as inputs.

So, very very roughly, there's a possible world in which your uncomputable function becomes computable!

But you have to be very careful about how you interpret this result.

What's the trick? The proof is beautiful, but it would take real work to improve on Hamkins' blog article, so please read that. I'll just say that he makes extensive use of Rosser sentences, which say:

For any proof of this sentence in theory T, there is a smaller proof of the negation of this sentence.

Rosser sentences are already mind-blowing, but Hamkins uses an infinite sequence of such sentences and their negations, chosen in a way that depends on the function \(f,\) to cleverly craft a model of arithmetic in which the Turing machine T computes this function on standard inputs.

But what's really going on? How can using a nonstandard model make an uncomputable function become computable for standard natural numbers? Shouldn't nonstandard models agree with the standard one on this issue? After all, the only difference is that they have extra nonstandard numbers tacked on after all the standard ones! How can that make a Turing machine succeed in computing \(f\) on standard natural numbers?

I'm not 100% sure, but I think I know the answer. I hope some logicians will correct me if I'm wrong.

You have to read the result rather carefully:

There is a Turing machine T with the following property. For any function \(f\) from the natural numbers to the natural numbers, there is a model of Peano arithmetic such that in this model, if we give T any standard natural \(n\) as input, it halts and computes \(f(n).\)

When we say the Turing machine halts, we mean it halts after \(N\) steps for some natural number \(N.\) But this may not be a standard natural number! It's a natural number in the model we're talking about.

So, the Turing machine halts... but perhaps only after a nonstandard number of steps.

In short: you can compute the uncomputable, but only if you're willing to wait long enough. You may need to wait a nonstandard amount of time.

It's like that old Navy saying:


But the trick becomes more evident if you notice that _one single_ Turing machine T computes _different functions_ from the natural numbers to the natural numbers... in different models. That's even weirder than computing an uncomputable function.

The only way to build a machine that computes \(n+1\) in one model and \(n+2\) in another to build a machine that doesn't halt in a standard amount of time in either model. It only halts after a _nonstandard_ amount of time. In one model, it halts and outputs \(n+1.\) In another, it halts and outputs \(n+2.\)

A scary possibility

To dig a bit deeper — and this is where it gets a bit scary — we have to admit that the standard model is a somewhat elusive thing. I certainly didn't define it when I said this:

For you, the natural numbers are just 0, 1, 2, 3, ..., with the usual way of adding and multiplying them. This is usually called the 'standard model' of Peano arithmetic. The numbers 0, 1, 2, 3, ... are called the 'standard' natural numbers.

The point is that "0, 1, 2, 3, ..." here is vague. It makes sense if you already know what the standard natural numbers are. But if you don't already know, those three dots aren't going to tell you!

You might say the standard natural numbers are those of the form 1 + ··· + 1, where we add 1 to itself some finite number of times. But what does 'finite number' mean here? It means a standard natural number! So this is circular.

So, conceivably, the concept of 'standard' natural number, and the concept of 'standard' model of Peano arithmetic, are more subjective than most mathematicians think. Perhaps some of my 'standard' natural numbers are nonstandard for you!

I think most mathematicians would reject this possibility... but not all. Edward Nelson tackled it head-on in his marvelous book Internal Set Theory. He writes:

Perhaps it is fair to say that "finite" does not mean what we have always thought it to mean. What have we always thought it to mean? I used to think that I knew what I had always thought it to mean, but I no longer think so.

If we go down this road, Hamkins' result takes on a different significance. It says that any subjectivity in the notion of 'natural number' may also infect what it means for a Turing machine to halt, and what function a Turing machine computes when it does halt.

© 2016 John Baez