Turing Zeta Functions
and
Uncertainty
Mike Stay
stay@google.com
(Work done with Cristian Calude at University of Auckland, New Zealand)
What is information?
- Liebniz, 1686: The year before Newton's Principia
- How to distinguish a world describable by science from one that isn't?
- "Lawful behavior"
- Data fitting
- Law is a smaller description of the data
What's a description?
- Church, 1934: Lambda calculus
- Turing, 1936: Turing machines, compute the same functions as lambda calculus
- A description of x relative to a programming language is a program in that language with no inputs and whose output is x.
Turing Machines
000000111101101000000000
......^.................
- Turing machines have
- a bi-infinite tape
- a read/write head
- internal state
- A state is a row in a three column table:
- new bit
- new state
- right or left
or a special HALT state.
Turing Machines
000000111101101000000000
......^.................
- This makes a Turing machine a partial computable function from binary strings to binary strings:
`T:\{0,1\}^** \stackrel{\circ}{\to} \{0,1\}^**`
Universal Turing Machines
- A Turing machine `U` is universal if
- `\forall` Turing machines `T`,
- `\exists c_T \in \{0,1\}^**` such that
- `\forall p \in \{0,1\}^**`,
- `\exists p' \in \{0,1\}^**` such that
`U(p') = T(p)`
and
`|p'| \le |p| + |c_T|`.
Universal Programming Language
- A programming language `U` is universal if
- for any other programming language `T`
- there's a program `c_T`, called a compiler such that
- for any program `p` written in `T`,
- there's another program `p'` written in `U` such that
they give the same answer
and
`p'` is no longer than concatenating the compiler and the program for `T`.
Algorithmic Information Theory
- Kolmogorov, Solomonoff, Chaitin in the 1960's.
- The complexity `H_U(x)` of a string `x` with respect to a TM `U` is the shortest program for `U` whose output is `x`:
`H_U(x) = min(\{ |p| \ \ |\ \ p \in \{0,1\}^** and U(p) = x\})`
- The smallest program `p` is called the elegant program for `x`.
- Work with prefix-free programs. Then can use Lebesgue measure to assign probability.
Formal Axiomatic Systems
- Axioms and rules `=>` Theorems
- Theorem examiner is a program that enumerates all possible proofs and asks of each proof:
- Does this theorem prove that `p` is elegant for some `x`?
- Is `|p| > |\mbox{FAS}| + |\mbox{theorem examiner}|`?
If yes to both, then return `x`.
- Contradiction, so no such long proofs exist!
Chaitin's `\Omega` (1974)
`\Omega = \sum_{p \mbox{ halts}} 2^{-|p|}`
- Theorem: Bits of `\Omega` are random, i.e.
`H_U(\Omega[m]) \ge m - c_U`
where `x[m]` denotes the first `m` bits of the binary fraction `x`.
Chaitin's `\Omega` (1974)
Chaitin's `\Omega` (1974)
- Add up contributions of halting programs
- When `m` bits match, no more `p` such that `|p| < m` can halt, because otherwise the contribution `2^{-|p|} > 2^{-m}` and one of the bits would change!
Chaitin's `\Omega` (1974)
- There is a program `\Psi` that
- reads in `\Omega_U[m]`
- computes the outputs of the halting programs
- chooses a string `x` not in that set
Then
`H_U(\Psi(\Omega_U[m])) = H_U(x) > m`
Chaitin's `\Omega` (1974)
- By universality of `U`, there's a program `c_{\Psi}` that implements it, so
`|c_{\Psi}| + H_U(\Omega_U[m]) \ge H_U(x) > m`
and setting `c = |c_\Psi|`,
`H(\Omega_U[m]) \ge H(x) - |c_{\Psi}| \ge m - c`
`\square`
Chaitin's Incompleteness Theorem
- Given a program implementing any FAS and a theorem examiner that looks for proofs about the values of bits of `\Omega_U[m]`, we can find its length
`N = |\mbox{FAS}| + |\mbox{theorem examiner}|.`
This program cannot prove the values of more than `m = N + c` bits. So a FAS like ZFC can't tell you more than a few bits of `\Omega_U`.
- So, there are infinitely many unprovable true statements, namely the values of the bits of `\Omega_U.`
- Calude et al. computed 64 bits of `\Omega` for a simple programming language.
Chaitin's Incompleteness Theorem
- If you add `k` bits worth of axioms to your FAS, you'll be able to prove at most `k` more bits of `\Omega_U`.
- The bits of `\Omega_U` are irreducible information
- The answer to any finitely refutable problem is encoded in the bits of `\Omega:` Riemann hypothesis, Goldbach's conjecture, Fermat's last theorem.
- Given enough bits to cover a FAS and theorem examiner, you can tell (in principle) whether it will ever generate a proof of a given conjecture!
Tadaki: Partial Randomness (2002)
`\Omega_U(s) = \sum_{p \mbox{ halts}} 2^{-s|p|}`
- For computable real `s>1,` `\Omega_U(s)` is `\frac{1}{s}-`random, i.e.
`H(\Omega_U(s)[m]) \ge \frac{m}{s} - c`
Stay: Zeta function of a Turing machine (2005)
Stay: Zeta function of a Turing machine (2005)
`\zeta_T(s) = \sum_{\mbox{bin}(n) \mbox{ halts}} n^{-s}`
- For universal `U`,
`zeta_U(s)` is `\frac{1}{s}-`random.`
- For any machine `R` that halts on all inputs (not just prefix-free ones),
`\zeta_R(s)` is Riemann zeta.
Stay: Uncertainty principle
- Let `\zeta = \zeta_U(1)`.
- Given `\zeta[m],` the uncertainty in `\zeta` is `\Delta\zeta = 2^{-m}`.
- What is the uncertainty in the elegant program for `\zeta[m]` given an unknown string `s`?
- If all the information we need is in `s`---that is,
`U(s) = \zeta[m]`
---then `n=1.`
- If `s` is independent of `\zeta[m]`, then `n` needs to satisfy
`U(\mbox{bin}(n)) = \zeta[m]`
and since `|\mbox{bin}(n)| > m-c`, `n > 2^{-m+c}`.
- The difference between these extremes is the uncertainty, so `\Delta n \ge 2^{-m+c_U}`.
Stay: Uncertainty Principle
`\Delta \zeta * \Delta \n \ge 2^{-c_U}`