On Thursday, I probably should've told you explicitly that I was compressing a whole math course into one lecture. On the one hand, that means I don't really expect you to have understood everything. On the other hand, to the extent you did understand -- hey! You got a whole math course in one lecture! You're welcome.

But I do realize that in the last lecture, I went too fast in some places. In particular, I wrote an example of logical inference on the board. The example was, if all A's are B's, and there is an A, then there is a B. I'm told that the physicists were having trouble with that?

Hey, I'm just ribbin' ya. If you haven't seen this way of thinking before, then you haven't seen it. But maybe, for the benefit of the physicists, we should go over the basic rules of logic?

**Propositional Tautologies:**A or not A, not(A and not A), etc. are valid.**Modus Ponens:**If A is valid and A implies B is valid then B is valid.**Equality Rules:**x=x, x=y implies y=x, x=y and y=z implies x=z, and x=y implies f(x)=f(y) are all valid.**Change of Variables:**Changing variable names leaves a statement valid.**Quantifier Elimination:**If For all x, A(x) is valid, then A(y) is valid for any y.**Quantifier Addition:**If A(y) is valid where y is an unrestricted variable, then For all x, A(x) is valid.**Quantifier Rules:**If Not(For all x, A(x)) is valid, then There exists an x such that Not(A(x)) is valid. Etc.

There's an amazing result called Gödel's Completeness Theorem, which says that these rules are all you ever need. In other words: if, starting from some set of axioms, you can't derive a contradiction using these rules, then the axioms must have a model (i.e., they must be consistent). Conversely, if the axioms are inconsistent, then the inconsistency can be proven using these rules alone.

Think about what that means. It means that Fermat's Last Theorem, the Poincaré Conjecture, or any other mathematical achievement you care to name can be proved by starting from the axioms for set theory, and then applying these piddling little rules over and over again. Probably 300 million times, but still...

(How does Gödel prove the Completeness Theorem? The proof has been described as "extracting semantics from syntax." We simply "cook up objects to order" as the axioms request them! And if we ever run into an inconsistency, that can only be because there was an inconsistency in the original axioms.)

One immediate consequence of the Completeness Theorem is the
*Löwenheim-Skolem Theorem*: every set of axioms has a model of at most
countable cardinality. (Note: One of the best predictors of success in
mathematical logic is having an umlaut in your name.) Why? Because the process
of "cooking up objects to order as the axioms request them" can only go on for a
countably infinite number of steps!

It's a shame that, after proving his Completeness Theorem, Gödel never really
did anything else of note. [Pause for comic effect] Well, alright, I guess a
year later he proved the *Incompleteness* Theorem. See, the Completeness
Theorem was his Master's thesis, and the Incompleteness Theorem was his PhD
thesis. Apparently, one of his PhD examiners didn't want to give him a degree
because the PhD thesis was "too similar to the Master's thesis."

The Incompleteness Theorem says that, given any consistent, computable set of
axioms, there's a true statement about the integers that can never be proved
from those axioms. Here *consistent* means that you can't derive a
contradiction, while *computable* means that either there are finitely many
axioms, or else if there are infinitely many, at least there's an algorithm to
generate all the axioms.

(If we didn't have the computability requirement, then we could simply take our "axioms" to consist of all true statements about the integers! In practice, that isn't a very useful set of axioms.)

But wait! Doesn't the Incompleteness Theorem contradict the Completeness Theorem, which says that any statement that's entailed by the axioms can be proved from the axioms? Hold that question; we're gonna clear it up later.

First, though, let's see how the Incompleteness Theorem is proved. People always say, "the proof of the Incompleteness Theorem was a technical tour de force, it took 30 pages, it requires an elaborate construction involving prime numbers," etc. Unbelievably, 80 years after Gödel, that's still how the proof is presented in math classes!

Alright, should I let you in on a secret? The proof of the Incompleteness Theorem is about two lines. The caveat is that, to give the two-line proof, you first need the concept of a computer.

When I was in junior high, I had a friend who was really good at math, but
maybe not so good at programming. He wanted to write a program using arrays, but
he didn't know what an array was. So what did he do? He associated each element
of the array with a unique prime number, then he multiplied them all together;
then, whenever he wanted to read something out of the array, he *factored*
the product. (If he was programming a quantum computer, maybe that wouldn't be
quite so bad!) Anyway, what my friend did, that's basically what Gödel did. He
made up an elaborate hack in order to program without programming.

**Turing Machines**

OK, time to bring Mr. T. on the scene. How many of you have seen Turing machines before? About three-quarters of you? I'll go pretty quickly then.

In 1936, the word "computer" meant a person (usually a woman) whose job was to compute with pencil and paper. Turing wanted to show that, in principle, such a "computer" could be simulated by a machine. What would the machine look like? Well, it would have to able to write down its calculations somewhere. Since we don't really care about handwriting, font size, etc., it's easiest to imagine that the calculations are written on a sheet of paper divided into squares, with one symbol per square, and a finite number of possible symbols. Traditionally paper has two dimensions, but without loss of generality we can imagine a long, one-dimensional paper tape. How long? For the time being, we'll assume as long as we need.

What can the machine do? Well, clearly it has to be able to read symbols off the tape and modify them based on what it reads. We'll assume for simplicity that the machine reads only one symbol at a time. But in that case, it had better be able to move back and forth on the tape. It would also be nice if, once it's computed an answer, the machine can halt! But at any time, how does the machine decide which things to do? According to Turing, this decision should depend only on two pieces of information: (1) the symbol currently being read, and (2) the machine's current "internal configuration" or "state." Based on its internal state and the symbol currently being read, the machine should (1) write a new symbol in the current square, (2) move backwards or forwards one square, and (3) switch to a new state or halt.

Finally, since we want this machine to be physically realizable, the number of possible internal states should be finite. These are the only requirements.

Turing's first result is the existence of a "universal" machine: a machine
whose job is to simulate any other machine described via symbols on the tape. In
other words, *universal programmable computers can exist*. You don't have
to build one machine for email, another for playing DVD's, another for Tomb
Raider, and so on: you can build a single machine that simulates any of the
other machines, by running different programs stored in memory. This result is
actually a *lemma*, which Turing uses to prove his "real" result.

So what's the real result? It's that there's a basic problem, called the halting problem, that no program can ever solve. The halting problem is this: we're given a program, and we want to decide if it ever halts. Of course we can run the program for a while, but what if the program hasn't halted after a million years? At what point should we give up?

One piece of evidence that this problem might be hard is that, if we
*could* solve it, then we could also solve many famous unsolved math
problems. For example, Goldbach's Conjecture says that every even number 4 or
greater can be written as a sum of two primes. Now, we can easily write a
program that tests 4, 6, 8, and so on, halting only if it finds a number that
can't be written as a sum of two primes. Then deciding whether that program ever
halts is equivalent to deciding the truth of Goldbach's Conjecture.

But can we *prove* there's no program to solve the halting problem? This
is what Turing does. His key idea is not even to *try* to analyze the
internal dynamics of such a program, supposing it existed. Instead he simply
says, suppose by way of contradiction that such a program P exists. Then we can
modify P to produce a new program P' that does the following. Given another
program Q as input, P'

- runs forever if Q halts given its own code as input, or
- halts if Q runs forever given its own code as input.

As I said, once you have Turing's results, Gödel's results fall out for free as a bonus. Why? Well, suppose the Incompleteness Theorem was false -- that is, there existed a consistent, computable proof system F from which any statement about integers could be either proved or disproved. Then given a computer program, we could simply search through every possible proof in F, until we found either a proof that the program halts or a proof that it doesn't halt. (This is possible because the statement that a particular computer program halts is ultimately just a statement about integers.) But this would give us an algorithm to solve the halting problem, which we already know is impossible. Therefore F can't exist.

By thinking more carefully, we can actually squeeze out a stronger result. Let P be a program that, given as input another program Q, tries to decide whether Q halts by the strategy above (i.e., searching through every possible proof and disproof that Q halts in some formal system F). Then as in Turing's proof, suppose we modify P to produce a new program P' that

- runs forever if Q is proved to halt given its own code as input, or
- halts if Q is proved to run forever given its own code as input.

But there's an obvious paradox: why isn't the above argument, *itself*,
a proof that P' will run forever given its own code as input? And why won't P'
discover this proof that it runs forever -- and therefore halt, and therefore
run forever, and therefore halt, etc.?

The answer is that, in "proving" that P' runs forever, we made a hidden assumption: namely that the proof system F is consistent. If F was inconsistent, then there could perfectly well be a proof that P' halts, even if the reality was that P' ran forever.

But this means that, if F could *prove* that F was consistent, then F
could also prove that P' ran forever -- thereby bringing back the above
contradiction. The only possible conclusion is that *if F is consistent, then
F can't prove its own consistency*. This result is sometimes called Gödel's
Second Incompleteness Theorem.

The Second Incompleteness Theorem establishes what we maybe should have
expected all along: that the only mathematical theories pompous enough to prove
their own consistency, are the ones that don't *have* any consistency to
brag about! If we want to prove that a theory F is consistent, then we can only
do it within a *more powerful* theory -- a trivial example being F+Con(F)
(F plus the axiom that F is consistent). But then how do we know that F+Con(F)
is itself consistent? Well, we can only prove that in a still stronger theory:
F+Con(F)+Con(F+Con(F)) (F+Con(F) plus the axiom that F+Con(F) is consistent).
And so on infinitely. (Indeed, even beyond infinitely, into the countable
ordinals.)

To take a concrete example: the Second Incompleteness Theorem tells us that
the most popular axiom system for the integers, Peano Arithmetic, can't prove
its own consistency. Or in symbols, PA can't prove Con(PA). If we want to prove
Con(PA), then we need to move to a stronger axiom system, such as ZF (the
Zermelo-Fraenkel axioms for set theory). In ZF we can prove Con(PA) pretty
easily, by using the Axiom of Infinity to conjure up an infinite set that then
serves as a *model* for PA.

On the other hand, again by the Second Incompleteness Theorem, ZF can't prove
its *own* consistency. If we want to prove Con(ZF), the simplest way to do
it is to posit the existence of infinities bigger than anything that can be
defined in ZF. Such infinities are called "large cardinals." (When set theorists
say large, they *mean* large.) Once again, we can prove the consistency of
ZF in ZF+LC (where LC is the axiom that large cardinals exist). But if we want
to prove that ZF+LC is itself consistent, then we need a still more powerful
theory, such as one with even *bigger* infinities.

A quick question to test your understanding: while we can't prove in PA that Con(PA), can we least prove in PA that Con(PA) implies Con(ZF)?

No, we can't. For then we could also prove in ZF that Con(PA) implies Con(ZF). But since ZF can prove Con(PA), this would mean that ZF can prove Con(ZF), which contradicts the Second Incompleteness Theorem.

I promised to explain why the Incompleteness Theorem doesn't contradict the
Completeness Theorem. The easiest way to do this is probably through an example.
Consider the "self-hating theory" PA+Not(Con(PA)), or Peano Arithmetic *plus
the assertion of its own inconsistency*. We know that if PA is consistent,
then this strange theory must be consistent as well -- since otherwise PA would
prove its own consistency, which the Incompleteness Theorem doesn't allow. It
follows, by the Completeness Theorem, that PA+Not(Con(PA)) must have a model.
But what could such a model possibly look like? In particular, what you happen
if, within that model, you just asked to *see* the proof that PA was
inconsistent?

I'll tell you what would happen: the axioms would tell you that proof of PA's
inconsistency is encoded by a positive integer X. And then you would say, "but
what *is* X?" And the axioms would say, "X." And you would say, "But what
is X, as an *ordinary positive integer*?"

"No, no, no! Talk to the axioms."

"Alright, is X greater or less than 10^{500,000}?"

"Greater." (The axioms aren't stupid: they know that if they said "smaller", then you could simply try every smaller number and verify that none of them encode a proof of PA's inconsistency.)

"Alright then, what's X+1?"

"Y."

And so on. The axioms will keep cooking up fictitious numbers to satisfy your
requests, and assuming that PA itself is consistent, you'll never be able to
trap them in an inconsistency. The point of the Completeness Theorem is that the
whole infinite set of fictitious numbers the axioms cook up will constitute a
*model* for PA -- just not the usual model (i.e., the ordinary positive
integers)! If we insist on talking about the usual model, then we switch from
the domain of the Completeness Theorem to the domain of the Incompleteness
Theorem.

Do you remember the puzzle from Thursday? The puzzle was whether there's any
theorem that can only be proved by assuming as an axiom that it *can* be
proved. In other words, does "just believing in yourself" make any formal
difference in mathematics? We're now in a position to answer that question.

Let's suppose, for concreteness, that the theorem we want to prove is the
Riemann Hypothesis (RH), and the formal system we want to prove it in is
Zermelo-Fraenkel set theory (ZF). Suppose we can prove in ZF that, if ZF proves
RH, then RH is true. Then taking the contrapositive, we can also prove in ZF
that if RH is false, then ZF does *not* prove RH. In other words, we can
prove in ZF+not(RH) that not(RH) is perfectly consistent with ZF. But this means
that the theory ZF+not(RH) proves its own consistency -- and this, by Gödel,
means that ZF+not(RH) is inconsistent. But saying that ZF+not(RH) is
inconsistent is equivalent to saying that RH is a theorem of ZF. Therefore we've
proved RH. In general we find that, if a statement can be proved by assuming as
an axiom that it's provable, then it can also be proved *without* assuming
that axiom. This result is known as Löb's Theorem (again with the umlauts),
though personally I think that a better name would be the
"You-Had-The-Mojo-All-Along Theorem."

Oh, you remember on Thursday we talked about the Axiom of Choice and the Continuum Hypothesis? Natural statements about the continuum that, since the continuum is such a well-defined mathematical entity, must certainly be either true or false? So, how did those things ever get decided? Well, Gödel proved in 1939 that assuming the Axiom of Choice (AC) or the Continuum Hypothesis (CH) can never lead to an inconsistency. In other words, if the theories ZF+AC or ZF+CH were inconsistent, that could only be because ZF itself was inconsistent.

This raised an obvious question: can we also consistently assume that AC and
CH are *false*? Gödel worked on this problem but wasn't able to answer it.
Finally Paul Cohen gave an affirmative answer in 1963, by inventing a new
technique called "forcing." (For that, he won the only Fields Medal that's ever
been given for set theory and the foundations of math.)

So, we now know that the usual axioms of mathematics don't decide the Axiom of Choice and the Continuum Hypothesis one way or another. You're free to believe both, neither, or one and not the other without fear of contradiction. And sure enough, opinion among mathematicians about AC and CH remains divided to this day, with many interesting arguments for and against (which we unfortunately don't have time to explore the details of).

Let me end with a possibly-surprising observation: the independence of AC and
CH from ZF set theory *is itself a theorem of Peano Arithmetic*. For,
ultimately, Gödel and Cohen's consistency theorems boil down to combinatorial
assertions about manipulations of first-order sentences -- which can in
principle be proven directly, without ever thinking about the transfinite sets
that those sentences purport to describe. This provides a nice illustration of
what, to me, is the central philosophical question underlying this whole
business: do we ever *really* talk about the continuum, or do we only ever
talk about finite sequences of symbols that talk about the continuum?

**Bonus Addendum**

What does any of this have to do with quantum mechanics? I will now attempt
the heroic task of making a connection. What I've tried to impress on you is
that there are profound difficulties if we want to assume the world is
continuous. Take this pen, for example: how many different positions can I put
it in on the surface of the table? ℵ_{1}? More than
ℵ_{1}? Less than ℵ_{1}?
We don't want the answers to "physics" questions to depend on the axioms of set
theory!

Ah, but you say my question is physically meaningless, since the pen's
position could never actually be measured to infinite precision? Sure -- but the
point is, you need a physical theory to *tell* you that!

Of course, quantum mechanics gets its very name from the fact that a lot of
the observables in the theory, like energy levels, are discrete -- "quantized."
This seems paradoxical, since one of the criticisms that computer scientists
level against quantum computing is that, as they see it, it's a
*continuous* model of computation!

My own view is that quantum mechanics, like classical probability theory,
should be seen as somehow "intermediate" between a continuous and discrete
theory. (Here I'm assuming that the Hilbert space or probability space are
finite-dimensional.) What I mean is that, while there *are* continuous
parameters (the probabilities or amplitudes respectively), those parameters are
not directly observable, and that has the effect of "shielding" us from the
bizarro universe of the Axiom of Choice and the Continuum Hypothesis. We don't
need a detailed physical theory to tell us that whether amplitudes are rational
or irrational, whether there are more or less than ℵ_{1} possible amplitudes, etc., are physically
meaningless questions. This follows directly from the fact that, if we wanted to
learn an amplitude exactly, then (even assuming no error!) we would need to
measure the appropriate state infinitely many times.

**Homework Assignment**

Let BB(n), or the "nth Busy Beaver number," be the maximum number of steps that an n-state Turing machine can make on an initially blank tape before halting. (Here the maximum is over all n-state Turing machines that eventually halt.)

- Prove that BB(n) grows faster than any computable function.

- Let S = 1/BB(1) + 1/BB(2) + 1/BB(3) + ...

Is S a computable real number? In other words, is there an algorithm that, given as input a positive integer k, outputs a rational number S' such that |S-S'|<1/k?

[Discussion of this lecture on blog]