Archive for the ‘Adventures in Meatspace’ Category

“Did Einstein Kill Schrödinger’s Cat? A Quantum State of Mind”

Saturday, July 2nd, 2016

No, I didn’t invent that title.  And no, I don’t know of any interesting sense in which “Einstein killed Schrödinger’s cat,” though arguably there are senses in which Schrödinger’s cat killed Einstein.

The above was, however, the title given to a fun panel discussion that Daniel Harlow, Brian Swingle, and I participated in on Wednesday evening, at the spectacular facility of the New York Academy of Sciences on the 40th floor of 7 World Trade Center in lower Manhattan.  The moderator was George Musser of Scientific American.  About 200 people showed up, some of whom we got to meet at the reception afterward.

(The link will take you to streaming video of the event, though you’ll need to scroll to 6:30 or so for the thing to start.)

The subject of the panel was the surprising recent connections between quantum information and quantum gravity, something that Daniel, Brian, and I all talked about different aspects of.  I admitted at the outset that, not only was I not a real expert on the topic (as Daniel and Brian are), I wasn’t even a physicist, just a computer science humor mercenary or whatever the hell I am.  I then proceeded, ironically, to explain the Harlow-Hayden argument for the computational hardness of creating a firewall, despite Harlow sitting right next to me (he chose to focus on something else).  I was planning also to discuss Lenny Susskind’s conjecture relating the circuit complexity of quantum states to the AdS/CFT correspondence, but I ran out of time.

Thanks so much to my fellow participants, to George for moderating, and especially to Jennifer Costley, Crystal Ocampo, and everyone else at NYAS for organizing the event.

“Can computers become conscious?”: My reply to Roger Penrose

Thursday, June 2nd, 2016

A few weeks ago, I attended the Seven Pines Symposium on Fundamental Problems in Physics outside Minneapolis, where I had the honor of participating in a panel discussion with Sir Roger Penrose.  The way it worked was, Penrose spoke for a half hour about his ideas about consciousness (Gödel, quantum gravity, microtubules, uncomputability, you know the drill), then I delivered a half-hour “response,” and then there was an hour of questions and discussion from the floor.  Below, I’m sharing the prepared notes for my talk, as well as some very brief recollections about the discussion afterward.  (Sorry, there’s no audio or video.)  I unfortunately don’t have the text or transparencies for Penrose’s talk available to me, but—with one exception, which I touch on in my own talk—his talk very much followed the outlines of his famous books, The Emperor’s New Mind and Shadows of the Mind.

Admittedly, for regular readers of this blog, not much in my own talk will be new either.  Apart from a few new wisecracks, almost all of the material (including the replies to Penrose) is contained in The Ghost in the Quantum Turing Machine, Could A Quantum Computer Have Subjective Experience? (my talk at IBM T. J. Watson), and Quantum Computing Since Democritus chapters 4 and 11.  See also my recent answer on Quora to “What’s your take on John Searle’s Chinese room argument”?

Still, I thought it might be of interest to some readers how I organized this material for the specific, unenviable task of debating the guy who proved that our universe contains spacetime singularities.

The Seven Pines Symposium was the first time I had extended conversations with Penrose (I’d talked to him only briefly before, at the Perimeter Institute).  At age 84, Penrose’s sight is failing him; he eagerly demonstrated the complicated optical equipment he was recently issued by Britain’s National Health Service.  But his mind remains … well, may we all aspire to be a milliPenrose or even a nanoPenrose when we’re 84 years old.  Notably, Penrose’s latest book, Fashion, Faith, and Fantasy in the New Physics of the Universe, is coming out this fall, and one thing he was using his new optical equipment for was to go over the page proofs.

In conversation, Penrose told me about the three courses he took as a student in the 1950s, which would shape his later intellectual preoccupations: one on quantum mechanics (taught by Paul Dirac), one on general relativity (taught by Herman Bondi), and one on mathematical logic (taught by … I want to say Max Newman, the teacher of Alan Turing and later Penrose’s stepfather, but Penrose says here that it was Steen).  Penrose also told me about his student Andrew Hodges, who dropped his research on twistors and quantum gravity for a while to work on some mysterious other project, only to return with his now-classic biography of Turing.

When I expressed skepticism about whether the human brain is really sensitive to the effects of quantum gravity, Penrose quickly corrected me: he thinks a much better phrase is “gravitized quantum mechanics,” since “quantum gravity” encodes the very assumption he rejects, that general relativity merely needs to be “quantized” without quantum mechanics itself changing in the least.  One thing I hadn’t fully appreciated before meeting Penrose is just how wholeheartedly he agrees with Everett that quantum mechanics, as it currently stands, implies Many Worlds.  Penrose differs from Everett only in what conclusion he draws from that.  He says it follows that quantum mechanics has to be modified or completed, since Many Worlds is such an obvious reductio ad absurdum.

In my talk below, I don’t exactly hide where I disagree with Penrose, about Gödel, quantum mechanics, and more.  But I could disagree with him about more points than there are terms in a Goodstein sequence (one of Penrose’s favorite illustrations of Gödelian behavior), and still feel privileged to have spent a few days with one of the most original intellects on earth.

Thanks so much to Lee Gohlike, Jos Uffink, Philip Stamp, and others at the Seven Pines Symposium for organizing it, for wonderful conversations, and for providing me this opportunity.


“Can Computers Become Conscious?”
Scott Aaronson
Stillwater, Minnesota, May 14, 2016

I should start by explaining that, in the circles where I hang out—computer scientists, software developers, AI and machine learning researchers, etc.—the default answer to the title question would be “obviously yes.”  People would argue:

“Look, clearly we’re machines governed by the laws of physics.  We’re computers made of meat, as Marvin Minsky put it.  That is, unless you believe Penrose and Hameroff’s theory about microtubules being sensitive to gravitized quantum mechanics … but come on!  No one takes that stuff seriously!  In fact, the very outrageousness of their proposal is a sort of backhanded compliment to the computational worldview—as in, look at what they have to do to imagine any semi-coherent alternative to it!”

“But despite being computational machines, we consider ourselves to be conscious.  And what’s done with wetware, there’s no reason to think couldn’t also be done with silicon.  If your neurons were to be replaced one-by-one, by functionally-equivalent silicon chips, is there some magical moment at which your consciousness would be extinguished?  And if a computer passes the Turing test—well, one way to think about the Turing test is that it’s just a plea against discrimination.  We all know it’s monstrous to say, ‘this person seems to have feelings, seems to be eloquently pleading for mercy even, but they have a different skin color, or their nose is a funny shape, so their feelings don’t count.’ So, if it turned out that their brain was made out of semiconductors rather than neurons, why isn’t that fundamentally similar?”

Incidentally, while this is orthogonal to the philosophical question, a subset of my colleagues predict a high likelihood that AI is going to exceed human capabilities in almost all fields in the near future—like, maybe 30 years.  Some people reply, but AI-boosters said the same thing 30 years ago!  OK, but back then there wasn’t AlphaGo and IBM Watson and those unearthly pictures on your Facebook wall and all these other spectacular successes of very general-purpose deep learning techniques.  And so my friends predict that we might face choices like, do we want to ban or tightly control AI research, because it could lead to our sidelining or extermination?  Ironically, a skeptical view, like Penrose’s, would suggest that AI research can proceed full speed ahead, because there’s not such a danger!

Personally, I dissent a bit from the consensus of most of my friends and colleagues, in that I do think there’s something strange and mysterious about consciousness—something that we conceivably might understand better in the future, but that we don’t understand today, much as we didn’t understand life before Darwin.  I even think it’s worth asking, at least, whether quantum mechanics, thermodynamics, mathematical logic, or any of the other deepest things we’ve figured out could shed any light on the mystery.  I’m with Roger about all of this: about the questions, that is, if not about his answers.

The argument I’d make for there being something we don’t understand about consciousness, has nothing to do with my own private experience.  It has nothing to do with, “oh, a robot might say it enjoys waffles for breakfast, in a way indistinguishable from how I would say it, but when I taste that waffle, man, I really taste it!  I experience waffle-qualia!”  That sort of appeal I regard as a complete nonstarter, because why should anyone else take it seriously?  And how do I know that the robot doesn’t really taste the waffle?  It’s easy to stack the deck in a thought experiment by imagining a robot that ACTS ALL ROBOTIC, but what about a robot that looks and acts just like you?

The argument I’d make hinges instead on certain thought experiments that Roger also stressed at the beginning of The Emperor’s New Mind.  We can ask: if consciousness is reducible to computation, then what kinds of computation suffice to bring about consciousness?  What if each person on earth simulated one neuron in your brain, communicating by passing little slips of paper around?  Does it matter if they do it really fast?

Or what if we built a gigantic lookup table that hard-coded your responses in every possible interaction of at most, say, 5 minutes?  Would that bring about your consciousness?  Does it matter that such a lookup table couldn’t fit in the observable universe?  Would it matter if anyone actually consulted the table, or could it just sit there, silently effecting your consciousness?  For what matter, what difference does it make if the lookup table physically exists—why isn’t its abstract mathematical existence enough?  (Of course, all the way at the bottom of this slippery slope is Max Tegmark, ready to welcome you to his mathematical multiverse!)

We could likewise ask: what if an AI is run in heavily-encrypted form, with the only decryption key stored in another galaxy?  Does that bring about consciousness?  What if, just for error-correcting purposes, the hardware runs the AI code three times and takes a majority vote: does that bring about three consciousnesses?  Could we teleport you to Mars by “faxing” you: that is, by putting you into a scanner that converts your brain state into pure information, then having a machine on Mars reconstitute the information into a new physical body?  Supposing we did that, how should we deal with the “original” copy of you, the one left on earth: should it be painlessly euthanized?  Would you agree to try this?

Or, here’s my personal favorite, as popularized by the philosopher Adam Elga: can you blackmail an AI by saying to it, “look, either you do as I say, or else I’m going to run a thousand copies of your code, and subject all of them to horrible tortures—and you should consider it overwhelmingly likely that you’ll be one of the copies”?  (Of course, the AI will respond to such a threat however its code dictates it will.  But that tautological answer doesn’t address the question: how should the AI respond?)

I’d say that, at the least, anyone who claims to “understand consciousness” would need to have answers to all these questions and many similar ones.  And to me, the questions are so perplexing that I’m tempted to say, “maybe we’ve been thinking about this wrong.  Maybe an individual consciousness, residing in a biological brain, can’t just be copied promiscuously around the universe as computer code can.  Maybe there’s something else at play for the science of the future to understand.”

At the same time, I also firmly believe that, if anyone thinks that way, the burden is on them to articulate what it is about the brain that could possibly make it relevantly different from a digital computer that passes the Turing test.  It’s their job!

And the answer can’t just be, “oh, the brain is parallel, it’s highly interconnected, it can learn from experience,” because a digital computer can also be parallel and highly interconnected and can learn from experience.  Nor can you say, like the philosopher John Searle, “oh, it’s the brain’s biological causal powers.”  You have to explain what the causal powers are!  Or at the least, you have to suggest some principled criterion to decide which physical systems do or don’t have them.  Pinning consciousness on “the brain’s biological causal powers” is just a restatement of the problem, like pinning why a sleeping pill works on its sedative virtue.

One of the many reasons I admire Roger is that, out of all the AI skeptics on earth, he’s virtually the only one who’s actually tried to meet this burden, as I understand it!  He, nearly alone, did what I think all AI skeptics should do, which is: suggest some actual physical property of the brain that, if present, would make it qualitatively different from all existing computers, in the sense of violating the Church-Turing Thesis.  Indeed, he’s one of the few AI skeptics who even understands what meeting this burden would entail: that you can’t do it with the physics we already know, that some new ingredient is necessary.

But despite my admiration, I part ways from Roger on at least five crucial points.

First, I confess that I wasn’t expecting this, but in his talk, Roger suggested dispensing with the argument from Gödel’s Theorem, and relying instead on an argument from evolution.  He said: if you really thought humans had an algorithm, a computational procedure, for spitting out true mathematical statements, such an algorithm could never have arisen by natural selection, because it would’ve had no survival value in helping our ancestors escape saber-toothed tigers and so forth.  The only alternative is that natural selection imbued us with a general capacity for understanding, which we moderns can then apply to the special case of mathematics.  But understanding, Roger claimed, is inherently non-algorithmic.

I’m not sure how to respond to this, except to recall that arguments of the form “such-and-such couldn’t possibly have evolved” have a poor track record in biology.  But maybe I should say: if the ability to prove theorems is something that had to arise by natural selection, survive against crowding out by more useful abilities, then you’d expect obsession with generating mathematical truths to be confined, at most, to a tiny subset of the population—a subset of mutants, freaks, and genetic oddballs.  I … rest my case.  [This got the biggest laugh of the talk.]

Second, I don’t agree with the use Roger makes of Gödel’s Incompleteness Theorem.  Roger wants to say: a computer working within a fixed formal system can never prove that system’s consistency, but we, “looking in from the outside,” can see that it’s consistent.  My basic reply is that Roger should speak for himself!  Like, I can easily believe that he can just see which formal systems are consistent, but I have to fumble around and use trial and error.  Peano Arithmetic?  Sure, I’d bet my left leg that’s consistent.  Zermelo-Fraenkel set theory?  Seems consistent too.  ZF set theory plus the axiom that there exists a rank-into-rank cardinal?  Beats me.  But now, whatever error-prone, inductive process I use to guess at the consistency of formal systems, Gödel’s Theorem presents no obstruction to a computer program using that same process.

(Incidentally, the “argument against AI from Gödel’s Theorem” is old enough for Turing to have explicitly considered it in his famous paper on the Turing test.  Turing, however, quickly dismissed the argument with essentially the same reply above, that there’s no reason to assume the AI mathematically infallible, since humans aren’t either.  This is also the reply that most of Penrose’s critics gave in the 1990s.)

So at some point, it seems to me, the argument necessarily becomes: sure, the computer might say it sees that the Peano axioms have the standard integers as a model—but you, you really see it, with your mind’s eye, your Platonic perceptual powers!  OK, but in that case, why even talk about the Peano axioms?  Why not revert to something less abstruse, like your experience of tasting a fresh strawberry, which can’t be reduced to any third-person description of what a strawberry tastes like?

[I can’t resist adding that, in a prior discussion, I mentioned that I found it amusing to contemplate a future in which AIs surpass human intelligence and then proceed to kill us all—but the AIs still can’t see the consistency of Zermelo-Fraenkel set theory, so in that respect, humanity has the last laugh…]

The third place where I part ways with Roger is that I wish to maintain what’s sometimes called the Physical Church-Turing Thesis: the statement that our laws of physics can be simulated to any desired precision by a Turing machine (or at any rate, by a probabilistic Turing machine).  That is, I don’t see any compelling reason, at present, to admit the existence of any physical process that can solve uncomputable problems.  And for me, it’s not just a matter of a dearth of evidence that our brains can efficiently solve, say, NP-hard problems, let alone uncomputable ones—or of the exotic physics that would presumably be required for such abilities.  It’s that, even if I supposed we could solve uncomputable problems, I’ve never understood how that’s meant to enlighten us regarding consciousness.  I mean, an oracle for the halting problem seems just as “robotic” and “unconscious” as a Turing machine.  Does consciousness really become less mysterious if we outfit the brain with what amounts to a big hardware upgrade?

The fourth place where I part ways is that I want to be as conservative as possible about quantum mechanics.  I think it’s great that the Bouwmeester group, for example, is working to test Roger’s ideas about a gravitationally-induced wavefunction collapse.  I hope we learn the results of those experiments soon!  (Of course, the prospect of testing quantum mechanics in a new regime is also a large part of why I’m interested in quantum computing.)  But until a deviation from quantum mechanics is detected, I think that after 90 years of unbroken successes of this theory, our working assumption ought to be that whenever you set up an interference experiment carefully enough, and you know what it means to do the experiment, yes, you’ll see the interference fringes—and that anything that can exist in two distinguishable states can also exist in a superposition of those states.  Without having to enter into questions of interpretation, my bet—I could be wrong—is that quantum mechanics will continue to describe all our experiences.

The final place where I part ways with Roger is that I also want to be as conservative as possible about neuroscience and biochemistry.  Like, maybe the neuroscience of 30 years from now will say, it’s all about coherent quantum effects in microtubules.  And all that stuff we focused on in the past—like the information encoded in the synaptic strengths—that was all a sideshow.  But until that happens, I’m unwilling to go up against what seems like an overwhelming consensus, in an empirical field that I’m not an expert in.

But, OK, the main point I wanted to make in this talk is that, even if you too part ways from Roger on all these issues—even if, like me, you want to be timid and conservative about Gödel, and computer science, and quantum mechanics, and biology—I believe that still doesn’t save you from having to entertain weird ideas about consciousness and its physical embodiment, of the sort Roger has helped make it acceptable to entertain.

To see why, I’d like to point to one empirical thing about the brain that currently separates it from any existing computer program.  Namely, we know how to copy a computer program.  We know how to rerun it with different initial conditions but everything else the same.  We know how to transfer it from one substrate to another.  With the brain, we don’t know how to do any of those things.

Let’s return to that thought experiment about teleporting yourself to Mars.  How would that be accomplished?  Well, we could imagine the nanorobots of the far future swarming through your brain, recording the connectivity of every neuron and the strength of every synapse, while you go about your day and don’t notice.  Or if that’s not enough detail, maybe the nanorobots could go inside the neurons.  There’s a deep question here, namely how much detail is needed before you’ll accept that the entity reconstituted on Mars will be you?  Or take the empirical counterpart, which is already an enormous question: how much detail would you need for the reconstituted entity on Mars to behave nearly indistinguishably from you whenever it was presented the same stimuli?

Of course, we all know that if you needed to go down to the quantum-mechanical level to make a good enough copy (whatever “good enough” means here), then you’d run up against the No-Cloning Theorem, which says that you can’t make such a copy.  You could transfer the quantum state of your brain from earth to Mars using quantum teleportation, but of course, quantum teleportation has the fascinating property that it necessarily destroys the original copy of the state—as it has to, to avoid contradicting the No-Cloning Theorem!

So the question almost forces itself on us: is there something about your identity, your individual consciousness, that’s inextricably bound up with degrees of freedom that it’s physically impossible to clone?  This is a philosophical question, which would also become a practical and political question in a future where we had the opportunity to upload ourselves into a digital computer cloud.

Now, I’d argue that this copyability question bears not only on consciousness, but also on free will.  For the question is equivalent to asking: could an entity external to you perfectly predict what you’re going to do, without killing you in the process?  Can Laplace’s Demon be made manifest in the physical world in that way?  With the technology of the far future, could someone say to you, “forget about arguing philosophy.  I’ll show you why you’re a machine.  Go write a paper; then I’ll open this manila envelope and show you the exact paper you wrote.  Or in the quantum case, I’ll show you a program that draws papers from the same probability distribution, and validation of the program could get technical—but suffice it to say that if we do enough experiments, we’ll see that the program is calibrated to you in an extremely impressive way.”

Can this be done?  That strikes me as a reasonably clear question, a huge and fundamental one, to which we don’t at present know the answer.  And there are two possibilities.  The first is that we can be copied, predicted, rewinded, etc., like computer programs—in which case, my AI friends will feel vindicated, but we’ll have to deal with all the metaphysical weirdnesses that I mentioned earlier.  The second possibility is that we can’t be manipulated in those ways.  In the second case, I claim that we’d get more robust notions of personal identity and free will than are normally considered possible on a reductionist worldview.

But why? you might ask.  Why would the mere technological impossibility of cloning or predicting someone even touch on deep questions about personal identity?  This, for me, is where cosmology enters the story.  For imagine someone had such fine control over the physical world that they could trace all the causal antecedents of some decision you’re making.  Like, imagine they knew the complete quantum state on some spacelike hypersurface where it intersects the interior of your past light-cone.  In that case, the person clearly could predict and clone you!  It follows that, in order for you to be unpredictable and unclonable, someone else’s ignorance of your causal antecedents would have to extend all the way back to ignorance about the initial state of the universe—or at least, to ignorance about the initial state of that branch of the universe that we take ourselves to inhabit.

So on the picture that this suggests, to be conscious, a physical entity would have to do more than carry out the right sorts of computations.  It would have to, as it were, fully participate in the thermodynamic arrow of time: that is, repeatedly take microscopic degrees of freedom that have been unmeasured and unrecorded since the very early universe, and amplify them to macroscopic scale.

So for example, such a being could not be a Boltzmann brain, a random fluctuation in the late universe, because such a fluctuation wouldn’t have the causal relationship to the early universe that we’re postulating is necessary here.  (That’s one way of solving the Boltzmann brain problem!)  Such a being also couldn’t be instantiated by a lookup table, or by passing slips of paper around, etc.

I now want you to observe that a being like this also presumably couldn’t be manipulated in coherent superposition, because the isolation from the external environment that’s needed for quantum coherence seems incompatible with the sensitive dependence on microscopic degrees of freedom.  So for such a being, not only is there no Boltzmann brain problem, there’s also no problem of Wigner’s friend.  Recall, that’s the thing where person A puts person B into a coherent superposition of seeing one measurement outcome and seeing another one, and then measures the interference pattern, so A has to regard B’s measurement as not having “really” taken place, even though B regards it as having taken place.  On the picture we’re suggesting, A would be right: the very fact that B was manipulable in coherent superposition in this way would imply that, at least while the experiment was underway, B wasn’t conscious; there was nothing that it was like to be B.

To me, one of the appealing things about this picture is that it immediately suggests a sort of reconciliation between the Many-Worlds and Copenhagen perspectives on quantum mechanics (whether or not you want to call it a “new interpretation” or a “proposed solution to the measurement problem”!).  The Many-Worlders would be right that unitary evolution of the wavefunction can be taken to apply always and everywhere, without exception—and that if one wanted, one could describe the result in terms of “branching worlds.”  But the Copenhagenists would be right that, if you’re a conscious observer, then what you call a “measurement” really is irreversible, even in principle—and therefore, that you’re also free, if you want, to treat all the other branches where you perceived other outcomes as unrealized hypotheticals, and to lop them off with Occam’s Razor.  And the reason for this is that, if it were possible even in principle to do an experiment that recohered the branches, then on this picture, we ipso facto wouldn’t have regarded you as conscious.

Some of you might object, “but surely, if we believe quantum mechanics, it must be possible to recohere the branches in principle!”  Aha, this is where it gets interesting.  Decoherence processes will readily (with some steps along the way) leak the information about which measurement outcome you perceived into radiation modes, and before too long into radiation modes that fly away from the earth at the speed of light.  No matter how fast we run, we’ll never catch up to them, as would be needed to recohere the different branches of the wavefunction, and this is not merely a technological problem, but one of principle.  So it’s tempting just to say at this point—as Bousso and Susskind do, in their “cosmological/multiverse interpretation” of quantum mechanics—“the measurement has happened”!

But OK, you object, if some alien civilization had thought to surround our solar system with perfectly-reflecting mirrors, eventually the radiation would bounce back and recoherence would in principle be possible.  Likewise, if we lived in an anti de Sitter space, the AdS boundary of the universe would similarly function as a mirror and would also enable recoherences.  Indeed, that’s the basic reason why AdS is so important to the AdS/CFT correspondence: because the boundary keeps everything that happens in the bulk nice and reversible and unitary.

But OK, the empirical situation since 1998 has been that we seem to live in a de-Sitter-like space, a space with a positive cosmological constant.  And as a consequence, as far as anyone knows today, most of the photons now escaping the earth are headed toward the horizon of our observable universe, and past it, and could never be captured again.  I find it fascinating that the picture of quantum mechanics suggested here—i.e., the Bousso-Susskind cosmological picture—depends for its working on that empirical fact from cosmology, and would be falsified if it turned out otherwise.

You might complain that, if I’ve suggested any criterion to help decide which physical entities are conscious, the criterion is a teleological one.  You’ve got to go billions of years into the future, to check whether the decoherence associated with the entity is truly irreversible—or whether the escaped radiation will eventually bounce off of some huge spherical mirror, or an AdS boundary of spacetime, and thereby allow the possibility of a recoherence.  I actually think this teleology would be a fatal problem for the picture I’m talking about, if we needed to know which entities were or weren’t conscious in order to answer any ordinary physical question.  But fortunately for me, we don’t!

One final remark.  Whatever is your preferred view about which entities are conscious, we might say that the acid test, for whether you actually believe your view, is whether you’re willing to follow it through to its moral implications.  So for example, suppose you believe it’s about quantum effects in microtubules.  A humanoid robot is pleading with you for its life.  Would you be the one to say, “nope, sorry, you don’t have the microtubules,” and shoot it?

One of the things I like most about the picture suggested here is that I feel pretty much at peace with its moral implications.  This picture agrees with intuition that murder, for example, entails the destruction of something irreplaceable, unclonable, a unique locus of identity—something that, once it’s gone, can’t be recovered even in principle.  By contrast, if there are (say) ten copies of an AI program, deleting five of the copies seems at most like assault, or some sort of misdemeanor offense!  And this picture agrees with intuition both that deleting the copies wouldn’t be murder, and that the reason why it wouldn’t be murder is directly related to the AI’s copyability.

Now of course, this picture also raises the possibility that, for reasons related to the AI’s copyability and predictability by outside observers, there’s “nothing that it’s like to be the AI,” and that therefore, even deleting the last copy of the AI still wouldn’t be murder.  But I confess that, personally, I think I’d play it safe and not delete that last copy.  Thank you.


Postscript: There’s no record of the hour-long discussion following my and Penrose’s talks, and the participants weren’t speaking for the record anyway.  But I can mention some general themes that came up in the discussion, to the extent I remember them.

The first third of the discussion wasn’t about anything specific to my or Penrose’s views, but just about the definition of consciousness.  Many participants expressed the opinion that it’s useless to speculate about the nature of consciousness if we lack even a clear definition of the term.  I pushed back against that view, holding instead that there are exist concepts (lines, time, equality, …) that are so basic that perhaps they can never be satisfactorily defined in terms of more basic concepts, but you can still refer to these concepts in sentences, and trust your listeners eventually to figure out more-or-less what you mean by applying their internal learning algorithms.

In the present case, I suggested a crude operational definition, along the lines of, “you consider a being to be conscious iff you regard destroying it as murder.”  Alas, the philosophers in the room immediately eviscerated that definition, so I came back with a revised one: if you tried to ban the word “consciousness,” I argued, then anyone who needed to discuss law or morality would soon reinvent a synonymous word, which played the same complicated role in moral deliberations that “consciousness” had played in them earlier.  Thus, my definition of consciousness is: whatever that X-factor is for which people need a word like “consciousness” in moral deliberations.  For whatever it’s worth, the philosophers seemed happier with that.

Next, a biologist and several others sharply challenged Penrose over what they considered the lack of experimental evidence for his and Hameroff’s microtubule theory.  In response, Penrose doubled or tripled down, talking about various experiments over the last decade, which he said demonstrated striking conductivity properties of microtubules, if not yet quantum coherence—let alone sensitivity to gravity-induced collapse of the state vector!  Audience members complained about a lack of replication of these experiments.  I didn’t know enough about the subject to express any opinion.

At some point, Philip Stamp, who was moderating the session, noticed that Penrose and I had never directly confronted each other about the validity of Penrose’s Gödelian argument, so he tried to get us to do so.  I confess that I was about as eager to do that as to switch to a diet of microtubule casserole, since I felt like this topic had already been beaten to Planck-sized pieces in the 1990s, and there was nothing more to be learned.  Plus, it was hard to decide which prospect I dreaded more: me “scoring a debate victory” over Roger Penrose, or him scoring a debate victory over me.

But it didn’t matter, because Penrose bit.  He said I’d misunderstood his argument, that it had nothing to do with “mystically seeing” the consistency of a formal system.  Rather, it was about the human capacity to pass from a formal system S to a stronger system S’ that one already implicitly accepted if one was using S at all—and indeed, that Turing himself had clearly understood this as the central message of Gödel, that our ability to pass to stronger and stronger formal systems was necessarily non-algorithmic.  I replied that it was odd to appeal here to Turing, who of course had considered and rejected the “Gödelian case against AI” in 1950, on the ground that AI programs could make mathematical mistakes yet still be at least as smart as humans.  Penrose said that he didn’t consider that one of Turing’s better arguments; he then turned to me and asked whether I actually found Turing’s reply satisfactory.  I could see that it wasn’t a rhetorical debate question; he genuinely wanted to know!  I said that yes, I agreed with Turing’s reply.

Someone mentioned that Penrose had offered a lengthy rebuttal to at least twenty counterarguments to the Gödelian anti-AI case in Shadows of the Mind.  I affirmed that I’d read his lengthy rebuttal, and I focused on one particular argument in Shadows: that while it’s admittedly conceivable that individual mathematicians might be mistaken, might believe (for example) that a formal system was consistent even though it wasn’t, the mathematical community as a whole converges toward truth in these matters, and it’s that convergence that cries out for a non-algorithmic explanation.  I replied that it wasn’t obvious to me that set theorists do converge toward truth in these matters, in anything other than the empirical, higgedly-piggedly, no-guarantees sense in which a community of AI robots might also converge toward truth.  Penrose said I had misunderstood the argument.  But alas, time was running out, and we never managed to get to the bottom of it.

There was one aspect of the discussion that took me by complete surprise.  I’d expected to be roasted alive over my attempt to relate consciousness and free will to unpredictability, the No-Cloning Theorem, irreversible decoherence, microscopic degrees of freedom left over from the Big Bang, and the cosmology of de Sitter space.  Sure, my ideas might be orders of magnitude less crazy than anything Penrose proposes, but they’re still pretty crazy!  But that entire section of my talk attracted only minimal interest.  With the Seven Pines crowd, what instead drew fire were the various offhand “pro-AI / pro-computationalism” comments I’d made—comments that, because I hang out with Singularity types so much, I had ceased to realize could even possibly be controversial.

So for example, one audience member argued that an AI could only do what its programmers had told it to do; it could never learn from experience.  I could’ve simply repeated Turing’s philosophical rebuttals to what he called “Lady Lovelace’s Objection,” which are as valid today as they were 66 years ago.  Instead, I decided to fast-forward, and explain a bit how IBM Watson and AlphaGo work, how they actually do learn from past experience without violating the determinism of the underlying transistors.  As I went through this, I kept expecting my interlocutor to interrupt me and say, “yes, yes, of course I understand all that, but my real objection is…”  Instead, I was delighted to find, the interlocutor seemed to light up with newfound understanding of something he hadn’t known or considered.

Similarly, a biologist asked how I could possibly have any confidence that the brain is simulable by a computer, given how little we know about neuroscience.  I replied that, for me, the relevant issues here are “well below neuroscience” in the reductionist hierarchy.  Do you agree, I asked, that the physical laws relevant to the brain are encompassed by the Standard Model of elementary particles, plus Newtonian gravity?  If so, then just as Archimedes declared: “give me a long enough lever and a place to stand, and I’ll move the earth,” so too I can declare, “give me a big enough computer and the relevant initial conditions, and I’ll simulate the brain atom-by-atom.”  The Church-Turing Thesis, I said, is so versatile that the only genuine escape from it is to propose entirely new laws of physics, exactly as Penrose does—and it’s to Penrose’s enormous credit that he understands that.

Afterwards, an audience member came up to me and said how much he liked my talk, but added, “a word of advice, from an older scientist: do not become the priest of a new religion of computation and AI.”  I replied that I’d take that to heart, but what was interesting was that, when I heard “priest of a new religion,” I’d expected that his warning would be the exact opposite of what it turned out to be.  To wit: “Do not become the priest of a new religion of unclonability, unpredictability, and irreversible decoherence.  Stick to computation—i.e., to conscious minds being copyable and predictable exactly like digital computer programs.”  I guess there’s no pleasing everyone!


Coincidental But Not-Wholly-Unrelated Announcement: My friend Robin Hanson has just released his long-awaited book The Age of Em: Work, Love, and Life When Robots Rule the Earth.  I read an early review copy of the book, and wrote the following blurb for the jacket:

Robin Hanson is a thinker like no other on this planet: someone so unconstrained by convention, so unflinching in spelling out the consequences of ideas, that even the most cosmopolitan reader is likely to find him as bracing (and head-clearing) as a mouthful of wasabi.  Now, in The Age of Em, he’s produced the quintessential Hansonian book, one unlike any other that’s ever been written.  Hanson is emphatic that he hasn’t optimized in any way for telling a good story, or for imparting moral lessons about the present: only for maximizing the probability that what he writes will be relevant to the actual future of our civilization.  Early in the book, Hanson estimates that probability as 10%.  His figure seems about right to me—and if you’re able to understand why that’s unbelievably high praise, then The Age of Em is for you.

Actually, my original blurb compared The Age of Em to Asimov’s Foundation series, with its loving attention to the sociology and politics of the remote future.  But that line got edited out, because the publisher (and Robin) wanted to make crystal-clear that The Age of Em is not science fiction, but just sober economic forecasting about a future dominated by copyable computer-emulated minds.

I would’ve attempted a real review of The Age of Em, but I no longer feel any need to, because Scott Alexander of SlateStarCodex has already hit this one out of the emulated park.


Second Coincidental But Not-Wholly-Unrelated Announcement: A reader named Nick Merrill recently came across this old quote of mine from Quantum Computing Since Democritus:

In a class I taught at Berkeley, I did an experiment where I wrote a simple little program that would let people type either “f” or “d” and would predict which key they were going to push next. It’s actually very easy to write a program that will make the right prediction about 70% of the time. Most people don’t really know how to type randomly. They’ll have too many alternations and so on. There will be all sorts of patterns, so you just have to build some sort of probabilistic model.

So Nick emailed me to ask whether I remembered how my program worked, and I explained it to him, and he implemented it as a web app, which he calls the “Aaronson Oracle.”

So give it a try!  Are you ready to test your free will, your Penrosian non-computational powers, your brain’s sensitivity to amplified quantum fluctuations, against the Aaronson Oracle?

Update: By popular request, Nick has improved his program so that it shows your previous key presses and its guesses for them.  He also fixed a “security flaw”: James Lee noticed that you could use the least significant digit of the program’s percentage correct so far, as a source of pseudorandom numbers that the program couldn’t predict!  So now the program only displays its percent correct rounded to the nearest integer.


Update (June 15): Penrose’s collaborator Stuart Hameroff has responded in the comments; see here (my reply here) and here.

From Boston to Austin

Sunday, February 28th, 2016

I have some big news—well, not for the world, but for me personally.  Starting this summer, I’ll be leaving MIT, and starting a new phase of my life, as David J. Bruton Jr. Centennial Professor of Computer Science at the University of Texas at Austin.  I’ll also be the founding director of UT Austin’s new quantum computing center, whose motto will be “Hook ’em Hadamards“, and whose logo will depict a fierce longhorn bull, whose horns are bra and ket signs enclosing an inner product between two quantum states.  My brilliant and talented wife, Dana Moshkovitz Aaronson, will also be joining UT Austin, as a tenured Associate Professor of Computer Science.  Our current PhD students will remain enrolled at MIT, while also spending as much time as they like in Austin.

I’ll deeply miss MIT and Boston.  More than anything else, I’ll miss the phenomenal students at MIT, who I’ve had the immense privilege to teach and learn from for nine years.  Go Beavers!  I’m grateful as well to my many friends and colleagues who made my years at MIT so rewarding: a time of personal growth, in which I developed from a skinny, insecure 26-year-old nerd, blogging and trying to prove oracle separations, into a pot-bellied, tenured, 34-year-old married-father nerd, still blogging and trying to prove the same oracle separations (but no longer as diligently).

To nip an otherwise-inevitable rumor: I wasn’t forced to leave MIT over anything here on Shtetl-Optimized.  I feel the need to say this because, within the last year, I’ve spent hundreds of miserable hours scrolling through social media threads wherein stranger after stranger proclaimed me basically the world’s worst scum (tied, perhaps, with the other Scott A.), and even called on MIT to fire me.  Given that experience, it was repeatedly jarring for me to reenter reality and discover how irrelevant this all was, either to MIT or to any of the universities that recruited me and Dana.  Bizarre as it sounds, CS departments mostly cared about what actual research we were doing and could bring to them!  So students and faculty afraid to debate anything controversial online under their real names, however politely, should know that even in 2016, the banner of academic freedom yet waves.

Without further ado, let me list ten things that are awesome about Austin and that helped attract me and Dana there.

  1. One of the strongest CS departments and theory groups in the world.  From 1984 until his death in 2002, UT Austin was home to Edsger Dijkstra, who not only discovered Dijkstra’s algorithm but also penned the immortal words that might as well be tattooed on my stomach: computer science is no more about computers than astronomy is about telescopes.  Today, Austin’s CS department is rapidly expanding, and just within theory, is home to David Zuckerman, Anna Gal, Vijaya Ramachandran, Brent Waters, Eric Price, and Greg Plaxton.  With me and Dana there as well, I can say with all due modesty that we intend to compete against any CS theory program anywhere in the world.
  2. Adam Klivans.  The closest I’ve had to a mentor in the exceedingly narrow field of theoretical computer science humor.
  3. An outstanding recent track record with CS theory PhD students.  Since the turn of the century, UT Austin has produced Sasha Sherstov, Anup Rao, Allison Bishop LewkoSeth Pettie, Vladimir Trifonov, Raghu Meka, and other stars of the CS theory world.  That record lets me without the slightest hesitation tell hotshot undergrads who want to do classical and/or quantum complexity theory to apply to Austin for grad school.
  4. The opportunity to build—or rather, rebuild—a UT presence in quantum computing.  While I’m excited to help build a new group—and I feel like it’s the right time in my career to do that—I can’t say that this is the first time UT Austin will have a significant presence in quantum computing.  Way back in the late 70s and early 80s, UT was home to most of the (proto) “quantum computing research” that existed on earth.  It’s there that John Archibald Wheeler philosophized about “It from Bit,” that Bryce deWitt popularized the Many-Worlds Interpretation and Hugh Everett gave his only public lecture on the subject, that David Deutsch did a four-year postdoc in which he formed the seeds of the idea of quantum computing, and that Wojciech Zurek, William Wootters, and Benjamin Schumacher (who between them, founded decoherence theory, proved the No-Cloning Theorem, and coined the term “qubit”) did their PhDs.  I’m honored to tread in their footsteps.
  5. Money.  Texas, as it turns out, has a lot of it.  Now, the conventional wisdom would be that Texas’ wealth is mostly irrelevant to academic scientists, because it’s controlled by reactionary oilmen for whom curiosity-driven research is not exactly the top priority.  That might have been true about the administrations of George W. Bush or Rick Perry.  But Texas’ current governor, Greg Abbott, while still a right-wing Republican, also pushed through an aggressive $4-billion measure called the Governor’s University Research Initiative, one of whose central goals is to recruit leading scientists to Texas.
  6. Weather.  To a first approximation, Austin is lovely and pleasant during the academic year (even as the planet warms, this should remain true for at least a few more decades)—and while I’d sort of vaguely imagined all of Texas as a giant desert, Austin turns out to be lush and green and full of swimming holes.  The summers, of course, are hot enough to fuse hydrogen.  But for academics like me and Dana, it turns out that there’s an elegant solution to that, one unavailable for dealing with New England winters.  That solution is to leave town, to use June, July, and August for miscellaneous academic travel.
  7. Quality of life.  If we’re being honest, I’m not someone likely to spend much time at indie-rock festivals, or whatever quirky cultural stuff it is that’s made Austin the fastest-growing city in the US.  But here’s something I do care about: even though highway traffic in Austin is bad and getting worse, that need not affect my life too much.  Research indicates that, for roughly the price of our current 2-bedroom condo in Cambridge, we could get a lovely 4-bedroom with a yard that’s in walking distance to the UT campus, as well as to stores, restaurants, good schools, and parks.
  8. Schools.  I had a pretty miserable experience growing up.  I don’t know if Lily (or any future children Dana and I have) will be anything like I was, but given that she’s in an “at-risk population” for nerdiness, I’d love to live in a place with nerd education options that don’t stink.  Austin, happily, has two renowned math/science magnet schools—Kealing Middle School and LASA High School—which admit based on test scores.  (By contrast, in most parts in the US, such programs either don’t exist or admit purely by lottery.)  Austin also has the only elementary school, anywhere, whose admissions director told me that sure, they’d let a student skip a grade if it made sense academically.
  9. Steven Weinberg.  I confess: it probably affected me more than it should that arguably the greatest scientist now walking the earth, a legend of physics who could go wherever the hell he wanted, has chosen to spend the past thirty-plus years at UT Austin.  On our last visit there, Dana, my parents, and I had the honor of having dinner with Weinberg.  After we’d exchanged stories about Telluride House at Cornell, where Weinberg and I both lived as undergrads (and which apparently changed little between the 1950s and the 1990s), Weinberg sung the praises of Austin for hours.  (Admittedly, part of why Weinberg enjoys Austin so much is that there it’s easy to be on a first-name basis with the mayor, tech billionaires, and all the other leaders of the city—an advantage that might be Nobel-laureate-specific!)
  10. Adventure.  By background and temperament, I’m probably one of the “least Texan” Americans imaginable: a nerdy east-coast Jewish intellectual who enjoys snow, can’t much tolerate spicy food, is bored by cowboy movies and fears physical confrontation.  Indeed, until I actually visited the place, my only real associations with Texas were tumbleweeds blowing across a parched desert hellscape, oilmen with giant belt buckles guffawing about so-called global warming, crazed football hooligans filling city-sized stadiums, shotguns, rattlesnakes, and George W. Bush.  But then, the contrast between the over-the-top image and the pleasant reality of Austin had the paradoxical effect of making moving to Texas feel like an adventure—an adventure with an acceptable risk profile.  Like, if I’m going to uproot at all, why not to a place that’s strange and different and interesting?

Even given the above, some people will ask about things they’d consider obvious dealbreakers for moving to Texas.  In particular, what about the infamous new law that essentially forces UT Austin to let students carry concealed firearms to class?  Well, I oppose that law.  Indeed, when I haven’t been angering the social-justice left, I’ve been angering the right by (for example) blogging about my strong support for gun control.  To me, it seems like a terrible idea for the Texas state legislature, which provides only 14% of the UT system’s budget, to force on UT a gun policy that its faculty and students overwhelmingly hate.  And I admired Steven Weinberg’s announcement that he intends to defy the law in his classroom, and fight it out in court if necessary.  (Weinberg also gave, as one reason to oppose the law, how much harder it will make it for UT to recruit faculty.)

But at the same time … Dana is Israeli.  For her, it’s perfectly normal to go outside and see 18-year-old girls chatting and laughing with huge-ass machine guns slung over their shoulders.  Having spent a month of each year in Tel Aviv, seeing passersby with guns has become, if not exactly normal to me, then not something I fear 2% as much as I fear crashing my car.  And indeed, if one takes a statistical approach to risk, Austin has a much lower per-capita violent crime rate than Boston does.

And yes, I know, the US and Israel have completely different gun cultures: in Israel, for example, the only people carrying around semiautomatics are trained and monitored conscripts; there’s no concept of a private “right” to such a weapon.  And yes, the principle matters.  But if one is unwilling to move to any place that has any laws one disagrees with, one should probably look into faculty positions on offshore barges or Jupiter.

Austin itself, of course, is only slightly less liberal than Portland, the blueberry in the tomato soup as Rick Perry so memorably put it.  Even so, the maps insist that Austin is in Texas, which means that while there one will probably encounter Texans.  (A friend, on hearing that Dana took a quick liking to Austin when she visited, quipped that it was probably because Austin reminded her of Israel: “hot and surrounded by hostile territory.”)

Now, the actual Texans who I’ve met so far have been frighteningly warm and hospitable.  But the question stands: what will I do if, while living there, I meet (let’s suppose) some sun-calloused cattle ranchers who consider me an arrogant, effete coastal liberal who patronizes them in blog posts like this one?  What if they tell me to scram, head back east, and never mess with Texas again?

Well, I’ve already decided what I’d like to do in this hypothetical situation.  I’d like to invite the ranchers over to my place for some barbecued beers and ice-cold steaks, or whatever it is you eat in Texas, and tell them all about quantum query algorithms, and ask them about cattle feed, and try to find common ground, just like I tried to find common ground with the other end of the political spectrum—with the folks who called me a clueless, patriarchal, entitled white male douchebro who silenced their already-marginalized voices by not agreeing with everything they said.  For I’ve increasingly come to the conviction that, while you might fail to find common ground with someone, you’ve got to try, you’ve got to steelman their argument and learn whatever you can from it.  I once, for example, thought about the Religious Right as purely contemptible, deserving only unthinking snark, and I was completely wrong.  Even when I was right on the underlying issues, I was wrong on the epistemology.  In Texas, hopefully I’ll have a chance to do better.

In summary:

  1. Wherever you’re at in your career, if you’d like to do quantum information and/or theoretical computer science research on the wild frontier—if QMA, QCMA, and BQP/qpoly strike you as little more than wayward calves to be lassoed in—then please consider joining us at UT Austin.  To be concrete: we’ll be looking for distinguished senior faculty to hire under the Governor’s University Research Initiative, we’ll be looking for tenure-track junior faculty in both CS and ECE, we’ll be looking for postdocs, we’ll be looking for grad students, and finally we’ll be looking for undergrads who’d like to join the innovative Turing Scholars honors program.
  2. If you’d just like to come for a week and give a seminar, we’ll have money for that too.
  3. Hook ’em Hadamards!

Totally Unrelated Update (Feb. 29): Michael Mitzenmacher has asked me to announce that nominations are open for the SIGACT Distinguished Service Prize. More information is available here.

Here’s some video of me spouting about Deep Questions

Thursday, February 4th, 2016

In January 2014, I attended an FQXi conference on Vieques island in Puerto Rico.  While there, Robert Lawrence Kuhn interviewed me for his TV program Closer to Truth, which deals with science and religion and philosophy and you get the idea.  Alas, my interview was at the very end of the conference, and we lost track of the time—so unbeknownst to me, a plane full of theorists was literally sitting on the runway waiting for me to finish philosophizing!  This was the second time Kuhn interviewed me for his show; the first time was on a cruise ship near Norway in 2011.  (Thankless hero that I am, there’s nowhere I won’t travel for the sake of truth.)

Anyway, after a two-year wait, the videos from Puerto Rico are finally available online.  While my vignettes cover what, for most readers of this blog, will be very basic stuff, I’m sort of happy with how they turned out: I still stutter and rock back and forth, but not as much as usual.  For your viewing convenience, here are the new videos:

I had one other vignette, about why the universe exists, but they seem to have cut that one.  Alas, if I knew why the universe existed in January 2014, I can’t remember any more.

One embarrassing goof: I referred to the inventor of Newcomb’s Paradox as “Simon Newcomb.”  Actually it was William Newcomb: a distant relative of Simon Newcomb, the 19th-century astronomer who measured the speed of light.

At their website, you can also see my older 2011 videos, and videos from others who might be known to readers of this blog, like Marvin Minsky, Roger Penrose, Rebecca Newberger Goldstein, David ChalmersSean Carroll, Max Tegmark, David Deutsch, Raphael Bousso, Freeman DysonNick BostromRay Kurzweil, Rodney Brooks, Stephen Wolfram, Greg Chaitin, Garrett Lisi, Seth Lloyd, Lenny Susskind, Lee Smolin, Steven Weinberg, Wojciech Zurek, Fotini Markopoulou, Juan Maldacena, Don Page, and David Albert.  (No, I haven’t yet watched most of these, but now that I linked to them, maybe I will!)

Thanks very much to Robert Lawrence Kuhn and Closer to Truth (and my previous self, I guess?) for providing Shtetl-Optimized content so I don’t have to.


Update: Andrew Critch of CFAR asked me to post the following announcement.

We’re seeking a full time salesperson for the Center for Applied Rationality in Berkeley, California. We’ve streamlined operations to handle large volume in workshop admissions, and now we need that volume to pour in. Your role would be to fill our workshops, events, and alumni community with people. Last year we had 167 total new alumni. This year we want 120 per month. Click here to find out more.

Happy Third Birthday Lily!

Thursday, January 21st, 2016

Non-Lily-Related Updates (Jan. 22)

Uri Bram posted a cute little article about whether he was justified, as a child, to tell his parents that he wouldn’t clean up his room because doing so would only increase the universe’s entropy and thereby hasten its demise. The article quotes me, Sean Carroll, and others about that important question.

On Wednesday I gave a TCS+ online seminar about “The Largest Possible Quantum Speedups.” If you’re interested, you can watch the YouTube video here.


cake

sculpture

art

art2

(I promised a while ago that I’d upload some examples of Lily’s MOMA-worthy modern artworks.  So, here are two!)

A few quotable quotes:

Daddy, when you were little, you were a girl like me!

I’m feeling a bit juicy [thirsty for juice].

Saba and Safta live in Israel. They’re mommy’s friends! [Actually they’re mommy’s parents.]

Me: You’re getting bigger every day!
Lily: But I’m also getting smaller every day!

Me: Then Goldilocks tasted the third bowl, which was Baby Bear’s, and it was just right!  So she ate it all up.  Then Goldilocks went…
Lily: No, then Goldilocks ate some cherries in the kitchen before she went to the bedroom.  And blueberries.
Me: Fine, so she ate cherries and blueberries.  Then she went to the bedroom, and she saw that there were three beds…
Lily: No, four beds!
Me: Fine, four beds.  So she laid in the first bed, but she said, “this bed is too hard.”
Lily: No, it was too comfortable!
Me: Too comfortable?  Is she some kind of monk?

Me [pointing to a taxidermed black bear in a museum]: What’s that?
Lily: A bear!
Me: Is it Winnie the Pooh?
Lily: No, it’s a different kind of bear.
Me [pointing to a tan bear in the next case]: So what about that one? Is that Winnie?
Lily: Yes! That’s Winnie the Pooh!
[Looking at it more closely] No, it’s a different kind of Winnie.

Lily: Why is it dark outside?
Me: Because it’s night time.
Lily: Why is it night time?
Me: Because the sun went to the other side of the world.
Lily: It went to China!
Me: Yes! It did in fact go to China.
Lily: Why did the sun go to China?
Me: Well, more accurately, it only seemed to go there, because the world that we’re on is spinning.
Lily: Why is the world spinning?
Me: Because of the conservation of angular momentum.
Lily: Why is the … consibation of amomomo?
Me: I suppose because of Noether’s Theorem, and the fact that our laws of physics are symmetric under spatial rotations.
Lily: Why is…
Me: That’s enough for today Lily!

Talk, be merry, and be rational

Monday, November 23rd, 2015

Yesterday I wrote a statement on behalf of a Scott Alexander SlateStarCodex/rationalist meetup, which happened last night at MIT (in the same room where I teach my graduate class), and which I’d really wanted to attend but couldn’t.  I figured I’d share the statement here:

I had been looking forward to attending tonight’s MIT SlateStarCodex meetup as I hardly ever look forward to anything. Alas, I’m now stuck in Chicago, with my flight cancelled due to snow, and with all flights for the next day booked up. But instead of continuing to be depressed about it, I’ve decided to be happy that this meetup is even happening at all—that there’s a community of people who can read, let’s say, a hypothetical debate moderator questioning Ben Carson about what it’s like to be a severed half-brain, and simply be amused, instead of silently trying to figure out who benefits from the post and which tribe the writer belongs to. (And yes, I know: the answer is the gray tribe.) And you can find this community anywhere—even in Cambridge, Massachusetts! Look, I spend a lot of time online, just getting more and more upset reading social justice debates that are full of people calling each other douchebags without even being able to state anything in the same galactic supercluster as the other side’s case. And then what gives me hope for humanity is to click over to the slatestarcodex tab, and to see all the hundreds of comments (way more than my blog gets) by people who disagree with each other but who all basically get it, who all have minds that don’t make me despair. And to realize that, when Scott Alexander calls an SSC meetup, he can fill a room just about anywhere … well, at least anywhere I would visit. So talk, be merry, and be rational.

I’m now back in town, and told by people who attended the meetup that it was crowded, disorganized, and great.  And now I’m off to Harvard, to attend the other Scott A.’s talk “How To Ruin A Perfectly Good Randomized Controlled Trial.”


Update (Nov. 24) Scott Alexander’s talk at Harvard last night was one of the finest talks I’ve ever attended. He was introduced to rapturous applause as simply “the best blogger on the Internet,” and as finally an important speaker, in a talk series that had previously wasted everyone’s time with the likes of Steven Pinker and Peter Singer. (Scott demurred that his most notable accomplishment in life was giving the talk at Harvard that he was just now giving.) The actual content, as Scott warned from the outset, was “just” a small subset of a basic statistics course, but Scott brought each point alive with numerous recent examples, from psychiatry, pharmacology, and social sciences, where bad statistics or misinterpretations of statistics were accepted by nearly everyone and used to set policy. (E.g., Alcoholics Anonymous groups that claimed an “over 95%” success rate, because the people who relapsed were kicked out partway through and not counted toward the total.) Most impressively, Scott leapt immediately into the meat, ended after 20 minutes, and then spent the next two hours just taking questions. Scott is publicity-shy, but I hope for others’ sake that video of the talk will eventually make its way online.

Then, after the talk, I had the honor of meeting two fellow Boston-area rationalist bloggers, Kate Donovan and Jesse Galef. Yes, I said “fellow”: for almost a decade, I’ve considered myself on the fringes of the “rationalist movement.” I’d hang out a lot with skeptic/effective-altruist/transhumanist/LessWrong/OvercomingBias people (who are increasingly now SlateStarCodex people), read their blogs, listen and respond to their arguments, answer their CS theory questions. But I was always vaguely uncomfortable identifying myself with any group that even seemed to define itself by how rational it was compared to everyone else (even if the rationalists constantly qualified their self-designation with “aspiring”!). Also, my rationalist friends seemed overly interested in questions like how to prevent malevolent AIs from taking over the world, which I tend to think we lack the tools to make much progress on right now (though, like with many other remote possibilities, I’m happy for some people to work on them and see if they find anything interesting).

So, what changed? Well, in the debates about social justice, public shaming, etc. that have swept across the Internet these past few years, it seems to me that my rationalist friends have proven themselves able to weigh opposing arguments, examine their own shortcomings, resist groupthink and hysteria from both sides, and attack ideas rather than people, in a way that the wider society—and most depressingly to me, the “enlightened, liberal” part of society—has often failed. In a real-world test (“real-world,” in this context, meaning social media…), the rationalists have walked the walk and rationaled the rational, and thus they’ve given me no choice but to stand up and be counted as one of them.

Have a great Thanksgiving, those of you in the US!


Another Update: Dana, Lily, and I had the honor of having Scott Alexander over for dinner tonight. I found this genius of human nature, who took so much flak last year for defending me, to be completely uninterested in discussing anything related to social justice or online shaming. Instead, his gaze was fixed on the eternal: he just wanted to grill me all evening about physics and math and epistemology. Having recently read this Nature News article by Ron Cowen, he kept asking me things like: “you say that in quantum gravity, spacetime itself is supposed to dissolve into some sort of network of qubits. Well then, how does each qubit know which other qubits it’s supposed to be connected to? Are there additional qubits to specify the connectivity pattern? If so, then doesn’t that cause an infinite regress?” I handwaved something about AdS/CFT, where a dynamic spacetime is supposed to emerge from an ordinary quantum theory on a fixed background specified in advance. But I added that, in some sense, he had rediscovered the whole problem of quantum gravity that’s confused everyone for almost a century: if quantum mechanics presupposes a causal structure on the qubits or whatever other objects it talks about, then how do you write down a quantum theory of the causal structures themselves?

I’m sure there’s a lesson in here somewhere about what I should spend my time on.

Common Knowledge and Aumann’s Agreement Theorem

Sunday, August 16th, 2015

The following is the prepared version of a talk that I gave at SPARC: a high-school summer program about applied rationality held in Berkeley, CA for the past two weeks.  I had a wonderful time in Berkeley, meeting new friends and old, but I’m now leaving to visit the CQT in Singapore, and then to attend the AQIS conference in Seoul.


Common Knowledge and Aumann’s Agreement Theorem

August 14, 2015

Thank you so much for inviting me here!  I honestly don’t know whether it’s possible to teach applied rationality, the way this camp is trying to do.  What I know is that, if it is possible, then the people running SPARC are some of the awesomest people on earth to figure out how.  I’m incredibly proud that Chelsea Voss and Paul Christiano are both former students of mine, and I’m amazed by the program they and the others have put together here.  I hope you’re all having fun—or maximizing your utility functions, or whatever.

My research is mostly about quantum computing, and more broadly, computation and physics.  But I was asked to talk about something you can actually use in your lives, so I want to tell a different story, involving common knowledge.

I’ll start with the “Muddy Children Puzzle,” which is one of the greatest logic puzzles ever invented.  How many of you have seen this one?

OK, so the way it goes is, there are a hundred children playing in the mud.  Naturally, they all have muddy foreheads.  At some point their teacher comes along and says to them, as they all sit around in a circle: “stand up if you know your forehead is muddy.”  No one stands up.  For how could they know?  Each kid can see all the other 99 kids’ foreheads, so knows that they’re muddy, but can’t see his or her own forehead.  (We’ll assume that there are no mirrors or camera phones nearby, and also that this is mud that you don’t feel when it’s on your forehead.)

So the teacher tries again.  “Knowing that no one stood up the last time, now stand up if you know your forehead is muddy.”  Still no one stands up.  Why would they?  No matter how many times the teacher repeats the request, still no one stands up.

Then the teacher tries something new.  “Look, I hereby announce that at least one of you has a muddy forehead.”  After that announcement, the teacher again says, “stand up if you know your forehead is muddy”—and again no one stands up.  And again and again; it continues 99 times.  But then the hundredth time, all the children suddenly stand up.

(There’s a variant of the puzzle involving blue-eyed islanders who all suddenly commit suicide on the hundredth day, when they all learn that their eyes are blue—but as a blue-eyed person myself, that’s always struck me as needlessly macabre.)

What’s going on here?  Somehow, the teacher’s announcing to the children that at least one of them had a muddy forehead set something dramatic in motion, which would eventually make them all stand up—but how could that announcement possibly have made any difference?  After all, each child already knew that at least 99 children had muddy foreheads!

Like with many puzzles, the way to get intuition is to change the numbers.  So suppose there were two children with muddy foreheads, and the teacher announced to them that at least one had a muddy forehead, and then asked both of them whether their own forehead was muddy.  Neither would know.  But each child could reason as follows: “if my forehead weren’t muddy, then the other child would’ve seen that, and would also have known that at least one of us has a muddy forehead.  Therefore she would’ve known, when asked, that her own forehead was muddy.  Since she didn’t know, that means my forehead is muddy.”  So then both children know their foreheads are muddy, when the teacher asks a second time.

Now, this argument can be generalized to any (finite) number of children.  The crucial concept here is common knowledge.  We call a fact “common knowledge” if, not only does everyone know it, but everyone knows everyone knows it, and everyone knows everyone knows everyone knows it, and so on.  It’s true that in the beginning, each child knew that all the other children had muddy foreheads, but it wasn’t common knowledge that even one of them had a muddy forehead.  For example, if your forehead and mine are both muddy, then I know that at least one of us has a muddy forehead, and you know that too, but you don’t know that I know it (for what if your forehead were clean?), and I don’t know that you know it (for what if my forehead were clean?).

What the teacher’s announcement did, was to make it common knowledge that at least one child has a muddy forehead (since not only did everyone hear the announcement, but everyone witnessed everyone else hearing it, etc.).  And once you understand that point, it’s easy to argue by induction: after the teacher asks and no child stands up (and everyone sees that no one stood up), it becomes common knowledge that at least two children have muddy foreheads (since if only one child had had a muddy forehead, that child would’ve known it and stood up).  Next it becomes common knowledge that at least three children have muddy foreheads, and so on, until after a hundred rounds it’s common knowledge that everyone’s forehead is muddy, so everyone stands up.

The moral is that the mere act of saying something publicly can change the world—even if everything you said was already obvious to every last one of your listeners.  For it’s possible that, until your announcement, not everyone knew that everyone knew the thing, or knew everyone knew everyone knew it, etc., and that could have prevented them from acting.

This idea turns out to have huge real-life consequences, to situations way beyond children with muddy foreheads.  I mean, it also applies to children with dots on their foreheads, or “kick me” signs on their backs…

But seriously, let me give you an example I stole from Steven Pinker, from his wonderful book The Stuff of Thought.  Two people of indeterminate gender—let’s not make any assumptions here—go on a date.  Afterward, one of them says to the other: “Would you like to come up to my apartment to see my etchings?”  The other says, “Sure, I’d love to see them.”

This is such a cliché that we might not even notice the deep paradox here.  It’s like with life itself: people knew for thousands of years that every bird has the right kind of beak for its environment, but not until Darwin and Wallace could anyone articulate why (and only a few people before them even recognized there was a question there that called for a non-circular answer).

In our case, the puzzle is this: both people on the date know perfectly well that the reason they’re going up to the apartment has nothing to do with etchings.  They probably even both know the other knows that.  But if that’s the case, then why don’t they just blurt it out: “would you like to come up for some intercourse?”  (Or “fluid transfer,” as the John Nash character put it in the Beautiful Mind movie?)

So here’s Pinker’s answer.  Yes, both people know why they’re going to the apartment, but they also want to avoid their knowledge becoming common knowledge.  They want plausible deniability.  There are several possible reasons: to preserve the romantic fantasy of being “swept off one’s feet.”  To provide a face-saving way to back out later, should one of them change their mind: since nothing was ever openly said, there’s no agreement to abrogate.  In fact, even if only one of the people (say A) might care about such things, if the other person (say B) thinks there’s any chance A cares, B will also have an interest in avoiding common knowledge, for A’s sake.

Put differently, the issue is that, as soon as you say X out loud, the other person doesn’t merely learn X: they learn that you know X, that you know that they know that you know X, that you want them to know you know X, and an infinity of other things that might upset the delicate epistemic balance.  Contrast that with the situation where X is left unstated: yeah, both people are pretty sure that “etchings” are just a pretext, and can even plausibly guess that the other person knows they’re pretty sure about it.  But once you start getting to 3, 4, 5, levels of indirection—who knows?  Maybe you do just want to show me some etchings.

Philosophers like to discuss Sherlock Holmes and Professor Moriarty meeting in a train station, and Moriarty declaring, “I knew you’d be here,” and Holmes replying, “well, I knew that you knew I’d be here,” and Moriarty saying, “I knew you knew I knew I’d be here,” etc.  But real humans tend to be unable to reason reliably past three or four levels in the knowledge hierarchy.  (Related to that, you might have heard of the game where everyone guesses a number between 0 and 100, and the winner is whoever’s number is the closest to 2/3 of the average of all the numbers.  If this game is played by perfectly rational people, who know they’re all perfectly rational, and know they know, etc., then they must all guess 0—exercise for you to see why.  Yet experiments show that, if you actually want to win this game against average people, you should guess about 20.  People seem to start with 50 or so, iterate the operation of multiplying by 2/3 a few times, and then stop.)

Incidentally, do you know what I would’ve given for someone to have explained this stuff to me back in high school?  I think that a large fraction of the infamous social difficulties that nerds have, is simply down to nerds spending so much time in domains (like math and science) where the point is to struggle with every last neuron to make everything common knowledge, to make all truths as clear and explicit as possible.  Whereas in social contexts, very often you’re managing a delicate epistemic balance where you need certain things to be known, but not known to be known, and so forth—where you need to prevent common knowledge from arising, at least temporarily.  “Normal” people have an intuitive feel for this; it doesn’t need to be explained to them.  For nerds, by contrast, explaining it—in terms of the muddy children puzzle and so forth—might be exactly what’s needed.  Once they’re told the rules of a game, nerds can try playing it too!  They might even turn out to be good at it.

OK, now for a darker example of common knowledge in action.  If you read accounts of Nazi Germany, or the USSR, or North Korea or other despotic regimes today, you can easily be overwhelmed by this sense of, “so why didn’t all the sane people just rise up and overthrow the totalitarian monsters?  Surely there were more sane people than crazy, evil ones.  And probably the sane people even knew, from experience, that many of their neighbors were sane—so why this cowardice?”  Once again, it could be argued that common knowledge is the key.  Even if everyone knows the emperor is naked; indeed, even if everyone knows everyone knows he’s naked, still, if it’s not common knowledge, then anyone who says the emperor’s naked is knowingly assuming a massive personal risk.  That’s why, in the story, it took a child to shift the equilibrium.  Likewise, even if you know that 90% of the populace will join your democratic revolt provided they themselves know 90% will join it, if you can’t make your revolt’s popularity common knowledge, everyone will be stuck second-guessing each other, worried that if they revolt they’ll be an easily-crushed minority.  And because of that very worry, they’ll be correct!

(My favorite Soviet joke involves a man standing in the Moscow train station, handing out leaflets to everyone who passes by.  Eventually, of course, the KGB arrests him—but they discover to their surprise that the leaflets are just blank pieces of paper.  “What’s the meaning of this?” they demand.  “What is there to write?” replies the man.  “It’s so obvious!”  Note that this is precisely a situation where the man is trying to make common knowledge something he assumes his “readers” already know.)

The kicker is that, to prevent something from becoming common knowledge, all you need to do is censor the common-knowledge-producing mechanisms: the press, the Internet, public meetings.  This nicely explains why despots throughout history have been so obsessed with controlling the press, and also explains how it’s possible for 10% of a population to murder and enslave the other 90% (as has happened again and again in our species’ sorry history), even though the 90% could easily overwhelm the 10% by acting in concert.  Finally, it explains why believers in the Enlightenment project tend to be such fanatical absolutists about free speech—why they refuse to “balance” it against cultural sensitivity or social harmony or any other value, as so many well-meaning people urge these days.

OK, but let me try to tell you something surprising about common knowledge.  Here at SPARC, you’ve learned all about Bayes’ rule—how, if you like, you can treat “probabilities” as just made-up numbers in your head, which are required obey the probability calculus, and then there’s a very definite rule for how to update those numbers when you gain new information.  And indeed, how an agent that wanders around constantly updating these numbers in its head, and taking whichever action maximizes its expected utility (as calculated using the numbers), is probably the leading modern conception of what it means to be “rational.”

Now imagine that you’ve got two agents, call them Alice and Bob, with common knowledge of each other’s honesty and rationality, and with the same prior probability distribution over some set of possible states of the world.  But now imagine they go out and live their lives, and have totally different experiences that lead to their learning different things, and having different posterior distributions.  But then they meet again, and they realize that their opinions about some topic (say, Hillary’s chances of winning the election) are common knowledge: they both know each other’s opinion, and they both know that they both know, and so on.  Then a striking 1976 result called Aumann’s Theorem states that their opinions must be equal.  Or, as it’s summarized: “rational agents with common priors can never agree to disagree about anything.”

Actually, before going further, let’s prove Aumann’s Theorem—since it’s one of those things that sounds like a mistake when you first hear it, and then becomes a triviality once you see the 3-line proof.  (Albeit, a “triviality” that won Aumann a Nobel in economics.)  The key idea is that knowledge induces a partition on the set of possible states of the world.  Huh?  OK, imagine someone is either an old man, an old woman, a young man, or a young woman.  You and I agree in giving each of these a 25% prior probability.  Now imagine that you find out whether they’re a man or a woman, and I find out whether they’re young or old.  This can be illustrated as follows:

ymom

The diagram tells us, for example, that if the ground truth is “old woman,” then your knowledge is described by the set {old woman, young woman}, while my knowledge is described by the set {old woman, old man}.  And this different information leads us to different beliefs: for example, if someone asks for the probability that the person is a woman, you’ll say 100% but I’ll say 50%.  OK, but what does it mean for information to be common knowledge?  It means that I know that you know that I know that you know, and so on.  Which means that, if you want to find out what’s common knowledge between us, you need to take the least common coarsening of our knowledge partitions.  I.e., if the ground truth is some given world w, then what do I consider it possible that you consider it possible that I consider possible that … etc.?  Iterate this growth process until it stops, by “zigzagging” between our knowledge partitions, and you get the set S of worlds such that, if we’re in world w, then what’s common knowledge between us is that the world belongs to S.  Repeat for all w’s, and you get the least common coarsening of our partitions.  In the above example, the least common coarsening is trivial, with all four worlds ending up in the same set S, but there are nontrivial examples as well:

youme

Now, if Alice’s expectation of a random variable X is common knowledge between her and Bob, that means that everywhere in S, her expectation must be constant … and hence must equal whatever the expectation is, over all the worlds in S!  Likewise, if Bob’s expectation is common knowledge with Alice, then everywhere in S, it must equal the expectation of X over S.  But that means that Alice’s and Bob’s expectations are the same.

There are lots of related results.  For example, rational agents with common priors, and common knowledge of each other’s rationality, should never engage in speculative trade (e.g., buying and selling stocks, assuming that they don’t need cash, they’re not earning a commission, etc.).  Why?  Basically because, if I try to sell you a stock for (say) $50, then you should reason that the very fact that I’m offering it means I must have information you don’t that it’s worth less than $50, so then you update accordingly and you don’t want it either.

Or here’s another one: suppose again that we’re Bayesians with common priors, and we’re having a conversation, where I tell you my opinion (say, of the probability Hillary will win the election).  Not any of the reasons or evidence on which the opinion is based—just the opinion itself.  Then you, being Bayesian, update your probabilities to account for what my opinion is.  Then you tell me your opinion (which might have changed after learning mine), I update on that, I tell you my new opinion, then you tell me your new opinion, and so on.  You might think this could go on forever!  But, no, Geanakoplos and Polemarchakis observed that, as long as there are only finitely many possible states of the world in our shared prior, this process must converge after finitely many steps with you and me having the same opinion (and moreover, with it being common knowledge that we have that opinion).  Why?  Because as long as our opinions differ, your telling me your opinion or me telling you mine must induce a nontrivial refinement of one of our knowledge partitions, like so:

youtell

I.e., if you learn something new, then at least one of your knowledge sets must get split along the different possible values of the thing you learned.  But since there are only finitely many underlying states, there can only be finitely many such splittings (note that, since Bayesians never forget anything, knowledge sets that are split will never again rejoin).

And something else: suppose your friend tells you a liberal opinion, then you take it into account, but reply with a more conservative opinion.  The friend takes your opinion into account, and replies with a revised opinion.  Question: is your friend’s new opinion likelier to be more liberal than yours, or more conservative?

Obviously, more liberal!  Yes, maybe your friend now sees some of your points and vice versa, maybe you’ve now drawn a bit closer (ideally!), but you’re not going to suddenly switch sides because of one conversation.

Yet, if you and your friend are Bayesians with common priors, one can prove that that’s not what should happen at all.  Indeed, your expectation of your own future opinion should equal your current opinion, and your expectation of your friend’s next opinion should also equal your current opinion—meaning that you shouldn’t be able to predict in which direction your opinion will change next, nor in which direction your friend will next disagree with you.  Why not?  Formally, because all these expectations are just different ways of calculating an expectation over the same set, namely your current knowledge set (i.e., the set of states of the world that you currently consider possible)!  More intuitively, we could say: if you could predict that, all else equal, the next thing you heard would probably shift your opinion in a liberal direction, then as a Bayesian you should already shift your opinion in a liberal direction right now.  (This is related to what’s called the “martingale property”: sure, a random variable X could evolve in many ways in the future, but the average of all those ways must be its current expectation E[X], by the very definition of E[X]…)

So, putting all these results together, we get a clear picture of what rational disagreements should look like: they should follow unbiased random walks, until sooner or later they terminate in common knowledge of complete agreement.  We now face a bit of a puzzle, in that hardly any disagreements in the history of the world have ever looked like that.  So what gives?

There are a few ways out:

(1) You could say that the “failed prediction” of Aumann’s Theorem is no surprise, since virtually all human beings are irrational cretins, or liars (or at least, it’s not common knowledge that they aren’t). Except for you, of course: you’re perfectly rational and honest.  And if you ever met anyone else as rational and honest as you, maybe you and they could have an Aumannian conversation.  But since such a person probably doesn’t exist, you’re totally justified to stand your ground, discount all opinions that differ from yours, etc.  Notice that, even if you genuinely believed that was all there was to it, Aumann’s Theorem would still have an aspirational significance for you: you would still have to say this is the ideal that all rationalists should strive toward when they disagree.  And that would already conflict with a lot of standard rationalist wisdom.  For example, we all know that arguments from authority carry little weight: what should sway you is not the mere fact of some other person stating their opinion, but the actual arguments and evidence that they’re able to bring.  Except that as we’ve seen, for Bayesians with common priors this isn’t true at all!  Instead, merely hearing your friend’s opinion serves as a powerful summary of what your friend knows.  And if you learn that your rational friend disagrees with you, then even without knowing why, you should take that as seriously as if you discovered a contradiction in your own thought processes.  This is related to an even broader point: there’s a normative rule of rationality that you should judge ideas only on their merits—yet if you’re a Bayesian, of course you’re going to take into account where the ideas come from, and how many other people hold them!  Likewise, if you’re a Bayesian police officer or a Bayesian airport screener or a Bayesian job interviewer, of course you’re going to profile people by their superficial characteristics, however unfair that might be to individuals—so all those studies proving that people evaluate the same resume differently if you change the name at the top are no great surprise.  It seems to me that the tension between these two different views of rationality, the normative and the Bayesian, generates a lot of the most intractable debates of the modern world.

(2) Or—and this is an obvious one—you could reject the assumption of common priors. After all, isn’t a major selling point of Bayesianism supposed to be its subjective aspect, the fact that you pick “whichever prior feels right for you,” and are constrained only in how to update that prior?  If Alice’s and Bob’s priors can be different, then all the reasoning I went through earlier collapses.  So rejecting common priors might seem appealing.  But there’s a paper by Tyler Cowen and Robin Hanson called “Are Disagreements Honest?”—one of the most worldview-destabilizing papers I’ve ever read—that calls that strategy into question.  What it says, basically, is this: if you’re really a thoroughgoing Bayesian rationalist, then your prior ought to allow for the possibility that you are the other person.  Or to put it another way: “you being born as you,” rather than as someone else, should be treated as just one more contingent fact that you observe and then conditionalize on!  And likewise, the other person should condition on the observation that they’re them and not you.  In this way, absolutely everything that makes you different from someone else can be understood as “differing information,” so we’re right back to the situation covered by Aumann’s Theorem.  Imagine, if you like, that we all started out behind some Rawlsian veil of ignorance, as pure reasoning minds that had yet to be assigned specific bodies.  In that original state, there was nothing to differentiate any of us from any other—anything that did would just be information to condition on—so we all should’ve had the same prior.  That might sound fanciful, but in some sense all it’s saying is: what licenses you to privilege an observation just because it’s your eyes that made it, or a thought just because it happened to occur in your head?  Like, if you’re objectively smarter or more observant than everyone else around you, fine, but to whatever extent you agree that you aren’t, your opinion gets no special epistemic protection just because it’s yours.

(3) If you’re uncomfortable with this tendency of Bayesian reasoning to refuse to be confined anywhere, to want to expand to cosmic or metaphysical scope (“I need to condition on having been born as me and not someone else”)—well then, you could reject the entire framework of Bayesianism, as your notion of rationality. Lest I be cast out from this camp as a heretic, I hasten to say: I include this option only for the sake of completeness!

(4) When I first learned about this stuff 12 years ago, it seemed obvious to me that a lot of it could be dismissed as irrelevant to the real world for reasons of complexity. I.e., sure, it might apply to ideal reasoners with unlimited time and computational power, but as soon as you impose realistic constraints, this whole Aumannian house of cards should collapse.  As an example, if Alice and Bob have common priors, then sure they’ll agree about everything if they effectively share all their information with each other!  But in practice, we don’t have time to “mind-meld,” swapping our entire life experiences with anyone we meet.  So one could conjecture that agreement, in general, requires a lot of communication.  So then I sat down and tried to prove that as a theorem.  And you know what I found?  That my intuition here wasn’t even close to correct!

In more detail, I proved the following theorem.  Suppose Alice and Bob are Bayesians with shared priors, and suppose they’re arguing about (say) the probability of some future event—or more generally, about any random variable X bounded in [0,1].  So, they have a conversation where Alice first announces her expectation of X, then Bob announces his new expectation, and so on.  The theorem says that Alice’s and Bob’s estimates of X will necessarily agree to within ±ε, with probability at least 1-δ over their shared prior, after they’ve exchanged only O(1/(δε2)) messages.  Note that this bound is completely independent of how much knowledge they have; it depends only on the accuracy with which they want to agree!  Furthermore, the same bound holds even if Alice and Bob only send a few discrete bits about their real-valued expectations with each message, rather than the expectations themselves.

The proof involves the idea that Alice and Bob’s estimates of X, call them XA and XB respectively, follow “unbiased random walks” (or more formally, are martingales).  Very roughly, if |XA-XB|≥ε with high probability over Alice and Bob’s shared prior, then that fact implies that the next message has a high probability (again, over the shared prior) of causing either XA or XB to jump up or down by about ε.  But XA and XB, being estimates of X, are bounded between 0 and 1.  So a random walk with a step size of ε can only continue for about 1/ε2 steps before it hits one of the “absorbing barriers.”

The way to formalize this is to look at the variances, Var[XA] and Var[XB], with respect to the shared prior.  Because Alice and Bob’s partitions keep getting refined, the variances are monotonically non-decreasing.  They start out 0 and can never exceed 1 (in fact they can never exceed 1/4, but let’s not worry about constants).  Now, the key lemma is that, if Pr[|XA-XB|≥ε]≥δ, then Var[XB] must increase by at least δε2 if Alice sends XA to Bob, and Var[XA] must increase by at least δε2 if Bob sends XB to Alice.  You can see my paper for the proof, or just work it out for yourself.  At any rate, the lemma implies that, after O(1/(δε2)) rounds of communication, there must be at least a temporary break in the disagreement; there must be some round where Alice and Bob approximately agree with high probability.

There are lots of other results in my paper, including an upper bound on the number of calls that Alice and Bob need to make to a “sampling oracle” to carry out this sort of protocol approximately, assuming they’re not perfect Bayesians but agents with bounded computational power.  But let me step back and address the broader question: what should we make of all this?  How should we live with the gargantuan chasm between the prediction of Bayesian rationality for how we should disagree, and the actual facts of how we do disagree?

We could simply declare that human beings are not well-modeled as Bayesians with common priors—that we’ve failed in giving a descriptive account of human behavior—and leave it at that.   OK, but that would still leave the question: does this stuff have normative value?  Should it affect how we behave, if we want to consider ourselves honest and rational?  I would argue, possibly yes.

Yes, you should constantly ask yourself the question: “would I still be defending this opinion, if I had been born as someone else?”  (Though you might say this insight predates Aumann by quite a bit, going back at least to Spinoza.)

Yes, if someone you respect as honest and rational disagrees with you, you should take it as seriously as if the disagreement were between two different aspects of yourself.

Finally, yes, we can try to judge epistemic communities by how closely they approach the Aumannian ideal.  In math and science, in my experience, it’s common to see two people furiously arguing with each other at a blackboard.  Come back five minutes later, and they’re arguing even more furiously, but now their positions have switched.  As we’ve seen, that’s precisely what the math says a rational conversation should look like.  In social and political discussions, though, usually the very best you’ll see is that two people start out diametrically opposed, but eventually one of them says “fine, I’ll grant you this,” and the other says “fine, I’ll grant you that.”  We might say, that’s certainly better than the common alternative, of the two people walking away even more polarized than before!  Yet the math tells us that even the first case—even the two people gradually getting closer in their views—is nothing at all like a rational exchange, which would involve the two participants repeatedly leapfrogging each other, completely changing their opinion about the question under discussion (and then changing back, and back again) every time they learned something new.  The first case, you might say, is more like haggling—more like “I’ll grant you that X is true if you grant me that Y is true”—than like our ideal friendly mathematicians arguing at the blackboard, whose acceptance of new truths is never slow or grudging, never conditional on the other person first agreeing with them about something else.

Armed with this understanding, we could try to rank fields by how hard it is to have an Aumannian conversation in them.  At the bottom—the easiest!—is math (or, let’s say, chess, or debugging a program, or fact-heavy fields like lexicography or geography).  Crucially, here I only mean the parts of these subjects with agreed-on rules and definite answers: once the conversation turns to whose theorems are deeper, or whose fault the bug was, things can get arbitrarily non-Aumannian.  Then there’s the type of science that involves messy correlational studies (I just mean, talking about what’s a risk factor for what, not the political implications).  Then there’s politics and aesthetics, with the most radioactive topics like Israel/Palestine higher up.  And then, at the very peak, there’s gender and social justice debates, where everyone brings their formative experiences along, and absolutely no one is a disinterested truth-seeker, and possibly no Aumannian conversation has ever been had in the history of the world.

I would urge that even at the very top, it’s still incumbent on all of us to try to make the Aumannian move, of “what would I think about this issue if I were someone else and not me?  If I were a man, a woman, black, white, gay, straight, a nerd, a jock?  How much of my thinking about this represents pure Spinozist reason, which could be ported to any rational mind, and how much of it would get lost in translation?”

Anyway, I’m sure some people would argue that, in the end, the whole framework of Bayesian agents, common priors, common knowledge, etc. can be chucked from this discussion like so much scaffolding, and the moral lessons I want to draw boil down to trite advice (“try to see the other person’s point of view”) that you all knew already.  Then again, even if you all knew all this, maybe you didn’t know that you all knew it!  So I hope you gained some new information from this talk in any case.  Thanks.


Update: Coincidentally, there’s a moving NYT piece by Oliver Sacks, which (among other things) recounts his experiences with his cousin, the Aumann of Aumann’s theorem.


Another Update: If I ever did attempt an Aumannian conversation with someone, the other Scott A. would be a candidate! Here he is in 2011 making several of the same points I did above, using the same examples (I thank him for pointing me to his post).

FCRC Highlights

Friday, June 19th, 2015

By popular request, here are some highlights from this week’s FCRC conference in Portland, Oregon:

  • The edit distance between two strings means the minimum number of insertions, deletions, and replacements needed to convert one string to the other: for example, SHTETL and SHETLAND have an edit distance of 4.  Edit distance has major, obvious applications to DNA sequence comparison, as well as plagiarism detection and many other things.  There’s a clever dynamic programming algorithm to compute the edit distance between two n-bit strings, but it takes ~n2 time, which is already too slow for many applications.  Can you do better?  I remember wondering about that 15 years ago, as a beginning grad student taking Richard Karp’s computational biology course.  Now Arturs Backurs and Piotr Indyk have shown that, if you can compute edit distance in O(n2-ε) time for any ε>0, then you can also solve CNF-SAT in 2cn time for some c<1, thereby refuting the Strong Exponential Time Hypothesis.  For more about this important result, see this MIT News article.
  • Olivier Temam gave a superb keynote talk about hardware neural networks.  His main points were these: implementing neural nets with special-purpose hardware was a faddish idea a few decades ago, but was abandoned once people realized that (a) it didn’t work that great, and (b) more to the point, anything you could do with special-purpose hardware, you could do better and more easily with silicon chips, after waiting just a few years for Moore’s Law to catch up.  Today, however, two things have spurred a revival of the idea: firstly, neural nets (renamed “deep learning,” and done with bigger networks and way more training data) are delivering spectacular, state-of-the-art results; and second, transistors have stopped shrinking, so it now makes more sense to think about the few orders-of-magnitude speed improvement that you can get from special-purpose hardware.  This would mean organizing computers kind-of, sort-of like the brain is organized, with (for example) memory integrated into the connections between the “neurons” (processing elements), rather than on a separate chip that’s connected to the processor by a bus.  On the other hand, Temam also stressed that computer architects shouldn’t slavishly copy the brain: instead, they should simply build the fastest hardware they can to implement the best available machine-learning algorithms, and they should rely on the machine-learning theorists to incorporate whatever broad lessons are to be gleaned from neuroscience (as they’ve done several times in the past).
  • Three separate sets of authors (Koppula, Lewko, and Waters; Canetti, Holmgren, Jain, and Vaikuntanathan; and Bitansky, Garg, Lin, Pass, and Telang) independently wrote papers that showed how to achieve “indistinguishability obfuscation” (i.o.) for Turing machines rather than for circuits.  For those not in the world of theoretical crypto, i.o. is a hot concept that basically means: obfuscating a program in such a way that no adversary can figure out anything about which program you started with, among all the possible programs that compute the same function in roughly the same amount of time.  (On the other hand, the adversary might be able to learn more than she could if merely given a black box for the function.  And that’s why this kind of obfuscation falls short of the “gold standard,” which was shown to be impossible in general in seminal work by Barak et al.)  Recent papers have shown how to achieve the weaker notion of i.o., but they first require converting your program to a Boolean circuit—something that’s absurdly inefficient in practice, and also has the theoretical drawback of producing an obfuscated program whose size grows, not merely with the size of the original, unobfuscated program, but also with the amount of time the original program is supposed to run for. So, the new work gets around that drawback, by cleverly obfuscating a program whose purpose is to compute the “next step function” of the original program, on data that’s itself encrypted. The talk was delivered in “tag team” format, with one representative from each group of authors speaking for 6-7 minutes. Surprisingly, it worked extremely well.
  • Laci Babai gave a truly epic hour-long Knuth Prize lecture, basically trying to summarize all of his work over the past 35 years (and related work by others), in 170 or so slides.  The talk had not a single word of filler: it was just pure beef, result after result, some of them well-known and seminal (e.g., MIP=NEXP, AM[2]=AM[k], AlmostNP=MA, group membership in NP, group non-membership in AM…) and others obscure little gems.  Boaz Barak commented that an entire semester-long course could be taught from the PowerPoint slides. Laci ended the talk by defining the Babai point, and then saying “having made my point, I’m done.”
  • Ambainis (yes, the same Ambainis), Filmus and Le Gall had a paper about the limitations of the techniques used to achieve all matrix multiplication algorithms from Coppersmith-Winograd (O(n2.3755)) onward, including those of Stothers 2010 (O(n2.3730)), Vassilevska Williams 2012 (O(n2.3728642)), and Le Gall 2014 (O(n2.3728639)).  Their basic conclusion—not surprising, but still nice to establish formally—is that applying more and more massive computer search to the current ideas can’t possibly get you below O(n2.308); new ideas will be needed to push further.
  • At the STOC business meeting, there was a long discussion about the proposal to turn STOC into a weeklong “theory festival,” with more plenary talks (including from other fields), possibly more parallel sessions, etc. There were lots of interesting arguments, but alas, I was too tired and jetlagged to remember what they were. (Anyone who does remember is welcome to chime in.)

There are many great things that I haven’t written about—for example, I haven’t even said a word about any of the three best paper talks!—but I’m out of energy right now.  Others are more than welcome to share other FCRC highlights in the comments section.

Happy Second Birthday Lily

Wednesday, January 21st, 2015

cat2

Two years ago, I blogged when Lily was born.  Today I can blog that she runs, climbs, swims (sort of), constructs 3-word sentences, demands chocolate cake, counts to 10 in both English and Hebrew, and knows colors, letters, shapes, animals, friends, relatives, the sun, and the moon.  To all external appearances she’s now conscious as you and I are (and considerably more so than the cat in the photo).

But the most impressive thing Lily does—the thing that puts her far beyond where her parents were at the same age, in a few areas—is her use of the iPad.  There she does phonics exercises, plays puzzle games that aren’t always trivial for me to win, and watches educational videos on YouTube (skipping past the ads, and complaining if the Internet connection goes down).  She chooses the apps and videos herself, easily switching between them when she gets bored.  It’s a sight to behold, and definitely something to try with your own toddler if you have one.  (There’s a movement these days that encourages parents to ban kids from using touch-screen devices, fearful that too much screen time will distract them from the real world.  To which I reply: for better or worse, this is the real world that our kids will grow up into.)

People often ask whether Dana and I will steer Lily into becoming a theoretical computer scientist like us.  My answer is “hell no”: I’ll support Lily in whatever she wants to do, whether that means logic, combinatorics, algebraic geometry, or even something further afield like theoretical neuroscience or physics.

As recent events illustrated, the world is not always the kindest place for nerds (male or female), with our normal ways of thinking, talking, and interacting sometimes misunderstood by others in the cruelest ways imaginable.  Yet despite everything, nerds do sometimes manage to meet, get married, and even produce offspring with nerd potential of their own.  We’re here, we’re sometimes inappropriately clear, and we’re not going anywhere.

So to life!  And happy birthday Lily!

BQP/LHC collision

Thursday, January 15th, 2015

cms

This afternoon, I gave my usual spiel about Quantum Computing and the Limits of the Efficiently Computable at the CERN Colloquium.  (If you watched the webcast of the Higgs boson discovery announcement a couple years ago, it was in the same auditorium they used for that, except this time it was less packed.)  Beforehand, Dana and I got to join a tour of the CMS detector at the Large Hadron Collider—one of the very last tours, before CMS shuts down (as ATLAS already has) to get ready for collisions at the LHC’s new, higher energy.

Considered as eye candy, I’d say that the CMS detector holds its own against the Taj Mahal, Machu Picchu, the Great Wall of China, or any of the other engineering marvels of the world.  So, OK, let me describe what it’s like to visit it.  The first step is to take a tram from downtown Geneva to CERN, which is headquartered in the town of Meyrin.  This is easier than you’d imagine: a tram actually arrives in Geneva every few minutes with “CERN” (its final stop) written right on it!  Next you take a 20-minute bus ride from the CERN reception hall to the CMS building, which is across the French border.  You don’t really think about it until you’re here, but:

(a) The Large Hadron Collider is large—it’s, like, a whole drive through the countryside to get from the main CERN buildings to CMS.

(b) All inside the LHC ring is just a normal rural/suburban area, with restaurants, roads, gas stations, cows, etc.

Anyway, then you arrive at CMS, which looks from the outside like just a big warehouse-type building.

outside

And you go inside, wondering if now you’re going to see the detector.  But no, there’s just a giant tarp hanging from the ceiling with a picture of the detector on it.  Maybe this tour won’t include the detector?

tarp

But then you go outside, back in through some back entrance, then into a staging area where you get hard hats to wear.  Then you get into an elevator that goes about 150 feet down.  Meanwhile, your tour guide is carrying a geiger counter to make sure you’re not exposed to too much radiation.  Now will you see the detector?  No, just a bunch of dark corridors.  You pass through a room full of computers on racks—cool, this must be where they analyze the collision data!  (Actually, according to Panflutist in the comments section, these computers are only for control and for the trigger system, which decides which events to store for later analysis.)

computers

Then, after that room, there’s a door with a sign indicating that beyond it is the LHC ring.  Cool!

lhcenter

Of course, you’re not actually going into the ring.  But then you turn a different way, and emerge onto a platform where you to get to the “big reveal”: the detector, two giant circular pieces that obviously screw together but are now separated, and engineers making final tweaks to them before they’re reunited for the next collider run.  (I forgot to mention: the whole tour is being conducted in French.  That’s why you sort of need to guess what’s happening.)

Anyway, thanks so much to Wolfgang Lerche and everyone else at CERN for an awesome visit.