Archive for the ‘Adventures in Meatspace’ Category

Common Knowledge and Aumann’s Agreement Theorem

Sunday, August 16th, 2015

The following is the prepared version of a talk that I gave at SPARC: a high-school summer program about applied rationality held in Berkeley, CA for the past two weeks.  I had a wonderful time in Berkeley, meeting new friends and old, but I’m now leaving to visit the CQT in Singapore, and then to attend the AQIS conference in Seoul.

Common Knowledge and Aumann’s Agreement Theorem

August 14, 2015

Thank you so much for inviting me here!  I honestly don’t know whether it’s possible to teach applied rationality, the way this camp is trying to do.  What I know is that, if it is possible, then the people running SPARC are some of the awesomest people on earth to figure out how.  I’m incredibly proud that Chelsea Voss and Paul Christiano are both former students of mine, and I’m amazed by the program they and the others have put together here.  I hope you’re all having fun—or maximizing your utility functions, or whatever.

My research is mostly about quantum computing, and more broadly, computation and physics.  But I was asked to talk about something you can actually use in your lives, so I want to tell a different story, involving common knowledge.

I’ll start with the “Muddy Children Puzzle,” which is one of the greatest logic puzzles ever invented.  How many of you have seen this one?

OK, so the way it goes is, there are a hundred children playing in the mud.  Naturally, they all have muddy foreheads.  At some point their teacher comes along and says to them, as they all sit around in a circle: “stand up if you know your forehead is muddy.”  No one stands up.  For how could they know?  Each kid can see all the other 99 kids’ foreheads, so knows that they’re muddy, but can’t see his or her own forehead.  (We’ll assume that there are no mirrors or camera phones nearby, and also that this is mud that you don’t feel when it’s on your forehead.)

So the teacher tries again.  “Knowing that no one stood up the last time, now stand up if you know your forehead is muddy.”  Still no one stands up.  Why would they?  No matter how many times the teacher repeats the request, still no one stands up.

Then the teacher tries something new.  “Look, I hereby announce that at least one of you has a muddy forehead.”  After that announcement, the teacher again says, “stand up if you know your forehead is muddy”—and again no one stands up.  And again and again; it continues 99 times.  But then the hundredth time, all the children suddenly stand up.

(There’s a variant of the puzzle involving blue-eyed islanders who all suddenly commit suicide on the hundredth day, when they all learn that their eyes are blue—but as a blue-eyed person myself, that’s always struck me as needlessly macabre.)

What’s going on here?  Somehow, the teacher’s announcing to the children that at least one of them had a muddy forehead set something dramatic in motion, which would eventually make them all stand up—but how could that announcement possibly have made any difference?  After all, each child already knew that at least 99 children had muddy foreheads!

Like with many puzzles, the way to get intuition is to change the numbers.  So suppose there were two children with muddy foreheads, and the teacher announced to them that at least one had a muddy forehead, and then asked both of them whether their own forehead was muddy.  Neither would know.  But each child could reason as follows: “if my forehead weren’t muddy, then the other child would’ve seen that, and would also have known that at least one of us has a muddy forehead.  Therefore she would’ve known, when asked, that her own forehead was muddy.  Since she didn’t know, that means my forehead is muddy.”  So then both children know their foreheads are muddy, when the teacher asks a second time.

Now, this argument can be generalized to any (finite) number of children.  The crucial concept here is common knowledge.  We call a fact “common knowledge” if, not only does everyone know it, but everyone knows everyone knows it, and everyone knows everyone knows everyone knows it, and so on.  It’s true that in the beginning, each child knew that all the other children had muddy foreheads, but it wasn’t common knowledge that even one of them had a muddy forehead.  For example, if your forehead and mine are both muddy, then I know that at least one of us has a muddy forehead, and you know that too, but you don’t know that I know it (for what if your forehead were clean?), and I don’t know that you know it (for what if my forehead were clean?).

What the teacher’s announcement did, was to make it common knowledge that at least one child has a muddy forehead (since not only did everyone hear the announcement, but everyone witnessed everyone else hearing it, etc.).  And once you understand that point, it’s easy to argue by induction: after the teacher asks and no child stands up (and everyone sees that no one stood up), it becomes common knowledge that at least two children have muddy foreheads (since if only one child had had a muddy forehead, that child would’ve known it and stood up).  Next it becomes common knowledge that at least three children have muddy foreheads, and so on, until after a hundred rounds it’s common knowledge that everyone’s forehead is muddy, so everyone stands up.

The moral is that the mere act of saying something publicly can change the world—even if everything you said was already obvious to every last one of your listeners.  For it’s possible that, until your announcement, not everyone knew that everyone knew the thing, or knew everyone knew everyone knew it, etc., and that could have prevented them from acting.

This idea turns out to have huge real-life consequences, to situations way beyond children with muddy foreheads.  I mean, it also applies to children with dots on their foreheads, or “kick me” signs on their backs…

But seriously, let me give you an example I stole from Steven Pinker, from his wonderful book The Stuff of Thought.  Two people of indeterminate gender—let’s not make any assumptions here—go on a date.  Afterward, one of them says to the other: “Would you like to come up to my apartment to see my etchings?”  The other says, “Sure, I’d love to see them.”

This is such a cliché that we might not even notice the deep paradox here.  It’s like with life itself: people knew for thousands of years that every bird has the right kind of beak for its environment, but not until Darwin and Wallace could anyone articulate why (and only a few people before them even recognized there was a question there that called for a non-circular answer).

In our case, the puzzle is this: both people on the date know perfectly well that the reason they’re going up to the apartment has nothing to do with etchings.  They probably even both know the other knows that.  But if that’s the case, then why don’t they just blurt it out: “would you like to come up for some intercourse?”  (Or “fluid transfer,” as the John Nash character put it in the Beautiful Mind movie?)

So here’s Pinker’s answer.  Yes, both people know why they’re going to the apartment, but they also want to avoid their knowledge becoming common knowledge.  They want plausible deniability.  There are several possible reasons: to preserve the romantic fantasy of being “swept off one’s feet.”  To provide a face-saving way to back out later, should one of them change their mind: since nothing was ever openly said, there’s no agreement to abrogate.  In fact, even if only one of the people (say A) might care about such things, if the other person (say B) thinks there’s any chance A cares, B will also have an interest in avoiding common knowledge, for A’s sake.

Put differently, the issue is that, as soon as you say X out loud, the other person doesn’t merely learn X: they learn that you know X, that you know that they know that you know X, that you want them to know you know X, and an infinity of other things that might upset the delicate epistemic balance.  Contrast that with the situation where X is left unstated: yeah, both people are pretty sure that “etchings” are just a pretext, and can even plausibly guess that the other person knows they’re pretty sure about it.  But once you start getting to 3, 4, 5, levels of indirection—who knows?  Maybe you do just want to show me some etchings.

Philosophers like to discuss Sherlock Holmes and Professor Moriarty meeting in a train station, and Moriarty declaring, “I knew you’d be here,” and Holmes replying, “well, I knew that you knew I’d be here,” and Moriarty saying, “I knew you knew I knew I’d be here,” etc.  But real humans tend to be unable to reason reliably past three or four levels in the knowledge hierarchy.  (Related to that, you might have heard of the game where everyone guesses a number between 0 and 100, and the winner is whoever’s number is the closest to 2/3 of the average of all the numbers.  If this game is played by perfectly rational people, who know they’re all perfectly rational, and know they know, etc., then they must all guess 0—exercise for you to see why.  Yet experiments show that, if you actually want to win this game against average people, you should guess about 20.  People seem to start with 50 or so, iterate the operation of multiplying by 2/3 a few times, and then stop.)

Incidentally, do you know what I would’ve given for someone to have explained this stuff to me back in high school?  I think that a large fraction of the infamous social difficulties that nerds have, is simply down to nerds spending so much time in domains (like math and science) where the point is to struggle with every last neuron to make everything common knowledge, to make all truths as clear and explicit as possible.  Whereas in social contexts, very often you’re managing a delicate epistemic balance where you need certain things to be known, but not known to be known, and so forth—where you need to prevent common knowledge from arising, at least temporarily.  “Normal” people have an intuitive feel for this; it doesn’t need to be explained to them.  For nerds, by contrast, explaining it—in terms of the muddy children puzzle and so forth—might be exactly what’s needed.  Once they’re told the rules of a game, nerds can try playing it too!  They might even turn out to be good at it.

OK, now for a darker example of common knowledge in action.  If you read accounts of Nazi Germany, or the USSR, or North Korea or other despotic regimes today, you can easily be overwhelmed by this sense of, “so why didn’t all the sane people just rise up and overthrow the totalitarian monsters?  Surely there were more sane people than crazy, evil ones.  And probably the sane people even knew, from experience, that many of their neighbors were sane—so why this cowardice?”  Once again, it could be argued that common knowledge is the key.  Even if everyone knows the emperor is naked; indeed, even if everyone knows everyone knows he’s naked, still, if it’s not common knowledge, then anyone who says the emperor’s naked is knowingly assuming a massive personal risk.  That’s why, in the story, it took a child to shift the equilibrium.  Likewise, even if you know that 90% of the populace will join your democratic revolt provided they themselves know 90% will join it, if you can’t make your revolt’s popularity common knowledge, everyone will be stuck second-guessing each other, worried that if they revolt they’ll be an easily-crushed minority.  And because of that very worry, they’ll be correct!

(My favorite Soviet joke involves a man standing in the Moscow train station, handing out leaflets to everyone who passes by.  Eventually, of course, the KGB arrests him—but they discover to their surprise that the leaflets are just blank pieces of paper.  “What’s the meaning of this?” they demand.  “What is there to write?” replies the man.  “It’s so obvious!”  Note that this is precisely a situation where the man is trying to make common knowledge something he assumes his “readers” already know.)

The kicker is that, to prevent something from becoming common knowledge, all you need to do is censor the common-knowledge-producing mechanisms: the press, the Internet, public meetings.  This nicely explains why despots throughout history have been so obsessed with controlling the press, and also explains how it’s possible for 10% of a population to murder and enslave the other 90% (as has happened again and again in our species’ sorry history), even though the 90% could easily overwhelm the 10% by acting in concert.  Finally, it explains why believers in the Enlightenment project tend to be such fanatical absolutists about free speech—why they refuse to “balance” it against cultural sensitivity or social harmony or any other value, as so many well-meaning people urge these days.

OK, but let me try to tell you something surprising about common knowledge.  Here at SPARC, you’ve learned all about Bayes’ rule—how, if you like, you can treat “probabilities” as just made-up numbers in your head, which are required obey the probability calculus, and then there’s a very definite rule for how to update those numbers when you gain new information.  And indeed, how an agent that wanders around constantly updating these numbers in its head, and taking whichever action maximizes its expected utility (as calculated using the numbers), is probably the leading modern conception of what it means to be “rational.”

Now imagine that you’ve got two agents, call them Alice and Bob, with common knowledge of each other’s honesty and rationality, and with the same prior probability distribution over some set of possible states of the world.  But now imagine they go out and live their lives, and have totally different experiences that lead to their learning different things, and having different posterior distributions.  But then they meet again, and they realize that their opinions about some topic (say, Hillary’s chances of winning the election) are common knowledge: they both know each other’s opinion, and they both know that they both know, and so on.  Then a striking 1976 result called Aumann’s Theorem states that their opinions must be equal.  Or, as it’s summarized: “rational agents with common priors can never agree to disagree about anything.”

Actually, before going further, let’s prove Aumann’s Theorem—since it’s one of those things that sounds like a mistake when you first hear it, and then becomes a triviality once you see the 3-line proof.  (Albeit, a “triviality” that won Aumann a Nobel in economics.)  The key idea is that knowledge induces a partition on the set of possible states of the world.  Huh?  OK, imagine someone is either an old man, an old woman, a young man, or a young woman.  You and I agree in giving each of these a 25% prior probability.  Now imagine that you find out whether they’re a man or a woman, and I find out whether they’re young or old.  This can be illustrated as follows:


The diagram tells us, for example, that if the ground truth is “old woman,” then your knowledge is described by the set {old woman, young woman}, while my knowledge is described by the set {old woman, old man}.  And this different information leads us to different beliefs: for example, if someone asks for the probability that the person is a woman, you’ll say 100% but I’ll say 50%.  OK, but what does it mean for information to be common knowledge?  It means that I know that you know that I know that you know, and so on.  Which means that, if you want to find out what’s common knowledge between us, you need to take the least common coarsening of our knowledge partitions.  I.e., if the ground truth is some given world w, then what do I consider it possible that you consider it possible that I consider possible that … etc.?  Iterate this growth process until it stops, by “zigzagging” between our knowledge partitions, and you get the set S of worlds such that, if we’re in world w, then what’s common knowledge between us is that the world belongs to S.  Repeat for all w’s, and you get the least common coarsening of our partitions.  In the above example, the least common coarsening is trivial, with all four worlds ending up in the same set S, but there are nontrivial examples as well:


Now, if Alice’s expectation of a random variable X is common knowledge between her and Bob, that means that everywhere in S, her expectation must be constant … and hence must equal whatever the expectation is, over all the worlds in S!  Likewise, if Bob’s expectation is common knowledge with Alice, then everywhere in S, it must equal the expectation of X over S.  But that means that Alice’s and Bob’s expectations are the same.

There are lots of related results.  For example, rational agents with common priors, and common knowledge of each other’s rationality, should never engage in speculative trade (e.g., buying and selling stocks, assuming that they don’t need cash, they’re not earning a commission, etc.).  Why?  Basically because, if I try to sell you a stock for (say) $50, then you should reason that the very fact that I’m offering it means I must have information you don’t that it’s worth less than $50, so then you update accordingly and you don’t want it either.

Or here’s another one: suppose again that we’re Bayesians with common priors, and we’re having a conversation, where I tell you my opinion (say, of the probability Hillary will win the election).  Not any of the reasons or evidence on which the opinion is based—just the opinion itself.  Then you, being Bayesian, update your probabilities to account for what my opinion is.  Then you tell me your opinion (which might have changed after learning mine), I update on that, I tell you my new opinion, then you tell me your new opinion, and so on.  You might think this could go on forever!  But, no, Geanakoplos and Polemarchakis observed that, as long as there are only finitely many possible states of the world in our shared prior, this process must converge after finitely many steps with you and me having the same opinion (and moreover, with it being common knowledge that we have that opinion).  Why?  Because as long as our opinions differ, your telling me your opinion or me telling you mine must induce a nontrivial refinement of one of our knowledge partitions, like so:


I.e., if you learn something new, then at least one of your knowledge sets must get split along the different possible values of the thing you learned.  But since there are only finitely many underlying states, there can only be finitely many such splittings (note that, since Bayesians never forget anything, knowledge sets that are split will never again rejoin).

And something else: suppose your friend tells you a liberal opinion, then you take it into account, but reply with a more conservative opinion.  The friend takes your opinion into account, and replies with a revised opinion.  Question: is your friend’s new opinion likelier to be more liberal than yours, or more conservative?

Obviously, more liberal!  Yes, maybe your friend now sees some of your points and vice versa, maybe you’ve now drawn a bit closer (ideally!), but you’re not going to suddenly switch sides because of one conversation.

Yet, if you and your friend are Bayesians with common priors, one can prove that that’s not what should happen at all.  Indeed, your expectation of your own future opinion should equal your current opinion, and your expectation of your friend’s next opinion should also equal your current opinion—meaning that you shouldn’t be able to predict in which direction your opinion will change next, nor in which direction your friend will next disagree with you.  Why not?  Formally, because all these expectations are just different ways of calculating an expectation over the same set, namely your current knowledge set (i.e., the set of states of the world that you currently consider possible)!  More intuitively, we could say: if you could predict that, all else equal, the next thing you heard would probably shift your opinion in a liberal direction, then as a Bayesian you should already shift your opinion in a liberal direction right now.  (This is related to what’s called the “martingale property”: sure, a random variable X could evolve in many ways in the future, but the average of all those ways must be its current expectation E[X], by the very definition of E[X]…)

So, putting all these results together, we get a clear picture of what rational disagreements should look like: they should follow unbiased random walks, until sooner or later they terminate in common knowledge of complete agreement.  We now face a bit of a puzzle, in that hardly any disagreements in the history of the world have ever looked like that.  So what gives?

There are a few ways out:

(1) You could say that the “failed prediction” of Aumann’s Theorem is no surprise, since virtually all human beings are irrational cretins, or liars (or at least, it’s not common knowledge that they aren’t). Except for you, of course: you’re perfectly rational and honest.  And if you ever met anyone else as rational and honest as you, maybe you and they could have an Aumannian conversation.  But since such a person probably doesn’t exist, you’re totally justified to stand your ground, discount all opinions that differ from yours, etc.  Notice that, even if you genuinely believed that was all there was to it, Aumann’s Theorem would still have an aspirational significance for you: you would still have to say this is the ideal that all rationalists should strive toward when they disagree.  And that would already conflict with a lot of standard rationalist wisdom.  For example, we all know that arguments from authority carry little weight: what should sway you is not the mere fact of some other person stating their opinion, but the actual arguments and evidence that they’re able to bring.  Except that as we’ve seen, for Bayesians with common priors this isn’t true at all!  Instead, merely hearing your friend’s opinion serves as a powerful summary of what your friend knows.  And if you learn that your rational friend disagrees with you, then even without knowing why, you should take that as seriously as if you discovered a contradiction in your own thought processes.  This is related to an even broader point: there’s a normative rule of rationality that you should judge ideas only on their merits—yet if you’re a Bayesian, of course you’re going to take into account where the ideas come from, and how many other people hold them!  Likewise, if you’re a Bayesian police officer or a Bayesian airport screener or a Bayesian job interviewer, of course you’re going to profile people by their superficial characteristics, however unfair that might be to individuals—so all those studies proving that people evaluate the same resume differently if you change the name at the top are no great surprise.  It seems to me that the tension between these two different views of rationality, the normative and the Bayesian, generates a lot of the most intractable debates of the modern world.

(2) Or—and this is an obvious one—you could reject the assumption of common priors. After all, isn’t a major selling point of Bayesianism supposed to be its subjective aspect, the fact that you pick “whichever prior feels right for you,” and are constrained only in how to update that prior?  If Alice’s and Bob’s priors can be different, then all the reasoning I went through earlier collapses.  So rejecting common priors might seem appealing.  But there’s a paper by Tyler Cowen and Robin Hanson called “Are Disagreements Honest?”—one of the most worldview-destabilizing papers I’ve ever read—that calls that strategy into question.  What it says, basically, is this: if you’re really a thoroughgoing Bayesian rationalist, then your prior ought to allow for the possibility that you are the other person.  Or to put it another way: “you being born as you,” rather than as someone else, should be treated as just one more contingent fact that you observe and then conditionalize on!  And likewise, the other person should condition on the observation that they’re them and not you.  In this way, absolutely everything that makes you different from someone else can be understood as “differing information,” so we’re right back to the situation covered by Aumann’s Theorem.  Imagine, if you like, that we all started out behind some Rawlsian veil of ignorance, as pure reasoning minds that had yet to be assigned specific bodies.  In that original state, there was nothing to differentiate any of us from any other—anything that did would just be information to condition on—so we all should’ve had the same prior.  That might sound fanciful, but in some sense all it’s saying is: what licenses you to privilege an observation just because it’s your eyes that made it, or a thought just because it happened to occur in your head?  Like, if you’re objectively smarter or more observant than everyone else around you, fine, but to whatever extent you agree that you aren’t, your opinion gets no special epistemic protection just because it’s yours.

(3) If you’re uncomfortable with this tendency of Bayesian reasoning to refuse to be confined anywhere, to want to expand to cosmic or metaphysical scope (“I need to condition on having been born as me and not someone else”)—well then, you could reject the entire framework of Bayesianism, as your notion of rationality. Lest I be cast out from this camp as a heretic, I hasten to say: I include this option only for the sake of completeness!

(4) When I first learned about this stuff 12 years ago, it seemed obvious to me that a lot of it could be dismissed as irrelevant to the real world for reasons of complexity. I.e., sure, it might apply to ideal reasoners with unlimited time and computational power, but as soon as you impose realistic constraints, this whole Aumannian house of cards should collapse.  As an example, if Alice and Bob have common priors, then sure they’ll agree about everything if they effectively share all their information with each other!  But in practice, we don’t have time to “mind-meld,” swapping our entire life experiences with anyone we meet.  So one could conjecture that agreement, in general, requires a lot of communication.  So then I sat down and tried to prove that as a theorem.  And you know what I found?  That my intuition here wasn’t even close to correct!

In more detail, I proved the following theorem.  Suppose Alice and Bob are Bayesians with shared priors, and suppose they’re arguing about (say) the probability of some future event—or more generally, about any random variable X bounded in [0,1].  So, they have a conversation where Alice first announces her expectation of X, then Bob announces his new expectation, and so on.  The theorem says that Alice’s and Bob’s estimates of X will necessarily agree to within ±ε, with probability at least 1-δ over their shared prior, after they’ve exchanged only O(1/(δε2)) messages.  Note that this bound is completely independent of how much knowledge they have; it depends only on the accuracy with which they want to agree!  Furthermore, the same bound holds even if Alice and Bob only send a few discrete bits about their real-valued expectations with each message, rather than the expectations themselves.

The proof involves the idea that Alice and Bob’s estimates of X, call them XA and XB respectively, follow “unbiased random walks” (or more formally, are martingales).  Very roughly, if |XA-XB|≥ε with high probability over Alice and Bob’s shared prior, then that fact implies that the next message has a high probability (again, over the shared prior) of causing either XA or XB to jump up or down by about ε.  But XA and XB, being estimates of X, are bounded between 0 and 1.  So a random walk with a step size of ε can only continue for about 1/ε2 steps before it hits one of the “absorbing barriers.”

The way to formalize this is to look at the variances, Var[XA] and Var[XB], with respect to the shared prior.  Because Alice and Bob’s partitions keep getting refined, the variances are monotonically non-decreasing.  They start out 0 and can never exceed 1 (in fact they can never exceed 1/4, but let’s not worry about constants).  Now, the key lemma is that, if Pr[|XA-XB|≥ε]≥δ, then Var[XB] must increase by at least δε2 if Alice sends XA to Bob, and Var[XA] must increase by at least δε2 if Bob sends XB to Alice.  You can see my paper for the proof, or just work it out for yourself.  At any rate, the lemma implies that, after O(1/(δε2)) rounds of communication, there must be at least a temporary break in the disagreement; there must be some round where Alice and Bob approximately agree with high probability.

There are lots of other results in my paper, including an upper bound on the number of calls that Alice and Bob need to make to a “sampling oracle” to carry out this sort of protocol approximately, assuming they’re not perfect Bayesians but agents with bounded computational power.  But let me step back and address the broader question: what should we make of all this?  How should we live with the gargantuan chasm between the prediction of Bayesian rationality for how we should disagree, and the actual facts of how we do disagree?

We could simply declare that human beings are not well-modeled as Bayesians with common priors—that we’ve failed in giving a descriptive account of human behavior—and leave it at that.   OK, but that would still leave the question: does this stuff have normative value?  Should it affect how we behave, if we want to consider ourselves honest and rational?  I would argue, possibly yes.

Yes, you should constantly ask yourself the question: “would I still be defending this opinion, if I had been born as someone else?”  (Though you might say this insight predates Aumann by quite a bit, going back at least to Spinoza.)

Yes, if someone you respect as honest and rational disagrees with you, you should take it as seriously as if the disagreement were between two different aspects of yourself.

Finally, yes, we can try to judge epistemic communities by how closely they approach the Aumannian ideal.  In math and science, in my experience, it’s common to see two people furiously arguing with each other at a blackboard.  Come back five minutes later, and they’re arguing even more furiously, but now their positions have switched.  As we’ve seen, that’s precisely what the math says a rational conversation should look like.  In social and political discussions, though, usually the very best you’ll see is that two people start out diametrically opposed, but eventually one of them says “fine, I’ll grant you this,” and the other says “fine, I’ll grant you that.”  We might say, that’s certainly better than the common alternative, of the two people walking away even more polarized than before!  Yet the math tells us that even the first case—even the two people gradually getting closer in their views—is nothing at all like a rational exchange, which would involve the two participants repeatedly leapfrogging each other, completely changing their opinion about the question under discussion (and then changing back, and back again) every time they learned something new.  The first case, you might say, is more like haggling—more like “I’ll grant you that X is true if you grant me that Y is true”—than like our ideal friendly mathematicians arguing at the blackboard, whose acceptance of new truths is never slow or grudging, never conditional on the other person first agreeing with them about something else.

Armed with this understanding, we could try to rank fields by how hard it is to have an Aumannian conversation in them.  At the bottom—the easiest!—is math (or, let’s say, chess, or debugging a program, or fact-heavy fields like lexicography or geography).  Crucially, here I only mean the parts of these subjects with agreed-on rules and definite answers: once the conversation turns to whose theorems are deeper, or whose fault the bug was, things can get arbitrarily non-Aumannian.  Then there’s the type of science that involves messy correlational studies (I just mean, talking about what’s a risk factor for what, not the political implications).  Then there’s politics and aesthetics, with the most radioactive topics like Israel/Palestine higher up.  And then, at the very peak, there’s gender and social justice debates, where everyone brings their formative experiences along, and absolutely no one is a disinterested truth-seeker, and possibly no Aumannian conversation has ever been had in the history of the world.

I would urge that even at the very top, it’s still incumbent on all of us to try to make the Aumannian move, of “what would I think about this issue if I were someone else and not me?  If I were a man, a woman, black, white, gay, straight, a nerd, a jock?  How much of my thinking about this represents pure Spinozist reason, which could be ported to any rational mind, and how much of it would get lost in translation?”

Anyway, I’m sure some people would argue that, in the end, the whole framework of Bayesian agents, common priors, common knowledge, etc. can be chucked from this discussion like so much scaffolding, and the moral lessons I want to draw boil down to trite advice (“try to see the other person’s point of view”) that you all knew already.  Then again, even if you all knew all this, maybe you didn’t know that you all knew it!  So I hope you gained some new information from this talk in any case.  Thanks.

Update: Coincidentally, there’s a moving NYT piece by Oliver Sacks, which (among other things) recounts his experiences with his cousin, the Aumann of Aumann’s theorem.

Another Update: If I ever did attempt an Aumannian conversation with someone, the other Scott A. would be a candidate! Here he is in 2011 making several of the same points I did above, using the same examples (I thank him for pointing me to his post).

FCRC Highlights

Friday, June 19th, 2015

By popular request, here are some highlights from this week’s FCRC conference in Portland, Oregon:

  • The edit distance between two strings means the minimum number of insertions, deletions, and replacements needed to convert one string to the other: for example, SHTETL and SHETLAND have an edit distance of 4.  Edit distance has major, obvious applications to DNA sequence comparison, as well as plagiarism detection and many other things.  There’s a clever dynamic programming algorithm to compute the edit distance between two n-bit strings, but it takes ~n2 time, which is already too slow for many applications.  Can you do better?  I remember wondering about that 15 years ago, as a beginning grad student taking Richard Karp’s computational biology course.  Now Arturs Backurs and Piotr Indyk have shown that, if you can compute edit distance in O(n2-ε) time for any ε>0, then you can also solve CNF-SAT in 2cn time for some c<1, thereby refuting the Strong Exponential Time Hypothesis.  For more about this important result, see this MIT News article.
  • Olivier Temam gave a superb keynote talk about hardware neural networks.  His main points were these: implementing neural nets with special-purpose hardware was a faddish idea a few decades ago, but was abandoned once people realized that (a) it didn’t work that great, and (b) more to the point, anything you could do with special-purpose hardware, you could do better and more easily with silicon chips, after waiting just a few years for Moore’s Law to catch up.  Today, however, two things have spurred a revival of the idea: firstly, neural nets (renamed “deep learning,” and done with bigger networks and way more training data) are delivering spectacular, state-of-the-art results; and second, transistors have stopped shrinking, so it now makes more sense to think about the few orders-of-magnitude speed improvement that you can get from special-purpose hardware.  This would mean organizing computers kind-of, sort-of like the brain is organized, with (for example) memory integrated into the connections between the “neurons” (processing elements), rather than on a separate chip that’s connected to the processor by a bus.  On the other hand, Temam also stressed that computer architects shouldn’t slavishly copy the brain: instead, they should simply build the fastest hardware they can to implement the best available machine-learning algorithms, and they should rely on the machine-learning theorists to incorporate whatever broad lessons are to be gleaned from neuroscience (as they’ve done several times in the past).
  • Three separate sets of authors (Koppula, Lewko, and Waters; Canetti, Holmgren, Jain, and Vaikuntanathan; and Bitansky, Garg, Lin, Pass, and Telang) independently wrote papers that showed how to achieve “indistinguishability obfuscation” (i.o.) for Turing machines rather than for circuits.  For those not in the world of theoretical crypto, i.o. is a hot concept that basically means: obfuscating a program in such a way that no adversary can figure out anything about which program you started with, among all the possible programs that compute the same function in roughly the same amount of time.  (On the other hand, the adversary might be able to learn more than she could if merely given a black box for the function.  And that’s why this kind of obfuscation falls short of the “gold standard,” which was shown to be impossible in general in seminal work by Barak et al.)  Recent papers have shown how to achieve the weaker notion of i.o., but they first require converting your program to a Boolean circuit—something that’s absurdly inefficient in practice, and also has the theoretical drawback of producing an obfuscated program whose size grows, not merely with the size of the original, unobfuscated program, but also with the amount of time the original program is supposed to run for. So, the new work gets around that drawback, by cleverly obfuscating a program whose purpose is to compute the “next step function” of the original program, on data that’s itself encrypted. The talk was delivered in “tag team” format, with one representative from each group of authors speaking for 6-7 minutes. Surprisingly, it worked extremely well.
  • Laci Babai gave a truly epic hour-long Knuth Prize lecture, basically trying to summarize all of his work over the past 35 years (and related work by others), in 170 or so slides.  The talk had not a single word of filler: it was just pure beef, result after result, some of them well-known and seminal (e.g., MIP=NEXP, AM[2]=AM[k], AlmostNP=MA, group membership in NP, group non-membership in AM…) and others obscure little gems.  Boaz Barak commented that an entire semester-long course could be taught from the PowerPoint slides. Laci ended the talk by defining the Babai point, and then saying “having made my point, I’m done.”
  • Ambainis (yes, the same Ambainis), Filmus and Le Gall had a paper about the limitations of the techniques used to achieve all matrix multiplication algorithms from Coppersmith-Winograd (O(n2.3755)) onward, including those of Stothers 2010 (O(n2.3730)), Vassilevska Williams 2012 (O(n2.3728642)), and Le Gall 2014 (O(n2.3728639)).  Their basic conclusion—not surprising, but still nice to establish formally—is that applying more and more massive computer search to the current ideas can’t possibly get you below O(n2.308); new ideas will be needed to push further.
  • At the STOC business meeting, there was a long discussion about the proposal to turn STOC into a weeklong “theory festival,” with more plenary talks (including from other fields), possibly more parallel sessions, etc. There were lots of interesting arguments, but alas, I was too tired and jetlagged to remember what they were. (Anyone who does remember is welcome to chime in.)

There are many great things that I haven’t written about—for example, I haven’t even said a word about any of the three best paper talks!—but I’m out of energy right now.  Others are more than welcome to share other FCRC highlights in the comments section.

Happy Second Birthday Lily

Wednesday, January 21st, 2015


Two years ago, I blogged when Lily was born.  Today I can blog that she runs, climbs, swims (sort of), constructs 3-word sentences, demands chocolate cake, counts to 10 in both English and Hebrew, and knows colors, letters, shapes, animals, friends, relatives, the sun, and the moon.  To all external appearances she’s now conscious as you and I are (and considerably more so than the cat in the photo).

But the most impressive thing Lily does—the thing that puts her far beyond where her parents were at the same age, in a few areas—is her use of the iPad.  There she does phonics exercises, plays puzzle games that aren’t always trivial for me to win, and watches educational videos on YouTube (skipping past the ads, and complaining if the Internet connection goes down).  She chooses the apps and videos herself, easily switching between them when she gets bored.  It’s a sight to behold, and definitely something to try with your own toddler if you have one.  (There’s a movement these days that encourages parents to ban kids from using touch-screen devices, fearful that too much screen time will distract them from the real world.  To which I reply: for better or worse, this is the real world that our kids will grow up into.)

People often ask whether Dana and I will steer Lily into becoming a theoretical computer scientist like us.  My answer is “hell no”: I’ll support Lily in whatever she wants to do, whether that means logic, combinatorics, algebraic geometry, or even something further afield like theoretical neuroscience or physics.

As recent events illustrated, the world is not always the kindest place for nerds (male or female), with our normal ways of thinking, talking, and interacting sometimes misunderstood by others in the cruelest ways imaginable.  Yet despite everything, nerds do sometimes manage to meet, get married, and even produce offspring with nerd potential of their own.  We’re here, we’re sometimes inappropriately clear, and we’re not going anywhere.

So to life!  And happy birthday Lily!

BQP/LHC collision

Thursday, January 15th, 2015


This afternoon, I gave my usual spiel about Quantum Computing and the Limits of the Efficiently Computable at the CERN Colloquium.  (If you watched the webcast of the Higgs boson discovery announcement a couple years ago, it was in the same auditorium they used for that, except this time it was less packed.)  Beforehand, Dana and I got to join a tour of the CMS detector at the Large Hadron Collider—one of the very last tours, before CMS shuts down (as ATLAS already has) to get ready for collisions at the LHC’s new, higher energy.

Considered as eye candy, I’d say that the CMS detector holds its own against the Taj Mahal, Machu Picchu, the Great Wall of China, or any of the other engineering marvels of the world.  So, OK, let me describe what it’s like to visit it.  The first step is to take a tram from downtown Geneva to CERN, which is headquartered in the town of Meyrin.  This is easier than you’d imagine: a tram actually arrives in Geneva every few minutes with “CERN” (its final stop) written right on it!  Next you take a 20-minute bus ride from the CERN reception hall to the CMS building, which is across the French border.  You don’t really think about it until you’re here, but:

(a) The Large Hadron Collider is large—it’s, like, a whole drive through the countryside to get from the main CERN buildings to CMS.

(b) All inside the LHC ring is just a normal rural/suburban area, with restaurants, roads, gas stations, cows, etc.

Anyway, then you arrive at CMS, which looks from the outside like just a big warehouse-type building.


And you go inside, wondering if now you’re going to see the detector.  But no, there’s just a giant tarp hanging from the ceiling with a picture of the detector on it.  Maybe this tour won’t include the detector?


But then you go outside, back in through some back entrance, then into a staging area where you get hard hats to wear.  Then you get into an elevator that goes about 150 feet down.  Meanwhile, your tour guide is carrying a geiger counter to make sure you’re not exposed to too much radiation.  Now will you see the detector?  No, just a bunch of dark corridors.  You pass through a room full of computers on racks—cool, this must be where they analyze the collision data!  (Actually, according to Panflutist in the comments section, these computers are only for control and for the trigger system, which decides which events to store for later analysis.)


Then, after that room, there’s a door with a sign indicating that beyond it is the LHC ring.  Cool!


Of course, you’re not actually going into the ring.  But then you turn a different way, and emerge onto a platform where you to get to the “big reveal”: the detector, two giant circular pieces that obviously screw together but are now separated, and engineers making final tweaks to them before they’re reunited for the next collider run.  (I forgot to mention: the whole tour is being conducted in French.  That’s why you sort of need to guess what’s happening.)

Anyway, thanks so much to Wolfgang Lerche and everyone else at CERN for an awesome visit.

Lens of Computation on the Sciences

Tuesday, November 25th, 2014

This weekend, the Institute for Advanced Study in Princeton hosted a workshop on the “Lens of Computation in the Sciences,” which was organized by Avi Wigderson, and was meant to showcase theoretical computer science’s imperialistic ambitions to transform every other field.  I was proud to speak at the workshop, representing CS theory’s designs on physics.  But videos of all four of the talks are now available, and all are worth checking out:

Unfortunately, the videos were slow to buffer when I last tried it.  While you’re waiting, you could also check my PowerPoint slides, though they overlap considerably with my previous talks.  (As always, if you can’t read PowerPoint, then go ask another reader of this blog to convert the file into a format you like.)

Thanks so much to Avi, and everyone else at IAS, for organizing an awesome workshop!

Interstellar’s dangling wormholes

Monday, November 10th, 2014

Update (Nov. 15): A third of my confusions addressed by reading Kip Thorne’s book! Details at the bottom of this post.

On Saturday Dana and I saw Interstellar, the sci-fi blockbuster co-produced by the famous theoretical physicist Kip Thorne (who told me about his work on this movie when I met him eight years ago).  We had the rare privilege of seeing the movie on the same day that we got to hang out with a real astronaut, Dan Barry, who flew three shuttle missions and did four spacewalks in the 1990s.  (As the end result of a project that Dan’s roboticist daughter, Jenny Barry, did for my graduate course on quantum complexity theory, I’m now the coauthor with both Barrys on a paper in Physical Review A, about uncomputability in quantum partially-observable Markov decision processes.)

Before talking about the movie, let me say a little about the astronaut.  Besides being an inspirational example of someone who’s achieved more dreams in life than most of us—seeing the curvature of the earth while floating in orbit around it, appearing on Survivor, and publishing a Phys. Rev. A paper—Dan is also a passionate advocate of humanity’s colonizing other worlds.  When I asked him whether there was any future for humans in space, he answered firmly that the only future for humans was in space, and then proceeded to tell me about the technical viability of getting humans to Mars with limited radiation exposure, the abundant water there, the romantic appeal that would inspire people to sign up for the one-way trip, and the extinction risk for any species confined to a single planet.  Hearing all this from someone who’d actually been to space gave Interstellar, with its theme of humans needing to leave Earth to survive (and its subsidiary theme of the death of NASA’s manned space program meaning the death of humanity), a special vividness for me.  Granted, I remain skeptical about several points: the feasibility of a human colony on Mars in the foreseeable future (a self-sufficient human colony on Antarctica, or under the ocean, strike me as plenty hard enough for the next few centuries); whether a space colony, even if feasible, cracks the list of the top twenty things we ought to be doing to mitigate the risk of human extinction; and whether there’s anything more to be learned, at this point in history, by sending humans to space that couldn’t be learned a hundred times more cheaply by sending robots.  On the other hand, if there is a case for continuing to send humans to space, then I’d say it’s certainly the case that Dan Barry makes.

OK, but enough about the real-life space traveler: what did I think about the movie?  Interstellar is a work of staggering ambition, grappling with some of the grandest themes of which sci-fi is capable: the deterioration of the earth’s climate; the future of life in the universe; the emotional consequences of extreme relativistic time dilation; whether “our” survival would be ensured by hatching human embryos in a faraway world, while sacrificing almost all the humans currently alive; to what extent humans can place the good of the species above family and self; the malleability of space and time; the paradoxes of time travel.  It’s also an imperfect movie, one with many “dangling wormholes” and unbalanced parentheses that are still generating compile-time errors in my brain.  And it’s full of stilted dialogue that made me giggle—particularly when the characters discussed jumping into a black hole to retrieve its “quantum data.”  Also, despite Kip Thorne’s involvement, I didn’t find the movie’s science spectacularly plausible or coherent (more about that below).  On the other hand, if you just wanted a movie that scrupulously obeyed the laws of physics, rather than intelligently probing their implications and limits, you could watch any romantic comedy.  So sure, Interstellar might make you cringe, but if you like science fiction at all, then it will also make you ponder, stare awestruck, and argue with friends for days afterward—and enough of the latter to make it more than worth your while.  Just one tip: if you’re prone to headaches, do not sit near the front of the theater, especially if you’re seeing it in IMAX.

For other science bloggers’ takes, see John Preskill (who was at a meeting with Steven Spielberg to brainstorm the movie in 2006), Sean Carroll, Clifford Johnson, and Peter Woit.

In the rest of this post, I’m going to list the questions about Interstellar that I still don’t understand the answers to (yes, the ones still not answered by the Interstellar FAQ).  No doubt some of these are answered by Thorne’s book The Science of Interstellar, which I’ve ordered (it hasn’t arrived yet), but since my confusions are more about plot than science, I’m guessing that others are not.

SPOILER ALERT: My questions give away basically the entire plot—so if you’re planning to see the movie, please don’t read any further.  After you’ve seen it, though, come back and see if you can help with any of my questions.

1. What’s causing the blight, and the poisoning of the earth’s atmosphere?  The movie is never clear about this.  Is it a freak occurrence, or is it human-caused climate change?  If the latter, then wouldn’t it be worth some effort to try to reverse the damage and salvage the earth, rather than escaping through a wormhole to another galaxy?

2. What’s with the drone?  Who sent it?  Why are Cooper and Murph able to control it with their laptop?  Most important of all, what does it have to do with the rest of the movie?

3. If NASA wanted Cooper that badly—if he was the best pilot they’d ever had and NASA knew it—then why couldn’t they just call him up?  Why did they have to wait for beings from the fifth dimension to send a coded message to his daughter revealing their coordinates?  Once he did show up, did they just kind of decide opportunistically that it would be a good idea to recruit him?

4. What was with Cooper’s crash in his previous NASA career?  If he was their best pilot, how and why did the crash happen?  If this was such a defining, traumatic incident in his life, why is it never brought up for the rest of the movie?

5. How is NASA funded in this dystopian future?  If official ideology holds that the Apollo missions were faked, and that growing crops is the only thing that matters, then why have the craven politicians been secretly funneling what must be trillions of dollars to a shadow-NASA, over a period of fifty years?

6. Why couldn’t NASA have reconnoitered the planets using robots—especially since this is a future where very impressive robots exist?  Yes, yes, I know, Matt Damon explains in the movie that humans remain more versatile than robots, because of their “survival instinct.”  But the crew arrives at the planets missing extremely basic information about them, like whether they’re inhospitable to human life because of freezing temperatures or mile-high tidal waves.  This is information that robotic probes, even of the sort we have today, could have easily provided.

7. Why are the people who scouted out the 12 planets so limited in the data they can send back?  If they can send anything, then why not data that would make Cooper’s mission completely redundant (excepting, of course, the case of the lying Dr. Mann)?  Does the wormhole limit their transmissions to 1 bit per decade or something?

8. Rather than wasting precious decades waiting for Cooper’s mission to return, while (presumably) billions of people die of starvation on a fading earth, wouldn’t it make more sense for NASA to start colonizing the planets now?  They could simply start trial colonies on all the planets, even if they think most of the colonies will fail.  Yes, this plan involves sacrificing individuals for the greater good of humanity, but NASA is already doing that anyway, with its slower, riskier, stupider reconnaissance plan.  The point becomes even stronger when we remember that, in Professor Brand’s mind, the only feasible plan is “Plan B” (the one involving the frozen human embryos).  Frozen embryos are (relatively) cheap: why not just spray them all over the place?  And why wait for “Plan A” to fail before starting that?

9. The movie involves a planet, Miller, that’s so close to the black hole Gargantua, that every hour spent there corresponds to seven years on earth.  There was an amusing exchange on Slate, where Phil Plait made the commonsense point that a planet that deep in a black hole’s gravity well would presumably get ripped apart by tidal forces.  Plait later had to issue an apology, since, in conceiving this movie, Kip Thorne had made sure that Gargantua was a rapidly rotating black hole—and it turns out that the physics of rotating black holes are sufficiently different from those of non-rotating ones to allow such a planet in principle.  Alas, this clever explanation still leaves me unsatisfied.  Physicists, please help: even if such a planet existed, wouldn’t safely landing a spacecraft on it, and getting it out again, require a staggering amount of energy—well beyond what the humans shown in the movie can produce?  (If they could produce that much acceleration and deceleration, then why couldn’t they have traveled from Earth to Saturn in days rather than years?)  If one could land on Miller and then get off of it using the relatively conventional spacecraft shown in the movie, then the amusing thought suggests itself that one could get factor-of-60,000 computational speedups, “free of charge,” by simply leaving one’s computer in space while one spent some time on the planet.  (And indeed, something like that happens in the movie: after Cooper and Anne Hathaway return from Miller, Romilly—the character who stayed behind—has had 23 years to think about physics.)

10. Why does Cooper decide to go into the black hole?  Surely he could jettison enough weight to escape the black hole’s gravity by sending his capsule into the hole, while he himself shared Anne Hathaway’s capsule?

11. Speaking of which, does Cooper go into the black hole?  I.e., is the “tesseract” something he encounters before or after he crosses the event horizon?  (Or maybe it should be thought of as at the event horizon—like a friendlier version of the AMPS firewall?)

12. Why is Cooper able to send messages back in time—but only by jostling books around, moving the hands of a watch, and creating patterns of dust in one particular room of one particular house?  (Does this have something to do with love and gravity being the only two forces in the universe that transcend space and time?)

13. Why does Cooper desperately send the message “STAY” to his former self?  By this point in the movie, isn’t it clear that staying on Earth means the death of all humans, including Murph?  If Cooper thought that a message could get through at all, then why not a message like: “go, and go directly to Edmunds’ planet, since that’s the best one”?  Also, given that Cooper now exists outside of time, why does he feel such desperate urgency?  Doesn’t he get, like, infinitely many chances?

14. Why is Cooper only able to send “quantum data” that saves the world to the older Murph—the one who lives when (presumably) billions of people are already dying of starvation?  Why can’t he send the “quantum data” back to the 10-year-old Murph, for example?  Even if she can’t yet understand it, surely she could hand it over to Professor Brand.  And even if this plan would be unlikely to succeed: again, Cooper now exists outside of time.  So can’t he just keep going back to the 10-year-old Murph, rattling those books over and over until the message gets through?

15. What exactly is the “quantum data” needed for, anyway?  I gather it has something to do with building a propulsion system that can get the entire human population out of the earth’s gravity well at a reasonable cost?  (Incidentally, what about all the animals?  If the writers of the Old Testament noticed that issue, surely the writers of Interstellar could.)

16. How does Cooper ever make it out of the black hole?  (Maybe it was explained and I missed it: once he entered the black hole, things got extremely confusing.)  Do the fifth-dimensional beings create a new copy of Cooper outside the black hole?  Do they postselect on a branch of the wavefunction where he never entered the black hole in the first place?  Does Murph use the “quantum data” to get him out?

17. At his tearful reunion with the elderly Murph, why is Cooper totally uninterested in meeting his grandchildren and great-grandchildren, who are in the same room?  And why are they uninterested in meeting him?  I mean, seeing Murph again has been Cooper’s overriding motivation during his journey across the universe, and has repeatedly been weighed against the survival of the entire human race, including Murph herself.  But seeing Murph’s kids—his grandkids—isn’t even worth five minutes?

18. Speaking of which, when did Murph ever find time to get married and have kids?  Since she’s such a major character, why don’t we learn anything about this?

19. Also, why is Murph an old woman by the time Cooper gets back?  Yes, Cooper lost a few decades because of the time dilation on Miller’s planet.  I guess he lost the additional decades while entering and leaving Gargantua?  If the five-dimensional beings were able to use their time-travel / causality-warping powers to get Cooper out of the black hole, couldn’t they have re-synced his clock with Murph’s while they were at it?

20. Why does Cooper need to steal a spaceship to get to Anne Hathaway’s planet?  Isn’t Murph, like, the one in charge?  Can’t she order that a spaceship be provided for Cooper?

21. Astute readers will note that I haven’t yet said anything about the movie’s central paradox, the one that dwarfs all the others.  Namely, if humans were going to go extinct without a “wormhole assist” from the humans of the far future, then how were there any humans in the far future to provide the wormhole assist?  And conversely, if the humans of the far future find themselves already existing, then why do they go to the trouble to put the wormhole in their past (which now seems superfluous, except maybe for tidying up the story of their own origins)?  The reason I didn’t ask about this is that I realize it’s supposed to be paradoxical; we’re supposed to feel vertigo thinking about it.  (And also, it’s not entirely unrelated to how PSPACE-complete problems get solved with polynomial resources, in my and John Watrous’s paper on computation with closed timelike curves.)  My problem is a different one: if the fifth-dimensional, far-future humans have the power to mold their own past to make sure everything turned out OK, then what they actually do seems pathetic compared to what they could do.  For example, why don’t they send a coded message to the 21st-century humans (similar to the coded messages that Cooper sends to Murph), telling them how to avoid the blight that destroys their crops?  Or just telling them that Edmunds’ planet is the right one to colonize?  Like the God of theodicy arguments, do the future humans want to use their superpowers only to give us a little boost here and there, while still leaving us a character-forming struggle?  Even if this reticence means that billions of innocent people—ones who had nothing to do with the character-forming struggle—will die horrible deaths?  If so, then I don’t understand these supposedly transcendently-evolved humans any better than I understand the theodical God.

Anyway, rather than ending on that note of cosmic pessimism, I guess I could rejoice that we’re living through what must be the single biggest month in the history of nerd cinema—what with a sci-fi film co-produced by a great theoretical physicist, a Stephen Hawking biopic, and the Alan Turing movie coming out in a few weeks.  I haven’t yet seen the latter two.  But it looks like the time might be ripe to pitch my own decades-old film ideas, like “Radical: The Story of Évariste Galois.”

Update (Nov. 15): I just finished reading Kip Thorne’s interesting book The Science of Interstellar.  I’d say that it addresses (doesn’t always clear up, but at least addresses) 7 of my 21 confusions: 1, 4, 9, 10, 11, 15, and 19.  Briefly:

1. Thorne correctly notes that the movie is vague about what’s causing the blight and the change to the earth’s atmosphere, but he discusses a bunch of possibilities, which are more in the “freak disaster” than the “manmade” category.

4. Cooper’s crash was supposed to have been caused by a gravitational anomaly, as the bulk beings of the far future were figuring out how to communicate with 21st-century humans.  It was another foreshadowing of those bulk beings.

9. Thorne notices the problem of the astronomical amount of energy needed to safely land on Miller’s planet and then get off of it—given that this planet is deep inside the gravity well of the black hole Gargantua, and orbiting Gargantua at a large fraction of the speed of light.  Thorne offers a solution that can only be called creative: namely, while nothing about this was said in the movie (since Christopher Nolan thought it would confuse people), it turns out that the crew accelerated to relativistic speed and then decelerated using a gravitational slingshot around a second, intermediate-mass black hole, which just happened to be in the vicinity of Gargantua at precisely the right times for this.  Thorne again appeals to slingshots around unmentioned but strategically-placed intermediate-mass black holes several more times in the book, to explain other implausible accelerations and decelerations that I hadn’t even noticed.

10. Thorne acknowledges that Cooper didn’t really need to jump into Gargantua in order to jettison the mass of his body (which is trivial compared to the mass of the spacecraft).  Cooper’s real reason for jumping, he says, was the desperate hope that he could somehow find the quantum data there needed to save the humans on Earth, and then somehow get it out of the black hole and back to the humans.  (This being a movie, it of course turns out that Cooper was right.)

11. Yes, Cooper encounters the tesseract while inside the black hole.  Indeed, he hits it while flying into a singularity that’s behind the event horizon, but that isn’t the black hole’s “main” singularity—it’s a different, milder singularity.

15. While this wasn’t made clear in the movie, the purpose of the quantum data was indeed to learn how to manipulate the gravitational anomalies in order to decrease Newton’s constant G in the vicinity of the earth—destroying the earth but also allowing all the humans to escape its gravity with the rocket fuel that’s available.  (Again, nothing said about the poor animals.)

19. Yes, Cooper lost the additional decades while entering Gargantua.  (Furthermore, while Thorne doesn’t discuss this, I guess he must have lost them only when he was still with Anne Hathaway, not after he separates from her.  For otherwise, Anne Hathaway would also be an old woman by the time Cooper reaches her on Edmunds’ planet, contrary to what’s shown in the movie.)

Speaking Truth to Parallelism at Cornell

Friday, October 3rd, 2014

This week I was at my alma mater, Cornell, to give a talk at the 50th anniversary celebration of its computer science department.  You can watch the streaming video here; my talk runs from roughly 1:17:30 to 1:56 (though if you’ve seen other complexity/physics/humor shows by me, this one is pretty similar, except for the riff about Cornell at the beginning).

The other two things in that video—a talk by Tom Henzinger about IST Austria, a bold new basic research institute that he leads, closely modeled after the Weizmann Institute in Israel; and a discussion panel about the future of programming languages—are also really interesting and worth watching.  There was lots of other good stuff at this workshop, including a talk about Google Glass and its applications to photography (by, not surprisingly, a guy wearing a Google Glass—Marc Levoy); a panel discussion with three Turing Award winners, Juris Hartmanis, John Hopcroft, and Ed Clarke, about the early days of Cornell’s CS department; a talk by Amit Singhal, Google’s director of search; a talk about differential privacy by Cynthia Dwork, one of the leading researchers at the recently-closed Microsoft SVC lab (with a poignant and emotional ending); and a talk by my own lab director at MIT, Daniela Rus, about her research in robotics.

Along with the 50th anniversary celebration, Bill Gates was also on campus to dedicate Bill and Melinda Gates Hall, the new home of Cornell’s CS department.  Click here for streaming video of a Q&A that Gates did with Cornell students, where I thought he acquitted himself quite well, saying many sensible things about education, the developing world, etc. that other smart people could also say, but that have extra gravitas coming from him.  Gates has also become extremely effective at wrapping barbs of fact inside a soft mesh of politically-unthreatening platitudes—but listen carefully and you’ll hear the barbs.  The amount of pomp and preparation around Gates’s visit reminded me of when President Obama visited MIT, befitting the two men’s approximately equal power.  (Obama has nuclear weapons, but then again, he also has Congress.)

And no, I didn’t get to meet Gates or shake his hand, though I did get to stand about ten feet from him at the Gates Hall dedication.  (He apparently spent most of his time at Cornell meeting with plant breeders, and other people doing things relevant to the Gates Foundation’s interests.)

Thanks so much to Bobby and Jon Kleinberg, and everyone else who invited me to this fantastic event and helped make it happen.  May Cornell’s CS department have a great next 50 years.

One last remark before I close this post.  Several readers have expressed disapproval and befuddlement over the proposed title of my next book, “Speaking Truth to Parallelism.”  In the words of commenter TonyK:

That has got to be the worst title in the history of publishing! “Speaking Truth to Parallelism”? It doesn’t even make sense! I count myself as one of your fans, Scott, but you’re going to have to do better than that if you want anybody else to buy your book. I know you can do better — witness “Quantum Computing Since Democritus”.

However, my experiences at Cornell this week helped to convince me that, not only does “Speaking Truth to Parallelism” make perfect sense, it’s an activity that’s needed now more than ever.  What it means, of course, is fighting a certain naïve, long-ago-debunked view of quantum computers—namely, that they would achieve exponential speedups by simply “trying every possible answer in parallel”—that’s become so entrenched in the minds of many journalists, laypeople, and even scientists from other fields that it feels like nothing you say can possibly dislodge it.  The words out of your mouth will literally be ignored, misheard, or even contorted to the opposite of what they mean, if that’s what it takes to preserve the listener’s misconception about quantum computers being able to solve NP-hard optimization problems by sheer magic.  (Much like in the Simpsons-visit-Australia episode, where Marge’s request for “coffee” is misheard over and over as “beer.”)  You probably think I’m exaggerating, and I’d agree with you—if I hadn’t experienced this phenomenon hundreds of times over the last decade.

So, to take one example: after my talk at Cornell, an audience member came up to me to say that it was a wonderful talk, but that what he really wanted to know was whether I thought quantum computers could solve problems in the “NP space” in linear time, by trying all the possible solutions at once.  He didn’t seem to realize that I’d spent the entire previous half hour answering that exact question, explaining why the answer was “no.”  Coincidentally, this week I also got an email from a longtime reader of this blog, saying that he read and loved Quantum Computing Since Democritus, and wanted my feedback on a popular article he’d written about quantum computing.  What was the gist of the article?  You guessed it: “quantum computing = generic exponential speedups for optimization, machine learning, and Big Data problems, by trying all the possible answers at once.”

These people’s enthusiasm for quantum computing tends to be so genuine, so sincere, that I find myself unable to blame them—even when they’ve done the equivalent of going up to Richard Dawkins and thanking him for having taught them that evolution works for the good of the entire species, just as its wise Designer intended.  I do blame the media and other careless or unscrupulous parties for misleading people about quantum computing, but most of all I blame myself, for not making my explanations clear enough.  In the end, then, meeting the “NP space” folks only makes me want to redouble my efforts to Speak Truth to Parallelism: eventually, I feel, the nerd world will get this point.

Update (Oct. 4): I had regarded this (perhaps wrongly) as too obvious to state, but particularly for non-native English speakers, I’d better clarify: “speaking truth to parallelism” is a deliberate pun on the left-wing protester phrase “speaking truth to power.”  So whatever linguistic oddness there is in my phrase, I’d say it simply inherits from the original.

Another Update (Oct. 7): See this comment for my short summary of what’s known about the actual technical question (can quantum computers solve NP-complete problems in polynomial time, or not?).

Another Update (Oct. 8): Many commenters wrote to point out that the video of my talk at Cornell is now password-protected, and no longer publicly available.  I wrote to my contacts at Cornell to ask about this, and they said they’re planning to release lightly-edited versions of the videos soon, but will look into the matter in the meantime.

BosonSampling Lecture Notes from Rio

Saturday, December 28th, 2013

Update (January 3): There’s now a long interview with me about quantum computing in the Washington Post (or at least, on their website).  The interview accompanies their lead article about quantum computing and the NSA, which also quotes me (among many others), and which reports—unsurprisingly—that the NSA is indeed interested in building scalable quantum computers but, based on the Snowden documents, appears to be quite far from that goal.

(Warning: The interview contains a large number of typos and other errors, which might have arisen from my infelicities in speaking or the poor quality of the phone connection.  Some were corrected but others remain.)

The week before last, I was in Rio de Janeiro to give a mini-course on “Complexity Theory and Quantum Optics” at the Instituto de Física of the Universidade Federal Fluminense.  Next week I’ll be giving a similar course at the Jerusalem Winter School on Quantum Information.

In the meantime, my host in Rio, Ernesto Galvão, and others were kind enough to make detailed, excellent notes for my five lectures in Rio.  You can click the link in the last sentence to get them, or here are links for the five lectures individually:

If you have questions or comments about the lectures, leave them here (since I might not check the quantumrio blog).

One other thing: I can heartily recommend a trip to Rio to anyone interested in quantum information—or, for that matter, to anyone interested in sunshine, giant Jesus statues, or (especially) fruit juices you’ve never tasted before.  My favorite from among the latter was acerola.  Also worth a try are caja, mangaba, guarana, umbu, seriguela, amora, and fruta do conde juices—as well as caju and cacao, even though they taste almost nothing like the more commercially exportable products from the same plants (cashews and chocolate respectively).  I didn’t like cupuaçu or graviola juices.  Thanks so much to Ernesto and everyone else for inviting me (not just because of the juice).

Update (January 2): You can now watch videos of my mini-course at the Jerusalem Winter School here.

Videos of the other talks at the Jerusalem Winter School are available from the same site (just scroll through them on the right).

Five announcements

Tuesday, October 1st, 2013

Update (Oct. 3): OK, a sixth announcement.  I just posted a question on CS Theory StackExchange, entitled Overarching reasons why problems are in P or BPP.  If you have suggested additions or improvements to my rough list of “overarching reasons,” please post them over there — thanks!

1. I’m in Oxford right now, for a Clay Institute workshop on New Insights into Computational Intractability.  The workshop is concurrent with three others, including one on Number Theory and Physics that includes an amplituhedron-related talk by Andrew Hodges.  (Speaking of which, see here for a small but non-parodic observation about expressing amplitudes as volumes of polytopes.)

2. I was hoping to stay in the UK one more week, to attend the Newton Institute’s special semester on Mathematical Challenges in Quantum Information over in Cambridge.  But alas I had to cancel, since my diaper-changing services are needed in the other Cambridge.  So, if anyone in Cambridge (or anywhere else in the United Kingdom) really wants to talk to me, come to Oxford this week!

3. Back in June, Jens Eisert and three others posted a preprint claiming that the output of a BosonSampling device would be “indistinguishable from the uniform distribution” in various senses.  Ever since then, people have emailing me, leaving comments on this blog, and cornering me at conferences to ask whether Alex Arkhipov and I had any response to these claims.  OK, so just this weekend, we posted our own 41-page preprint, entitled “BosonSampling Is Far From Uniform.”  I hope it suffices by way of reply!  (Incidentally, this is also the paper I hinted at in a previous post: the one where π2/6 and the Euler-Mascheroni constant make cameo appearances.)  To clarify, if we just wanted to answer the claims of the Eisert group, then I think a couple paragraphs would suffice for that (see, for example, these PowerPoint slides).  In our new paper, however, Alex and I take the opportunity to go further: we study lots of interesting questions about the statistical properties of Haar-random BosonSampling distributions, and about how one might test efficiently whether a claimed BosonSampling device worked, even with hundreds or thousands of photons.

4. Also on the arXiv last night, there was a phenomenal survey about the quantum PCP conjecture by Dorit Aharonov, Itai Arad, and my former postdoc Thomas Vidick (soon to be a professor at Caltech).  I recommend reading it in the strongest possible terms, if you’d like to see how far people have come with this problem (but also, how far they still have to go) since my “Quantum PCP Manifesto” seven years ago.

5. Christos Papadimitriou asked me to publicize that the deadline for early registration and hotel reservations for the upcoming FOCS in Berkeley is fast approaching!  Indeed, it’s October 4 (three days from now).  See here for details, and here for information about student travel support.  (The links were down when I just tried them, but hopefully the server will be back up soon.)


Tuesday, August 27th, 2013

Updates (Aug. 29): John Preskill now has a very nice post summarizing the different views on offer at the firewall workshop, thereby alleviating my guilt for giving you only the mess below.  Thanks, John!

And if you check out John’s Twitter feed (which you should), you’ll find another, unrelated gem: a phenomenal TEDx talk on quantum computing by my friend, coauthor, and hero, the Lowerboundsman of Latvia, Andris Ambainis.  (Once again, when offered a feast of insight to dispel their misconceptions and ennoble their souls, the YouTube commenters are distinguishing themselves by focusing on the speaker’s voice.  Been there, man, been there.)

So, last week I was at the Fuzzorfire workshop at the Kavli Institute for Theoretical Physics in Santa Barbara, devoted to the black hole firewall paradox.  (The workshop is still going on this week, but I needed to get back early.)  For some background:

I had fantasies of writing a long, witty blog post that would set out my thoughts about firewalls, full of detailed responses to everything I’d heard at the conference, as well as ruminations about Harlow and Hayden’s striking argument that computational complexity might provide a key to resolving the paradox.  But the truth is, I’m recovering from a nasty stomach virus, am feeling “firewalled out,” and wish to use my few remaining non-childcare hours before the semester starts to finish writing papers.  So I decided that better than nothing would be a hastily-assembled pastiche of links.

First and most important, you can watch all the talks online.  In no particular order:

Here’s my own attempt to summarize what’s at stake, adapted from a comment on Peter Woit’s blog (see also a rapid response by Lubos):

As I understand it, the issue is actually pretty simple. Do you agree that
(1) the Hawking evaporation process should be unitary, and
(2) the laws of physics should describe the experiences of an infalling observer, not just those of an observer who stays outside the horizon?
If so, then you seem forced to accept
(3) the interior degrees of freedom should just be some sort of scrambled re-encoding of the exterior degrees, rather than living in a separate subfactor of Hilbert space (since otherwise we’d violate unitarity).
But then we get
(4) by applying a suitable unitary transformation to the Hawking radiation of an old enough black hole before you jump into it, someone ought to be able, in principle, to completely modify what you experience when you do jump in.  Moreover, that person could be far away from you—an apparent gross violation of locality.

So, there are a few options: you could reject either (1) or (2). You could bite the bullet and accept (4). You could say that the “experience of an infalling observer” should just be to die immediately at the horizon (firewalls). You could argue that for some reason (e.g., gravitational backreaction, or computational complexity), the unitary transformations required in (4) are impossible to implement even in principle. Or you could go the “Lubosian route,” and simply assert that the lack of any real difficulty is so obvious that, if you admit to being confused, then that just proves you’re an idiot.  AdS/CFT is clearly relevant, but as Polchinski pointed out, it does surprisingly little to solve the problem.

Now, what Almheiri et al. (AMPS) added to the simple logical argument above was really to make the consequence (4) more “concrete” and “vivid”—by describing something that, in principle, someone could actually do to the Hawking radiation before jumping in, such that after you jumped in, if there wasn’t anything dramatic that happened—something violating local QFT and the equivalence principle—then you’d apparently observe a violation of the monogamy of entanglement, a basic principle of quantum mechanics.  I’m sure the bare logic (1)-(4) was known to many people before AMPS: I certainly knew it, but I didn’t call it a “paradox,” I just called it “I don’t understand black hole complementarity”!

At any rate, thinking about the “Hawking radiation decoding problem” already led me to some very nice questions in quantum computing theory, which remain interesting even if you remove the black hole motivation entirely. And that helped convince me that something new and worthwhile might indeed come out of this business, despite how much fun it is. (Hopefully whatever does come out won’t be as garbled as Hawking radiation.)

For continuing live updates from the workshop, check out John Preskill’s Twitter feed.

Or you can ask me to expand on various things in the comments, and I’ll do my best.  (As I said in my talk, while I’m not sure that the correct quantum description of the black hole interior is within anyone‘s professional expertise, it’s certainly outside of mine!  But I do find this sort of thing fun to think about—how could I not?)

Unrelated, but also of interest: check out an excellent article in Quanta by Erica Klarreich, about the recent breakthroughs by Reichardt-Unger-Vazirani, Vazirani-Vidick, and others on classical command of quantum systems.