## Archive for the ‘Metaphysical Spouting’ Category

### Common Knowledge and Aumann’s Agreement Theorem

Sunday, August 16th, 2015

The following is the prepared version of a talk that I gave at SPARC: a high-school summer program about applied rationality held in Berkeley, CA for the past two weeks.  I had a wonderful time in Berkeley, meeting new friends and old, but I’m now leaving to visit the CQT in Singapore, and then to attend the AQIS conference in Seoul.

Common Knowledge and Aumann’s Agreement Theorem

August 14, 2015

Thank you so much for inviting me here!  I honestly don’t know whether it’s possible to teach applied rationality, the way this camp is trying to do.  What I know is that, if it is possible, then the people running SPARC are some of the awesomest people on earth to figure out how.  I’m incredibly proud that Chelsea Voss and Paul Christiano are both former students of mine, and I’m amazed by the program they and the others have put together here.  I hope you’re all having fun—or maximizing your utility functions, or whatever.

My research is mostly about quantum computing, and more broadly, computation and physics.  But I was asked to talk about something you can actually use in your lives, so I want to tell a different story, involving common knowledge.

I’ll start with the “Muddy Children Puzzle,” which is one of the greatest logic puzzles ever invented.  How many of you have seen this one?

OK, so the way it goes is, there are a hundred children playing in the mud.  Naturally, they all have muddy foreheads.  At some point their teacher comes along and says to them, as they all sit around in a circle: “stand up if you know your forehead is muddy.”  No one stands up.  For how could they know?  Each kid can see all the other 99 kids’ foreheads, so knows that they’re muddy, but can’t see his or her own forehead.  (We’ll assume that there are no mirrors or camera phones nearby, and also that this is mud that you don’t feel when it’s on your forehead.)

So the teacher tries again.  “Knowing that no one stood up the last time, now stand up if you know your forehead is muddy.”  Still no one stands up.  Why would they?  No matter how many times the teacher repeats the request, still no one stands up.

Then the teacher tries something new.  “Look, I hereby announce that at least one of you has a muddy forehead.”  After that announcement, the teacher again says, “stand up if you know your forehead is muddy”—and again no one stands up.  And again and again; it continues 99 times.  But then the hundredth time, all the children suddenly stand up.

(There’s a variant of the puzzle involving blue-eyed islanders who all suddenly commit suicide on the hundredth day, when they all learn that their eyes are blue—but as a blue-eyed person myself, that’s always struck me as needlessly macabre.)

What’s going on here?  Somehow, the teacher’s announcing to the children that at least one of them had a muddy forehead set something dramatic in motion, which would eventually make them all stand up—but how could that announcement possibly have made any difference?  After all, each child already knew that at least 99 children had muddy foreheads!

Now, this argument can be generalized to any (finite) number of children.  The crucial concept here is common knowledge.  We call a fact “common knowledge” if, not only does everyone know it, but everyone knows everyone knows it, and everyone knows everyone knows everyone knows it, and so on.  It’s true that in the beginning, each child knew that all the other children had muddy foreheads, but it wasn’t common knowledge that even one of them had a muddy forehead.  For example, if your forehead and mine are both muddy, then I know that at least one of us has a muddy forehead, and you know that too, but you don’t know that I know it (for what if your forehead were clean?), and I don’t know that you know it (for what if my forehead were clean?).

What the teacher’s announcement did, was to make it common knowledge that at least one child has a muddy forehead (since not only did everyone hear the announcement, but everyone witnessed everyone else hearing it, etc.).  And once you understand that point, it’s easy to argue by induction: after the teacher asks and no child stands up (and everyone sees that no one stood up), it becomes common knowledge that at least two children have muddy foreheads (since if only one child had had a muddy forehead, that child would’ve known it and stood up).  Next it becomes common knowledge that at least three children have muddy foreheads, and so on, until after a hundred rounds it’s common knowledge that everyone’s forehead is muddy, so everyone stands up.

The moral is that the mere act of saying something publicly can change the world—even if everything you said was already obvious to every last one of your listeners.  For it’s possible that, until your announcement, not everyone knew that everyone knew the thing, or knew everyone knew everyone knew it, etc., and that could have prevented them from acting.

This idea turns out to have huge real-life consequences, to situations way beyond children with muddy foreheads.  I mean, it also applies to children with dots on their foreheads, or “kick me” signs on their backs…

But seriously, let me give you an example I stole from Steven Pinker, from his wonderful book The Stuff of Thought.  Two people of indeterminate gender—let’s not make any assumptions here—go on a date.  Afterward, one of them says to the other: “Would you like to come up to my apartment to see my etchings?”  The other says, “Sure, I’d love to see them.”

This is such a cliché that we might not even notice the deep paradox here.  It’s like with life itself: people knew for thousands of years that every bird has the right kind of beak for its environment, but not until Darwin and Wallace could anyone articulate why (and only a few people before them even recognized there was a question there that called for a non-circular answer).

In our case, the puzzle is this: both people on the date know perfectly well that the reason they’re going up to the apartment has nothing to do with etchings.  They probably even both know the other knows that.  But if that’s the case, then why don’t they just blurt it out: “would you like to come up for some intercourse?”  (Or “fluid transfer,” as the John Nash character put it in the Beautiful Mind movie?)

So here’s Pinker’s answer.  Yes, both people know why they’re going to the apartment, but they also want to avoid their knowledge becoming common knowledge.  They want plausible deniability.  There are several possible reasons: to preserve the romantic fantasy of being “swept off one’s feet.”  To provide a face-saving way to back out later, should one of them change their mind: since nothing was ever openly said, there’s no agreement to abrogate.  In fact, even if only one of the people (say A) might care about such things, if the other person (say B) thinks there’s any chance A cares, B will also have an interest in avoiding common knowledge, for A’s sake.

Put differently, the issue is that, as soon as you say X out loud, the other person doesn’t merely learn X: they learn that you know X, that you know that they know that you know X, that you want them to know you know X, and an infinity of other things that might upset the delicate epistemic balance.  Contrast that with the situation where X is left unstated: yeah, both people are pretty sure that “etchings” are just a pretext, and can even plausibly guess that the other person knows they’re pretty sure about it.  But once you start getting to 3, 4, 5, levels of indirection—who knows?  Maybe you do just want to show me some etchings.

Philosophers like to discuss Sherlock Holmes and Professor Moriarty meeting in a train station, and Moriarty declaring, “I knew you’d be here,” and Holmes replying, “well, I knew that you knew I’d be here,” and Moriarty saying, “I knew you knew I knew I’d be here,” etc.  But real humans tend to be unable to reason reliably past three or four levels in the knowledge hierarchy.  (Related to that, you might have heard of the game where everyone guesses a number between 0 and 100, and the winner is whoever’s number is the closest to 2/3 of the average of all the numbers.  If this game is played by perfectly rational people, who know they’re all perfectly rational, and know they know, etc., then they must all guess 0—exercise for you to see why.  Yet experiments show that, if you actually want to win this game against average people, you should guess about 20.  People seem to start with 50 or so, iterate the operation of multiplying by 2/3 a few times, and then stop.)

Incidentally, do you know what I would’ve given for someone to have explained this stuff to me back in high school?  I think that a large fraction of the infamous social difficulties that nerds have, is simply down to nerds spending so much time in domains (like math and science) where the point is to struggle with every last neuron to make everything common knowledge, to make all truths as clear and explicit as possible.  Whereas in social contexts, very often you’re managing a delicate epistemic balance where you need certain things to be known, but not known to be known, and so forth—where you need to prevent common knowledge from arising, at least temporarily.  “Normal” people have an intuitive feel for this; it doesn’t need to be explained to them.  For nerds, by contrast, explaining it—in terms of the muddy children puzzle and so forth—might be exactly what’s needed.  Once they’re told the rules of a game, nerds can try playing it too!  They might even turn out to be good at it.

OK, now for a darker example of common knowledge in action.  If you read accounts of Nazi Germany, or the USSR, or North Korea or other despotic regimes today, you can easily be overwhelmed by this sense of, “so why didn’t all the sane people just rise up and overthrow the totalitarian monsters?  Surely there were more sane people than crazy, evil ones.  And probably the sane people even knew, from experience, that many of their neighbors were sane—so why this cowardice?”  Once again, it could be argued that common knowledge is the key.  Even if everyone knows the emperor is naked; indeed, even if everyone knows everyone knows he’s naked, still, if it’s not common knowledge, then anyone who says the emperor’s naked is knowingly assuming a massive personal risk.  That’s why, in the story, it took a child to shift the equilibrium.  Likewise, even if you know that 90% of the populace will join your democratic revolt provided they themselves know 90% will join it, if you can’t make your revolt’s popularity common knowledge, everyone will be stuck second-guessing each other, worried that if they revolt they’ll be an easily-crushed minority.  And because of that very worry, they’ll be correct!

(My favorite Soviet joke involves a man standing in the Moscow train station, handing out leaflets to everyone who passes by.  Eventually, of course, the KGB arrests him—but they discover to their surprise that the leaflets are just blank pieces of paper.  “What’s the meaning of this?” they demand.  “What is there to write?” replies the man.  “It’s so obvious!”  Note that this is precisely a situation where the man is trying to make common knowledge something he assumes his “readers” already know.)

The kicker is that, to prevent something from becoming common knowledge, all you need to do is censor the common-knowledge-producing mechanisms: the press, the Internet, public meetings.  This nicely explains why despots throughout history have been so obsessed with controlling the press, and also explains how it’s possible for 10% of a population to murder and enslave the other 90% (as has happened again and again in our species’ sorry history), even though the 90% could easily overwhelm the 10% by acting in concert.  Finally, it explains why believers in the Enlightenment project tend to be such fanatical absolutists about free speech—why they refuse to “balance” it against cultural sensitivity or social harmony or any other value, as so many well-meaning people urge these days.

OK, but let me try to tell you something surprising about common knowledge.  Here at SPARC, you’ve learned all about Bayes’ rule—how, if you like, you can treat “probabilities” as just made-up numbers in your head, which are required obey the probability calculus, and then there’s a very definite rule for how to update those numbers when you gain new information.  And indeed, how an agent that wanders around constantly updating these numbers in its head, and taking whichever action maximizes its expected utility (as calculated using the numbers), is probably the leading modern conception of what it means to be “rational.”

Now imagine that you’ve got two agents, call them Alice and Bob, with common knowledge of each other’s honesty and rationality, and with the same prior probability distribution over some set of possible states of the world.  But now imagine they go out and live their lives, and have totally different experiences that lead to their learning different things, and having different posterior distributions.  But then they meet again, and they realize that their opinions about some topic (say, Hillary’s chances of winning the election) are common knowledge: they both know each other’s opinion, and they both know that they both know, and so on.  Then a striking 1976 result called Aumann’s Theorem states that their opinions must be equal.  Or, as it’s summarized: “rational agents with common priors can never agree to disagree about anything.”

Actually, before going further, let’s prove Aumann’s Theorem—since it’s one of those things that sounds like a mistake when you first hear it, and then becomes a triviality once you see the 3-line proof.  (Albeit, a “triviality” that won Aumann a Nobel in economics.)  The key idea is that knowledge induces a partition on the set of possible states of the world.  Huh?  OK, imagine someone is either an old man, an old woman, a young man, or a young woman.  You and I agree in giving each of these a 25% prior probability.  Now imagine that you find out whether they’re a man or a woman, and I find out whether they’re young or old.  This can be illustrated as follows:

The diagram tells us, for example, that if the ground truth is “old woman,” then your knowledge is described by the set {old woman, young woman}, while my knowledge is described by the set {old woman, old man}.  And this different information leads us to different beliefs: for example, if someone asks for the probability that the person is a woman, you’ll say 100% but I’ll say 50%.  OK, but what does it mean for information to be common knowledge?  It means that I know that you know that I know that you know, and so on.  Which means that, if you want to find out what’s common knowledge between us, you need to take the least common coarsening of our knowledge partitions.  I.e., if the ground truth is some given world w, then what do I consider it possible that you consider it possible that I consider possible that … etc.?  Iterate this growth process until it stops, by “zigzagging” between our knowledge partitions, and you get the set S of worlds such that, if we’re in world w, then what’s common knowledge between us is that the world belongs to S.  Repeat for all w’s, and you get the least common coarsening of our partitions.  In the above example, the least common coarsening is trivial, with all four worlds ending up in the same set S, but there are nontrivial examples as well:

Now, if Alice’s expectation of a random variable X is common knowledge between her and Bob, that means that everywhere in S, her expectation must be constant … and hence must equal whatever the expectation is, over all the worlds in S!  Likewise, if Bob’s expectation is common knowledge with Alice, then everywhere in S, it must equal the expectation of X over S.  But that means that Alice’s and Bob’s expectations are the same.

There are lots of related results.  For example, rational agents with common priors, and common knowledge of each other’s rationality, should never engage in speculative trade (e.g., buying and selling stocks, assuming that they don’t need cash, they’re not earning a commission, etc.).  Why?  Basically because, if I try to sell you a stock for (say) $50, then you should reason that the very fact that I’m offering it means I must have information you don’t that it’s worth less than$50, so then you update accordingly and you don’t want it either.

Or here’s another one: suppose again that we’re Bayesians with common priors, and we’re having a conversation, where I tell you my opinion (say, of the probability Hillary will win the election).  Not any of the reasons or evidence on which the opinion is based—just the opinion itself.  Then you, being Bayesian, update your probabilities to account for what my opinion is.  Then you tell me your opinion (which might have changed after learning mine), I update on that, I tell you my new opinion, then you tell me your new opinion, and so on.  You might think this could go on forever!  But, no, Geanakoplos and Polemarchakis observed that, as long as there are only finitely many possible states of the world in our shared prior, this process must converge after finitely many steps with you and me having the same opinion (and moreover, with it being common knowledge that we have that opinion).  Why?  Because as long as our opinions differ, your telling me your opinion or me telling you mine must induce a nontrivial refinement of one of our knowledge partitions, like so:

I.e., if you learn something new, then at least one of your knowledge sets must get split along the different possible values of the thing you learned.  But since there are only finitely many underlying states, there can only be finitely many such splittings (note that, since Bayesians never forget anything, knowledge sets that are split will never again rejoin).

And something else: suppose your friend tells you a liberal opinion, then you take it into account, but reply with a more conservative opinion.  The friend takes your opinion into account, and replies with a revised opinion.  Question: is your friend’s new opinion likelier to be more liberal than yours, or more conservative?

Obviously, more liberal!  Yes, maybe your friend now sees some of your points and vice versa, maybe you’ve now drawn a bit closer (ideally!), but you’re not going to suddenly switch sides because of one conversation.

So, putting all these results together, we get a clear picture of what rational disagreements should look like: they should follow unbiased random walks, until sooner or later they terminate in common knowledge of complete agreement.  We now face a bit of a puzzle, in that hardly any disagreements in the history of the world have ever looked like that.  So what gives?

There are a few ways out:

(1) You could say that the “failed prediction” of Aumann’s Theorem is no surprise, since virtually all human beings are irrational cretins, or liars (or at least, it’s not common knowledge that they aren’t). Except for you, of course: you’re perfectly rational and honest.  And if you ever met anyone else as rational and honest as you, maybe you and they could have an Aumannian conversation.  But since such a person probably doesn’t exist, you’re totally justified to stand your ground, discount all opinions that differ from yours, etc.  Notice that, even if you genuinely believed that was all there was to it, Aumann’s Theorem would still have an aspirational significance for you: you would still have to say this is the ideal that all rationalists should strive toward when they disagree.  And that would already conflict with a lot of standard rationalist wisdom.  For example, we all know that arguments from authority carry little weight: what should sway you is not the mere fact of some other person stating their opinion, but the actual arguments and evidence that they’re able to bring.  Except that as we’ve seen, for Bayesians with common priors this isn’t true at all!  Instead, merely hearing your friend’s opinion serves as a powerful summary of what your friend knows.  And if you learn that your rational friend disagrees with you, then even without knowing why, you should take that as seriously as if you discovered a contradiction in your own thought processes.  This is related to an even broader point: there’s a normative rule of rationality that you should judge ideas only on their merits—yet if you’re a Bayesian, of course you’re going to take into account where the ideas come from, and how many other people hold them!  Likewise, if you’re a Bayesian police officer or a Bayesian airport screener or a Bayesian job interviewer, of course you’re going to profile people by their superficial characteristics, however unfair that might be to individuals—so all those studies proving that people evaluate the same resume differently if you change the name at the top are no great surprise.  It seems to me that the tension between these two different views of rationality, the normative and the Bayesian, generates a lot of the most intractable debates of the modern world.

(2) Or—and this is an obvious one—you could reject the assumption of common priors. After all, isn’t a major selling point of Bayesianism supposed to be its subjective aspect, the fact that you pick “whichever prior feels right for you,” and are constrained only in how to update that prior?  If Alice’s and Bob’s priors can be different, then all the reasoning I went through earlier collapses.  So rejecting common priors might seem appealing.  But there’s a paper by Tyler Cowen and Robin Hanson called “Are Disagreements Honest?”—one of the most worldview-destabilizing papers I’ve ever read—that calls that strategy into question.  What it says, basically, is this: if you’re really a thoroughgoing Bayesian rationalist, then your prior ought to allow for the possibility that you are the other person.  Or to put it another way: “you being born as you,” rather than as someone else, should be treated as just one more contingent fact that you observe and then conditionalize on!  And likewise, the other person should condition on the observation that they’re them and not you.  In this way, absolutely everything that makes you different from someone else can be understood as “differing information,” so we’re right back to the situation covered by Aumann’s Theorem.  Imagine, if you like, that we all started out behind some Rawlsian veil of ignorance, as pure reasoning minds that had yet to be assigned specific bodies.  In that original state, there was nothing to differentiate any of us from any other—anything that did would just be information to condition on—so we all should’ve had the same prior.  That might sound fanciful, but in some sense all it’s saying is: what licenses you to privilege an observation just because it’s your eyes that made it, or a thought just because it happened to occur in your head?  Like, if you’re objectively smarter or more observant than everyone else around you, fine, but to whatever extent you agree that you aren’t, your opinion gets no special epistemic protection just because it’s yours.

(3) If you’re uncomfortable with this tendency of Bayesian reasoning to refuse to be confined anywhere, to want to expand to cosmic or metaphysical scope (“I need to condition on having been born as me and not someone else”)—well then, you could reject the entire framework of Bayesianism, as your notion of rationality. Lest I be cast out from this camp as a heretic, I hasten to say: I include this option only for the sake of completeness!

(4) When I first learned about this stuff 12 years ago, it seemed obvious to me that a lot of it could be dismissed as irrelevant to the real world for reasons of complexity. I.e., sure, it might apply to ideal reasoners with unlimited time and computational power, but as soon as you impose realistic constraints, this whole Aumannian house of cards should collapse.  As an example, if Alice and Bob have common priors, then sure they’ll agree about everything if they effectively share all their information with each other!  But in practice, we don’t have time to “mind-meld,” swapping our entire life experiences with anyone we meet.  So one could conjecture that agreement, in general, requires a lot of communication.  So then I sat down and tried to prove that as a theorem.  And you know what I found?  That my intuition here wasn’t even close to correct!

In more detail, I proved the following theorem.  Suppose Alice and Bob are Bayesians with shared priors, and suppose they’re arguing about (say) the probability of some future event—or more generally, about any random variable X bounded in [0,1].  So, they have a conversation where Alice first announces her expectation of X, then Bob announces his new expectation, and so on.  The theorem says that Alice’s and Bob’s estimates of X will necessarily agree to within ±ε, with probability at least 1-δ over their shared prior, after they’ve exchanged only O(1/(δε2)) messages.  Note that this bound is completely independent of how much knowledge they have; it depends only on the accuracy with which they want to agree!  Furthermore, the same bound holds even if Alice and Bob only send a few discrete bits about their real-valued expectations with each message, rather than the expectations themselves.

The proof involves the idea that Alice and Bob’s estimates of X, call them XA and XB respectively, follow “unbiased random walks” (or more formally, are martingales).  Very roughly, if |XA-XB|≥ε with high probability over Alice and Bob’s shared prior, then that fact implies that the next message has a high probability (again, over the shared prior) of causing either XA or XB to jump up or down by about ε.  But XA and XB, being estimates of X, are bounded between 0 and 1.  So a random walk with a step size of ε can only continue for about 1/ε2 steps before it hits one of the “absorbing barriers.”

The way to formalize this is to look at the variances, Var[XA] and Var[XB], with respect to the shared prior.  Because Alice and Bob’s partitions keep getting refined, the variances are monotonically non-decreasing.  They start out 0 and can never exceed 1 (in fact they can never exceed 1/4, but let’s not worry about constants).  Now, the key lemma is that, if Pr[|XA-XB|≥ε]≥δ, then Var[XB] must increase by at least δε2 if Alice sends XA to Bob, and Var[XA] must increase by at least δε2 if Bob sends XB to Alice.  You can see my paper for the proof, or just work it out for yourself.  At any rate, the lemma implies that, after O(1/(δε2)) rounds of communication, there must be at least a temporary break in the disagreement; there must be some round where Alice and Bob approximately agree with high probability.

There are lots of other results in my paper, including an upper bound on the number of calls that Alice and Bob need to make to a “sampling oracle” to carry out this sort of protocol approximately, assuming they’re not perfect Bayesians but agents with bounded computational power.  But let me step back and address the broader question: what should we make of all this?  How should we live with the gargantuan chasm between the prediction of Bayesian rationality for how we should disagree, and the actual facts of how we do disagree?

We could simply declare that human beings are not well-modeled as Bayesians with common priors—that we’ve failed in giving a descriptive account of human behavior—and leave it at that.   OK, but that would still leave the question: does this stuff have normative value?  Should it affect how we behave, if we want to consider ourselves honest and rational?  I would argue, possibly yes.

Yes, you should constantly ask yourself the question: “would I still be defending this opinion, if I had been born as someone else?”  (Though you might say this insight predates Aumann by quite a bit, going back at least to Spinoza.)

Yes, if someone you respect as honest and rational disagrees with you, you should take it as seriously as if the disagreement were between two different aspects of yourself.

Finally, yes, we can try to judge epistemic communities by how closely they approach the Aumannian ideal.  In math and science, in my experience, it’s common to see two people furiously arguing with each other at a blackboard.  Come back five minutes later, and they’re arguing even more furiously, but now their positions have switched.  As we’ve seen, that’s precisely what the math says a rational conversation should look like.  In social and political discussions, though, usually the very best you’ll see is that two people start out diametrically opposed, but eventually one of them says “fine, I’ll grant you this,” and the other says “fine, I’ll grant you that.”  We might say, that’s certainly better than the common alternative, of the two people walking away even more polarized than before!  Yet the math tells us that even the first case—even the two people gradually getting closer in their views—is nothing at all like a rational exchange, which would involve the two participants repeatedly leapfrogging each other, completely changing their opinion about the question under discussion (and then changing back, and back again) every time they learned something new.  The first case, you might say, is more like haggling—more like “I’ll grant you that X is true if you grant me that Y is true”—than like our ideal friendly mathematicians arguing at the blackboard, whose acceptance of new truths is never slow or grudging, never conditional on the other person first agreeing with them about something else.

Armed with this understanding, we could try to rank fields by how hard it is to have an Aumannian conversation in them.  At the bottom—the easiest!—is math (or, let’s say, chess, or debugging a program, or fact-heavy fields like lexicography or geography).  Crucially, here I only mean the parts of these subjects with agreed-on rules and definite answers: once the conversation turns to whose theorems are deeper, or whose fault the bug was, things can get arbitrarily non-Aumannian.  Then there’s the type of science that involves messy correlational studies (I just mean, talking about what’s a risk factor for what, not the political implications).  Then there’s politics and aesthetics, with the most radioactive topics like Israel/Palestine higher up.  And then, at the very peak, there’s gender and social justice debates, where everyone brings their formative experiences along, and absolutely no one is a disinterested truth-seeker, and possibly no Aumannian conversation has ever been had in the history of the world.

I would urge that even at the very top, it’s still incumbent on all of us to try to make the Aumannian move, of “what would I think about this issue if I were someone else and not me?  If I were a man, a woman, black, white, gay, straight, a nerd, a jock?  How much of my thinking about this represents pure Spinozist reason, which could be ported to any rational mind, and how much of it would get lost in translation?”

Anyway, I’m sure some people would argue that, in the end, the whole framework of Bayesian agents, common priors, common knowledge, etc. can be chucked from this discussion like so much scaffolding, and the moral lessons I want to draw boil down to trite advice (“try to see the other person’s point of view”) that you all knew already.  Then again, even if you all knew all this, maybe you didn’t know that you all knew it!  So I hope you gained some new information from this talk in any case.  Thanks.

Update: Coincidentally, there’s a moving NYT piece by Oliver Sacks, which (among other things) recounts his experiences with his cousin, the Aumann of Aumann’s theorem.

Another Update: If I ever did attempt an Aumannian conversation with someone, the other Scott A. would be a candidate! Here he is in 2011 making several of the same points I did above, using the same examples (I thank him for pointing me to his post).

### “Can Quantum Computing Reveal the True Meaning of Quantum Mechanics?”

Thursday, June 25th, 2015

I now have a 3500-word post on that question up at NOVA’s “Nature of Reality” blog.  If you’ve been reading Shtetl-Optimized religiously for the past decade (why?), there won’t be much new to you there, but if not, well, I hope you like it!  Comments are welcome, either here or there.  Thanks so much to Kate Becker at NOVA for commissioning this piece, and for her help editing it.

### Can blog posts nourish the soul? Scott A. (alas, not me) as existence proof

Wednesday, June 3rd, 2015

Reading the essays and speculative fiction of Scott Alexander, as they’ve grown in awesomeness even just within the past half-year, has for me been like witnessing the birth of a new Asimov.  (For more Alexandery goodness, check out Universal Love, Said the Cactus Person.)  That this nerd-bard, this spinner of stupid Internet memes into reflections on eternity, came to my attention by way of his brilliantly defending me, is almost immaterial at this point; I don’t think it plays any role in my continuing admiration for his work.  Whatever you do, just keep writing, other Scott A.

### The End of Suffering?

Monday, June 1st, 2015

A computer science undergrad who reads this blog recently emailed me about an anxiety he’s been feeling connected to the Singularity—not that it will destroy all human life, but rather that it will make life suffering-free and therefore no longer worth living (more Brave New World than Terminator, one might say).

As he puts it:

This probably sounds silly, but I’ve been existentially troubled by certain science fiction predictions for about a year or two, most of them coming from the Ray Kurzweil/Singularity Institute types … What really bothers me is the idea of the “abolition of suffering” as some put it. I just don’t see the point. Getting rid of cancer, premature death, etc., that all sounds great. But death itself? All suffering? At what point do we just sit down and ask ourselves, why not put our brains in a jar, and just activate our pleasure receptors for all eternity? That seems to be the logical conclusion of that line of thinking. If we want to reduce the conscious feeling of pleasure to the release of dopamine in the brain, well, why not?

I guess what I think I’m worried about is having to make the choice to become a cyborg, or to upload my mind to a computer, to live forever, or to never suffer again. I don’t know how I’d answer, given the choice. I enjoy being human, and that includes my suffering. I really don’t want to live forever. I see that as a hedonic treadmill more than anything else. Crazy bioethicists like David Pearce, who want to genetically re-engineer all species on planet Earth to be herbivores, and literally abolish all suffering, just add fuel to my anxiety.

… Do you think we’re any closer to what Kurzweil (or Pearce) predicted (and by that I mean, will we see it in our lifetimes)? I want to stop worrying about these things, but something is preventing me from doing so. Thoughts about the far flung (or near) future are just intrusive for me. And it seems like everywhere I go I’m reminded of my impending fate. Ernst Jünger would encourage me to take up an attitude of amor fati, but I can’t see myself doing that. My father says I’m too young to worry about these things, and that the answer will be clear when I’ve actually lived my life. But I just don’t know. I want to stop caring, more than anything else. It’s gotten to a point where the thoughts keep me up at night.

I don’t know how many readers might have had similar anxieties, but in any case, I thought my reply might be of some interest to others, so with the questioner’s kind permission, I’m reproducing it below.

1. An end to suffering removing the meaning from life? As my grandmother might say, “we should only have such problems”! I believe, alas, that suffering will always be with us, even after a hypothetical technological singularity, because of basic Malthusian logic. I.e., no matter how many resources there are, population will expand exponentially to exploit them and make the resources scarce again, thereby causing fighting, deprivation, and suffering. What’s terrifying about Malthus’s logic is how fully general it is: it applies equally to tenure-track faculty positions, to any extraterrestrial life that might exist in our universe or in any other bounded universe, and to the distant post-Singularity future.

But if, by some miracle, we were able to overcome Malthus and eliminate all suffering, my own inclination would be to say “go for it”! I can easily imagine a life that was well worth living—filled with beauty, humor, play, love, sex, and mathematical and scientific discovery—even though it was devoid of any serious suffering. (We could debate whether the “ideal life” would include occasional setbacks, frustrations, etc., even while agreeing that at any rate, it should certainly be devoid of cancer, poverty, bullying, suicidal depression, and one’s Internet connection going down.)

2. If you want to worry about something, then rather than an end to suffering, I might humbly suggest worrying about a large increase in human suffering within our lifetimes. A few possible culprits: climate change, resurgent religious fundamentalism, large parts of the world running out of fresh water.

3. It’s fun to think about these questions from time to time, to use them to hone our moral intuitions—and I even agree with Scott Alexander that it’s worthwhile to have a small number of smart people think about them full-time for a living.  But I should tell you that, as I wrote in my post The Singularity Is Far, I don’t expect a Singularity in my lifetime or my grandchildrens’ lifetimes. Yes, technically, if there’s ever going to be a Singularity, then we’re 10 years closer to it now than we were 10 years ago, but it could still be one hell of a long way away! And yes, I expect that technology will continue to change in my lifetime in amazing ways—not as much as it changed in my grandparents’ lifetimes, probably, but still by a lot—but how to put this? I’m willing to bet any amount of money that when I die, people’s shit will still stink.

### “Is There Something Mysterious About Math?”

Wednesday, April 22nd, 2015

When it rains, it pours: after not blogging for a month, I now have a second thing to blog about in as many days.  Aeon, an online magazine, asked me to write a short essay responding to the question above, so I did.  My essay is here.  Spoiler alert: my thesis is that yes, there’s something “mysterious” about math, but the main mystery is why there isn’t even more mystery than there is.  Also—shameless attempt to get you to click—the essay discusses the “discrete math is just a disorganized mess of random statements” view of Luboš Motl, who’s useful for putting flesh on what might otherwise be a strawman position.  Comments welcome (when aren’t they?).  You should also read other interesting responses to the same question by Penelope Maddy, James Franklin, and Neil Levy.  Thanks very much to Ed Lake at Aeon for commissioning these pieces.

Update (4/22): On rereading my piece, I felt bad that it didn’t make a clear enough distinction between two separate questions:

1. Are there humanly-comprehensible explanations for why the mathematical statements that we care about are true or false—thereby rendering their truth or falsity “non-mysterious” to us?
2. Are there formal proofs or disproofs of the statements?

Interestingly, neither of the above implies the other.  Thus, to take an example from the essay, no one has any idea how to prove that the digits 0 through 9 occur with equal frequency in the decimal expansion of π, and yet it’s utterly non-mysterious (at a “physics level of rigor”) why that particular statement should be true.  Conversely, there are many examples of statements for which we do have proofs, but which experts in the relevant fields still see as “mysterious,” because the proofs aren’t illuminating or explanatory enough.  Any proofs that require gigantic manipulations of formulas, “magically” terminating in the desired outcome, probably fall into that class, as do proofs that require computer enumeration of cases (like that of the Four-Color Theorem).

But it’s not just that proof and explanation are incomparable; sometimes they might even be at odds.  In this MathOverflow post, Timothy Gowers relates an interesting speculation of Don Zagier, that statements like the equidistribution of the digits of π might be unprovable from the usual axioms of set theory, precisely because they’re so “obviously” true—and for that very reason, there need not be anything deeper underlying their truth.  As Gowers points out, we shouldn’t go overboard with this speculation, because there are plenty of other examples of mathematical statements (the Green-Tao theorem, Vinogradov’s theorem, etc.) that also seem like they might be true “just because”—true only because their falsehood would require a statistical miracle—but for which mathematicians nevertheless managed to give fully rigorous proofs, in effect formalizing the intuition that it would take a miracle to make them false.

Zagier’s speculation is related to another objection one could raise against my essay: while I said that the “Gödelian gremlin” has remained surprisingly dormant in the 85 years since its discovery (and that this is a fascinating fact crying out for explanation), who’s to say that it’s not lurking in some of the very open problems that I mentioned, like π’s equidistribution, the Riemann Hypothesis, the Goldbach Conjecture, or P≠NP?  Conceivably, not only are all those conjectures unprovable from the usual axioms of set theory, but their unprovability is itself unprovable, and so on, so that we could never even have the satisfaction of knowing why we’ll never know.

My response to these objections is basically just to appeal yet again to the empirical record.  First, while proof and explanation need not go together and sometimes don’t, by and large they do go together: over thousands over years, mathematicians learned to seek formal proofs largely because they discovered that without them, their understanding constantly went awry.  Also, while no one can rule out that P vs. NP, the Riemann Hypothesis, etc., might be independent of set theory, there’s very little in the history of math—including in the recent history, which saw spectacular proofs of (e.g.) Fermat’s Last Theorem and the Poincaré Conjecture—that lends concrete support to such fatalism.

So in summary, I’d say that history does present us with “two mysteries of the mathematical supercontinent”—namely, why do so many of the mathematical statements that humans care about turn out to be tightly linked in webs of explanation, and also in webs of proof, rather than occupying separate islands?—and that these two mysteries are very closely related, if not quite the same.

### The ultimate physical limits of privacy

Wednesday, March 11th, 2015

Somewhat along the lines of my last post, the other day a reader sent me an amusing list of questions about privacy and fundamental physics.  The questions, and my answers, are below.

1. Does the universe provide us with a minimum level of information security?

I’m not sure what the question means. Yes, there are various types of information security that are rooted in the known laws of physics—some of them (like quantum key distribution) even relying on specific aspects of quantum physics—whose security one can argue for by appealing to the known properties of the physical world. Crucially, however, any information security protocol is only as good as the assumptions it rests on: for example, that the attacker can’t violate the attack model by, say, breaking into your house with an ax!

2. For example, is my information safe from entities outside the light-cone I project?

Yes, I think it’s safe to assume that your information is safe from any entities outside your future light-cone. Indeed, if information is not in your future light-cone, then almost by definition, you had no role in creating it, so in what sense should it be called “yours”?

3. Assume that there are distant alien cultures with infinite life spans – would they always be able to wait long enough for my light cone to spread to them, and then have a chance of detecting my “private” information?

First of all, the aliens would need to be in your future light-cone (see my answer to 2). In 1998, it was discovered that there’s a ‘dark energy’ pushing the galaxies apart at an exponentially-increasing rate. Assuming the dark energy remains there at its current density, galaxies that are far enough away from us (more than a few tens of billions of light-years) will always recede from us faster than the speed of light, meaning that they’ll remain outside our future light-cone, and signals from us can never reach them. So, at least you’re safe from those aliens!

For the aliens in your future light-cone, the question is subtler. Suppose you took the only piece of paper on which your secrets were written, and burned it to ash—nothing high-tech, just burned it. Then there’s no technology that we know today, or could even seriously envision, that would piece the secrets together. It would be like unscrambling an egg, or bringing back the dead from decomposing corpses, or undoing a quantum measurement. It would mean, effectively, reversing the Arrow of Time in the relevant part of the universe. This is formally allowed by the Second Law of Thermodynamics, since the decrease in entropy within that region could be balanced by an increase in entropy elsewhere, but it would require a staggering level of control over the region’s degrees of freedom.

On the other hand, it’s also true that the microscopic laws of physics are reversible: they never destroy information. And for that reason, as a matter of principle, we can’t rule out the possibility that some civilization of the very far future, whether human or alien, could piece together what was written on your paper even after you’d burned it to a crisp. Indeed, with such godlike knowledge and control, maybe they could even reconstruct the past states of your brain, and thereby piece together private thoughts that you’d never written anywhere!

4. Does living in a black hole provide privacy? Couldn’t they follow you into the hole?

No, I would not recommend jumping into a black hole as a way to ensure your privacy. For one thing, you won’t get to enjoy the privacy for long (a couple hours, maybe, for a supermassive black hole at the center of a galaxy?) before getting spaghettified on your way to the singularity. For another, as you correctly pointed out, other people could still snoop on you by jumping into the black hole themselves—although they’d have to want badly enough to learn your secrets that they wouldn’t mind dying themselves along with you, and also not being able to share whatever they learned with anyone outside the hole.

But a third problem is that even inside a black hole, your secrets might not be safe forever! Since the 1970s, it’s been thought that all information dropped into a black hole eventually comes out, in extremely-scrambled form, in the Hawking radiation that black holes produce as they slowly shrink and evaporate. What do I mean by “slowly”? Well, the evaporation would take about 1070 years for a black hole the mass of the sun, or about 10100 years for the black holes at the centers of galaxies. Furthermore, even after the black hole had evaporated, piecing together the infalling secrets from the Hawking radiation would probably make reconstructing what was on the burned paper from the smoke and ash seem trivial by comparison! But just like in the case of the burned paper, the information is still formally present (if current ideas about quantum gravity are correct), so one can’t rule out that it could be reconstructed by some civilization of the extremely remote future.

### The flow of emails within the block inbox

Saturday, March 7th, 2015

As a diversion from the important topics of shaming, anti-shaming, and anti-anti-shaming, I thought I’d share a little email exchange (with my interlocutor’s kind permission), which gives a good example of what I find myself doing all day when I’m not blogging, changing diapers, or thinking about possibly doing some real work (but where did all the time go?).

Dear Professor Aaronson,

I would be very pleased to know your opinion about time.  In a letter of condolence to the Besso family, Albert Einstein wrote: “Now he has departed from this strange world a little ahead of me. That means nothing. People like us, who believe in physics, know that the distinction between past, present and future is only a stubbornly persistent illusion.” I’m a medical doctor and everyday I see time’s effect over human bodies. Is Einstein saying time is an illusion?  For who ‘believe in physics’ is death an illusion?  Don’t we lose our dears and will they continue to live in an ‘eternal world’?

Is time only human perceptive illusion (as some scientists say physics has proved)?

Dear [redacted],

I don’t read Einstein in that famous quote as saying that time itself is an illusion, but rather, that the sense of time flowing from past to present to future is an illusion. He meant, for example, that the differential equations of physics can just as easily be run backward (from future to past) as forward (from past to future), and that studying physics can strongly encourage a perspective—which philosophers call the “block universe” perspective—where you treat the entire history of spacetime as just a fixed, 4-dimensional manifold, with time simply another dimension in addition to the three spatial ones (admittedly, a dimension that the laws of physics treat somewhat differently than the other three). And yes, relativity encourages this perspective, by showing that different observers, moving at different speeds relative to each other, will divide up the 4-dimensional manifold into time slices in different ways, with two events judged to be simultaneous by one observer judged to be happening at different times by another.

But even after Einstein is read this way, I’d personally respond: well, that’s just one perspective you can take. A perfectly understandable one, if you’re Einstein, and especially if you’re Einstein trying to comfort the bereaved. But still: would you want to say, for example, that because physics treats the table in front of you as just a collection of elementary particles held together by forces, therefore the table, as such, doesn’t “exist”? That seems overwrought. Physics deepens your understanding of the table, of course—showing you what its microscopic constituents are and why they hold themselves together—but the table still “exists.”  In much the same way, physics enormously deepened our understanding of what we mean by the “flow of time”—showing how the “flow” emerges from the time-symmetric equations of physics, combined with the time-asymmetric phenomena of thermodynamics, which increase the universe’s entropy as we move away from the Big Bang, and thereby allow for the creation of memories, records, and other irreversible effects (a part of the story that I didn’t even get into here). But it feels overwrought to say that, because physics gives us a perspective from which we can see the “flow of time” as emerging from something deeper, therefore the “flow” doesn’t exist, or is just an illusion.

Hope that helps!

Best,
Scott

(followup question)

Dear Professor,

I’ve been thinking about the “block universe” and it seems to me that in it past, present and future all coexist.  So on the basis of Einstein’s theory, do all exist eternally, and why do we perceive only the present?

But you don’t perceive only the present!  In the past, you perceived what’s now the past (and which you now remember), and in the future, you’ll perceive what’s now the future (and which you now look forward to), right?  And as for why the present is the present, and not some other point in time?  Well, that strikes me as one of those questions like why you’re you, out of all the possible people who you could have been instead, or why, assuming there are billions of habitable planets, you find yourself on earth and not on any of the other planets.  Maybe the best answer is that you had to be someone, living somewhere, at some particular point in time when you asked this question—and you could’ve wondered the same thing regardless of what the answer had turned out to be.

### “Could a Quantum Computer Have Subjective Experience?”

Monday, August 25th, 2014

Author’s Note: Below is the prepared version of a talk that I gave two weeks ago at the workshop Quantum Foundations of a Classical Universe, which was held at IBM’s TJ Watson Research Center in Yorktown Heights, NY.  My talk is for entertainment purposes only; it should not be taken seriously by anyone.  If you reply in a way that makes clear you did take it seriously (“I’m shocked and outraged that someone who dares to call himself a scientist would … [blah blah]”), I will log your IP address, hunt you down at night, and force you to put forward an account of consciousness and decoherence that deals with all the paradoxes discussed below—and then reply at length to all criticisms of your account.

If you’d like to see titles, abstracts, and slides for all the talks from the workshop—including by Charles Bennett, Sean Carroll, James Hartle, Adrian Kent, Stefan Leichenauer, Ken Olum, Don Page, Jason Pollack, Jess Riedel, Mark Srednicki, Wojciech Zurek, and Michael Zwolak—click here.  You’re also welcome to discuss these other nice talks in the comments section, though I might or might not be able to answer questions about them.  Apparently videos of all the talks will be available before long (Jess Riedel has announced that videos are now available).

(Note that, as is probably true for other talks as well, the video of my talk differs substantially from the prepared version—it mostly just consists of interruptions and my responses to them!  On the other hand, I did try to work some of the more salient points from the discussion into the text below.)

Thanks so much to Charles Bennett and Jess Riedel for organizing the workshop, and to all the participants for great discussions.

I didn’t prepare slides for this talk—given the topic, what slides would I use exactly?  “Spoiler alert”: I don’t have any rigorous results about the possibility of sentient quantum computers, to state and prove on slides.  I thought of giving a technical talk on quantum computing theory, but then I realized that I don’t really have technical results that bear directly on the subject of the workshop, which is how the classical world we experience emerges from the quantum laws of physics.  So, given the choice between a technical talk that doesn’t really address the questions we’re supposed to be discussing, or a handwavy philosophical talk that at least tries to address them, I opted for the latter, so help me God.

Let me start with a story that John Preskill told me years ago.  In the far future, humans have solved not only the problem of building scalable quantum computers, but also the problem of human-level AI.  They’ve built a Turing-Test-passing quantum computer.  The first thing they do, to make sure this is actually a quantum computer, is ask it to use Shor’s algorithm to factor a 10,000-digit number.  So the quantum computer factors the number.  Then they ask it, “while you were factoring that number, what did it feel like?  did you feel yourself branching into lots of parallel copies, which then recohered?  or did you remain a single consciousness—a ‘unitary’ consciousness, as it were?  can you tell us from introspection which interpretation of quantum mechanics is the true one?”  The quantum computer ponders this for a while and then finally says, “you know, I might’ve known before, but now I just … can’t remember.”

I like to tell this story when people ask me whether the interpretation of quantum mechanics has any empirical consequences.

Look, I understand the impulse to say “let’s discuss the measure problem, or the measurement problem, or derivations of the Born rule, or Boltzmann brains, or observer-counting, or whatever, but let’s take consciousness off the table.”  (Compare: “let’s debate this state law in Nebraska that says that, before getting an abortion, a woman has to be shown pictures of cute babies.  But let’s take the question of whether or not fetuses have human consciousness—i.e., the actual thing that’s driving our disagreement about that and every other subsidiary question—off the table, since that one is too hard.”)  The problem, of course, is that even after you’ve taken the elephant off the table (to mix metaphors), it keeps climbing back onto the table, often in disguises.  So, for better or worse, my impulse tends to be the opposite: to confront the elephant directly.

Having said that, I still need to defend the claim that (a) the questions we’re discussing, centered around quantum mechanics, Many Worlds, and decoherence, and (b) the question of which physical systems should be considered “conscious,” have anything to do with each other.  Many people would say that the connection doesn’t go any deeper than: “quantum mechanics is mysterious, consciousness is also mysterious, ergo maybe they’re related somehow.”  But I’m not sure that’s entirely true.  One thing that crystallized my thinking about this was a remark made in a lecture by Peter Byrne, who wrote a biography of Hugh Everett.  Byrne was discussing the question, why did it take so many decades for Everett’s Many-Worlds Interpretation to become popular?  Of course, there are people who deny quantum mechanics itself, or who have basic misunderstandings about it, but let’s leave those people aside.  Why did people like Bohr and Heisenberg dismiss Everett?  More broadly: why wasn’t it just obvious to physicists from the beginning that “branching worlds” is a picture that the math militates toward, probably the simplest, easiest story one can tell around the Schrödinger equation?  Even if early quantum physicists rejected the Many-Worlds picture, why didn’t they at least discuss and debate it?

Here was Byrne’s answer: he said, before you can really be on board with Everett, you first need to be on board with Daniel Dennett (the philosopher).  He meant: you first need to accept that a “mind” is just some particular computational process.  At the bottom of everything is the physical state of the universe, evolving via the equations of physics, and if you want to know where consciousness is, you need to go into that state, and look for where computations are taking place that are sufficiently complicated, or globally-integrated, or self-referential, or … something, and that’s where the consciousness resides.  And crucially, if following the equations tells you that after a decoherence event, one computation splits up into two computations, in different branches of the wavefunction, that thereafter don’t interact—congratulations!  You’ve now got two consciousnesses.

And if everything above strikes you as so obvious as not to be worth stating … well, that’s a sign of how much things changed in the latter half of the 20th century.  Before then, many thinkers would’ve been more likely to say, with Descartes: no, my starting point is not the physical world.  I don’t even know a priori that there is a physical world.  My starting point is my own consciousness, which is the one thing besides math that I can be certain about.  And the point of a scientific theory is to explain features of my experience—ultimately, if you like, to predict the probability that I’m going to see X or Y if I do A or B.  (If I don’t have prescientific knowledge of myself, as a single, unified entity that persists in time, makes choices, and later observes their consequences, then I can’t even get started doing science.)  I’m happy to postulate a world external to myself, filled with unseen entities like electrons behaving in arbitrarily unfamiliar ways, if it will help me understand my experience—but postulating other versions of me is, at best, irrelevant metaphysics.  This is a viewpoint that could lead you Copenhagenism, or to its newer variants like quantum Bayesianism.

I’m guessing that many people in this room side with Dennett, and (not coincidentally, I’d say) also with Everett.  I certainly have sympathies in that direction too.  In fact, I spent seven or eight years of my life as a Dennett/Everett hardcore believer.  But, while I don’t want to talk anyone out of the Dennett/Everett view, I’d like to take you on a tour of what I see as some of the extremely interesting questions that that view leaves unanswered.  I’m not talking about “deep questions of meaning,” but about something much more straightforward: what exactly does a computational process have to do to qualify as “conscious”?

Of course, there are already tremendous difficulties here, even if we ignore quantum mechanics entirely.  Ken Olum was over much of this ground in his talk yesterday (see here for a relevant paper by Davenport and Olum).  You’ve all heard the ones about, would you agree to be painlessly euthanized, provided that a complete description of your brain would be sent to Mars as an email attachment, and a “perfect copy” of you would be reconstituted there?  Would you demand that the copy on Mars be up and running before the original was euthanized?  But what do we mean by “before”—in whose frame of reference?

Some people say: sure, none of this is a problem!  If I’d been brought up since childhood taking family vacations where we all emailed ourselves to Mars and had our original bodies euthanized, I wouldn’t think anything of it.  But the philosophers of mind are barely getting started.

To these standard thought experiments, we can add more.  Let’s suppose that, purely for error-correction purposes, the computer that’s simulating your brain runs the code three times, and takes the majority vote of the outcomes.  Would that bring three “copies” of your consciousness into being?  Does it make a difference if the three copies are widely separated in space or time—say, on different planets, or in different centuries?  Is it possible that the massive redundancy taking place in your brain right now is bringing multiple copies of you into being?

Maybe my favorite thought experiment along these lines was invented by my former student Andy Drucker.  In the past five years, there’s been a revolution in theoretical cryptography, around something called Fully Homomorphic Encryption (FHE), which was first discovered by Craig Gentry.  What FHE lets you do is to perform arbitrary computations on encrypted data, without ever decrypting the data at any point.  So, to someone with the decryption key, you could be proving theorems, simulating planetary motions, etc.  But to someone without the key, it looks for all the world like you’re just shuffling random strings and producing other random strings as output.

You can probably see where this is going.  What if we homomorphically encrypted a simulation of your brain?  And what if we hid the only copy of the decryption key, let’s say in another galaxy?  Would this computation—which looks to anyone in our galaxy like a reshuffling of gobbledygook—be silently producing your consciousness?

When we consider the possibility of a conscious quantum computer, in some sense we inherit all the previous puzzles about conscious classical computers, but then also add a few new ones.  So, let’s say I run a quantum subroutine that simulates your brain, by applying some unitary transformation U.  But then, of course, I want to “uncompute” to get rid of garbage (and thereby enable interference between different branches), so I apply U-1.  Question: when I apply U-1, does your simulated brain experience the same thoughts and feelings a second time?  Is the second experience “the same as” the first, or does it differ somehow, by virtue of being reversed in time?  Or, since U-1U is just a convoluted implementation of the identity function, are there no experiences at all here?

Here’s a better one: many of you have heard of the Vaidman bomb.  This is a famous thought experiment in quantum mechanics where there’s a package, and we’d like to “query” it to find out whether it contains a bomb—but if we query it and there is a bomb, it will explode, killing everyone in the room.  What’s the solution?  Well, suppose we could go into a superposition of querying the bomb and not querying it, with only ε amplitude on querying the bomb, and √(1-ε2) amplitude on not querying it.  And suppose we repeat this over and over—each time, moving ε amplitude onto the “query the bomb” state if there’s no bomb there, but moving ε2 probability onto the “query the bomb” state if there is a bomb (since the explosion decoheres the superposition).  Then after 1/ε repetitions, we’ll have order 1 probability of being in the “query the bomb” state if there’s no bomb.  By contrast, if there is a bomb, then the total probability we’ve ever entered that state is (1/ε)×ε2 = ε.  So, either way, we learn whether there’s a bomb, and the probability that we set the bomb off can be made arbitrarily small.  (Incidentally, this is extremely closely related to how Grover’s algorithm works.)

OK, now how about the Vaidman brain?  We’ve got a quantum subroutine simulating your brain, and we want to ask it a yes-or-no question.  We do so by querying that subroutine with ε amplitude 1/ε times, in such a way that if your answer is “yes,” then we’ve only ever activated the subroutine with total probability ε.  Yet you still manage to communicate your “yes” answer to the outside world.  So, should we say that you were conscious only in the ε fraction of the wavefunction where the simulation happened, or that the entire system was conscious?  (The answer could matter a lot for anthropic purposes.)

You might say, sure, maybe these questions are puzzling, but what’s the alternative?  Either we have to say that consciousness is a byproduct of any computation of the right complexity, or integration, or recursiveness (or something) happening anywhere in the wavefunction of the universe, or else we’re back to saying that beings like us are conscious, and all these other things aren’t, because God gave the souls to us, so na-na-na.  Or I suppose we could say, like the philosopher John Searle, that we’re conscious, and the lookup table and homomorphically-encrypted brain and Vaidman brain and all these other apparitions aren’t, because we alone have “biological causal powers.”  And what do those causal powers consist of?  Hey, you’re not supposed to ask that!  Just accept that we have them.  Or we could say, like Roger Penrose, that we’re conscious and the other things aren’t because we alone have microtubules that are sensitive to uncomputable effects from quantum gravity.  But neither of those two options ever struck me as much of an improvement.

Yet I submit to you that, between these extremes, there’s another position we can stake out—one that I certainly don’t know to be correct, but that would solve so many different puzzles if it were correct that, for that reason alone, it seems to me to merit more attention than it usually receives.  (In an effort to give the view that attention, a couple years ago I wrote an 85-page essay called The Ghost in the Quantum Turing Machine, which one or two people told me they actually read all the way through.)  If, after a lifetime of worrying (on weekends) about stuff like whether a giant lookup table would be conscious, I now seem to be arguing for this particular view, it’s less out of conviction in its truth than out of a sense of intellectual obligation: to whatever extent people care about these slippery questions at all, to whatever extent they think various alternative views deserve a hearing, I believe this one does as well.

The intermediate position that I’d like to explore says the following.  Yes, consciousness is a property of any suitably-organized chunk of matter.  But, in addition to performing complex computations, or passing the Turing Test, or other information-theoretic conditions that I don’t know (and don’t claim to know), there’s at least one crucial further thing that a chunk of matter has to do before we should consider it conscious.  Namely, it has to participate fully in the Arrow of Time.  More specifically, it has to produce irreversible decoherence as an intrinsic part of its operation.  It has to be continually taking microscopic fluctuations, and irreversibly amplifying them into stable, copyable, macroscopic classical records.

Before I go further, let me be extremely clear about what this view is not saying.  Firstly, it’s not saying that the brain is a quantum computer, in any interesting sense—let alone a quantum-gravitational computer, like Roger Penrose wants!  Indeed, I see no evidence, from neuroscience or any other field, that the cognitive information processing done by the brain is anything but classical.  The view I’m discussing doesn’t challenge conventional neuroscience on that account.

Secondly, this view doesn’t say that consciousness is in any sense necessary for decoherence, or for the emergence of a classical world.  I’ve never understood how one could hold such a belief, while still being a scientific realist.  After all, there are trillions of decoherence events happening every second in stars and asteroids and uninhabited planets.  Do those events not “count as real” until a human registers them?  (Or at least a frog, or an AI?)  The view I’m discussing only asserts the converse: that decoherence is necessary for consciousness.  (By analogy, presumably everyone agrees that some amount of computation is necessary for an interesting consciousness, but that doesn’t mean consciousness is necessary for computation.)

Thirdly, the view I’m discussing doesn’t say that “quantum magic” is the explanation for consciousness.  It’s silent on the explanation for consciousness (to whatever extent that question makes sense); it seeks only to draw a defensible line between the systems we want to regard as conscious and the systems we don’t—to address what I recently called the Pretty-Hard Problem.  And the (partial) answer it suggests doesn’t seem any more “magical” to me than any other proposed answer to the same question.  For example, if one said that consciousness arises from any computation that’s sufficiently “integrated” (or something), I could reply: what’s the “magical force” that imbues those particular computations with consciousness, and not other computations I can specify?  Or if one said (like Searle) that consciousness arises from the biology of the brain, I could reply: so what’s the “magic” of carbon-based biology, that could never be replicated in silicon?  Or even if one threw up one’s hands and said everything was conscious, I could reply: what’s the magical power that imbues my stapler with a mind?  Each of these views, along with the view that stresses the importance of decoherence and the arrow of time, is worth considering.  In my opinion, each should be judged according to how well it holds up under the most grueling battery of paradigm-cases, thought experiments, and reductios ad absurdum we can devise.

So, why might one conjecture that decoherence, and participation in the arrow of time, were necessary conditions for consciousness?  I suppose I could offer some argument about our subjective experience of the passage of time being a crucial component of our consciousness, and the passage of time being bound up with the Second Law.  Truthfully, though, I don’t have any a-priori argument that I find convincing.  All I can do is show you how many apparent paradoxes get resolved if you make this one speculative leap.

For starters, if you think about exactly how our chunk of matter is going to amplify microscopic fluctuations, it could depend on details like the precise spin orientations of various subatomic particles in the chunk.  But that has an interesting consequence: if you’re an outside observer who doesn’t know the chunk’s quantum state, it might be difficult or impossible for you to predict what the chunk is going to do next—even just to give decent statistical predictions, like you can for a hydrogen atom.  And of course, you can’t in general perform a measurement that will tell you the chunk’s quantum state, without violating the No-Cloning Theorem.  For the same reason, there’s in general no physical procedure that you can apply to the chunk to duplicate it exactly: that is, to produce a second chunk that you can be confident will behave identically (or almost identically) to the first, even just in a statistical sense.  (Again, this isn’t assuming any long-range quantum coherence in the chunk: only microscopic coherence that then gets amplified.)

It might be objected that there are all sorts of physical systems that “amplify microscopic fluctuations,” but that aren’t anything like what I described, at least not in any interesting sense: for example, a Geiger counter, or a photodetector, or any sort of quantum-mechanical random-number generator.  You can make, if not an exact copy of a Geiger counter, surely one that’s close enough for practical purposes.  And, even though the two counters will record different sequences of clicks when pointed at identical sources, the statistical distribution of clicks will be the same (and precisely calculable), and surely that’s all that matters.  So, what separates these examples from the sorts of examples I want to discuss?

What separates them is the undisputed existence of what I’ll call a clean digital abstraction layer.  By that, I mean a macroscopic approximation to a physical system that an external observer can produce, in principle, without destroying the system; that can be used to predict what the system will do to excellent accuracy (given knowledge of the environment); and that “sees” quantum-mechanical uncertainty—to whatever extent it does—as just a well-characterized source of random noise.  If a system has such an abstraction layer, then we can regard any quantum noise as simply part of the “environment” that the system observes, rather than part of the system itself.  I’ll take it as clear that such clean abstraction layers exist for a Geiger counter, a photodetector, or a computer with a quantum random number generator.  By contrast, for (say) an animal brain, I regard it as currently an open question whether such an abstraction layer exists or not.  If, someday, it becomes routine for nanobots to swarm through people’s brains and make exact copies of them—after which the “original” brains can be superbly predicted in all circumstances, except for some niggling differences that are traceable back to different quantum-mechanical dice rolls—at that point, perhaps educated opinion will have shifted to the point where we all agree the brain does have a clean digital abstraction layer.  But from where we stand today, it seems entirely possible to agree that the brain is a physical system obeying the laws of physics, while doubting that the nanobots would work as advertised.  It seems possible that—as speculated by Bohr, Compton, Eddington, and even Alan Turing—if you want to get it right you’ll need more than just the neural wiring graph, the synaptic strengths, and the approximate neurotransmitter levels.  Maybe you also need (e.g.) the internal states of the neurons, the configurations of sodium-ion channels, or other data that you simply can’t get without irreparably damaging the original brain—not only as a contingent matter of technology but as a fundamental matter of physics.

(As a side note, I should stress that obviously, even without invasive nanobots, our brains are constantly changing, but we normally don’t say as a result that we become completely different people at each instant!  To my way of thinking, though, this transtemporal identity is fundamentally different from a hypothetical identity between different “copies” of you, in the sense we’re talking about.  For one thing, all your transtemporal doppelgängers are connected by a single, linear chain of causation.  For another, outside movies like Bill and Ted’s Excellent Adventure, you can’t meet your transtemporal doppelgängers and have a conversation with them, nor can scientists do experiments on some of them, then apply what they learned to others that remained unaffected by their experiments.)

So, on this view, a conscious chunk of matter would be one that not only acts irreversibly, but that might well be unclonable for fundamental physical reasons.  If so, that would neatly resolve many of the puzzles that I discussed before.  So for example, there’s now a straightforward reason why you shouldn’t consent to being killed, while your copy gets recreated on Mars from an email attachment.  Namely, that copy will have a microstate with no direct causal link to your “original” microstate—so while it might behave similarly to you in many ways, you shouldn’t expect that your consciousness will “transfer” to it.  If you wanted to get your exact microstate to Mars, you could do that in principle using quantum teleportation—but as we all know, quantum teleportation inherently destroys the original copy, so there’s no longer any philosophical problem!  (Or, of course, you could just get on a spaceship bound for Mars: from a philosophical standpoint, it amounts to the same thing.)

Similarly, in the case where the simulation of your brain was run three times for error-correcting purposes: that could bring about three consciousnesses if, and only if, the three simulations were tied to different sets of decoherence events.  The giant lookup table and the Earth-sized brain simulation wouldn’t bring about any consciousness, unless they were implemented in such a way that they no longer had a clean digital abstraction layer.  What about the homomorphically-encrypted brain simulation?  That might no longer work, simply because we can’t assume that the microscopic fluctuations that get amplified are homomorphically encrypted.  Those are “in the clear,” which inevitably leaks information.  As for the quantum computer that simulates your thought processes and then perfectly reverses the simulation, or that queries you like a Vaidman bomb—in order to implement such things, we’d of course need to use quantum fault-tolerance, so that the simulation of you stayed in an encoded subspace and didn’t decohere.  But under our assumption, that would mean the simulation wasn’t conscious.

Now, it might seem to some of you like I’m suggesting something deeply immoral.  After all, the view I’m considering implies that, even if a system passed the Turing Test, and behaved identically to a human, even if it eloquently pleaded for its life, if it wasn’t irreversibly decohering microscopic events then it wouldn’t be conscious, so it would be fine to kill it, torture it, whatever you want.

But wait a minute: if a system isn’t doing anything irreversible, then what exactly does it mean to “kill” it?  If it’s a classical computation, then at least in principle, you could always just restore from backup.  You could even rewind and not only erase the memories of, but “uncompute” (“untorture”?) whatever tortures you had performed.  If it’s a quantum computation, you could always invert the unitary transformation U that corresponded to killing the thing (then reapply U and invert it again for good measure, if you wanted).  Only for irreversible systems are there moral acts with irreversible consequences.

This is related to something that’s bothered me for years in quantum foundations.  When people discuss Schrödinger’s cat, they always—always—insert some joke about, “obviously, this experiment wouldn’t pass the Ethical Review Board.  Nowadays, we try to avoid animal cruelty in our quantum gedankenexperiments.”  But actually, I claim that there’s no animal cruelty at all in the Schrödinger’s cat experiment.  And here’s why: in order to prove that the cat was ever in a coherent superposition of |Alive〉 and |Dead〉, you need to be able to measure it in a basis like {|Alive〉+|Dead〉,|Alive〉-|Dead〉}.  But if you can do that, you must have such precise control over all the cat’s degrees of freedom that you can also rotate unitarily between the |Alive〉 and |Dead〉 states.  (To see this, let U be the unitary that you applied to the |Alive〉 branch, and V the unitary that you applied to the |Dead〉 branch, to bring them into coherence with each other; then consider applying U-1V.)  But if you can do that, then in what sense should we say that the cat in the |Dead〉 state was ever “dead” at all?  Normally, when we speak of “killing,” we mean doing something irreversible—not rotating to some point in a Hilbert space that we could just as easily rotate away from.

(There followed discussion among some audience members about the question of whether, if you destroyed all records of some terrible atrocity, like the Holocaust, everywhere in the physical world, you would thereby cause the atrocity “never to have happened.”  Many people seemed surprised by my willingness to accept that implication of what I was saying.  By way of explaining, I tried to stress just how far our everyday, intuitive notion of “destroying all records of something” falls short of what would actually be involved here: when we think of “destroying records,” we think about burning books, destroying the artifacts in museums, silencing witnesses, etc.  But even if all those things were done and many others, still the exact configurations of the air, the soil, and photons heading away from the earth at the speed of light would retain their silent testimony to the Holocaust’s reality.  “Erasing all records” in the physics sense would be something almost unimaginably more extreme: it would mean inverting the entire physical evolution in the vicinity of the earth, stopping time’s arrow and running history itself backwards.  Such ‘unhappening’ of what’s happened is something that we lack any experience of, at least outside of certain quantum interference experiments—though in the case of the Holocaust, one could be forgiven for wishing it were possible.)

OK, so much for philosophy of mind and morality; what about the interpretation of quantum mechanics?  If we think about consciousness in the way I’ve suggested, then who’s right: the Copenhagenists or the Many-Worlders?  You could make a case for either.  The Many-Worlders would be right that we could always, if we chose, think of decoherence events as “splitting” our universe into multiple branches, each with different versions of ourselves, that thereafter don’t interact.  On the other hand, the Copenhagenists would be right that, even in principle, we could never do any experiment where this “splitting” of our minds would have any empirical consequence.  On this view, if you can control a system well enough that you can actually observe interference between the different branches, then it follows that you shouldn’t regard the system as conscious, because it’s not doing anything irreversible.

In my essay, the implication that concerned me the most was the one for “free will.”  If being conscious entails amplifying microscopic events in an irreversible and unclonable way, then someone looking at a conscious system from the outside might not, in general, be able to predict what it’s going to do next, not even probabilistically.  In other words, its decisions might be subject to at least some “Knightian uncertainty”: uncertainty that we can’t even quantify in a mutually-agreed way using probabilities, in the same sense that we can quantify our uncertainty about (say) the time of a radioactive decay.  And personally, this is actually the sort of “freedom” that interests me the most.  I don’t really care if my choices are predictable by God, or by a hypothetical Laplace demon: that is, if they would be predictable (at least probabilistically), given complete knowledge of the microstate of the universe.  By definition, there’s essentially no way for my choices not to be predictable in that weak and unempirical sense!  On the other hand, I’d prefer that my choices not be completely predictable by other people.  If someone could put some sheets of paper into a sealed envelope, then I spoke extemporaneously for an hour, and then the person opened the envelope to reveal an exact transcript of everything I said, that’s the sort of thing that really would cause me to doubt in what sense “I” existed as a locus of thought.  But you’d have to actually do the experiment (or convince me that it could be done): it doesn’t count just to talk about it, or to extrapolate from fMRI experiments that predict which of two buttons a subject is going to press with 60% accuracy a few seconds in advance.

But since we’ve got some cosmologists in the house, let me now turn to discussing the implications of this view for Boltzmann brains.

(For those tuning in from home: a Boltzmann brain is a hypothetical chance fluctuation in the late universe, which would include a conscious observer with all the perceptions that a human being—say, you—is having right now, right down to false memories and false beliefs of having arisen via Darwinian evolution.  On statistical grounds, the overwhelming majority of Boltzmann brains last just long enough to have a single thought—like, say, the one you’re having right now—before they encounter the vacuum and freeze to death.  If you measured some part of the vacuum state toward which our universe seems to be heading, asking “is there a Boltzmann brain here?,” quantum mechanics predicts that the probability would be ridiculously astronomically small, but nonzero.  But, so the argument goes, if the vacuum lasts for infinite time, then as long as the probability is nonzero, it doesn’t matter how tiny it is: you’ll still get infinitely many Boltzmann brains indistinguishable from any given observer; and for that reason, any observer should consider herself infinitely likelier to be a Boltzmann brain than to be the “real,” original version.  For the record, even among the strange people at the IBM workshop, no one actually worried about being a Boltzmann brain.  The question, rather, is whether, if a cosmological model predicts Boltzmann brains, then that’s reason enough to reject the model, or whether we can live with such a prediction, since we have independent grounds for knowing that we can’t be Boltzmann brains.)

At this point, you can probably guess where this is going.  If decoherence, entropy production, full participation in the arrow of time are necessary conditions for consciousness, then it would follow, in particular, that a Boltzmann brain is not conscious.  So we certainly wouldn’t be Boltzmann brains, even under a cosmological model that predicts infinitely more of them than of us.  We can wipe our hands; the problem is solved!

I find it extremely interesting that, in their recent work, Kim Boddy, Sean Carroll, and Jason Pollack reached a similar conclusion, but from a completely different starting point.  They said: look, under reasonable assumptions, the late universe is just going to stay forever in an energy eigenstate—just sitting there doing nothing.  It’s true that, if someone came along and measured the energy eigenstate, asking “is there a Boltzmann brain here?,” then with a tiny but nonzero probability the answer would be yes.  But since no one is there measuring, what licenses us to interpret the nonzero overlap in amplitude with the Boltzmann brain state, as a nonzero probability of there being a Boltzmann brain?  I think they, too, are implicitly suggesting: if there’s no decoherence, no arrow of time, then we’re not authorized to say that anything is happening that “counts” for anthropic purposes.

Let me now mention an obvious objection.  (In fact, when I gave the talk, this objection was raised much earlier.)  You might say, “look, if you really think irreversible decoherence is a necessary condition for consciousness, then you might find yourself forced to say that there’s no consciousness, because there might not be any such thing as irreversible decoherence!  Imagine that our entire solar system were enclosed in an anti de Sitter (AdS) boundary, like in Greg Egan’s science-fiction novel Quarantine.  Inside the box, there would just be unitary evolution in some Hilbert space: maybe even a finite-dimensional Hilbert space.  In which case, all these ‘irreversible amplifications’ that you lay so much stress on wouldn’t be irreversible at all: eventually all the Everett branches would recohere; in fact they’d decohere and recohere infinitely many times.  So by your lights, how could anything be conscious inside the box?”

My response to this involves one last speculation.  I speculate that the fact that we don’t appear to live in AdS space—that we appear to live in (something evolving toward) a de Sitter space, with a positive cosmological constant—might be deep and important and relevant.  I speculate that, in our universe, “irreversible decoherence” means: the records of what you did are now heading toward our de Sitter horizon at the speed of light, and for that reason alone—even if for no others—you can’t put Humpty Dumpty back together again.  (Here I should point out, as several workshop attendees did to me, that Bousso and Susskind explored something similar in their paper The Multiverse Interpretation of Quantum Mechanics.)

Does this mean that, if cosmologists discover tomorrow that the cosmological constant is negative, or will become negative, then it will turn out that none of us were ever conscious?  No, that’s stupid.  What it would suggest is that the attempt I’m now making on the Pretty-Hard Problem had smacked into a wall (an AdS wall?), so that I, and anyone else who stressed in-principle irreversibility, should go back to the drawing board.  (By analogy, if some prescription for getting rid of Boltzmann brains fails, that doesn’t mean we are Boltzmann brains; it just means we need a new prescription.  Tempting as it is to skewer our opponents’ positions with these sorts of strawman inferences, I hope we can give each other the courtesy of presuming a bare minimum of sense.)

Another question: am I saying that, in order to be absolutely certain of whether some entity satisfied the postulated precondition for consciousness, one might, in general, need to look billions of years into the future, to see whether the “decoherence” produced by the entity was really irreversible?  Yes (pause to gulp bullet).  I am saying that.  On the other hand, I don’t think it’s nearly as bad as it sounds.  After all, the category of “consciousness” might be morally relevant, or relevant for anthropic reasoning, but presumably we all agree that it’s unlikely to play any causal role in the fundamental laws of physics.  So it’s not as if we’ve introduced any teleology into the laws of physics by this move.

Let me end by pointing out what I’ll call the “Tegmarkian slippery slope.”  It feels scientific and rational—from the perspective of many of us, even banal—to say that, if we’re conscious, then any sufficiently-accurate computer simulation of us would also be.  But I tried to convince you that this view depends, for its aura of obviousness, on our agreeing not to probe too closely exactly what would count as a “sufficiently-accurate” simulation.  E.g., does it count if the simulation is done in heavily-encrypted form, or encoded as a giant lookup table?  Does it matter if anyone actually runs the simulation, or consults the lookup table?  Now, all the way at the bottom of the slope is Max Tegmark, who asks: to produce consciousness, what does it matter if the simulation is physically instantiated at all?  Why isn’t it enough for the simulation to “exist” mathematically?  Or, better yet: if you’re worried about your infinitely-many Boltzmann brain copies, then why not worry equally about the infinitely many descriptions of your life history that are presumably encoded in the decimal expansion of π?  Why not hold workshops about how to avoid the prediction that we’re infinitely likelier to be “living in π” than to be our “real” selves?

From this extreme, even most scientific rationalists recoil.  They say, no, even if we don’t yet know exactly what’s meant by “physical instantiation,” we agree that you only get consciousness if the computer program is physically instantiated somehow.  But now I have the opening I want.  I can say: once we agree that physical existence is a prerequisite for consciousness, why not participation in the Arrow of Time?  After all, our ordinary ways of talking about sentient beings—outside of quantum mechanics, cosmology, and maybe theology—don’t even distinguish between the concepts “exists” and “exists and participates in the Arrow of Time.”  And to say we have no experience of reversible, clonable, coherently-executable, atemporal consciousnesses is a massive understatement.

Of course, we should avoid the sort of arbitrary prejudice that Turing warned against in Computing Machinery and Intelligence.  Just because we lack experience with extraterrestrial consciousnesses, doesn’t mean it would be OK to murder an intelligent extraterrestrial if we met one tomorrow.  In just the same way, just because we lack experience with clonable, atemporal consciousnesses, doesn’t mean it would be OK to … wait!  As we said before, clonability, and aloofness from time’s arrow, call severely into question what it even means to “murder” something.  So maybe this case isn’t as straightforward as the extraterrestrials after all.

At this point, I’ve probably laid out enough craziness, so let me stop and open things up for discussion.

### Integrated Information Theory: Virgil Griffith opines

Wednesday, June 25th, 2014

Remember the two discussions about Integrated Information Theory that we had a month ago on this blog?  You know, the ones where I argued that IIT fails because “the brain might be an expander, but not every expander is a brain”; where IIT inventor Giulio Tononi wrote a 14-page response biting the bullet with mustard; and where famous philosopher of mind David Chalmers, and leading consciousness researcher (and IIT supporter) Christof Koch, also got involved in the comments section?

OK, so one more thing about that.  Virgil Griffith recently completed his PhD under Christof Koch at Caltech—as he puts it, “immersing [him]self in the nitty-gritty of IIT for the past 6.5 years.”  This morning, Virgil sent me two striking letters about his thoughts on the recent IIT exchanges on this blog.  He asked me to share them here, something that I’m more than happy to do:

Reading these letters, what jumped out at me—given Virgil’s long apprenticeship in the heart of IIT-land—was the amount of agreement between my views and his.  In particular, Virgil agrees with my central contention that Φ, as it stands, can at most be a necessary condition for consciousness, not a sufficient condition, and remarks that “[t]o move IIT from talked about to accepted among hard scientists, it may be necessary for [Tononi] to wash his hands of sufficiency claims.”  He agrees that a lack of mathematical clarity in the definition of Φ is a “major problem in the IIT literature,” commenting that “IIT needs more mathematically inclined people at its helm.”  He also says he agrees “110%” that the lack of a derivation of the form of Φ from IIT’s axioms is “a pothole in the theory,” and further agrees 110% that the current prescriptions for computing Φ contain many unjustified idiosyncrasies.

Indeed, given the level of agreement here, there’s not all that much for me to rebut, defend, or clarify!

I suppose there are a few things.

1. Just as a clarifying remark, in a few places where it looks from the formatting like Virgil is responding to something I said (for example, “The conceptual structure is unified—it cannot be decomposed into independent components” and “Clearly, a theory of consciousness must be able to provide an adequate account for such seemingly disparate but largely uncontroversial facts”), he’s actually responding to something Giulio said (and that I, at most, quoted).
2. Virgil says, correctly, that Giulio would respond to my central objection against IIT by challenging my “intuition for things being unconscious.”  (Indeed, because Giulio did respond, there’s no need to speculate about how he would respond!)  However, Virgil then goes on to explicate Giulio’s response using the analogy of temperature (interestingly, the same analogy I used for a different purpose).  He points out how counterintuitive it would be for Kelvin’s contemporaries to accept that “even the coldest thing you’ve touched actually has substantial heat in it,” and remarks: “I find this ‘Kelvin scale for C’ analogy makes the panpsychism much more palatable.”  The trouble is that I never objected to IIT’s panpsychism per se: I only objected to its seemingly arbitrary and selective panpsychism.  It’s one thing for a theory to ascribe some amount of consciousness to a 2D grid or an expander graph.  It’s quite another for a theory to ascribe vastly more consciousness to those things than it ascribes to a human brain—even while denying consciousness to things that are intuitively similar but organized a little differently (say, a 1D grid).  A better analogy here would be if Kelvin’s theory of temperature had predicted, not merely that all ordinary things had some heat in them, but that an ice cube was hotter than the Sun, even though a popsicle was, of course, colder than the Sun.  (The ice cube, you see, “integrates heat” in a way that the popsicle doesn’t…)
3. Virgil imagines two ways that an IIT proponent could respond to my argument involving the cerebellum—the argument that accuses IIT proponents of changing the rules of the game according to convenience (a 2D grid has a large Φ?  suck it up and accept it; your intuitions about a grid’s lack of consciousness are irrelevant.  the human cerebellum has a small Φ?  ah, that’s a victory for IIT, since the cerebellum is intuitively unconscious).  The trouble is that both of Virgil’s imagined responses are by reference to the IIT axioms.  But I wasn’t talking about the axioms themselves, but about whether we’re allowed to validate the axioms, by checking their consequences against earlier, pre-theoretic intuitions.  And I was pointing out that Giulio seemed happy to do so when the results “went in IIT’s favor” (in the cerebellum example), even though he lectured me against doing so in the cases of the expander and the 2D grid (cases where IIT does less well, to put it mildly, at capturing our intuitions).
4. Virgil chastises me for ridiculing Giulio’s phenomenological argument for the consciousness of a 2D grid by way of nursery rhymes: “Just because it feels like something to see a wall, doesn’t mean it feels like something to be a wall.  You can smell a rose, and the rose can smell good, but that doesn’t mean the rose can smell you.”  Virgil amusingly comments: “Even when both are inebriated, I’ve never heard [Giulio] nor [Christof] separately or collectively imply anything like this.  Moreover, they’re each far too clueful to fall for something so trivial.”  For my part, I agree that neither Giulio nor Christof would ever advocate something as transparently silly as, “if you have a rich inner experience when thinking about X, then that’s evidence X itself is conscious.”  And I apologize if I seemed to suggest they would.  To clarify, my point was not that Giulio was making such an absurd statement, but rather that, assuming he wasn’t, I didn’t know what he was trying to say in the passages of his that I’d just quoted at length.  The silly thing seemed like the “obvious” reading of his words, and my hermeneutic powers were unequal to the task of figuring out the non-silly, non-obvious reading that he surely intended.

Anyway, there’s much more to Virgil’s letters than the above—including answers to some of my subsidiary questions about the details of IIT (e.g., how to handle unbalanced partitions, and the mathematical meanings of terms like “mechanism” and “system of mechanisms”).  Also, in parts of the letters, Virgil’s main concern is neither to agree with me nor to agree with Giulio, but rather to offer his own ideas, developed in the course of his PhD work, for how to move forward and fix some of the problems with IIT.  All in all, these are recommended reads for anyone who’s been following this debate.

### Giulio Tononi and Me: A Phi-nal Exchange

Friday, May 30th, 2014

You might recall that last week I wrote a post criticizing Integrated Information Theory (IIT), and its apparent implication that a simple Reed-Solomon decoding circuit would, if scaled to a large enough size, bring into being a consciousness vastly exceeding our own.  On Wednesday Giulio Tononi, the creator of IIT, was kind enough to send me a fascinating 14-page rebuttal, and to give me permission to share it here:

Why Scott should stare at a blank wall and reconsider (or, the conscious grid)

If you’re interested in this subject at all, then I strongly recommend reading Giulio’s response before continuing further.   But for those who want the tl;dr: Giulio, not one to battle strawmen, first restates my own argument against IIT with crystal clarity.  And while he has some minor quibbles (e.g., apparently my calculations of Φ didn’t use the most recent, “3.0” version of IIT), he wisely sets those aside in order to focus on the core question: according to IIT, are all sorts of simple expander graphs conscious?

There, he doesn’t “bite the bullet” so much as devour a bullet hoagie with mustard.  He affirms that, yes, according to IIT, a large network of XOR gates arranged in a simple expander graph is conscious.  Indeed, he goes further, and says that the “expander” part is superfluous: even a network of XOR gates arranged in a 2D square grid is conscious.  In my language, Giulio is simply pointing out here that a √n×√n square grid has decent expansion: good enough to produce a Φ-value of about √n, if not the information-theoretic maximum of n (or n/2, etc.) that an expander graph could achieve.  And apparently, by Giulio’s lights, Φ=√n is sufficient for consciousness!

While Giulio never mentions this, it’s interesting to observe that logic gates arranged in a 1-dimensional line would produce a tiny Φ-value (Φ=O(1)).  So even by IIT standards, such a linear array would not be conscious.  Yet the jump from a line to a two-dimensional grid is enough to light the spark of Mind.

Personally, I give Giulio enormous credit for having the intellectual courage to follow his theory wherever it leads.  When the critics point out, “if your theory were true, then the Moon would be made of peanut butter,” he doesn’t try to wiggle out of the prediction, but proudly replies, “yes, chunky peanut butter—and you forgot to add that the Earth is made of Nutella!”

Yet even as we admire Giulio’s honesty and consistency, his stance might also prompt us, gently, to take another look at this peanut-butter-moon theory, and at what grounds we had for believing it in the first place.  In his response essay, Giulio offers four arguments (by my count) for accepting IIT despite, or even because of, its conscious-grid prediction: one “negative” argument and three “positive” ones.  Alas, while your Φ-lage may vary, I didn’t find any of the four arguments persuasive.  In the rest of this post, I’ll go through them one by one and explain why.

I. The Copernicus-of-Consciousness Argument

Like many commenters on my last post, Giulio heavily criticizes my appeal to “common sense” in rejecting IIT.  Sure, he says, I might find it “obvious” that a huge Vandermonde matrix, or its physical instantiation, isn’t conscious.  But didn’t people also find it “obvious” for millennia that the Sun orbits the Earth?  Isn’t the entire point of science to challenge common sense?  Clearly, then, the test of a theory of consciousness is not how well it upholds “common sense,” but how well it fits the facts.

The above position sounds pretty convincing: who could dispute that observable facts trump personal intuitions?  The trouble is, what are the observable facts when it comes to consciousness?  The anti-common-sense view gets all its force by pretending that we’re in a relatively late stage of research—namely, the stage of taking an agreed-upon scientific definition of consciousness, and applying it to test our intuitions—rather than in an extremely early stage, of agreeing on what the word “consciousness” is even supposed to mean.

Since I think this point is extremely important—and of general interest, beyond just IIT—I’ll expand on it with some analogies.

Suppose I told you that, in my opinion, the ε-δ definition of continuous functions—the one you learn in calculus class—failed to capture the true meaning of continuity.  Suppose I told you that I had a new, better definition of continuity—and amazingly, when I tried out my definition on some examples, it turned out that ⌊x⌋ (the floor function) was continuous, whereas x2  had discontinuities, though only at 17.5 and 42.

You would probably ask what I was smoking, and whether you could have some.  But why?  Why shouldn’t the study of continuity produce counterintuitive results?  After all, even the standard definition of continuity leads to some famously weird results, like that x sin(1/x) is a continuous function, even though sin(1/x) is discontinuous.  And it’s not as if the standard definition is God-given: people had been using words like “continuous” for centuries before Bolzano, Weierstrass, et al. formalized the ε-δ definition, a definition that millions of calculus students still find far from intuitive.  So why shouldn’t there be a different, better definition of “continuous,” and why shouldn’t it reveal that a step function is continuous while a parabola is not?

In my view, the way out of this conceptual jungle is to realize that, before any formal definitions, any ε’s and δ’s, we start with an intuition for we’re trying to capture by the word “continuous.”  And if we press hard enough on what that intuition involves, we’ll find that it largely consists of various “paradigm-cases.”  A continuous function, we’d say, is a function like 3x, or x2, or sin(x), while a discontinuity is the kind of thing that the function 1/x has at x=0, or that ⌊x⌋ has at every integer point.  Crucially, we use the paradigm-cases to guide our choice of a formal definition—not vice versa!  It’s true that, once we have a formal definition, we can then apply it to “exotic” cases like x sin(1/x), and we might be surprised by the results.  But the paradigm-cases are different.  If, for example, our definition told us that x2 was discontinuous, that wouldn’t be a “surprise”; it would just be evidence that we’d picked a bad definition.  The definition failed at the only task for which it could have succeeded: namely, that of capturing what we meant.

Some people might say that this is all well and good in pure math, but empirical science has no need for squishy intuitions and paradigm-cases.  Nothing could be further from the truth.  Suppose, again, that I told you that physicists since Kelvin had gotten the definition of temperature all wrong, and that I had a new, better definition.  And, when I built a Scott-thermometer that measures true temperatures, it delivered the shocking result that boiling water is actually colder than ice.  You’d probably tell me where to shove my Scott-thermometer.  But wait: how do you know that I’m not the Copernicus of heat, and that future generations won’t celebrate my breakthrough while scoffing at your small-mindedness?

I’d say there’s an excellent answer: because what we mean by heat is “whatever it is that boiling water has more of than ice” (along with dozens of other paradigm-cases).  And because, if you use a thermometer to check whether boiling water is hotter than ice, then the term for what you’re doing is calibrating your thermometer.  When the clock strikes 13, it’s time to fix the clock, and when the thermometer says boiling water’s colder than ice, it’s time to replace the thermometer—or if needed, even the entire theory on which the thermometer is based.

Ah, you say, but doesn’t modern physics define heat in a completely different, non-intuitive way, in terms of molecular motion?  Yes, and that turned out to be a superb definition—not only because it was precise, explanatory, and applicable to cases far beyond our everyday experience, but crucially, because it matched common sense on the paradigm-cases.  If it hadn’t given sensible results for boiling water and ice, then the only possible conclusion would be that, whatever new quantity physicists had defined, they shouldn’t call it “temperature,” or claim that their quantity measured the amount of “heat.”  They should call their new thing something else.

The implications for the consciousness debate are obvious.  When we consider whether to accept IIT’s equation of integrated information with consciousness, we don’t start with any agreed-upon, independent notion of consciousness against which the new notion can be compared.  The main things we start with, in my view, are certain paradigm-cases that gesture toward what we mean:

• You are conscious (though not when anesthetized).
• (Most) other people appear to be conscious, judging from their behavior.
• Many animals appear to be conscious, though probably to a lesser degree than humans (and the degree of consciousness in each particular species is far from obvious).
• A rock is not conscious.  A wall is not conscious.  A Reed-Solomon code is not conscious.  Microsoft Word is not conscious (though a Word macro that passed the Turing test conceivably would be).

Fetuses, coma patients, fish, and hypothetical AIs are the x sin(1/x)’s of consciousness: they’re the tougher cases, the ones where we might actually need a formal definition to adjudicate the truth.

Now, given a proposed formal definition for an intuitive concept, how can we check whether the definition is talking about same thing we were trying to get at before?  Well, we can check whether the definition at least agrees that parabolas are continuous while step functions are not, that boiling water is hot while ice is cold, and that we’re conscious while Reed-Solomon decoders are not.  If so, then the definition might be picking out the same thing that we meant, or were trying to mean, pre-theoretically (though we still can’t be certain).  If not, then the definition is certainly talking about something else.

What else can we do?

II. The Axiom Argument

According to Giulio, there is something else we can do, besides relying on paradigm-cases.  That something else, in his words, is to lay down “postulates about how the physical world should be organized to support the essential properties of experience,” then use those postulates to derive a consciousness-measuring quantity.

OK, so what are IIT’s postulates?  Here’s how Giulio states the five postulates leading to Φ in his response essay (he “derives” these from earlier “phenomenological axioms,” which you can find in the essay):

1. A system of mechanisms exists intrinsically if it can make a difference to itself, by affecting the probability of its past and future states, i.e. it has causal power (existence).
2. It is composed of submechanisms each with their own causal power (composition).
3. It generates a conceptual structure that is the specific way it is, as specified by each mechanism’s concept — this is how each mechanism affects the probability of the system’s past and future states (information).
4. The conceptual structure is unified — it cannot be decomposed into independent components (integration).
5. The conceptual structure is singular — there can be no superposition of multiple conceptual structures over the same mechanisms and intervals of time.

From my standpoint, these postulates have three problems.  First, I don’t really understand them.  Second, insofar as I do understand them, I don’t necessarily accept their truth.  And third, insofar as I do accept their truth, I don’t see how they lead to Φ.

To elaborate a bit:

I don’t really understand the postulates.  I realize that the postulates are explicated further in the many papers on IIT.  Unfortunately, while it’s possible that I missed something, in all of the papers that I read, the definitions never seemed to “bottom out” in mathematical notions that I understood, like functions mapping finite sets to other finite sets.  What, for example, is a “mechanism”?  What’s a “system of mechanisms”?  What’s “causal power”?  What’s a “conceptual structure,” and what does it mean for it to be “unified”?  Alas, it doesn’t help to define these notions in terms of other notions that I also don’t understand.  And yes, I agree that all these notions can be given fully rigorous definitions, but there could be many different ways to do so, and the devil could lie in the details.  In any case, because (as I said) it’s entirely possible that the failure is mine, I place much less weight on this point than I do on the two points to follow.

I don’t necessarily accept the postulates’ truth.  Is consciousness a “unified conceptual structure”?  Is it “singular”?  Maybe.  I don’t know.  It sounds plausible.  But at any rate, I’m far less confident about any these postulates—whatever one means by them!—than I am about my own “postulate,” which is that you and I are conscious while my toaster is not.  Note that my postulate, though not phenomenological, does have the merit of constraining candidate theories of consciousness in an unambiguous way.

I don’t see how the postulates lead to Φ.  Even if one accepts the postulates, how does one deduce that the “amount of consciousness” should be measured by Φ, rather than by some other quantity?  None of the papers I read—including the ones Giulio linked to in his response essay—contained anything that looked to me like a derivation of Φ.  Instead, there was general discussion of the postulates, and then Φ just sort of appeared at some point.  Furthermore, given the many idiosyncrasies of Φ—the minimization over all bipartite (why just bipartite? why not tripartite?) decompositions of the system, the need for normalization (or something else in version 3.0) to deal with highly-unbalanced partitions—it would be quite a surprise were it possible to derive its specific form from postulates of such generality.

I was going to argue for that conclusion in more detail, when I realized that Giulio had kindly done the work for me already.  Recall that Giulio chided me for not using the “latest, 2014, version 3.0” edition of Φ in my previous post.  Well, if the postulates uniquely determined the form of Φ, then what’s with all these upgrades?  Or has Φ’s definition been changing from year to year because the postulates themselves have been changing?  If the latter, then maybe one should wait for the situation to stabilize before trying to form an opinion of the postulates’ meaningfulness, truth, and completeness?

III. The Ironic Empirical Argument

Or maybe not.  Despite all the problems noted above with the IIT postulates, Giulio argues in his essay that there’s a good a reason to accept them: namely, they explain various empirical facts from neuroscience, and lead to confirmed predictions.  In his words:

[A] theory’s postulates must be able to explain, in a principled and parsimonious way, at least those many facts about consciousness and the brain that are reasonably established and non-controversial.  For example, we know that our own consciousness depends on certain brain structures (the cortex) and not others (the cerebellum), that it vanishes during certain periods of sleep (dreamless sleep) and reappears during others (dreams), that it vanishes during certain epileptic seizures, and so on.  Clearly, a theory of consciousness must be able to provide an adequate account for such seemingly disparate but largely uncontroversial facts.  Such empirical facts, and not intuitions, should be its primary test…

[I]n some cases we already have some suggestive evidence [of the truth of the IIT postulates’ predictions].  One example is the cerebellum, which has 69 billion neurons or so — more than four times the 16 billion neurons of the cerebral cortex — and is as complicated a piece of biological machinery as any.  Though we do not understand exactly how it works (perhaps even less than we understand the cerebral cortex), its connectivity definitely suggests that the cerebellum is ill suited to information integration, since it lacks lateral connections among its basic modules.  And indeed, though the cerebellum is heavily connected to the cerebral cortex, removing it hardly affects our consciousness, whereas removing the cortex eliminates it.

I hope I’m not alone in noticing the irony of this move.  But just in case, let me spell it out: Giulio has stated, as “largely uncontroversial facts,” that certain brain regions (the cerebellum) and certain states (dreamless sleep) are not associated with our consciousness.  He then views it as a victory for IIT, if those regions and states turn out to have lower information integration than the regions and states that he does take to be associated with our consciousness.

But how does Giulio know that the cerebellum isn’t conscious?  Even if it doesn’t produce “our” consciousness, maybe the cerebellum has its own consciousness, just as rich as the cortex’s but separate from it.  Maybe removing the cerebellum destroys that other consciousness, unbeknownst to “us.”  Likewise, maybe “dreamless” sleep brings about its own form of consciousness, one that (unlike dreams) we never, ever remember in the morning.

Giulio might take the implausibility of those ideas as obvious, or at least as “largely uncontroversial” among neuroscientists.  But here’s the problem with that: he just told us that a 2D square grid is conscious!  He told us that we must not rely on “commonsense intuition,” or on any popular consensus, to say that if a square mesh of wires is just sitting there XORing some input bits, doing nothing at all that we’d want to call intelligent, then it’s probably safe to conclude that the mesh isn’t conscious.  So then why shouldn’t he say the same for the cerebellum, or for the brain in dreamless sleep?  By Giulio’s own rules (the ones he used for the mesh), we have no a-priori clue whether those systems are conscious or not—so even if IIT predicts that they’re not conscious, that can’t be counted as any sort of success for IIT.

For me, the point is even stronger: I, personally, would be a million times more inclined to ascribe consciousness to the human cerebellum, or to dreamless sleep, than I would to the mesh of XOR gates.  For it’s not hard to imagine neuroscientists of the future discovering “hidden forms of intelligence” in the cerebellum, and all but impossible to imagine them doing the same for the mesh.  But even if you put those examples on the same footing, still the take-home message seems clear: you can’t count it as a “success” for IIT if it predicts that the cerebellum in unconscious, while at the same time denying that it’s a “failure” for IIT if it predicts that a square mesh of XOR gates is conscious.  If the unconsciousness of the cerebellum can be considered an “empirical fact,” safe enough for theories of consciousness to be judged against it, then surely the unconsciousness of the mesh can also be considered such a fact.

IV. The Phenomenology Argument

I now come to, for me, the strangest and most surprising part of Giulio’s response.  Despite his earlier claim that IIT need not dovetail with “commonsense intuition” about which systems are conscious—that it can defy intuition—at some point, Giulio valiantly tries to reprogram our intuition, to make us feel why a 2D grid could be conscious.  As best I can understand, the argument seems to be that, when we stare at a blank 2D screen, we form a rich experience in our heads, and that richness must be mirrored by a corresponding “intrinsic” richness in 2D space itself:

[I]f one thinks a bit about it, the experience of empty 2D visual space is not at all empty, but contains a remarkable amount of structure.  In fact, when we stare at the blank screen, quite a lot is immediately available to us without any effort whatsoever.  Thus, we are aware of all the possible locations in space (“points”): the various locations are right “there”, in front of us.  We are aware of their relative positions: a point may be left or right of another, above or below, and so on, for every position, without us having to order them.  And we are aware of the relative distances among points: quite clearly, two points may be close or far, and this is the case for every position.  Because we are aware of all of this immediately, without any need to calculate anything, and quite regularly, since 2D space pervades most of our experiences, we tend to take for granted the vast set of relationship[s] that make up 2D space.

And yet, says IIT, given that our experience of the blank screen definitely exists, and it is precisely the way it is — it is 2D visual space, with all its relational properties — there must be physical mechanisms that specify such phenomenological relationships through their causal power … One may also see that the causal relationships that make up 2D space obtain whether the elements are on or off.  And finally, one may see that such a 2D grid is necessary not so much to represent space from the extrinsic perspective of an observer, but to create it, from its own intrinsic perspective.

Now, it would be child’s-play to criticize the above line of argument for conflating our consciousness of the screen with the alleged consciousness of the screen itself.  To wit:  Just because it feels like something to see a wall, doesn’t mean it feels like something to be a wall.  You can smell a rose, and the rose can smell good, but that doesn’t mean the rose can smell you.

However, I actually prefer a different tack in criticizing Giulio’s “wall argument.”  Suppose I accepted that my mental image of the relationships between certain entities was relevant to assessing whether those entities had their own mental life, independent of me or any other observer.  For example, suppose I believed that, if my experience of 2D space is rich and structured, then that’s evidence that 2D space is rich and structured enough to be conscious.

Then my question is this: why shouldn’t the same be true of 1D space?  After all, my experience of staring at a rope is also rich and structured, no less than my experience of staring at a wall.  I perceive some points on the rope as being toward the left, others as being toward the right, and some points as being between two other points.  In fact, the rope even has a structure—namely, a natural total ordering on its points—that the wall lacks.  So why does IIT cruelly deny subjective experience to a row of logic gates strung along a rope, reserving it only for a mesh of logic gates pasted to a wall?

And yes, I know the answer: because the logic gates on the rope aren’t “integrated” enough.  But who’s to say that the gates in the 2D mesh are integrated enough?  As I mentioned before, their Φ-value grows only as the square root of the number of gates, so that the ratio of integrated information to total information tends to 0 as the number of gates increases.  And besides, aren’t what Giulio calls “the facts of phenomenology” the real arbiters here, and isn’t my perception of the rope’s structure a phenomenological fact?  When you cut a rope, does it not split?  When you prick it, does it not fray?

Conclusion

At this point, I fear we’re at a philosophical impasse.  Having learned that, according to IIT,

1. a square grid of XOR gates is conscious, and your experience of staring at a blank wall provides evidence for that,
2. by contrast, a linear array of XOR gates is not conscious, your experience of staring at a rope notwithstanding,
3. the human cerebellum is also not conscious (even though a grid of XOR gates is), and
4. unlike with the XOR gates, we don’t need a theory to tell us the cerebellum is unconscious, but can simply accept it as “reasonably established” and “largely uncontroversial,”

I personally feel completely safe in saying that this is not the theory of consciousness for me.  But I’ve also learned that other people, even after understanding the above, still don’t reject IIT.  And you know what?  Bully for them.  On reflection, I firmly believe that a two-state solution is possible, in which we simply adopt different words for the different things that we mean by “consciousness”—like, say, consciousnessReal for my kind and consciousnessWTF for the IIT kind.  OK, OK, just kidding!  How about “paradigm-case consciousness” for the one and “IIT consciousness” for the other.

Completely unrelated announcement: Some of you might enjoy this Nature News piece by Amanda Gefter, about black holes and computational complexity.