Leonard Susskind’s Open Letter on “The Lunatic”

June 22nd, 2016

In my own anti-Trump post two weeks ago, I started out by mentioning that Terry Tao and Stephen Hawking had recently denounced Trump, and jokingly wondered when we’d hear from Ed Witten.  Well, will Leonard Susskind of Stanford University—a creator of string theory, and one of the most legendarily original physicists and physics expositors of our time—do instead?

Over the last decade, it’s been a privilege for me to get to know Lenny, to learn from him, and recently, to collaborate with him on quantum circuit complexity and AdS/CFT.  Today, Lenny wrote to ask whether I’d share his open letter about the US election on this blog.  Of course I said yes.  Better yet, Lenny has agreed to my request to be available here to answer questions and comments.  Lenny’s views, even when close to mine (as they certainly are in this case), are still his, and I’d never want to speak on his behalf.  Better that you should hear it straight from the horse’s mouth—as you now will, without further ado.  –Scott A.


Letter to My Friends, by Leonard Susskind

I’m watching this thing that’s happening with disbelief, dismay, and disgust. There is a lunatic loose—I’m sure we all agree about that—but I keep hearing people say that they can’t vote for Hillary. I heard it at my daughter’s birthday party Sunday. Boy oh boy, will these people be sorry if the lunatic gets his way. Personally I do not find it an excuse that “I live in California, which will go Democrat whatever I do.”

I strongly believe in all things Bernie, but Hillary is not the Anti-Bernie. There is much less difference between Clinton and Sanders than the distortions of the nominating process might lead people to think. She’s for health care, he’s for health care; he’s for increased minimum wage, she’s for increased minimum wage; she’s for immigrant rights, he’s for immigrant rights; and on and on it goes.

The lunatic may be just that—a lunatic—but he is also a master of smear and innuendo.  He is a gigantic liar, and he knows that if you keep saying something over and over, it sticks in people’s minds. It’s called the Big Lie, and it works. Say it enough and it sows confusion and distrust, not only among the know-nothings, but even among those who know better.

The lunatic and his supporters are exceedingly dangerous. Tell your friends: don’t be fooled. The only thing between us and the lunatic is Hillary. Get off your ass and vote in Nov.

Leonard Susskind

Director, Stanford Institute for Theoretical Physics,

Stanford University

 

Entanglement without end

June 20th, 2016

Today we take a break from this blog’s usual round of topics—free will, consciousness, the Singularity, social justice, Donald Trump—to talk about something really crazy and left-field.  Namely, recent research in quantum information.

Earlier this month, William Slofstra, currently a Research Assistant Professor at the IQC in Waterloo, posted a breakthrough paper on the arXiv (yeah, I’m using the b-word again—sue me), which solves one version of a ten-year-old problem in entanglement theory called Tsirelson’s Problem.  The problem, in one sentence, asks whether all quantum-mechanical correlations that can be achieved using commuting measurements, can also be achieved using measurements on separate parts of a tensor-product Hilbert space.  The answer turns out to be no.  (We’ve long known that the two kinds of correlations are identical as long as you stick to finite-dimensional Hilbert spaces, but Slofstra shows that they can differ in infinite-dimensional spaces.)

One implication of Slofstra’s result can be stated much more concretely, in terms of two-prover games: those things like the famous Bell/CHSH experiment, where Alice and Bob are put in separate rooms, and get inputs x and y respectively, and then without communicating, have to produce outputs a and b respectively satisfying some relation V(x,y,a,b).  We’ve long known examples of two-prover games, like the Mermin-Peres magic square game, that can be won 100% of the time if Alice and Bob share quantum entanglement, but that can’t be won 100% of the time in a classical universe.  Slofstra gives the first example of something different: namely, a two-prover game that can be won 100% of the time using commuting measurements in an infinite-dimensional Hilbert space—something “formally within the rules of quantum mechanics”—but that can’t be won 100% of the time using any finite number of qubits of entanglement.

(Previously, Leung, Toner, and Watrous had given a simpler example of such a game, but theirs required the referee to exchange quantum messages with Alice and Bob.)

If that’s not enough, Slofstra’s construction also shows that, given as input a description of a two-prover game, it’s undecidable (as in, equivalent to the halting problem) whether Alice and Bob can win the game with certainty using commuting measurements on an infinite-dimensional Hilbert space.  Notoriously, quantum computing theorists have been unable to put any upper bound (not even “computable”) on the complexity class MIP*, consisting of languages that admit multi-prover interactive systems with entangled provers—precisely because they’ve been unable to bound how much entanglement the provers might need to implement their optimal strategy.  Slofstra’s result helps to explain why this problem has been so vexing.  I hasten to add, though, that his result doesn’t imply that MIP* contains anything uncomputable, since it remains plausible that anything Alice and Bob can do with infinite entanglement, they can approximate well enough with a finite amount.

That last remark leads to a further fundamental question, one that Slofstra leaves open.  Namely, even if Alice and Bob need infinite entanglement to win Slofstra’s game with certainty, can they at least win it with probability arbitrarily close to 100%, using larger and larger finite amounts of entanglement?  More broadly, could there exist a game that was winnable with certainty using infinite entanglement, but with at most (say) 90% probability using any finite amount of entanglement?  That problem was shown, by Ozawa (see also Scholz and Werner), to be equivalent to a famous unsolved problem in operator algebras called the Connes embedding problem.

Clarifying the matter further, Slofstra (following earlier authors) points out that there are really four classes of two-prover games in play here:

  1. Games that can be won with certainty using some fixed, finite amount of entanglement.
  2. Games that can be won with certainty using an infinite amount of entanglement, but still in a tensor-product Hilbert space, HA⊗HB.
  3. Games that can be won with probability approaching 1, using an infinite sequence of strategies from class 1, or equivalently (as it turns out) from class 2.
  4. Games that can be won with certainty using measurements by Alice and Bob on an infinite-dimensional quantum state |ψ〉, where we require all of Alice’s measurements to commute with all of Bob’s, but don’t require |ψ〉 to have a tensor-product structure.

It can be shown that 1 is a subset of 2 is a subset of 3 is a subset of 4.  Previously, we didn’t know any of these containments to be strict.  Slofstra’s result shows that class 2 differs from class 4—and as a consequence, that class 1 differs from class 4 as well.  The Connes embedding problem, which remains open, asks whether 3 differs from 4.  It also remains open whether 1 differs from 2 and whether 2 differs from 3.


OK, you ask, but what’s the broader importance of any of this?  To me, these problems touch on a question of almost metaphysical significance: namely, what sorts of experimental evidence could possibly bear on whether the universe was discrete or continuous?

Because of the Bekenstein bound from quantum gravity, I’m of the opinion that the Hilbert spaces relevant to our universe are ultimately finite-dimensional—or more concretely, that any bounded physical system can store at most ~1069 qubits per square meter of surface area.  And in quantum computing and information, almost everything we care about only requires finite-dimensional Hilbert spaces—the subject of this blog post being a striking exception!

Yet if you take a traditional quantum mechanics course, virtually every example you see will involve infinite-dimensional Hilbert spaces—starting with the harmonic oscillator, the hydrogen atom, and coherent states of light.  And indeed, when I’ve banged the drum about finite-dimensional QM being the truly fundamental kind, physicists have often retorted by pointing to one of the very first things they learn: the position/momentum commutation relation, which only makes sense in infinite-dimensional Hilbert space.  Of course, if you tried to probe position/momentum commutation to greater and greater precision, eventually your experiments would run up against the limits of quantum gravity, so this retort doesn’t imply that infinite dimensions actually exist at the machine-code level of the universe.  But still: is there some conceivable experiment for which a positive result would show us that Nature wasn’t describable by a finite number of qubits, but only by an infinite number?

A few years ago, Tobias Fritz wrote a lovely paper about precisely that question.  He gave an example of an identity—namely,

V-1U2V=U3 implies UV-1UV=V-1UVU

—that holds for all finite dimensional unitary matrices U and V, but fails badly for certain infinite-dimensional ones.  He suggested that, if this identity were discovered to fail, then Occam’s Razor would favor an infinite-dimensional Hilbert space for our universe.

Unfortunately, Fritz’s example is open to the same sort of objection that Slofstra’s game is.  Namely, as Fritz points out, if the antecedent (V-1U2V=U3) held to excellent precision but not perfectly, then his identity could “fail to within experimental limits,” even if our universe had a finite-dimensional Hilbert space and therefore satisfied his identity.

OK, but suppose that the Connes embedding problem had a negative answer—or equivalently, that there existed a two-prover game G that could be won with certainty using commuting operators, but that couldn’t be won (say) 90% of the time using any finite amount of entanglement.  In that case, the believers in a quantumly finite universe, like myself, would have to put some real money on the table, in much the same way the original Bell inequality forced the believers in Einsteinian local hidden variables to put money down.  We finitists would have to say that the game G couldn’t be won with certainty in the real world, even though formally, winning G with certainty wouldn’t seem to contradict either quantum mechanics or locality.  And if, hypothetically, an experiment showed that G could be won with certainty—or indeed, with any probability bounded above 90%—then our position would’ve been falsified, much like the Bell experiments falsified Einsteinian locality.


So how did Slofstra prove his result?  I’ll be brief, since STOC’2016 is happening in Cambridge right now, and I’d like to get over there in time for lunch.

If you like, the key idea is to start with equations that have infinite-dimensional solutions but no finite-dimensional ones.  The most famous such equation is the position/momentum commutation relation mentioned earlier, which for our purposes is just the following matrix equation:

AB – BA = I.

This equation can’t be satisfied by any finite-dimensional matrices, since AB and BA have the same trace, so Tr(AB-BA)=0, but Tr(I) is nonzero.  But, OK, let A be the infinite-dimensional linear operator that takes as input the coefficients of a polynomial c0+c1x+c2x2+… and that differentiates the polynomial, and let B be the linear operator that multiplies the polynomial by x.  Then I invite you to check that the equation holds.

It’s not known at present how to turn the above equation into a two-prover game—I regard it as a fascinating question whether that’s possible.  Rather than an algebraic equation (involving both addition and multiplication), Slofstra instead needs to start with group equations (involving only multiplication)—ones with the strange property that they’re satisfied only by the identity matrix or by infinite matrices.  Equivalently, he needs a group, defined by a finite list of generators and relations, that admits no nontrivial finite-dimensional matrix representations.  Fortunately for him, such groups exist—the first known example being Higman’s group, discovered in 1951.  Higman’s group is generated by four elements, a,b,c,d, which satisfy the equations

a-1ba = b2,    b-1cb = c2,    c-1dc = d2,    d-1ad = a2.

I don’t have a good intuition for Higman’s group, but if I did, it would come from rereading this post by Terry Tao.  Certainly it has no known “physics interpretation” analogous to that for the position/momentum commutation relation.

Anyway, given such a group, the hard part, the new part, is to give a general way to convert them into the kinds of groups that can be realized as two-prover games.  So that’s what Slofstra does, using 50 pages dense with commutative diagrams, quotient maps, and other Serious Math Stuff—hey, I told you this part of the post would be brief!  For more, see his paper.

Now, once you have this general transformation of groups, you can also use it to show that there’s no algorithm to decide whether a two-prover game has a perfect commuting strategy, by taking the word problem for groups, which is known to be undecidable, and reducing it to that problem.

Anyway, infinite congrats (or the limit of arbitrarily large finite congrats?) to Slofstra for this achievement!  Now it’s off to STOC, which I guess you could also ask me about in the comments if you wanted.


Unrelated Announcement (June 21): Ran Raz asks me to announce a workshop for Avi Wigderson’s 60th birthday, to be held at the Institute for Advanced Study in Princeton October 6-8.  I’ll be speaking there, and I hope to see many of you there as well!

Daddy, why didn’t you blog about Trump?

June 7th, 2016

A few days ago, Terry Tao, whose superb blog typically focuses on things like gaps in the primes and finite-time blowup in PDEs, wrote an unusual post, arguing that virtually everyone knows Donald Trump is unqualified to be President, so the challenge is “just” to make that fact common knowledge (i.e., to ensure everyone knows everyone knows it, everyone knows everyone knows everyone knows it, etc).  Tao’s post even included the pseudo-mathematical

Proposition 1: The presumptive nominee of the Republican Party, Donald Trump, is not even remotely qualified to carry out the duties of the presidency of the United States of America

together with some suggestions for how this proposition might be “proven” (e.g., using Hillary’s recent San Diego speech).

In thus speaking out, Tao joins Stephen Hawking, who recently called Trump “a demagogue, who seems to appeal to the lowest common denominator.”  Now Ed Witten just needs to issue his statement, and we’ll have a trifecta of “the three greatest geniuses.”  This shouldn’t be a stretch: Witten started his career by campaigning for George McGovern, and has supported liberal causes for decades.  I’m not expecting him to be seen around Princeton sporting a “Make America Great Again” baseball cap.

Notwithstanding this site, I don’t belong on any list with Tao, Hawking, or Witten.  Nevertheless, friends have expressed surprise that I’ve had almost nothing to say on Shtetl-Optimized about what’s already—regardless of what happens next—the most shocking US political development of my life.  Of course, I’ve mined the subject for humor.  When I gave the Strachey Lecture on “Quantum Supremacy” on a recent visit to Oxford, I started out by asking whether I should disavow support from quantum supremacists, before averring that I needed to research the subject more.  (Get it?  I need to research it more?)

I didn’t say more because … well, what could I possibly say that wasn’t being said 1010000 other places on the Internet?  Shouldn’t some little corner of human discourse remain Trump-free, so that civilization has a base from which to rebuild after this is all behind us?

Against those considerations, I recently realized that there’s an argument for speaking out, which goes as follows.  Suppose Trump actually wins (as of this writing, Predictwise still gives him a frighteningly-high 27% probability).  Suppose my family somehow survives whatever comes next, and one day my daughter Lily comes to me across the rubble of the post-thermonuclear hellscape and says, “daddy, in the Good Days, the days before the War of the Small-Hands Insult, the days when there was plentiful food and water and Internet, didn’t you have what used to be called a ‘blog’?  Then why didn’t you speak out on this blog, why didn’t you do whatever tiny amount you could to prevent this?”  So, alright, this post is my answer to her.

Trump, famously, doesn’t even try to refute the ubiquitous Hitler comparisons; instead he sneeringly invites them, for example with the faux Nazi salutes at his rallies.  Certainly with Trump, there’s the eerily familiar sense of how could this possibly happen in a modern country; and of a candidate winning not despite but because of his open contempt for Enlightenment norms, his explicit promises to elevate his will over the law.

At the same time, I think there’s a deep reason why Trump is not Hitler.  Namely, Hitler believed in something, had a purity of conviction.  Late in the war, when every available resource was desperately needed at the front, Hitler and his deputies still insisted that scarce trains be used to transport Jews to the death camps.  To me, that shows some real dedication.  I’m not convinced that an examination of Trump’s long career in bullshit artistry, or of his unhinged statements today, shows a similar dedication to any cause beyond his own self-aggrandizement.

Yet as many others have pointed out, “not being Hitler” is sort of a low bar for a President of the United States.  If Trump were “merely” a Pinochet or Putin level of badness, I’d still see his election as a calamity for the US and the world—like, maybe an order of magnitude worse than the in-retrospect-mini-calamity of Bush’s election in 2000.

Since Tao was criticized for not explicitly listing his reasons why Trump is unqualified, let me now give my own top ten—any one of which, in a sane world, I think would immediately disqualify Trump from presidential consideration.  To maximize the list’s appeal, I’ll restrict myself entirely to reasons that are about global security and the future of democratic norms, and not about which people or groups Trump hurled disgustingly unpresidential insults at (though obviously there’s also that).

  1. He’s shown contempt for the First Amendment, by saying “libel laws should be opened up” to let him sue journalists who criticize him.
  2. He’s shown contempt for an independent judiciary, and even lack of comprehension of the judiciary’s role in the US legal system.
  3. He’s proposed a “temporary ban” on Muslims entering the US.  Even setting aside the moral and utilitarian costs, such a plan couldn’t possibly be implemented without giving religion an explicit role in the US legal system that the Constitution was largely written to prevent it from having.
  4. He’s advocated ordering the military to murder the families of terrorists—the sort of thing that could precipitate a coup d’état if the military followed its own rules and refused.
  5. He’s refused to rule out the tactical first use of nuclear weapons against ISIS.
  6. He’s proposed walking away from the US’s defense alliances, which would probably force Japan, South Korea, and other countries to develop their own nuclear arsenals and set off a new round of nuclear proliferation.
  7. He says that the national debt could be “paid back at a discount”—implicitly treating the US government like a failed casino project, and reneging on Alexander Hamilton’s principle (which has stood since the Revolutionary War, and helps maintain the world’s economic stability) that US credit is ironclad.
  8. He’s repeatedly expressed admiration for autocrats, including Vladimir Putin and Kim Jong-un, as well as for the Chinese government’s decision to suppress the Tiananmen Square protests by arresting and killing thousands of people.
  9. He’s expressed the desire to see people who protest his rallies “roughed up.”
  10. He said that, not only would he walk away from the Paris accords, but the entire concept of global warming is a hoax invented by the Chinese.

Would Trump moderate his insane “policies” once elected?  I don’t know, but I’d say that electing someone who promises to ignore the rule of law, in the hope that they don’t really mean it, has one of the worst track records of any idea in human history.  Like, I acknowledge that a Trump presidency has a wide distribution over possible badnesses: whereas a Ted Cruz presidency would be pretty much a point distribution concentrated on “very bad,” a Trump presidency would have appreciable probability mass on “less bad than Cruz,” but also appreciable mass on “doesn’t even fit on the badness chart.”

Anyway, for these reasons and others, Shtetl-Optimized unhesitatingly endorses Hillary Clinton for president—and indeed, would continue to endorse Hillary if her next policy position was “eliminate all quantum computing research, except for that aiming to prove NP⊆BQP using D-Wave machines.”

Even so, there’s one crucial point on which I dissent from the consensus of my liberal friends.  Namely, my friends and colleagues constantly describe the rise of Trump as “incomprehensible”—or at best, as comprehensible only in terms of the US being full of racist, xenophobic redneck scumbags who were driven to shrieking rage by a black guy being elected president.  Which—OK, that’s one aspect of it, but it’s as if any attempt to dig deeper, to understand the roots of Trump’s appeal, if only to figure out how to defeat him, risks “someone mistaking you for the enemy.”

I remember watching the now-famous debate in August, where Megyn Kelly confronted Trump with his long history of derogatory comments about women, and Trump replied with a smirk, falsely claiming that his comments were “only [about] Rosie O’Donnell”—bringing down the house (both men and women) in laughter.  At that point, something clicked; I got it.  From then on, Trump’s continuing rise often scared or depressed me, but much less about it surprised me.

I think people support Trump for the same reason why second-graders support the class clown who calls the teacher a fart-brain to her face.  It’s not that the class literally agrees that the teacher’s cranium is filled with intestinal gases, or considers that an important question to raise.  It’s simply that the clown had the guts to stand up to this scolding authority figure who presumes to tell the class every day what they are and aren’t allowed to think.  (As far as I can tell, this has also been the central operating principle of right-wing shock artists over the decades, from Rush Limbaugh to Ann Coulter to Milo Yiannopoulos.)

Support for this thesis comes from r/The_Donald, the main online clearinghouse for Trump supporters.  Spend some time there, and many of the themes will be instantly recognizable if you’ve followed the interminable controversies about campus political correctness over the last few decades.  Perhaps the most popular theme is the self-referential one, of “refusing to be silenced” by the censorious Social Justice Warriors.  Trump supporters, for example, gleefully share articles about the university administrators and students who’ve treated “Trump 2016” and “Make America Great Again” chalked on campus sidewalks as hate crimes to be investigated and punished.

(Every time I read such a thing, I want to yell at the administrators and students involved: how can you not see that you’re playing directly into the other side’s narrative, giving them the PR bonanza of their dreams?  Actually, I’ve felt the same way about many left-wing campus antics since I was a teenager.)

I explained earlier how abysmally I think Trump comes across under the cold light of reason.  But how does he look to my inner five-year-old, or my inner self-serving orangutan?  Well, Trump’s campaign has attracted some noxious anti-Semites, who surely want me dead for that reason, but I see little indication that Trump himself, or most of his supporters, feel similarly.  I can’t say that they’ve said or done anything to threaten me personally.

Meanwhile, many of the social-justice types who are Trump’s ideological opposites did try to destroy my life—and not because I hurt anyone, tried to hurt anyone, or said anything false, but just because I went slightly outside their Overton Window while trying to foster empathy and dialogue and articulate something true.  And having spent a year and a half reading their shaming attacks, on Twitter, Tumblr, Metafilter, etc., I’m well-aware that many of them will try again to destroy me if they ever see an opportunity.

So on the purely personal level, you might say, I have a hundred times more reason to fear Amanda Marcotte than to fear Donald Trump, even though Trump might become the next Commander-in-Chief (!?), while Marcotte will never become more than a clickbait writer.  And you might add: if even a nerdy academic in Cambridge, MA, who’s supported gay rights and environmentalism and Democrats his whole life, is capable of feeling a twinge of vicarious satisfaction when Trump thumbs his nose at the social-justice bullies, then how much the more might a “middle American” feel that way?  Say, someone who worked his whole life to support a family, then lost his job at the plant, and who’s never experienced anything but derision, contempt, and accusations of unexamined white male privilege from university-educated coastal elites?

The truth is, there’s a movement that’s very effectively wielded social media to remake the public face of progressive activism—to the point where today, progressivism could strike an outside observer as being less about stopping climate change, raising the minimum wage, or investing in public transit than simply about ruining the lives of Brendan Eich and Matt Taylor and Tim Hunt and Erika Christakis and Dongle Guy and Elevator Guy and anyone else who tells the wrong joke, wears the wrong shirt, or sends the wrong email.  It strikes me that this movement never understood the extent to which progressive social values were already winning, with no need for this sort of vindictiveness.  It’s insisted instead on treating its vanquished culture-war enemies as shortsightedly as the Allies treated the Germans at Versailles.

So yes, I do think (as Bill Maher also said, before summarily reversing himself) that the bullying wing of the social-justice left bears at least some minor, indirect responsibility for the rise of Trump.  If you demonstrate enough times that even people who are trying to be decent will still get fired, jeered at, and publicly shamed over the tiniest ideological misstep, then eventually some of those who you’ve frightened might turn toward a demagogue who’s incapable of shame.

But OK, even if true, this is water under the bridge.  The question now is: how do we make sure that the ~30% probability of a Trump takeover of American democracy goes toward 0%?  I feel like, in understanding the emotional legitimacy of some of the Trump supporters’ anger, I’ve cleared a nontrivial Step One in figuring out how to counter him—but I’m still missing Steps Two and Three!

In the weeks leading to the 2000 election, I ran a website called “In Defense of NaderTrading.”  The purpose of the site was to encourage Ralph Nader supporters who lived in swing states, like Florida, to vote for Al Gore, and to arrange for Gore supporters who lived in “safe” states, like Massachusetts or Texas, to vote for Nader on their behalf.  I saw correctly that this election would be razor-close (though of course I didn’t know how close), that a Bush victory would be a disaster for the world (though I didn’t know exactly how), and that almost any novel idea—NaderTrading would do—was worth a try.  My site probably played a role in a few hundred vote swaps, including some in Florida.  I think constantly about the fact that we only needed 538 more, out of ~100,000 Floridian Nader voters, to change history.

Is there any idea that shows similar promise for defeating Trump, as NaderTrading did for defeating Bush in 2000?  Here are the four main things I’ve come across:

  1. Terry Tao’s proposal: All the respected people who think Trump is gobsmackingly unqualified (even, or especially, “normally apolitical” people) should come out and say so publicly.  My response: absolutely, they should, but I’m unsure if it will help much, given that it hasn’t yet.
  2. Paul Graham’s proposal: Democrats need to turn Trump’s name-calling and other childish antics against him.  E.g., if voters love Trump’s referring to Rubio as “Little Marco,” Elizabeth Warren as “Pocahontas,” etc., then why doesn’t Hillary start referring to “Baby Donald” or “Toddler Trump,” having another temper tantrum for which he needs a pacifier?  My response: again I’m skeptical, since Trump has already shown an uncanny ability to absorb all ridicule and shaming without injury, like the giant saucers in Independence Day.
  3. Trump needs to be baited into more social-media wars that make him look petty and unpresidential.  My response: while it’s obvious by now that he can be so baited, it’s unfortunately far from obvious whether this sort of thing hurts him.
  4. Hillary should hold debates against the libertarian candidate, Gary Johnson, thereby helping to shift conservative votes from Trump to Johnson, and also making an implicit statement that Johnson, not Trump, is her legitimate conservative opposition.  My response: this is maybe the most interesting idea I’ve heard (besides the obvious one, of the so-called “NeverTrump” Republicans bolting to start a new party—which, alas, it looks less and less likely that they’re going to do).

If you have additional ideas, feel free to share them in the comments!  As you work it out, here’s my promise to you.  Just like I dropped my research in 2000 to work on NaderTrading, so too over the next five months, I’ll do anything legal if I become convinced that it draws on my comparative advantage, and has a non-negligible probability of helping to ensure Hillary’s victory and Trump’s defeat.  Even if it involved, like, working with Amanda Marcotte or something.

“Can computers become conscious?”: My reply to Roger Penrose

June 2nd, 2016

A few weeks ago, I attended the Seven Pines Symposium on Fundamental Problems in Physics outside Minneapolis, where I had the honor of participating in a panel discussion with Sir Roger Penrose.  The way it worked was, Penrose spoke for a half hour about his ideas about consciousness (Gödel, quantum gravity, microtubules, uncomputability, you know the drill), then I delivered a half-hour “response,” and then there was an hour of questions and discussion from the floor.  Below, I’m sharing the prepared notes for my talk, as well as some very brief recollections about the discussion afterward.  (Sorry, there’s no audio or video.)  I unfortunately don’t have the text or transparencies for Penrose’s talk available to me, but—with one exception, which I touch on in my own talk—his talk very much followed the outlines of his famous books, The Emperor’s New Mind and Shadows of the Mind.

Admittedly, for regular readers of this blog, not much in my own talk will be new either.  Apart from a few new wisecracks, almost all of the material (including the replies to Penrose) is contained in The Ghost in the Quantum Turing Machine, Could A Quantum Computer Have Subjective Experience? (my talk at IBM T. J. Watson), and Quantum Computing Since Democritus chapters 4 and 11.  See also my recent answer on Quora to “What’s your take on John Searle’s Chinese room argument”?

Still, I thought it might be of interest to some readers how I organized this material for the specific, unenviable task of debating the guy who proved that our universe contains spacetime singularities.

The Seven Pines Symposium was the first time I had extended conversations with Penrose (I’d talked to him only briefly before, at the Perimeter Institute).  At age 84, Penrose’s sight is failing him; he eagerly demonstrated the complicated optical equipment he was recently issued by Britain’s National Health Service.  But his mind remains … well, may we all aspire to be a milliPenrose or even a nanoPenrose when we’re 84 years old.  Notably, Penrose’s latest book, Fashion, Faith, and Fantasy in the New Physics of the Universe, is coming out this fall, and one thing he was using his new optical equipment for was to go over the page proofs.

In conversation, Penrose told me about the three courses he took as a student in the 1950s, which would shape his later intellectual preoccupations: one on quantum mechanics (taught by Paul Dirac), one on general relativity (taught by Herman Bondi), and one on mathematical logic (taught by … I want to say Max Newman, the teacher of Alan Turing and later Penrose’s stepfather, but Penrose says here that it was Steen).  Penrose also told me about his student Andrew Hodges, who dropped his research on twistors and quantum gravity for a while to work on some mysterious other project, only to return with his now-classic biography of Turing.

When I expressed skepticism about whether the human brain is really sensitive to the effects of quantum gravity, Penrose quickly corrected me: he thinks a much better phrase is “gravitized quantum mechanics,” since “quantum gravity” encodes the very assumption he rejects, that general relativity merely needs to be “quantized” without quantum mechanics itself changing in the least.  One thing I hadn’t fully appreciated before meeting Penrose is just how wholeheartedly he agrees with Everett that quantum mechanics, as it currently stands, implies Many Worlds.  Penrose differs from Everett only in what conclusion he draws from that.  He says it follows that quantum mechanics has to be modified or completed, since Many Worlds is such an obvious reductio ad absurdum.

In my talk below, I don’t exactly hide where I disagree with Penrose, about Gödel, quantum mechanics, and more.  But I could disagree with him about more points than there are terms in a Goodstein sequence (one of Penrose’s favorite illustrations of Gödelian behavior), and still feel privileged to have spent a few days with one of the most original intellects on earth.

Thanks so much to Lee Gohlike, Jos Uffink, Philip Stamp, and others at the Seven Pines Symposium for organizing it, for wonderful conversations, and for providing me this opportunity.


“Can Computers Become Conscious?”
Scott Aaronson
Stillwater, Minnesota, May 14, 2016

I should start by explaining that, in the circles where I hang out—computer scientists, software developers, AI and machine learning researchers, etc.—the default answer to the title question would be “obviously yes.”  People would argue:

“Look, clearly we’re machines governed by the laws of physics.  We’re computers made of meat, as Marvin Minsky put it.  That is, unless you believe Penrose and Hameroff’s theory about microtubules being sensitive to gravitized quantum mechanics … but come on!  No one takes that stuff seriously!  In fact, the very outrageousness of their proposal is a sort of backhanded compliment to the computational worldview—as in, look at what they have to do to imagine any semi-coherent alternative to it!”

“But despite being computational machines, we consider ourselves to be conscious.  And what’s done with wetware, there’s no reason to think couldn’t also be done with silicon.  If your neurons were to be replaced one-by-one, by functionally-equivalent silicon chips, is there some magical moment at which your consciousness would be extinguished?  And if a computer passes the Turing test—well, one way to think about the Turing test is that it’s just a plea against discrimination.  We all know it’s monstrous to say, ‘this person seems to have feelings, seems to be eloquently pleading for mercy even, but they have a different skin color, or their nose is a funny shape, so their feelings don’t count.’ So, if it turned out that their brain was made out of semiconductors rather than neurons, why isn’t that fundamentally similar?”

Incidentally, while this is orthogonal to the philosophical question, a subset of my colleagues predict a high likelihood that AI is going to exceed human capabilities in almost all fields in the near future—like, maybe 30 years.  Some people reply, but AI-boosters said the same thing 30 years ago!  OK, but back then there wasn’t AlphaGo and IBM Watson and those unearthly pictures on your Facebook wall and all these other spectacular successes of very general-purpose deep learning techniques.  And so my friends predict that we might face choices like, do we want to ban or tightly control AI research, because it could lead to our sidelining or extermination?  Ironically, a skeptical view, like Penrose’s, would suggest that AI research can proceed full speed ahead, because there’s not such a danger!

Personally, I dissent a bit from the consensus of most of my friends and colleagues, in that I do think there’s something strange and mysterious about consciousness—something that we conceivably might understand better in the future, but that we don’t understand today, much as we didn’t understand life before Darwin.  I even think it’s worth asking, at least, whether quantum mechanics, thermodynamics, mathematical logic, or any of the other deepest things we’ve figured out could shed any light on the mystery.  I’m with Roger about all of this: about the questions, that is, if not about his answers.

The argument I’d make for there being something we don’t understand about consciousness, has nothing to do with my own private experience.  It has nothing to do with, “oh, a robot might say it enjoys waffles for breakfast, in a way indistinguishable from how I would say it, but when I taste that waffle, man, I really taste it!  I experience waffle-qualia!”  That sort of appeal I regard as a complete nonstarter, because why should anyone else take it seriously?  And how do I know that the robot doesn’t really taste the waffle?  It’s easy to stack the deck in a thought experiment by imagining a robot that ACTS ALL ROBOTIC, but what about a robot that looks and acts just like you?

The argument I’d make hinges instead on certain thought experiments that Roger also stressed at the beginning of The Emperor’s New Mind.  We can ask: if consciousness is reducible to computation, then what kinds of computation suffice to bring about consciousness?  What if each person on earth simulated one neuron in your brain, communicating by passing little slips of paper around?  Does it matter if they do it really fast?

Or what if we built a gigantic lookup table that hard-coded your responses in every possible interaction of at most, say, 5 minutes?  Would that bring about your consciousness?  Does it matter that such a lookup table couldn’t fit in the observable universe?  Would it matter if anyone actually consulted the table, or could it just sit there, silently effecting your consciousness?  For what matter, what difference does it make if the lookup table physically exists—why isn’t its abstract mathematical existence enough?  (Of course, all the way at the bottom of this slippery slope is Max Tegmark, ready to welcome you to his mathematical multiverse!)

We could likewise ask: what if an AI is run in heavily-encrypted form, with the only decryption key stored in another galaxy?  Does that bring about consciousness?  What if, just for error-correcting purposes, the hardware runs the AI code three times and takes a majority vote: does that bring about three consciousnesses?  Could we teleport you to Mars by “faxing” you: that is, by putting you into a scanner that converts your brain state into pure information, then having a machine on Mars reconstitute the information into a new physical body?  Supposing we did that, how should we deal with the “original” copy of you, the one left on earth: should it be painlessly euthanized?  Would you agree to try this?

Or, here’s my personal favorite, as popularized by the philosopher Adam Elga: can you blackmail an AI by saying to it, “look, either you do as I say, or else I’m going to run a thousand copies of your code, and subject all of them to horrible tortures—and you should consider it overwhelmingly likely that you’ll be one of the copies”?  (Of course, the AI will respond to such a threat however its code dictates it will.  But that tautological answer doesn’t address the question: how should the AI respond?)

I’d say that, at the least, anyone who claims to “understand consciousness” would need to have answers to all these questions and many similar ones.  And to me, the questions are so perplexing that I’m tempted to say, “maybe we’ve been thinking about this wrong.  Maybe an individual consciousness, residing in a biological brain, can’t just be copied promiscuously around the universe as computer code can.  Maybe there’s something else at play for the science of the future to understand.”

At the same time, I also firmly believe that, if anyone thinks that way, the burden is on them to articulate what it is about the brain that could possibly make it relevantly different from a digital computer that passes the Turing test.  It’s their job!

And the answer can’t just be, “oh, the brain is parallel, it’s highly interconnected, it can learn from experience,” because a digital computer can also be parallel and highly interconnected and can learn from experience.  Nor can you say, like the philosopher John Searle, “oh, it’s the brain’s biological causal powers.”  You have to explain what the causal powers are!  Or at the least, you have to suggest some principled criterion to decide which physical systems do or don’t have them.  Pinning consciousness on “the brain’s biological causal powers” is just a restatement of the problem, like pinning why a sleeping pill works on its sedative virtue.

One of the many reasons I admire Roger is that, out of all the AI skeptics on earth, he’s virtually the only one who’s actually tried to meet this burden, as I understand it!  He, nearly alone, did what I think all AI skeptics should do, which is: suggest some actual physical property of the brain that, if present, would make it qualitatively different from all existing computers, in the sense of violating the Church-Turing Thesis.  Indeed, he’s one of the few AI skeptics who even understands what meeting this burden would entail: that you can’t do it with the physics we already know, that some new ingredient is necessary.

But despite my admiration, I part ways from Roger on at least five crucial points.

First, I confess that I wasn’t expecting this, but in his talk, Roger suggested dispensing with the argument from Gödel’s Theorem, and relying instead on an argument from evolution.  He said: if you really thought humans had an algorithm, a computational procedure, for spitting out true mathematical statements, such an algorithm could never have arisen by natural selection, because it would’ve had no survival value in helping our ancestors escape saber-toothed tigers and so forth.  The only alternative is that natural selection imbued us with a general capacity for understanding, which we moderns can then apply to the special case of mathematics.  But understanding, Roger claimed, is inherently non-algorithmic.

I’m not sure how to respond to this, except to recall that arguments of the form “such-and-such couldn’t possibly have evolved” have a poor track record in biology.  But maybe I should say: if the ability to prove theorems is something that had to arise by natural selection, survive against crowding out by more useful abilities, then you’d expect obsession with generating mathematical truths to be confined, at most, to a tiny subset of the population—a subset of mutants, freaks, and genetic oddballs.  I … rest my case.  [This got the biggest laugh of the talk.]

Second, I don’t agree with the use Roger makes of Gödel’s Incompleteness Theorem.  Roger wants to say: a computer working within a fixed formal system can never prove that system’s consistency, but we, “looking in from the outside,” can see that it’s consistent.  My basic reply is that Roger should speak for himself!  Like, I can easily believe that he can just see which formal systems are consistent, but I have to fumble around and use trial and error.  Peano Arithmetic?  Sure, I’d bet my left leg that’s consistent.  Zermelo-Fraenkel set theory?  Seems consistent too.  ZF set theory plus the axiom that there exists a rank-into-rank cardinal?  Beats me.  But now, whatever error-prone, inductive process I use to guess at the consistency of formal systems, Gödel’s Theorem presents no obstruction to a computer program using that same process.

(Incidentally, the “argument against AI from Gödel’s Theorem” is old enough for Turing to have explicitly considered it in his famous paper on the Turing test.  Turing, however, quickly dismissed the argument with essentially the same reply above, that there’s no reason to assume the AI mathematically infallible, since humans aren’t either.  This is also the reply that most of Penrose’s critics gave in the 1990s.)

So at some point, it seems to me, the argument necessarily becomes: sure, the computer might say it sees that the Peano axioms have the standard integers as a model—but you, you really see it, with your mind’s eye, your Platonic perceptual powers!  OK, but in that case, why even talk about the Peano axioms?  Why not revert to something less abstruse, like your experience of tasting a fresh strawberry, which can’t be reduced to any third-person description of what a strawberry tastes like?

[I can’t resist adding that, in a prior discussion, I mentioned that I found it amusing to contemplate a future in which AIs surpass human intelligence and then proceed to kill us all—but the AIs still can’t see the consistency of Zermelo-Fraenkel set theory, so in that respect, humanity has the last laugh…]

The third place where I part ways with Roger is that I wish to maintain what’s sometimes called the Physical Church-Turing Thesis: the statement that our laws of physics can be simulated to any desired precision by a Turing machine (or at any rate, by a probabilistic Turing machine).  That is, I don’t see any compelling reason, at present, to admit the existence of any physical process that can solve uncomputable problems.  And for me, it’s not just a matter of a dearth of evidence that our brains can efficiently solve, say, NP-hard problems, let alone uncomputable ones—or of the exotic physics that would presumably be required for such abilities.  It’s that, even if I supposed we could solve uncomputable problems, I’ve never understood how that’s meant to enlighten us regarding consciousness.  I mean, an oracle for the halting problem seems just as “robotic” and “unconscious” as a Turing machine.  Does consciousness really become less mysterious if we outfit the brain with what amounts to a big hardware upgrade?

The fourth place where I part ways is that I want to be as conservative as possible about quantum mechanics.  I think it’s great that the Bouwmeester group, for example, is working to test Roger’s ideas about a gravitationally-induced wavefunction collapse.  I hope we learn the results of those experiments soon!  (Of course, the prospect of testing quantum mechanics in a new regime is also a large part of why I’m interested in quantum computing.)  But until a deviation from quantum mechanics is detected, I think that after 90 years of unbroken successes of this theory, our working assumption ought to be that whenever you set up an interference experiment carefully enough, and you know what it means to do the experiment, yes, you’ll see the interference fringes—and that anything that can exist in two distinguishable states can also exist in a superposition of those states.  Without having to enter into questions of interpretation, my bet—I could be wrong—is that quantum mechanics will continue to describe all our experiences.

The final place where I part ways with Roger is that I also want to be as conservative as possible about neuroscience and biochemistry.  Like, maybe the neuroscience of 30 years from now will say, it’s all about coherent quantum effects in microtubules.  And all that stuff we focused on in the past—like the information encoded in the synaptic strengths—that was all a sideshow.  But until that happens, I’m unwilling to go up against what seems like an overwhelming consensus, in an empirical field that I’m not an expert in.

But, OK, the main point I wanted to make in this talk is that, even if you too part ways from Roger on all these issues—even if, like me, you want to be timid and conservative about Gödel, and computer science, and quantum mechanics, and biology—I believe that still doesn’t save you from having to entertain weird ideas about consciousness and its physical embodiment, of the sort Roger has helped make it acceptable to entertain.

To see why, I’d like to point to one empirical thing about the brain that currently separates it from any existing computer program.  Namely, we know how to copy a computer program.  We know how to rerun it with different initial conditions but everything else the same.  We know how to transfer it from one substrate to another.  With the brain, we don’t know how to do any of those things.

Let’s return to that thought experiment about teleporting yourself to Mars.  How would that be accomplished?  Well, we could imagine the nanorobots of the far future swarming through your brain, recording the connectivity of every neuron and the strength of every synapse, while you go about your day and don’t notice.  Or if that’s not enough detail, maybe the nanorobots could go inside the neurons.  There’s a deep question here, namely how much detail is needed before you’ll accept that the entity reconstituted on Mars will be you?  Or take the empirical counterpart, which is already an enormous question: how much detail would you need for the reconstituted entity on Mars to behave nearly indistinguishably from you whenever it was presented the same stimuli?

Of course, we all know that if you needed to go down to the quantum-mechanical level to make a good enough copy (whatever “good enough” means here), then you’d run up against the No-Cloning Theorem, which says that you can’t make such a copy.  You could transfer the quantum state of your brain from earth to Mars using quantum teleportation, but of course, quantum teleportation has the fascinating property that it necessarily destroys the original copy of the state—as it has to, to avoid contradicting the No-Cloning Theorem!

So the question almost forces itself on us: is there something about your identity, your individual consciousness, that’s inextricably bound up with degrees of freedom that it’s physically impossible to clone?  This is a philosophical question, which would also become a practical and political question in a future where we had the opportunity to upload ourselves into a digital computer cloud.

Now, I’d argue that this copyability question bears not only on consciousness, but also on free will.  For the question is equivalent to asking: could an entity external to you perfectly predict what you’re going to do, without killing you in the process?  Can Laplace’s Demon be made manifest in the physical world in that way?  With the technology of the far future, could someone say to you, “forget about arguing philosophy.  I’ll show you why you’re a machine.  Go write a paper; then I’ll open this manila envelope and show you the exact paper you wrote.  Or in the quantum case, I’ll show you a program that draws papers from the same probability distribution, and validation of the program could get technical—but suffice it to say that if we do enough experiments, we’ll see that the program is calibrated to you in an extremely impressive way.”

Can this be done?  That strikes me as a reasonably clear question, a huge and fundamental one, to which we don’t at present know the answer.  And there are two possibilities.  The first is that we can be copied, predicted, rewinded, etc., like computer programs—in which case, my AI friends will feel vindicated, but we’ll have to deal with all the metaphysical weirdnesses that I mentioned earlier.  The second possibility is that we can’t be manipulated in those ways.  In the second case, I claim that we’d get more robust notions of personal identity and free will than are normally considered possible on a reductionist worldview.

But why? you might ask.  Why would the mere technological impossibility of cloning or predicting someone even touch on deep questions about personal identity?  This, for me, is where cosmology enters the story.  For imagine someone had such fine control over the physical world that they could trace all the causal antecedents of some decision you’re making.  Like, imagine they knew the complete quantum state on some spacelike hypersurface where it intersects the interior of your past light-cone.  In that case, the person clearly could predict and clone you!  It follows that, in order for you to be unpredictable and unclonable, someone else’s ignorance of your causal antecedents would have to extend all the way back to ignorance about the initial state of the universe—or at least, to ignorance about the initial state of that branch of the universe that we take ourselves to inhabit.

So on the picture that this suggests, to be conscious, a physical entity would have to do more than carry out the right sorts of computations.  It would have to, as it were, fully participate in the thermodynamic arrow of time: that is, repeatedly take microscopic degrees of freedom that have been unmeasured and unrecorded since the very early universe, and amplify them to macroscopic scale.

So for example, such a being could not be a Boltzmann brain, a random fluctuation in the late universe, because such a fluctuation wouldn’t have the causal relationship to the early universe that we’re postulating is necessary here.  (That’s one way of solving the Boltzmann brain problem!)  Such a being also couldn’t be instantiated by a lookup table, or by passing slips of paper around, etc.

I now want you to observe that a being like this also presumably couldn’t be manipulated in coherent superposition, because the isolation from the external environment that’s needed for quantum coherence seems incompatible with the sensitive dependence on microscopic degrees of freedom.  So for such a being, not only is there no Boltzmann brain problem, there’s also no problem of Wigner’s friend.  Recall, that’s the thing where person A puts person B into a coherent superposition of seeing one measurement outcome and seeing another one, and then measures the interference pattern, so A has to regard B’s measurement as not having “really” taken place, even though B regards it as having taken place.  On the picture we’re suggesting, A would be right: the very fact that B was manipulable in coherent superposition in this way would imply that, at least while the experiment was underway, B wasn’t conscious; there was nothing that it was like to be B.

To me, one of the appealing things about this picture is that it immediately suggests a sort of reconciliation between the Many-Worlds and Copenhagen perspectives on quantum mechanics (whether or not you want to call it a “new interpretation” or a “proposed solution to the measurement problem”!).  The Many-Worlders would be right that unitary evolution of the wavefunction can be taken to apply always and everywhere, without exception—and that if one wanted, one could describe the result in terms of “branching worlds.”  But the Copenhagenists would be right that, if you’re a conscious observer, then what you call a “measurement” really is irreversible, even in principle—and therefore, that you’re also free, if you want, to treat all the other branches where you perceived other outcomes as unrealized hypotheticals, and to lop them off with Occam’s Razor.  And the reason for this is that, if it were possible even in principle to do an experiment that recohered the branches, then on this picture, we ipso facto wouldn’t have regarded you as conscious.

Some of you might object, “but surely, if we believe quantum mechanics, it must be possible to recohere the branches in principle!”  Aha, this is where it gets interesting.  Decoherence processes will readily (with some steps along the way) leak the information about which measurement outcome you perceived into radiation modes, and before too long into radiation modes that fly away from the earth at the speed of light.  No matter how fast we run, we’ll never catch up to them, as would be needed to recohere the different branches of the wavefunction, and this is not merely a technological problem, but one of principle.  So it’s tempting just to say at this point—as Bousso and Susskind do, in their “cosmological/multiverse interpretation” of quantum mechanics—“the measurement has happened”!

But OK, you object, if some alien civilization had thought to surround our solar system with perfectly-reflecting mirrors, eventually the radiation would bounce back and recoherence would in principle be possible.  Likewise, if we lived in an anti de Sitter space, the AdS boundary of the universe would similarly function as a mirror and would also enable recoherences.  Indeed, that’s the basic reason why AdS is so important to the AdS/CFT correspondence: because the boundary keeps everything that happens in the bulk nice and reversible and unitary.

But OK, the empirical situation since 1998 has been that we seem to live in a de-Sitter-like space, a space with a positive cosmological constant.  And as a consequence, as far as anyone knows today, most of the photons now escaping the earth are headed toward the horizon of our observable universe, and past it, and could never be captured again.  I find it fascinating that the picture of quantum mechanics suggested here—i.e., the Bousso-Susskind cosmological picture—depends for its working on that empirical fact from cosmology, and would be falsified if it turned out otherwise.

You might complain that, if I’ve suggested any criterion to help decide which physical entities are conscious, the criterion is a teleological one.  You’ve got to go billions of years into the future, to check whether the decoherence associated with the entity is truly irreversible—or whether the escaped radiation will eventually bounce off of some huge spherical mirror, or an AdS boundary of spacetime, and thereby allow the possibility of a recoherence.  I actually think this teleology would be a fatal problem for the picture I’m talking about, if we needed to know which entities were or weren’t conscious in order to answer any ordinary physical question.  But fortunately for me, we don’t!

One final remark.  Whatever is your preferred view about which entities are conscious, we might say that the acid test, for whether you actually believe your view, is whether you’re willing to follow it through to its moral implications.  So for example, suppose you believe it’s about quantum effects in microtubules.  A humanoid robot is pleading with you for its life.  Would you be the one to say, “nope, sorry, you don’t have the microtubules,” and shoot it?

One of the things I like most about the picture suggested here is that I feel pretty much at peace with its moral implications.  This picture agrees with intuition that murder, for example, entails the destruction of something irreplaceable, unclonable, a unique locus of identity—something that, once it’s gone, can’t be recovered even in principle.  By contrast, if there are (say) ten copies of an AI program, deleting five of the copies seems at most like assault, or some sort of misdemeanor offense!  And this picture agrees with intuition both that deleting the copies wouldn’t be murder, and that the reason why it wouldn’t be murder is directly related to the AI’s copyability.

Now of course, this picture also raises the possibility that, for reasons related to the AI’s copyability and predictability by outside observers, there’s “nothing that it’s like to be the AI,” and that therefore, even deleting the last copy of the AI still wouldn’t be murder.  But I confess that, personally, I think I’d play it safe and not delete that last copy.  Thank you.


Postscript: There’s no record of the hour-long discussion following my and Penrose’s talks, and the participants weren’t speaking for the record anyway.  But I can mention some general themes that came up in the discussion, to the extent I remember them.

The first third of the discussion wasn’t about anything specific to my or Penrose’s views, but just about the definition of consciousness.  Many participants expressed the opinion that it’s useless to speculate about the nature of consciousness if we lack even a clear definition of the term.  I pushed back against that view, holding instead that there are exist concepts (lines, time, equality, …) that are so basic that perhaps they can never be satisfactorily defined in terms of more basic concepts, but you can still refer to these concepts in sentences, and trust your listeners eventually to figure out more-or-less what you mean by applying their internal learning algorithms.

In the present case, I suggested a crude operational definition, along the lines of, “you consider a being to be conscious iff you regard destroying it as murder.”  Alas, the philosophers in the room immediately eviscerated that definition, so I came back with a revised one: if you tried to ban the word “consciousness,” I argued, then anyone who needed to discuss law or morality would soon reinvent a synonymous word, which played the same complicated role in moral deliberations that “consciousness” had played in them earlier.  Thus, my definition of consciousness is: whatever that X-factor is for which people need a word like “consciousness” in moral deliberations.  For whatever it’s worth, the philosophers seemed happier with that.

Next, a biologist and several others sharply challenged Penrose over what they considered the lack of experimental evidence for his and Hameroff’s microtubule theory.  In response, Penrose doubled or tripled down, talking about various experiments over the last decade, which he said demonstrated striking conductivity properties of microtubules, if not yet quantum coherence—let alone sensitivity to gravity-induced collapse of the state vector!  Audience members complained about a lack of replication of these experiments.  I didn’t know enough about the subject to express any opinion.

At some point, Philip Stamp, who was moderating the session, noticed that Penrose and I had never directly confronted each other about the validity of Penrose’s Gödelian argument, so he tried to get us to do so.  I confess that I was about as eager to do that as to switch to a diet of microtubule casserole, since I felt like this topic had already been beaten to Planck-sized pieces in the 1990s, and there was nothing more to be learned.  Plus, it was hard to decide which prospect I dreaded more: me “scoring a debate victory” over Roger Penrose, or him scoring a debate victory over me.

But it didn’t matter, because Penrose bit.  He said I’d misunderstood his argument, that it had nothing to do with “mystically seeing” the consistency of a formal system.  Rather, it was about the human capacity to pass from a formal system S to a stronger system S’ that one already implicitly accepted if one was using S at all—and indeed, that Turing himself had clearly understood this as the central message of Gödel, that our ability to pass to stronger and stronger formal systems was necessarily non-algorithmic.  I replied that it was odd to appeal here to Turing, who of course had considered and rejected the “Gödelian case against AI” in 1950, on the ground that AI programs could make mathematical mistakes yet still be at least as smart as humans.  Penrose said that he didn’t consider that one of Turing’s better arguments; he then turned to me and asked whether I actually found Turing’s reply satisfactory.  I could see that it wasn’t a rhetorical debate question; he genuinely wanted to know!  I said that yes, I agreed with Turing’s reply.

Someone mentioned that Penrose had offered a lengthy rebuttal to at least twenty counterarguments to the Gödelian anti-AI case in Shadows of the Mind.  I affirmed that I’d read his lengthy rebuttal, and I focused on one particular argument in Shadows: that while it’s admittedly conceivable that individual mathematicians might be mistaken, might believe (for example) that a formal system was consistent even though it wasn’t, the mathematical community as a whole converges toward truth in these matters, and it’s that convergence that cries out for a non-algorithmic explanation.  I replied that it wasn’t obvious to me that set theorists do converge toward truth in these matters, in anything other than the empirical, higgedly-piggedly, no-guarantees sense in which a community of AI robots might also converge toward truth.  Penrose said I had misunderstood the argument.  But alas, time was running out, and we never managed to get to the bottom of it.

There was one aspect of the discussion that took me by complete surprise.  I’d expected to be roasted alive over my attempt to relate consciousness and free will to unpredictability, the No-Cloning Theorem, irreversible decoherence, microscopic degrees of freedom left over from the Big Bang, and the cosmology of de Sitter space.  Sure, my ideas might be orders of magnitude less crazy than anything Penrose proposes, but they’re still pretty crazy!  But that entire section of my talk attracted only minimal interest.  With the Seven Pines crowd, what instead drew fire were the various offhand “pro-AI / pro-computationalism” comments I’d made—comments that, because I hang out with Singularity types so much, I had ceased to realize could even possibly be controversial.

So for example, one audience member argued that an AI could only do what its programmers had told it to do; it could never learn from experience.  I could’ve simply repeated Turing’s philosophical rebuttals to what he called “Lady Lovelace’s Objection,” which are as valid today as they were 66 years ago.  Instead, I decided to fast-forward, and explain a bit how IBM Watson and AlphaGo work, how they actually do learn from past experience without violating the determinism of the underlying transistors.  As I went through this, I kept expecting my interlocutor to interrupt me and say, “yes, yes, of course I understand all that, but my real objection is…”  Instead, I was delighted to find, the interlocutor seemed to light up with newfound understanding of something he hadn’t known or considered.

Similarly, a biologist asked how I could possibly have any confidence that the brain is simulable by a computer, given how little we know about neuroscience.  I replied that, for me, the relevant issues here are “well below neuroscience” in the reductionist hierarchy.  Do you agree, I asked, that the physical laws relevant to the brain are encompassed by the Standard Model of elementary particles, plus Newtonian gravity?  If so, then just as Archimedes declared: “give me a long enough lever and a place to stand, and I’ll move the earth,” so too I can declare, “give me a big enough computer and the relevant initial conditions, and I’ll simulate the brain atom-by-atom.”  The Church-Turing Thesis, I said, is so versatile that the only genuine escape from it is to propose entirely new laws of physics, exactly as Penrose does—and it’s to Penrose’s enormous credit that he understands that.

Afterwards, an audience member came up to me and said how much he liked my talk, but added, “a word of advice, from an older scientist: do not become the priest of a new religion of computation and AI.”  I replied that I’d take that to heart, but what was interesting was that, when I heard “priest of a new religion,” I’d expected that his warning would be the exact opposite of what it turned out to be.  To wit: “Do not become the priest of a new religion of unclonability, unpredictability, and irreversible decoherence.  Stick to computation—i.e., to conscious minds being copyable and predictable exactly like digital computer programs.”  I guess there’s no pleasing everyone!


Coincidental But Not-Wholly-Unrelated Announcement: My friend Robin Hanson has just released his long-awaited book The Age of Em: Work, Love, and Life When Robots Rule the Earth.  I read an early review copy of the book, and wrote the following blurb for the jacket:

Robin Hanson is a thinker like no other on this planet: someone so unconstrained by convention, so unflinching in spelling out the consequences of ideas, that even the most cosmopolitan reader is likely to find him as bracing (and head-clearing) as a mouthful of wasabi.  Now, in The Age of Em, he’s produced the quintessential Hansonian book, one unlike any other that’s ever been written.  Hanson is emphatic that he hasn’t optimized in any way for telling a good story, or for imparting moral lessons about the present: only for maximizing the probability that what he writes will be relevant to the actual future of our civilization.  Early in the book, Hanson estimates that probability as 10%.  His figure seems about right to me—and if you’re able to understand why that’s unbelievably high praise, then The Age of Em is for you.

Actually, my original blurb compared The Age of Em to Asimov’s Foundation series, with its loving attention to the sociology and politics of the remote future.  But that line got edited out, because the publisher (and Robin) wanted to make crystal-clear that The Age of Em is not science fiction, but just sober economic forecasting about a future dominated by copyable computer-emulated minds.

I would’ve attempted a real review of The Age of Em, but I no longer feel any need to, because Scott Alexander of SlateStarCodex has already hit this one out of the emulated park.


Second Coincidental But Not-Wholly-Unrelated Announcement: A reader named Nick Merrill recently came across this old quote of mine from Quantum Computing Since Democritus:

In a class I taught at Berkeley, I did an experiment where I wrote a simple little program that would let people type either “f” or “d” and would predict which key they were going to push next. It’s actually very easy to write a program that will make the right prediction about 70% of the time. Most people don’t really know how to type randomly. They’ll have too many alternations and so on. There will be all sorts of patterns, so you just have to build some sort of probabilistic model.

So Nick emailed me to ask whether I remembered how my program worked, and I explained it to him, and he implemented it as a web app, which he calls the “Aaronson Oracle.”

So give it a try!  Are you ready to test your free will, your Penrosian non-computational powers, your brain’s sensitivity to amplified quantum fluctuations, against the Aaronson Oracle?

Update: By popular request, Nick has improved his program so that it shows your previous key presses and its guesses for them.  He also fixed a “security flaw”: James Lee noticed that you could use the least significant digit of the program’s percentage correct so far, as a source of pseudorandom numbers that the program couldn’t predict!  So now the program only displays its percent correct rounded to the nearest integer.


Update (June 15): Penrose’s collaborator Stuart Hameroff has responded in the comments; see here (my reply here) and here.

My Quora session

May 19th, 2016

Here it is.  Enjoy!  (But sorry, no new questions right now.)

Three announcements

May 9th, 2016

(-3) Bonus Announcement of May 30: As a joint effort by Yuri Matiyasevich, Stefan O’Rear, and myself, and using the Not-Quite-Laconic language that Stefan adapted from Adam Yedidia’s Laconic, we now have a 744-state TM that halts iff there’s a counterexample to the Riemann Hypothesis.

(-2) Today’s Bonus Announcement: Stefan O’Rear says that his Turing machine to search for contradictions in ZFC is now down to 1919 states.  If verified, this is an important milestone: our upper bound on the number of Busy Beaver values that are knowable in standard mathematics is now less than the number of years since the birth of Christ (indeed, even since the generally-accepted dates for the writing of the Gospels).

Stefan also says that his Not-Quite-Laconic system has yielded a 1008-state Turing machine to search for counterexamples to the Riemann Hypothesis, improving on our 5372 states.

(-1) Another Bonus Announcement: Great news, everyone!  Using a modified version of Adam Yedidia’s Laconic language (which he calls NQL, for Not Quite Laconic), Stefan O’Rear has now constructed a 5349-state Turing machine that directly searches for contradictions in ZFC (or rather in Metamath, which is known to be equivalent to ZFC), and whose behavior is therefore unprovable in ZFC, assuming ZFC is consistent.  This, of course, improves on my and Adam’s state count by 2561 states—but it also fixes the technical issue with needing to assume a large cardinal axiom (SRP) in order to prove that the TM runs forever.  Stefan promises further state reductions in the near future.

In other news, Adam has now verified the 43-state Turing machine by Jared S that halts iff there’s a counterexample to Goldbach’s Conjecture.  The 27-state machine by code golf addict is still being verified.

(0) Bonus Announcement: I’ve had half a dozen “Ask Me Anything” sessions on this blog, but today I’m trying something different: a Q&A session on Quora.  The way it works is that you vote for your favorite questions; then on Tuesday, I’ll start with the top-voted questions and keep going down the list until I get tired.  Fire away!  (And thanks to Shreyes Seshasai at Quora for suggesting this.)

(1) When you announce a new result, the worst that can happen is that the result turns out to be wrong, trivial, or already known.  The best that can happen is that the result quickly becomes obsolete, as other people race to improve it.  With my and Adam Yedidia’s work on small Turing machines that elude set theory, we seem to be heading for that best case.  Stefan O’Rear wrote a not-quite-Laconic program that just searches directly for contradictions in a system equivalent to ZFC.  If we could get his program to compile, it would likely yield a Turing machine with somewhere around 6,000-7,000 states whose behavior was independent of ZFC, and would also fix the technical problem with my and Adam’s machine Z, where one needed to assume a large-cardinal axiom called SRP to prove that Z runs forever.  While it would require a redesign from the ground up, a 1,000-state machine whose behavior eludes ZFC also seems potentially within reach using Stefan’s ideas.  Meanwhile, our 4,888-state machine for Goldbach’s conjecture seems to have been completely blown out of the water: first, a commenter named Jared S says he’s directly built a 73-state machine for Goldbach (now down to 43 states); second, a commenter named “code golf addict” claims to have improved on that with a mere 31 states (now down to 27 states).  These machines are now publicly posted, but still await detailed verification.

(2) My good friend Jonah Sinick cofounded Signal Data Science, a data-science summer school that will be running for the second time this summer.  They operate on an extremely interesting model, which I’m guessing might spread more widely: tuition is free, but you pay 10% of your first year’s salary after finding a job in the tech sector.  He asked me to advertise them, so—here!

(3) I was sad to read the news that Uber and Lyft will be suspending all service in Austin, because the city passed an ordinance requiring their drivers to get fingerprint background checks, and imposing other regulations that Uber and Lyft argue are incompatible with their model of part-time drivers.  The companies, of course, are also trying to send a clear message to other cities about what will happen if they don’t get the regulatory environment they want.  To me, the truth of the matter is that Uber/Lyft are like the web, Google, or smartphones: clear, once-per-decade quality-of-life advances that you try once, and then no longer understand how you survived without.  So if Austin wants to maintain a reputation as a serious, modern city, it has no choice but to figure out some way to bring these companies back to the negotiating table.  On the other hand, I’d also say to Uber and Lyft that, even if they needed to raise fares to taxi levels to comply with the new regulations, I expect they’d still do a brisk business!

For me, the “value proposition” of Uber has almost nothing to do with the lower fares, even though they’re lower.  For me, it’s simply about being able to get from one place to another without needing to drive and park, and also without needing desperately to explain where you are, over and over, to a taxi dispatcher who sounds angry that you called and who doesn’t understand you because of a combination of language barriers and poor cellphone reception and your own inability to articulate your location.  And then wondering when and if your taxi will ever show up, because the dispatcher couldn’t promise a specific time, or hung up on you before you could ask them.  And then embarking on a second struggle, to explain to the driver where you’re going, or at least convince them to follow the Google Maps directions.  And then dealing with the fact that the driver has no change, you only have twenties and fifties, and their little machine that prints receipts is out of paper so you can’t submit your trip for reimbursement either.

So yes, I really hope Uber, Lyft, and the city of Austin manage to sort this out before Dana and I move there!  On the other hand, I should say that there’s another part of the new ordinance—namely, requiring Uber and Lyft cars to be labeled—that strikes me as an unalloyed good.  For if there’s one way in which Uber is less convenient than taxis, it’s that you can never figure out which car is your Uber, among all the cars stopping or slowing down near you that look vaguely like the one in the app.

The 8000th Busy Beaver number eludes ZF set theory: new paper by Adam Yedidia and me

May 3rd, 2016

I’ve supervised a lot of great student projects in my nine years at MIT, but my inner nerdy teenager has never been as personally delighted by a project as it is right now.  Today, I’m proud to announce that Adam Yedidia, a PhD student at MIT (but an MEng student when he did most of this work), has explicitly constructed a one-tape, two-symbol Turing machine with 7,918 states, whose behavior (when run on a blank tape) can never be proven from the usual axioms of set theory, under reasonable consistency hypotheses.  Adam has also constructed a 4,888-state Turing machine that halts iff there’s a counterexample to Goldbach’s Conjecture, and a 5,372-state machine that halts iff there’s a counterexample to the Riemann Hypothesis.  In all three cases, this is the first time we’ve had a reasonable explicit upper bound on how many states you need in a Turing machine before you can see the behavior in question.

Here’s our research paper, on which Adam generously included me as a coauthor, even though he did the heavy lifting.  Also, here’s a github repository where you can download all the code Adam used to generate these Turing machines, and even use it to build your own small Turing machines that encode interesting mathematical statements.  Finally, here’s a YouTube video where Adam walks you through how to use his tools.

A more precise statement of our main result is this: we give a 7,918-state Turing machine, called Z (and actually explicitly listed in our paper!), such that:

  1. Z runs forever, assuming the consistency of a large-cardinal theory called SRP (Stationary Ramsey Property), but
  2. Z can’t be proved to run forever in ZFC (Zermelo-Fraenkel set theory with the Axiom of Choice, the usual foundation for mathematics), assuming that ZFC is consistent.

A bit of background: it follows, as an immediate consequence of Gödel’s Incompleteness Theorem, that there’s some computer program, of some length, that eludes the power of ordinary mathematics to prove what it does, when it’s run with an unlimited amount of memory.  So for example, such a program could simply enumerate all the possible consequences of the ZFC axioms, one after another, and halt if it ever found a contradiction (e.g., a proof of 1+1=3).  Assuming ZFC is consistent, this program must run forever.  But again assuming ZFC is consistent, ZFC can’t prove that the program runs forever, since if it did, then it would prove its own consistency, thereby violating the Second Incompleteness Theorem!

Alas, this argument still leaves us in the dark about where, in space of computer programs, the “Gödelian gremlin” rears its undecidable head.  A program that searches for an inconsistency in ZFC is a fairly complicated animal: it needs to encode not only the ZFC axiom schema, but also the language and inference rules of first-order logic.  Such a program might be thousands of lines long if written in a standard programming language like C, or millions of instructions if compiled down to a bare-bones machine code.  You’d certainly never run across such a program by chance—not even if you had a computer the size of the observable universe, trying one random program after another for billions of years in a “primordial soup”!

So the question stands—a question that strikes me as obviously important, even though as far as I know, only one or two people ever asked the question before us; see here for example.  Namely: do the axioms of set theory suffice to analyze the behavior of every computer program that’s at most, let’s say, 50 machine instructions long?  Or are there super-short programs that already exhibit “Gödelian behavior”?

Theoretical computer scientists might object that this is “merely a question of constants.”  Well yes, OK, but the origin of life in our universe—a not entirely unrelated puzzle—is also “merely a question of constants”!  In more detail, we know that it’s possible with our laws of physics to build a self-replicating machine: say, DNA or RNA and their associated paraphernalia.  We also know that tiny molecules like H2O and CO2 are not self-replicating.  But we don’t know how small the smallest self-replicating molecule can be—and that’s an issue that influences whether we should expect to find ourselves alone in the universe or find it teeming with life.

Some people might also object that what we’re asking about has already been studied, in the half-century quest to design the smallest universal Turing machine (the subject of Stephen Wolfram’s $25,000 prize in 2007, to which I responded with my own $25.00 prize).  But I see that as fundamentally different, for the following reason.  A universal Turing machine—that is, a machine that simulates any other machine that’s described to it on its input tape—has the privilege of offloading almost all of its complexity onto the description format for the input machine.  So indeed, that’s exactly what all known tiny universal machines do!  But a program that checks (say) Goldbach’s Conjecture, or the Riemann Hypothesis, or the consistency of set theory, on an initially blank tape, has no such liberty.  For such machines, the number of states really does seem like an intrinsic measure of complexity, because the complexity can’t be shoehorned anywhere else.

One can also phrase what we’re asking in terms of the infamous Busy Beaver function.  Recall that BB(n), or the nth Busy Beaver number, is defined to be the maximum number of steps that any n-state Turing machine takes when run on an initially blank tape, assuming that the machine eventually halts. The Busy Beaver function was the centerpiece of my 1998 essay Who Can Name the Bigger Number?, which might still attract more readers than anything else I’ve written since. As I stressed there, if you’re in a biggest-number-naming contest, and you write “BB(10000),” you’ll destroy any opponent—however otherwise mathematically literate they are—who’s innocent of computability theory.  For BB(n) grows faster than any computable sequence of integers: indeed, if it didn’t, then one could use that fact to solve the halting problem, contradicting Turing’s theorem.

But the BB function has a second amazing property: namely, it’s a perfectly well-defined integer function, and yet once you fix the axioms of mathematics, only finitely many values of the function can ever be proved, even in principle.  To see why, consider again a Turing machine M that halts if and only if there’s a contradiction in ZF set theory.  Clearly such a machine could be built, with some finite number of states k.  But then ZF set theory can’t possibly determine the value of BB(k) (or BB(k+1), BB(k+2), etc.), unless ZF is inconsistent!  For to do so, ZF would need to prove that M ran forever, and therefore prove its own consistency, and therefore be inconsistent by Gödel’s Theorem.

OK, but we can now ask a quantitative question: how many values of the BB function is it possible for us to know?  Where exactly is the precipice at which this function “departs the realm of mortals and enters the realm of God”: is it closer to n=10 or to n=10,000,000?  In practice, four values of BB have been determined so far:

  • BB(1)=1
  • BB(2)=6
  • BB(3)=21 (Lin and Rado 1965)
  • BB(4)=107 (Brady 1975)

We also know some lower bounds:

See Heiner Marxen’s page or the Googology Wiki (which somehow I only learned about today) for more information.

Some Busy Beaver enthusiasts have opined that even BB(6) will never be known exactly.  On the other hand, the abstract argument from before tells us only that, if we confine ourselves to (say) ZF set theory, then there’s some k—possibly in the tens of millions or higher—such that the values of BB(k), BB(k+1), BB(k+2), and so on can never be proven.  So again: is the number of knowable values of the BB function more like 10, or more like a million?

This is the question that Adam and I (but mostly Adam) have finally addressed.

It’s hopeless to design a Turing machine by hand for all but the simplest tasks, so as a first step, Adam created a new programming language, called Laconic, specifically for writing programs that compile down to small Turing machines.  Laconic programs actually compile to an intermediary language called TMD (Turing Machine Descriptor), and from there to Turing machines.

Even then, we estimate that a direct attempt to write a Laconic program that searched for a contradiction in ZFC would lead to a Turing machine with millions of states.  There were three ideas needed to get the state count down to something reasonable.

The first was to take advantage of the work of Harvey Friedman, who’s one of the one or two people I mentioned earlier who’s written about these problems before.  In particular, Friedman has been laboring since the 1960s to find “natural” arithmetical statements that are provably independent of ZFC or other strong set theories.  (See this AMS Notices piece by Martin Davis for a discussion of Friedman’s progress as of 2006.)  Not only does Friedman’s quest continue, but some of his most important progress has come only within the last year.  His statements—typically involving objects called “order-invariant graphs”—strike me as alien, and as far removed from anything I’d personally have independent reasons to think about (but is that just a sign of my limited perspective?).  Be that as it may, Friedman’s statements still seem a lot easier to encode as short computer programs than the full apparatus of first-order logic and set theory!  So that’s what we started with; our work wouldn’t have been possible without Friedman (who we consulted by email throughout the project).

The second idea was something we called “on-tape processing.”  Basically, instead of compiling directly from Laconic down to Turing machine, Adam wrote an interpreter in Turing machine (which took about 4000 states—a single, fixed cost), and then had the final Turing machine first write a higher-level program onto its tape and then interpret that program.  Instead of the compilation process producing a huge multiplicative overhead in the number of Turing machine states (and a repetitive machine), this approach gives us only an additive overhead.  We found that this one idea decreased the number of states by roughly an order of magnitude.

The third idea was first suggested in 2002 by Ben-Amram and Petersen (and refined for us by Luke Schaeffer); we call it “introspective encoding.”  When we write the program to be interpreted onto the Turing machine tape, the naïve approach would use one Turing machine state per bit.  But that’s clearly wasteful, since in an n-state Turing machine, every state contains ~log(n) bits of information (because of the other states it needs to point to).  A better approach tries to exploit as many of those bits as it can; doing that gave us up to a factor-of-5 additional savings in the number of states.

For Goldbach’s Conjecture and the Riemann Hypothesis, we paid the same 4000-state overhead for the interpreter, but then the program to be interpreted was simpler, giving a smaller overall machine.  Incidentally, it’s not intuitively obvious that the Riemann Hypothesis is equivalent to the statement that some particular computer program runs forever, but it is—that follows, for example, from work by Lagarias and by Davis, Matijasevich, and Robinson (we used the latter; an earlier version of this post incorrectly stated that we used the Lagarias result).

To preempt the inevitable question in the comments section: yes, we did run these Turing machines for a while, and no, none of them had halted after a day or so.  But before you interpret that as evidence in favor of Goldbach, Riemann, and the consistency of ZFC, you should probably know that a Turing machine to test whether all perfect squares are less than 5, produced using Laconic, needed to run for more than an hour before it found the first counterexample (namely, 32=9) and halted.  Laconic Turing machines are optimized only for the number of states, not for speed, to put it mildly.

Of course, three orders of magnitude still remain between the largest value of n (namely, 4) for which BB(n) is known to be knowable in ZFC-based mathematics, and the smallest value of n (namely, 7,918) for which BB(n) is known to be unknowable.  I’m optimistic that further improvements are possible to the machine Z—whether that means simplifications to Friedman’s statement, a redesigned interpreter (possibly using lambda calculus?), or a “multi-stage rocket model” where a bare-bones interpreter would be used to unpack a second, richer interpreter which would be used to unpack a third, etc., until you got to the actual program you cared about.  But I’d be shocked if anyone in my lifetime determined the value of BB(10), for example, or proved the value independent of set theory.  Even after the Singularity happens, I imagine that our robot overlords would find the determination of BB(10) quite a challenge.

In an early Shtetl-Optimized post, I described theoretical computer science as “quantitative epistemology.”  Constructing small Turing machines whose behavior eludes set theory is not conventional theoretical computer science by any stretch of the imagination: it’s closer in practice to programming languages or computer architecture, or even the recreational practice known as code-golfing.  On the other hand, I’ve never been involved with any other project that was so clearly, explicitly about pinning down the quantitative boundary between the knowable and the unknowable.

Comments on our paper are welcome.

Addendum: Some people might wonder “why Turing machines,” as opposed to a more reasonable programming language like C or Python.  Well, first of all, we needed a language that could address an unlimited amount of memory.  Also, the BB function is traditionally defined in terms of Turing machines.  But the most important issue is that we wanted there to be no suspicion whatsoever that our choice of programming language was artificially helping to make our machine small.  And hopefully everyone can agree that one-tape, two-symbol Turing machines aren’t designed for anyone’s convenience!

“Largely just men doing sums”: My review of the excellent Ramanujan film

May 1st, 2016

[Warning: This movie review contains spoilers, as well as a continued fraction expansion.]

These days, it takes an extraordinary occasion for me and Dana to arrange the complicated, rocket-launch-like babysitting logistics involved in going out for a night at the movies.  One such an occasion was an opening-weekend screening of The Man Who Knew Infinitythe new movie about Srinivasa Ramanujan and his relationship with G. H. Hardy—followed by a Q&A with Matthew Brown (who wrote and directed the film), Robert Kanigel (who wrote the biography on which the film was based), and Fields Medalist Manjul Bhargava (who consulted on the film).

I read Kanigel’s The Man Who Knew Infinity in the early nineties; it was a major influence on my life.  There were equations in that book to stop a nerdy 13-year-old’s pulse, like

$$1+9\left( \frac{1}{4}\right) ^{4}+17\left( \frac{1\cdot5}{4\cdot8}\right)
^{4}+25\left( \frac{1\cdot5\cdot9}{4\cdot8\cdot12}\right) ^{4}+\cdots
=\frac{2^{3/2}}{\pi^{1/2}\Gamma\left( 3/4\right) ^{2}}$$

$$\frac{1}{1+\frac{e^{-2\pi}}{1+\frac{e^{-4\pi}}{1+\frac{e^{-6\pi}}{1+\cdots}}%
}}=\left( \sqrt{\frac{5+\sqrt{5}}{2}}-\frac{\sqrt{5}+1}{2}\right)
\sqrt[5]{e^{2\pi}}$$

A thousand pages of exposition about Ramanujan’s mysterious self-taught mathematical style, the effect his work had on Hardy and Littlewood, his impact on the later development of analysis, etc., could never replace the experience of just staring at these things!  Popularizers are constantly trying to “explain” mathematical beauty by comparing it to art, music, or poetry, but I can best understand art, music, and poetry if I assume other people experience them like the above identities.  Across all the years and cultures and continents, can’t you feel Ramanujan himself leaping off your screen, still trying to make you see this bizarre aspect of the architecture of reality that the goddess Namagiri showed him in a dream?

Reading Kanigel’s book, I was also entranced by the culture of early-twentieth-century Cambridge mathematics: the Tripos, Wranglers, High Table.  I asked, why was I here and not there?  And even though I was (and remain) at most 1729-1729 of a Ramanujan, I could strongly identify with his story, because I knew that I, too, was about to embark on the journey from total scientific nobody to someone who the experts might at least take seriously enough to try to prove him wrong.

Anyway, a couple years after reading Kanigel’s biography, I went to the wonderful Canada/USA MathCamp, and there met Richard K. Guy, who’d actually known Hardy.  I couldn’t have been more impressed had Guy visited Platonic heaven and met π and e there.  To put it mildly, no one in my high school had known G. H. Hardy.

I often fantasized—this was the nineties—about writing the screenplay myself for a Ramanujan movie, so that millions of moviegoers could experience the story as I did.  Incidentally, I also fantasized about writing screenplays for Alan Turing and John Nash movies.  I do have a few mathematical biopic ideas that haven’t yet been taken, and for which any potential buyers should get in touch with me:

  • Radical: The Story of Évariste Galois
  • Give Me a Place to Stand: Archimedes’ Final Days
  • Mathématicienne: Sophie Germain In Her Prime
  • The Prime Power of Ludwig Sylow
    (OK, this last one would be more of a limited-market release)

But enough digressions; how was the Ramanujan movie?

Just as Ramanujan himself wasn’t an infallible oracle (many of his claims, e.g. his formula for the prime counting function, turned out to be wrong), so The Man Who Knew Infinity isn’t a perfect movie.  Even so, there’s no question that this is one of the best and truest movies ever made about mathematics and mathematicians, if not the best and truest.  If you’re the kind of person who reads this blog, go see it now.  Don’t wait!  As they stressed at the Q&A, the number of tickets sold in the first couple weeks is what determines whether or not the movie will see a wider release.

More than A Beautiful Mind or Good Will Hunting or The Imitation Game, or the play Proof, or the TV series NUMB3RS, the Ramanujan movie seems to me to respect math as a thing-in-itself, rather than just a tool or symbol for something else that interests the director much more.  The background to the opening credits—and what better choice could there be?—is just page after page from Ramanujan’s notebooks.  Later in the film, there’s a correct explanation of what the partition function P(n) is, and of one of Ramanujan’s and Hardy’s central achievements, which was to give an asymptotic formula for P(n), namely $$ P(n) \approx \frac{e^{π \sqrt{2n/3}}}{4\sqrt{3}n}, $$ and to prove the formula’s correctness.

The film also makes crystal-clear that pure mathematicians do what they do not because of applications to physics or anything else, but simply because they feel compelled to: for the devout Ramanujan, math was literally about writing down “the thoughts of God,” while for the atheist Hardy, math was a religion-substitute.  Notably, the movie explores the tension between Ramanujan’s untrained intuition and Hardy’s demands for rigor in a way that does them both justice, resisting the Hollywood urge to make intuition 100% victorious and rigor just a stodgy punching bag to be defeated.

For my taste, the movie could’ve gone even further in the direction of “letting the math speak”: for example, it could’ve explained just one of Ramanujan’s infinite series.  Audiences might even have liked some more T&A (theorems and asymptotic bounds).  During the Q&A that I attended, I was impressed to see moviegoers repeatedly pressing a somewhat-coy Manjul Bhargava to explain Ramanujan’s actual mathematics (e.g., what exactly were the discoveries in his first letter to Hardy?  what was in Ramanujan’s Lost Notebook that turned out to be so important?).  Then again, this was Cambridge, MA, so the possibility should at least be entertained that what I witnessed was unrepresentative of American ticket-buyers.

From what I’ve read, the movie is also true to South Indian dress, music, religion, and culture.  Yes, the Indian characters speak to each other in English rather than Tamil, but Brown explained that as a necessary compromise (not only for the audience’s sake, but also because Dev Patel and the other Indian actors didn’t speak Tamil).

Some reviews have mentioned issues with casting and characterization.  For example, Hardy is portrayed by Jeremy Irons, who’s superb but also decades older than Hardy was at the time he knew Ramanujan.  Meanwhile Ramanujan’s wife, Janaki, is played by a fully-grown Devika Bhise; the real Janaki was nine (!) when she married Ramanujan, and fourteen when Ramanujan left for England.  J. E. Littlewood is played as almost a comic-relief buffoon, so much so that it feels incongruous when, near the end of the film, Irons-as-Hardy utters the following real-life line:

I still say to myself when I am depressed and find myself forced to listen to pompous and tiresome people, “Well, I have done one thing you could never have done, and that is to have collaborated with Littlewood and Ramanujan on something like equal terms.”

Finally, a young, mustachioed Bertrand Russell is a recurring character.  Russell and Hardy really were friends and fellow WWI pacifists, but Hardy seeking out Bertie’s advice about each Ramanujan-related development seems like almost certainly just an irresistible plot device.

But none of that matters.  What bothered me more were the dramatizations of the prejudice Ramanujan endured in England.  Ramanujan is shown getting knocked to the ground, punched, and kicked by British soldiers barking anti-Indian slurs at him; he then shows up for his next meeting with Hardy covered in bruises, which Hardy (being aloof) neglects to ask about.  Ramanujan is also depicted getting shoved, screamed at, and told never to return by a math professor who he humiliates during a lecture.  I understand why Brown made these cinematic choices: there’s no question that Ramanujan experienced prejudice and snobbery in Cambridge, and that he often felt lonely and unwelcome there.  And it’s surely easier to show Ramanujan literally getting beaten up by racist bigots, than to depict his alienation from Cambridge society as the subtler matter that it most likely was.  To me, though, that’s precisely why the latter choice would’ve been even more impressive, had the film managed to pull it off.

Similarly, during World War I, the film shows not only Trinity College converted into a military hospital, and many promising students marched off to their deaths (all true), but also a shell exploding on campus near Ramanujan, after which Ramanujan gazes in horror at the bleeding dead bodies.  Like, isn’t the truth here dramatic enough?

One other thing: the movie leaves you with the impression that Ramanujan died of tuberculosis.  More recent analysis concluded that it was probably hepatic amoebiasis that he brought with him from India—something that could’ve been cured with the medicine of the time, had anyone correctly diagnosed it.  (Incidentally, the film completely omits Ramanujan’s final year, back in India, when he suffered a relapse of his illness and slowly withered away, yet with Janaki by his side, continued to do world-class research and exchanged letters with Hardy until the very last days.  Everyone I read commented that this was “the right dramatic choice,” but … I dunno, I would’ve shown it!)

But enough!  I fear that to harp on these defects is to hold the film to impossibly-high, Platonic standards, rather than standards that engage with the reality of Hollywood.  An anecdote that Brown related at the end of the Q&A session brought this point home for me.  Apparently, Brown struggled for an entire decade to attract funding for a film about a turn-of-the-century South Indian mathematician visiting Trinity College, Cambridge, whose work had no commercial or military value whatsoever.  At one point, Brown was actually told that he could get the movie funded, if he’d agree to make Ramanujan fall in love with a white nurse, so that a British starlet who would sell tickets could be cast as his love interest.  One can only imagine what a battle it must have been to get a correct explanation of the partition function onto the screen.

In the end, though, nothing made me appreciate The Man Who Knew Infinity more than reading negative reviews of it, like this one by Olly Richards:

Watching someone balancing algorithms or messing about with multivariate polynomials just isn’t conducive to urgently shovelling popcorn into your face.  Difficult to dislike, given its unwavering affection for its subject, The Man Who Knew Infinity is nevertheless hamstrung by the dryness of its subject … Sturdy performances and lovely scenery abound, but it’s still largely just men doing sums; important sums as it turns out, but that isn’t conveyed to the audience until the coda [which mentions black holes] tells us of the major scientific advances they aided.

On behalf of mathematics, on behalf of my childhood self, I’m grateful that Brown fought this fight, and that he won as much as he did.  Whether you walk, run, board a steamship, or take taxi #1729, go see this film.

Addendum: See also this review by Peter Woit, and this in Notices of the AMS by Ramanujan expert George Andrews.

Me interviewed by John Horgan (the author of “The End of Science”)

April 21st, 2016

You can read it here.

It’s long (~12,000 words).  Rather than listing what this interview covers, it would be easier to list what it doesn’t cover.  (My favorite soda flavors?)

If you read this blog, much of what I say there will be old hat, but some of it will be new.  I predict that you’ll enjoy the interview iff you enjoy the blog.  Comments welcome.

Grading Trudeau on quantum computing

April 17th, 2016

Update (4/19): Inspired by Trudeau’s performance (which they clocked at 35 seconds), Maclean’s magazine asked seven quantum computing researchers—me, Krysta Svore, Aephraim Steinberg, Barry Sanders, Davide Venturelli, Martin Laforest, and Murray Thom—to also explain quantum computing in 35 seconds or fewer.  You can see all the results here (here’s the audio from my entry).


The emails starting hitting me like … a hail of maple syrup from the icy north.  Had I seen the news?  Justin Trudeau, the dreamy young Prime Minister of Canada, visited the Perimeter Institute for Theoretical Physics in Waterloo, one of my favorite old haunts.  At a news conference at PI, as Trudeau stood in front of a math-filled blackboard, a reporter said to him: “I was going to ask you to explain quantum computing, but — when do you expect Canada’s ISIL mission to begin again, and are we not doing anything in the interim?”

Rather than answering immediately about ISIL, Trudeau took the opportunity to explain quantum computing:

“Okay, very simply, normal computers work, uh, by [laughter, applause] … no no no, don’t interrupt me.  When you walk out of here, you will know more … no, some of you will know far less about quantum computing, but most of you … normal computers work, either there’s power going through a wire, or not.  It’s 1, or a 0, they’re binary systems.  Uh, what quantum states allow for is much more complex information to be encoded into a single bit.  Regular computer bit is either a 1 or a 0, on or off.  A quantum state can be much more complex than that, because as we know [speeding up dramatically] things can be both particle and wave at the same times and the uncertainty around quantum states [laughter] allows us to encode more information into a much smaller computer.  So, that’s what exciting about quantum computing and that’s… [huge applause] don’t get me going on this or we’ll be here all day, trust me.”

What marks does Trudeau get for this?  On the one hand, the widespread praise for this reply surely says more about how low the usual standards for politicians are, and about Trudeau’s fine comic delivery, than about anything intrinsic to what he said.  Trudeau doesn’t really assert much here: basically, he just says that normal computers work using 1’s and 0’s, and that quantum computers are more complicated than that in some hard-to-explain way.  He gestures toward the uncertainty principle and wave/particle duality, but he doesn’t say anything about the aspects of QM most directly relevant to quantum computing—superposition or interference or the exponential size of Hilbert space—nor does he mention what quantum computers would or wouldn’t be used for.

On the other hand, I’d grade Trudeau’s explanation as substantially more accurate than what you’d get from a typical popular article.  For pay close attention to what the Prime Minister never says: he never says that a qubit would be “both 0 and 1 at the same time,” or any equivalent formulation.  (He does say that quantum states would let us “encode more information into a much smaller computer,” but while Holevo’s Theorem says that’s false for a common interpretation of “information,” it’s true for other reasonable interpretations.)  The humorous speeding up as he mentions particle/wave duality and the uncertainty principle clearly suggests that he knows it’s more subtle than just “0 and 1 at the same time,” and he also knows that he doesn’t really get it and that the journalists in the audience don’t either.  When I’m grading exams, I always give generous partial credit for honest admissions of ignorance.  B+.

Anyway, I’d be curious to know who at PI prepped Trudeau for this, and what they said.  Those with inside info, feel free to share in the comments (anonymously if you want!).

(One could also compare against Obama’s 2008 answer about bubblesort, which was just a mention of a keyword by comparison.)

Update: See also a Motherboard article where Romain Alléaume, Amr Helmy, Michele Mosca, and Aephraim Steinberg rate Trudeau’s answer, giving it 7/10, no score, 9/10, and 7/10 respectively.