## Talk, be merry, and be rational

November 23rd, 2015

Yesterday I wrote a statement on behalf of a Scott Alexander SlateStarCodex/rationalist meetup, which happened last night at MIT (in the same room where I teach my graduate class), and which I’d really wanted to attend but couldn’t.  I figured I’d share the statement here:

I had been looking forward to attending tonight’s MIT SlateStarCodex meetup as I hardly ever look forward to anything. Alas, I’m now stuck in Chicago, with my flight cancelled due to snow, and with all flights for the next day booked up. But instead of continuing to be depressed about it, I’ve decided to be happy that this meetup is even happening at all—that there’s a community of people who can read, let’s say, a hypothetical debate moderator questioning Ben Carson about what it’s like to be a severed half-brain, and simply be amused, instead of silently trying to figure out who benefits from the post and which tribe the writer belongs to. (And yes, I know: the answer is the gray tribe.) And you can find this community anywhere—even in Cambridge, Massachusetts! Look, I spend a lot of time online, just getting more and more upset reading social justice debates that are full of people calling each other douchebags without even being able to state anything in the same galactic supercluster as the other side’s case. And then what gives me hope for humanity is to click over to the slatestarcodex tab, and to see all the hundreds of comments (way more than my blog gets) by people who disagree with each other but who all basically get it, who all have minds that don’t make me despair. And to realize that, when Scott Alexander calls an SSC meetup, he can fill a room just about anywhere … well, at least anywhere I would visit. So talk, be merry, and be rational.

I’m now back in town, and told by people who attended the meetup that it was crowded, disorganized, and great.  And now I’m off to Harvard, to attend the other Scott A.’s talk “How To Ruin A Perfectly Good Randomized Controlled Trial.”

Update (Nov. 24) Scott Alexander’s talk at Harvard last night was one of the finest talks I’ve ever attended. He was introduced to rapturous applause as simply “the best blogger on the Internet,” and as finally an important speaker, in a talk series that had previously wasted everyone’s time with the likes of Steven Pinker and Peter Singer. (Scott demurred that his most notable accomplishment in life was giving the talk at Harvard that he was just now giving.) The actual content, as Scott warned from the outset, was “just” a small subset of a basic statistics course, but Scott brought each point alive with numerous recent examples, from psychiatry, pharmacology, and social sciences, where bad statistics or misinterpretations of statistics were accepted by nearly everyone and used to set policy. (E.g., Alcoholics Anonymous groups that claimed an “over 95%” success rate, because the people who relapsed were kicked out partway through and not counted toward the total.) Most impressively, Scott leapt immediately into the meat, ended after 20 minutes, and then spent the next two hours just taking questions. Scott is publicity-shy, but I hope for others’ sake that video of the talk will eventually make its way online.

Then, after the talk, I had the honor of meeting two fellow Boston-area rationalist bloggers, Kate Donovan and Jesse Galef. Yes, I said “fellow”: for almost a decade, I’ve considered myself on the fringes of the “rationalist movement.” I’d hang out a lot with skeptic/effective-altruist/transhumanist/LessWrong/OvercomingBias people (who are increasingly now SlateStarCodex people), read their blogs, listen and respond to their arguments, answer their CS theory questions. But I was always vaguely uncomfortable identifying myself with any group that even seemed to define itself by how rational it was compared to everyone else (even if the rationalists constantly qualified their self-designation with “aspiring”!). Also, my rationalist friends seemed overly interested in questions like how to prevent malevolent AIs from taking over the world, which I tend to think we lack the tools to make much progress on right now (though, like with many other remote possibilities, I’m happy for some people to work on them and see if they find anything interesting).

So, what changed? Well, in the debates about social justice, public shaming, etc. that have swept across the Internet these past few years, it seems to me that my rationalist friends have proven themselves able to weigh opposing arguments, examine their own shortcomings, resist groupthink and hysteria from both sides, and attack ideas rather than people, in a way that the wider society—and most depressingly to me, the “enlightened, liberal” part of society—has often failed. In a real-world test (“real-world,” in this context, meaning social media…), the rationalists have walked the walk and rationaled the rational, and thus they’ve given me no choice but to stand up and be counted as one of them.

Have a great Thanksgiving, those of you in the US!

I’m sure there’s a lesson in here somewhere about what I should spend my time on.

## G. Phi. Fo. Fum.

November 4th, 2015

Update (Nov. 17): Video of Laci’s first talk is now available.

Breaking News (Nov. 12): Jeremy Kun has written up a phenomenal summary of Babai’s first lecture.  I haven’t carefully studied all of it, and in any case, there are many missing details to be filled in later (Babai told Kun that the preprint will be available “soon, soon!”).  But from the summary, four points stood out to me:

1. Babai actually claims a quasipolynomial-time algorithm for an interestingly more general problem than graph isomorphism, called string isomorphism.  This was already in the abstract, but googling didn’t reveal what string isomorphism was.  So, OK, here’s what it is: you’re given two strings x and y over some finite alphabet, as well as the generators of a group G of permutations of the string indices.  The problem is to determine whether you can transform x to y by applying a permutation in G.  Or even more generally: given a string x, find a full set of generators for the subgroup of G that fixes x.  See Kun’s post for the straightforward reductions from GI to these group-theoretic problems.
2. As was hinted in the abstract, in Babai’s analysis of his algorithm, there’s one step that relies on a statement whose only known proof depends on the Classification of Finite Simple Groups.  (Thus, it’s not the algorithm itself requires iterating through all the sporadic simple groups or anything like that; this only shows up in the correctness proof.)  This is not the first-ever computer-science application of the Classification of Finite Simple Groups (indeed, Babai himself has some previous ones), but it’s certainly the most dramatic.
3. In previous work on GI, the Johnson graph emerged over and over as a forehead-bangingly hard case that caused numerous algorithms to fail.  In the new work, it looks like Babai’s central technical innovation is to show that, in some sense, the Johnson graph is the only obstruction to taking the divide-and-conquer approaches that people that had tried before, and making them run in quasipolynomial time.  I.e., in each step of the recursion, either you can find a Johnson graph on a large fraction of the vertices and handle it specially, or else you can do something that works whenever there’s not a Johnson graph on a large fraction of the vertices.  Babai calls this “split-or-Johnson.”
4. Babai stressed that in some sense, his new algorithm is the culmination of a program laid out by Eugene Luks in 1982.  Now, the Classification of Finite Simple Groups was also more-or-less completed in the early 1980s.  To my mind, this raises a fascinating socio-mathematical question: which aspects of the new work, if any, could not have been done in the early 80s, possibly by Babai or Luks themselves?  what is it that needed another 30 years?  If the answer turns out to be “nothing,” then to me that’s an astounding illustration of the role of the individual in mathematical progress.  As in: Laci was nice enough to take a third-of-a-century break between his and Luks’ work in the early 80s, and the “natural next step” in their program … and still no one managed to use that break to beat him to the next step!

Earlier today, I was tipped off to what might be the theoretical computer science result of the decade.  My source asked me not to break the news on this blog—but since other theory bloggers (and twitterers) are now covering the story, I guess the graph is out of the Babai.

According to the University of Chicago’s theory seminar calendar, on Tuesday of next week (November 10), the legendary Laszlo Babai will be giving a talk about a new algorithm that solves the graph isomorphism problem in quasipolynomial time.  The previous fastest algorithm to decide whether two n-vertex graphs G and H are isomorphic—by Babai and Luks, back in 1983—ran in exp(√(n log n)) time.  If we credit the announcement, Babai has now gotten that down to exp(polylog(n)), putting one of the central problems of computer science “just barely above P.”  (For years, I’ve answered questions on this blog about the status of graph isomorphism—would I bet that it’s in BQP? in coNP? etc.—by saying that, as far as I and many others are concerned, it might as well just be in P.  Of course I’m happy to reaffirm that conjecture tonight.)

Next week, I assume, Laci will lecture to a packed house; then the experts will race to unpack the details.  Until then, we probably need to sit tight; I don’t know any more than what’s in the abstract.  For now, I’m delighted if commenters want to share general thoughts or questions about graph isomorphism (and I’ll try to answer what I can), but I won’t allow uninformed speculations or rumors about the details of the new result—not until Laci has had a chance to speak.

Update (Nov. 5): While we all wait with bated breath for more details, you can amuse yourself with the talk I gave at Laci’s 60th birthday conference five years ago.

Also, a comment of mine that I should probably promote to the main post:

Dana points out to me that non-native English speakers might not get the staggeringly clever pun in this post’s title (hey, it was the best I could do on a deadline).

So, alright, fee fi fo fum is what the approaching giant bellows in Jack and the Beanstalk. It means something big is on the horizon. Also, G is a graph, and Phi is an isomorphism.

Update (Nov. 12): So, Laci gave his talk. Video was made but does not appear to be available yet. However, Gabriel Gaster, who was in attendance, graciously live-tweeted everything. Here’s a Storify of the live-tweets. (What’s a “Storify”?)

## A breakthrough on QMA(2)?

October 30th, 2015

Last night, Martin Schwarz posted a preprint to the arXiv that claims to show the new complexity class containment QMA(2) ⊆ EXP.  (See also his brief blog post about this result.)  Here QMA(2) means Quantum Merlin-Arthur with two Merlins—i.e., the set of languages for which a “yes” answer can be witnessed by two unentangled quantum states, |ψ〉⊗|φ〉, on polynomially many qubits each, which are checked by a polynomial-time quantum algorithm—while EXP means deterministic exponential time.  Previously, the best upper bound we had was the trivial QMA(2) ⊆ NEXP (Nondeterministic Exponential Time), which comes from guessing exponential-size classical descriptions of the two quantum proofs.

Whether QMA(2) is contained in EXP is a problem that had fascinated me for a decade.  With Salman Beigi, Andy Drucker, Bill Fefferman, and Peter Shor, we discussed this problem in our 2008 paper The Power of Unentanglement.  That paper (with an additional ingredient supplied by Harrow and Montanaro) shows how to prove that a 3SAT instance of size n is satisfiable, using two unentangled quantum proofs with only Õ(√n) qubits each.  This implies that searching over all n-qubit unentangled proofs must take at least exp(n2) time, unless 3SAT is solvable in 2o(n) time (i.e., unless the Exponential Time Hypothesis is false).  However, since EXP is defined as the set of problems solvable in 2p(n) time, for any polynomial p, this is no barrier to QMA(2) ⊆ EXP being true—it merely constrains the possible techniques that could prove such a containment.

In trying to prove QMA(2) ⊆ EXP, the fundamental difficulty is that you need to optimize over unentangled quantum states only.  That might sound easier than optimizing over all states (including the entangled ones), but ironically, it’s harder!  The reason why it’s harder is that optimizing over all quantum states (say, to find the one that’s accepted by some measurement with the maximum probability) is a convex optimization problem: in fact, it boils down to finding the principal eigenvector of a Hermitian matrix.  By contrast, optimizing over only the separable states is a non-convex optimization problem, which is NP-hard to solve exactly (treating the dimension of the Hilbert space as the input size n)—meaning that the question shifts to what sorts of approximations are possible.

Last week, I had the pleasure of speaking with Martin in person, when I visited Vienna, Austria to give a public lecture at the wonderful new research institute IST.  Martin was then ironing out some final wrinkles in his proof, and I got to watch him in action—in particular, to see the care and detachment with which he examined the possibility that his proof might imply too much (e.g., that NP-complete problems are solvable in quasipolynomial time).  Fortunately, his proof turned out not to imply anything of the kind.

The reason why it didn’t is directly related to the most striking feature of Martin’s proof—namely, that it’s non-relativizing, leaving completely open the question of whether QMA(2)A ⊆ EXPA relative to all oracles A.  To explain how this is possible requires saying a bit about how the proof works.  The obvious way to prove QMA(2) ⊆ EXP—what I had assumed from the beginning was the only realistic way—would be to give a quasipolynomial-time approximation algorithm for the so-called Best Separable State or BSS problem.  The BSS problem, as defined in this paper by Russell Impagliazzo, Dana Moshkovitz, and myself (see also this one by Barak et al.), is as follows: you’re given as input an n2×n2 Hermitian matrix A, with all its eigenvalues in [0,1].  Your goal is to find length-n unit vectors, u and w, that maximize

(uT⊗wT)A(u⊗w),

to within an additive error of ±ε, for some constant ε.

Of course, if we just asked for a length-n2 unit vector v that maximized vTAv, we’d be asking for the principal eigenvector of A, which is easy to find in polynomial time.  By contrast, from the ABDFS and Harrow-Montanaro results, it follows that the BSS problem, for constant ε, cannot be solved in poly(n) time, unless 3SAT is solvable in 2o(n) time.  But this still leaves the possibility that BSS is solvable in nlog(n) time—and that possibility would immediately imply QMA(2) ⊆ EXP.  So, as I and others saw it, the real challenge here was to find a quasipolynomial-time approximation algorithm for BSS—something that remained elusive, although Brandao-Christandl-Yard made partial progress towards it.

But now Martin comes along, and proves QMA(2) ⊆ EXP in a way that sidesteps the BSS problem.  The way he does it is by using the fact that, if a problem is in QMA(2), then we don’t merely know a Hermitian operator A corresponding to the measurement of |ψ〉⊗|φ〉: rather, we know an actual polynomial-size sequence of quantum gates that get multiplied together to produce A.  Using that fact, Chailloux and Sattath showed that a natural variant of the QMA-complete Local Hamiltonians problem, which they call Separable Sparse Hamiltonians, is complete for QMA(2).  Thus, it suffices for Martin to show how to solve the Separable Sparse Hamiltonians problem in singly-exponential time.  This he does by using perturbation theory gadgets to reduce Separable Sparse Hamiltonians to Separable Local Hamiltonians with an exponentially-small promise gap, and then using a result of Shi and Wu to solve the latter problem in singly-exponential time.  All in all, given the significance of the advance, Martin’s paper is remarkably short; a surprising amount of it boils down to deeply understanding some not-especially-well-known results that were already in the literature.

One obvious problem left open is whether the full BSS problem—rather than just the special case of it corresponding to QMA(2)—is solvable in quasipolynomial time after all.  A second obvious problem is whether the containment QMA(2) ⊆ EXP can be improved to QMA(2) ⊆ PSPACE, or even (say) QMA(2) ⊆ PP.  (By comparison, note that QMA ⊆ PP, by a result of Kitaev and Watrous.)

Update (Nov. 10): I thought I should let people know that a serious concern has been raised by an expert about the correctness of the proof—and in particular, about the use of perturbation theory gadgets. Martin tells me that he’s working on a fix, and I very much hope he’ll succeed, but not much to do for now except let the scientific process trundle along (which doesn’t happen at blog-speed).

## Ordinary Words Will Do

October 18th, 2015

Izabella Laba, a noted mathematician at the University of British Columbia, recently posted some tweets that used me as a bad, cautionary example for how “STEM faculty should be less contemptuous of social sciences.”  Here was the offending comment of mine, from the epic Walter Lewin thread last fall:

[W]hy not dispense with the empirically-empty notion of “privilege,” and just talk directly about the actual well-being of actual people, or groups of people?  If men are doing horrific things to women—for example, lashing them for driving cars, like in Saudi Arabia—then surely we can just say so in plain language.  Stipulating that the torturers are “exercising their male privilege” with every lash adds nothing to anyone’s understanding of the evil.  It’s bad writing.  More broadly, it seems to me that the entire apparatus of “privilege,” “delegitimation,” etc. etc. can simply be tossed overboard, to rust on the ocean floor alongside dialectical materialism and other theoretical superstructures that were once pompously insisted upon as preconditions of enlightened social discourse.  This isn’t quantum field theory.  Ordinary words will do.

Prof. Laba derisively commented:

Might as well ask you to explain calculus without using fancy words like “derivative” or “continuous.”  Simple number arithmetic will do.

Prof. Laba’s tweets were favorited by Jordan Ellenberg, a mathematician who wrote the excellent popular book How Not to Be Wrong.  (Ellenberg had also criticized me last year for my strange, naïve idea that human relations can be thought about using logic.)

Given my respect for the critics, I guess I’m honor-bound to respond.

For the record, I tend not to think about the social sciences—or for that matter, the natural sciences—as monolithic entities at all.  I admire any honest attempt to discover the truth about anything.  And not being a postmodern relativist, I believe there are deep truths worth discovering in history, psychology, economics, linguistics, possibly even sociology.  Reading the books of Steven Pinker underscored for me how much is actually understood nowadays about human nature—much of it only figured out within the last half-century.  Likewise, reading the voluminous profundities of Scott Alexander taught me that even in psychiatry, there are truths (and even a few definite cures) to be had for those who seek.

I also believe that the social sciences are harder—way harder—than math or physics or CS.  They’re harder because of the tenuousness of the correlations, because of the complexity of each individual human brain (let alone 7 billion of them on the same planet), but most of all, because politics and ideology and the scientist’s own biases place such powerful thumbs on the scale.  This makes it all the more impressive when a social scientist, like (say) Stanley Milgram or Judith Rich Harris or Napoleon Chagnon, teaches the world something important and new.

I will confess to contempt for anything that I regard as pompous obscurantism—for self-referential systems of jargon whose main purposes are to bar outsiders, to mask a lack of actual understanding, and to confer power on certain favored groups.  And I regard the need to be alert to such systems, to nip them in the bud before they grow into Lysenkoism, as in some sense the problem of intellectual life.  Which brings me to the most fundamental asymmetry between the hard and soft sciences.  Namely, the very fact that it’s so much harder to nurture new truths to maturity in the social sciences than it is in math or physics, means that in the former, the jargon-weeds have an easier time filling the void—and we know they’ve done it again and again, even in the post-Enlightenment West.

Time for a thought experiment.  Suppose you showed up at a university anytime between, let’s say, 1910 and 1970, and went from department to department asking (in so many words): what are you excited about this century?  Where are your new continents, what’s the future of your field?  Who should I read to learn about that future?

In physics, the consensus answer would’ve been something like: Planck, Einstein, Bohr, Schrödinger, Dirac.

In psychology, it would’ve been: Freud and Jung (with another faction for B. F. Skinner).

In politics and social sciences, over an enormous swath of academia (including in the West), it would’ve been: Marx, Engels, Trotsky, Lenin.

With hindsight, we now know that the physics advice would’ve been absolute perfection, the psychology and politics advice an unmitigated disaster.  Yes, physicists today know more than Einstein, can even correct him on some points, but the continents he revealed to us actually existed—indeed, have only become more important since Einstein’s time.

But Marx and Freud?  You would’ve done better to leave the campus, and ask a random person on the street what she or he thought about economics and psychology.  In high school, I remember cringing through a unit on the 1920s, when we learned about how “two European professors upset a war-weary civilization’s established certainties—with Einstein overturning received wisdom about space and time, and Freud doing just the same for the world of the mind.”  It was never thought important to add that Einstein’s theories turned out to be true while Freud’s turned out to be false.  Still, at least Freud’s ideas led “only” to decades of bad psychology and hundreds of innocent people sent to jail because of testimony procured through hypnosis, rather than to tens of millions of dead, as with the other social-scientific theory that reigned supreme among 20th-century academics.

Marx and Freud built impressive intellectual edifices—sufficiently impressive for a large fraction of intellectuals to have accepted those men as gurus on par with Darwin and Einstein for almost a century.  Yet on nearly every topic they wrote about, we now know that Marx and Freud couldn’t have been any more catastrophically wrong.  Moreover, their wrongness was knowable at the time—and was known to many, though the ones who knew were typically the ones who the intellectual leaders sneered at, as deluded reactionaries.

Which raises a question: suppose that, in the 1920s, I’d taken the social experts’ advice to study Marx and Freud, didn’t understand much of what they said (and found nonsensical much of what I did understand), and eventually rejected them as pretentious charlatans.  Then why wouldn’t I have been just like Prof. Laba’s ignorant rube, who dismisses calculus because he doesn’t understand technical terms like “continuous” and “derivative”?

On reflection, I don’t think that the two cases are comparable at all.

The hard sciences need technical vocabularies for a simple reason: because they’re about things that normal people don’t spend their hours obsessively worrying about.  Yes, I’d have a hard time understanding organic chemists or differential geometers, but largely for the same reasons I’d have a hard time understanding football fans or pirates.  It’s not just that I don’t understand the arguments; it’s that the arguments are about a world that’s alien to me (and that, to be honest, I don’t care about as much as I do my world).

Suppose, by contrast, that you’re writing about the topics everyone spends their time obsessively worrying about: politics, society, the human mind, the relations between the races and sexes.  In other words, suppose you’re writing about precisely the topics for which the ordinary English language has been honed over centuries—for which Shakespeare and Twain and Dr. King and so many others deployed the language to such spectacular effect.  In that case, what excuse could you possibly have to write in academese, to pepper your prose with undefined in-group neologisms?

Well, let’s be charitable; maybe you have a reason.  For example, maybe you’re doing a complicated meta-analysis of psychology papers, so you need to talk about r-values and kurtosis and heteroskedasticity.  Or maybe you’re putting people in an fMRI machine while you ask them questions, so you need to talk about the temporal resolution in the anterior cingulate cortex.  Or maybe you’re analyzing sibling rivalries using game theory, so you need Nash equilibria.  Or you’re picking apart sentences using Chomskyan formal grammar.  In all these cases, armchair language doesn’t suffice because you’re not just sitting in your armchair: you’re using a new tool to examine the everyday from a different perspective.  For present purposes, you might as well be doing algebraic geometry.

The Freudians and Marxists would, of course, claim that they’re doing the exact same thing.  Yes, they’d say, you thought you had the words to discuss your own mind or the power structure of society, but really you didn’t, because you lacked the revolutionary theoretical framework that we now possess.  (Trotsky’s writings  are suffused with this brand of arrogance in nearly every sentence: for example, when he ridicules the bourgeoisie liberals who whine about “human rights violations” in the early USSR, yet who are too dense to phrase their objections within the framework of dialectical materialism.)

I submit that, even without the hindsight of 2015, there would’ve been excellent reasons to be skeptical of these claims.  Has it ever happened, you might ask yourself, that someone sat in their study and mused about the same human questions that occupied Plato and Shakespeare and Hume, in the same human way they did, and then came up with a new, scientific conclusion that was as rigorous and secure as relativity or evolution?

Let me know if I missed something, but I can’t think of a single example.  Sure, it seems to me, there have been geniuses of human nature, who enlarged our vision without any recourse to the quantitative methods of science.  But even those geniuses “only” contributed melodies for other geniuses to answer in counterpoint, rather than stones for everyone who came later to build upon.  Also, the geniuses usually wrote well.

Am I claiming that progress is impossible in the social realm?  Not at all.  The emancipation of slaves, the end of dueling and blasphemy laws and the divine right of kings, women’s suffrage and participation in the workforce, gay marriage—all these strike me as crystal-clear examples of moral progress, as advances that will still be considered progress a thousand years from now, if there’s anyone around then to discuss such things.  Evolutionary psychology, heuristics and biases, reciprocal altruism, and countless other developments likewise strike me as intellectual progress within the sciences of human nature.  But none of these advances needed recondite language!  Ordinary words sufficed for Thomas Paine and Frederick Douglass and John Stuart Mill, as they sufficed for Robert Axelrod and for Kahneman and Tversky.  So forgive me for thinking that whatever is true and important in the social world today, should likewise be defensible to every smart person in ordinary words, and that this represents a genuine difference between the social sciences and physics.

Which brings us to the central point that Prof. Laba disputed in that comment of mine.  I believe there are countless moral heroes in our time, as well as social scientists who struggle heroically to get the right answers.  But as far as I can tell, the people who build complex intellectual edifices around words like “privilege” and “delegitimation” and “entitlement” and “marginalized” are very much the same sort of people who, a few generations ago, built similar edifices around “bourgeoisie” and “dialectical” and “false consciousness.”  In both cases, there’s an impressive body of theory that’s held up as the equivalent in its domain of relativity, quantum mechanics, and Darwinism, with any skeptics denounced as science-deniers.  In both cases, enlightened liberals are tempted to side with the theorists, since the theorists believe in so many of the same causes that the enlightened liberals believe in, and hate so many of the same people who the enlightened liberals hate.  But in both cases, the theorists’ language seems to alternate between incomprehensible word-salad and fervid, often profanity-laced denunciations, skipping entirely over calm clarity.  And in both cases, the only thing that the impressive theoretical edifice ever seems to get used for, is to prove over and over that certain favored groups should get more power while disfavored ones should get less.

So I’m led to the view that, if you want to rouse people’s anger about injustice or their pity about preventable suffering, or end arbitrary discrimination codified into law, or give individuals more freedom to pursue their own happiness, or come up with a new insight about human nature, or simply describe the human realities that you see around you—for all these purposes, the words that sufficed for every previous generation’s great humanists will also suffice for you.

On the other hand, to restrict freedom and invent new forms of discrimination—and to do it in the name of equality and justice—that takes theory.  You’ll need a sophisticated framework, for example, to prove that even if two adults both insist they’re consenting to a relationship, really they might not be, because of power structures in the wider society that your superior insight lets you see.  You’ll need advanced discourse to assure you that, even though your gut reaction might be horror at (say) someone who misspoke once and then had their life gleefully destroyed on social media, your gut is not to be trusted, because it’s poisoned by the same imperialist, patriarchal biases as everything else—and because what looks like a cruel lynching needs to be understood in a broader social context (did the victim belong to a dominant group, or to a marginalized one?).  Finally, you’ll need oodles of theory (bring out the Marcuse) to explain why the neoliberal fanaticism about “free speech” and “tolerance” and “due process” and “the presumption of innocence” is too abstract and simplistic—for those concepts, too, fail to distinguish between a marginalized group that deserves society’s protection and a dominant group that doesn’t.

So I concede to Prof. Laba that the complicated discourse of privilege, hegemony, etc. serves a definite purpose for the people who wield it, just as much as the complicated discourse of quantum field theory serves a purpose for physicists.  It’s just that the purposes of the privilege-warriors aren’t my purposes.  For my purposes—which include fighting injustice, advancing every social and natural science as quickly as possible, and helping all well-meaning women and men see each other’s common humanity—I said last year and I say again that ordinary words will do.

Update (Oct. 26): Izabella Laba has written a response to this post, for which I’m extremely grateful. Her reply reveals that she and I have a great deal of common ground, and also a few clear areas of disagreement (e.g., what’s wrong with Steven Pinker?). But my most important objection is simply that, the first time I loaded her blog, the text went directly over the rock image in the background, making it impossible to read without highlighting it.

## Six announcements

September 21st, 2015
1. I did a podcast interview with Julia Galef for her series “Rationally Speaking.”  See also here for the transcript (which I read rather than having to listen to myself stutter).  The interview is all about Aumann’s Theorem, and whether rational people can agree to disagree.  It covers a lot of the same ground as my recent post on the same topic, except with less technical detail about agreement theory and more … well, agreement.  At Julia’s suggestion, we’re planning to do a follow-up podcast about the particular intractability of online disagreements.  I feel confident that we’ll solve that problem once and for all.  (Update: Also check out this YouTube video, where Julia offers additional thoughts about what we discussed.)
2. When Julia asked me to recommend a book at the end of the interview, I picked probably my favorite contemporary novel: The Mind-Body Problem by Rebecca Newberger Goldstein.  Embarrassingly, I hadn’t realized that Rebecca had already been on Julia’s show twice as a guest!  Anyway, one of the thrills of my life over the last year has been to get to know Rebecca a little, as well as her husband, who’s some guy named Steve Pinker.  Like, they both live right here in Boston!  You can talk to them!  I was especially pleased two weeks ago to learn that Rebecca won the National Humanities Medal—as I told Julia, Rebecca Goldstein getting a medal at the White House is the sort of thing I imagine happening in my ideal fantasy world, making it a pleasant surprise that it happened in this one.  Huge congratulations to Rebecca!
3. The NSA has released probably its most explicit public statement so far about its plans to move to quantum-resistant cryptography.  For more see Bruce Schneier’s Crypto-Gram.  Hat tip for this item goes to reader Ole Aamot, one of the only people I’ve ever encountered whose name alphabetically precedes mine.
4. Last Tuesday, I got to hear Ayaan Hirsi Ali speak at MIT about her new book, Heretic, and then spend almost an hour talking to students who had come to argue with her.  I found her clear, articulate, and courageous (as I guess one has to be in her line of work, even with armed cops on either side of the lecture hall).  After the shameful decision of Brandeis in caving in to pressure and cancelling Hirsi Ali’s commencement speech, I thought it spoke well of MIT that they let her speak at all.  The bar shouldn’t be that low, but it is.
5. From far away on the political spectrum, I also heard Noam Chomsky talk last week (my first time hearing him live), about the current state of linguistics.  Much of the talk, it struck me, could have been given in the 1950s with essentially zero change (and I suspect Chomsky would agree), though a few parts of it were newer, such as the speculation that human languages have many of the features they do in order to minimize the amount of computation that the speaker needs to perform.  The talk was full of declarations that there had been no useful work whatsoever on various questions (e.g., about the evolutionary function of language), that they were total mysteries and would perhaps remain total mysteries forever.
6. Many of you have surely heard by now that Terry Tao solved the Erdös Discrepancy Problem, by showing that for every infinite sequence of heads and tails and every positive integer C, there’s a positive integer k such that, if you look at the subsequence formed by every kth flip, there comes a point where the heads outnumber tails or vice versa by at least C.  This resolves a problem that’s been open for more than 80 years.  For more details, see this post by Timothy Gowers.  Notably, Tao’s proof builds, in part, on a recent Polymath collaborative online effort.  It was a big deal last year when Konev and Lisitsa used a SAT-solver to prove that there’s always a subsequence with discrepancy at least 3; Tao’s result now improves on that bound by ∞.

## Bell inequality violation finally done right

September 15th, 2015

A few weeks ago, Hensen et al., of the Delft University of Technology and Barcelona, Spain, put out a paper reporting the first experiment that violates the Bell inequality in a way that closes off the two main loopholes simultaneously: the locality and detection loopholes.  Well, at least with ~96% confidence.  This is big news, not only because of the result itself, but because of the advances in experimental technique needed to achieve it.  Last Friday, two renowned experimentalists—Chris Monroe of U. of Maryland and Jungsang Kim of Duke—visited MIT, and in addition to talking about their own exciting ion-trap work, they did a huge amount to help me understand the new Bell test experiment.  So OK, let me try to explain this.

While some people like to make it more complicated, the Bell inequality is the following statement. Alice and Bob are cooperating with each other to win a certain game (the “CHSH game“) with the highest possible probability. They can agree on a strategy and share information and particles in advance, but then they can’t communicate once the game starts. Alice gets a uniform random bit x, and Bob gets a uniform random bit y (independent of x).  Their goal is to output bits, a and b respectively, such that a XOR b = x AND y: in other words, such that a and b are different if and only if x and y are both 1.  The Bell inequality says that, in any universe that satisfies the property of local realism, no matter which strategy they use, Alice and Bob can win the game at most 75% of the time (for example, by always outputting a=b=0).

What does local realism mean?  It means that, after she receives her input x, any experiment Alice can perform in her lab has a definite result that might depend on x, on the state of her lab, and on whatever information she pre-shared with Bob, but at any rate, not on Bob’s input y.  If you like: a=a(x,w) is a function of x and of the information w available before the game started, but is not a function of y.  Likewise, b=b(y,w) is a function of y and w, but not of x.  Perhaps the best way to explain local realism is that it’s the thing you believe in, if you believe all the physicists babbling about “quantum entanglement” just missed something completely obvious.  Clearly, at the moment two “entangled” particles are created, but before they separate, one of them flips a tiny coin and then says to the other, “listen, if anyone asks, I’ll be spinning up and you’ll be spinning down.”  Then the naïve, doofus physicists measure one particle, find it spinning down, and wonder how the other particle instantly “knows” to be spinning up—oooh, spooky! mysterious!  Anyway, if that’s how you think it has to work, then you believe in local realism, and you must predict that Alice and Bob can win the CHSH game with probability at most 3/4.

What Bell observed in 1964 is that, even though quantum mechanics doesn’t let Alice send a signal to Bob (or vice versa) faster than the speed of light, it still makes a prediction about the CHSH game that conflicts with local realism.  (And thus, quantum mechanics exhibits what one might not have realized beforehand was even a logical possibility: it doesn’t allow communication faster than light, but simulating the predictions of quantum mechanics in a classical universe would require faster-than-light communication.)  In particular, if Alice and Bob share entangled qubits, say $$\frac{\left| 00 \right\rangle + \left| 11 \right\rangle}{\sqrt{2}},$$ then there’s a simple protocol that lets them violate the Bell inequality, winning the CHSH game ~85% of the time (with probability (1+1/√2)/2 > 3/4).  Starting in the 1970s, people did experiments that vindicated the prediction of quantum mechanics, and falsified local realism—or so the story goes.

The violation of the Bell inequality has a schizophrenic status in physics.  To many of the physicists I know, Nature’s violating the Bell inequality is so trivial and obvious that it’s barely even worth doing the experiment: if people had just understood and believed Bohr and Heisenberg back in 1925, there would’ve been no need for this whole tiresome discussion.  To others, however, the Bell inequality violation remains so unacceptable that some way must be found around it—from casting doubt on the experiments that have been done, to overthrowing basic presuppositions of science (e.g., our own “freedom” to generate random bits x and y to send to Alice and Bob respectively).

For several decades, there was a relatively conservative way out for local realist diehards, and that was to point to “loopholes”: imperfections in the existing experiments which meant that local realism was still theoretically compatible with the results, at least if one was willing to assume a sufficiently strange conspiracy.

Fine, you interject, but surely no one literally believed these little experimental imperfections would be the thing that would rescue local realism?  Not so fast.  Right here, on this blog, I’ve had people point to the loopholes as a reason to accept local realism and reject the reality of quantum entanglement.  See, for example, the numerous comments by Teresa Mendes in my Whether Or Not God Plays Dice, I Do post.  Arguing with Mendes back in 2012, I predicted that the two main loopholes would both be closed in a single experiment—and not merely eventually, but in, like, a decade.  I was wrong: achieving this milestone took only a few years.

Before going further, let’s understand what the two main loopholes are (or rather, were).

The locality loophole arises because the measuring process takes time and Alice and Bob are not infinitely far apart.  Thus, suppose that, the instant Alice starts measuring her particle, a secret signal starts flying toward Bob’s particle at the speed of light, revealing her choice of measurement setting (i.e., the value of x).  Likewise, the instant Bob starts measuring his particle, his doing so sends a secret signal flying toward Alice’s particle, revealing the value of y.  By the time the measurements are finished, a few microseconds later, there’s been plenty of time for the two particles to coordinate their responses to the measurements, despite being “classical under the hood.”

Meanwhile, the detection loophole arises because in practice, measurements of entangled particles—especially of photons—don’t always succeed in finding the particles, let alone ascertaining their properties.  So one needs to select those runs of the experiment where Alice and Bob both find the particles, and discard all the “bad” runs where they don’t.  This by itself wouldn’t be a problem, if not for the fact that the very same measurement that reveals whether the particles are there, is also the one that “counts” (i.e., where Alice and Bob feed x and y and get out a and b)!

To someone with a conspiratorial mind, this opens up the possibility that the measurement’s success or failure is somehow correlated with its result, in a way that could violate the Bell inequality despite there being no real entanglement.  To illustrate, suppose that at the instant they’re created, one entangled particle says to the other: “listen, if Alice measures me in the x=0 basis, I’ll give the a=1 result.  If Bob measures you in the y=1 basis, you give the b=1 result.  In any other case, we’ll just evade detection and count this run as a loss.”  In such a case, Alice and Bob will win the game with certainty, whenever it gets played at all—but that’s only because of the particles’ freedom to choose which rounds will count.  Indeed, by randomly varying their “acceptable” x and y values from one round to the next, the particles can even make it look like x and y have no effect on the probability of a round’s succeeding.

Until a month ago, the state-of-the-art was that there were experiments that closed the locality loophole, and other experiments that closed the detection loophole, but there was no single experiment that closed both of them.

To close the locality loophole, “all you need” is a fast enough measurement on photons that are far enough apart.  That way, even if the vast Einsteinian conspiracy is trying to send signals between Alice’s and Bob’s particles at the speed of light, to coordinate the answers classically, the whole experiment will be done before the signals can possibly have reached their destinations.  Admittedly, as Nicolas Gisin once pointed out to me, there’s a philosophical difficulty in defining what we mean by the experiment being “done.”  To some purists, a Bell experiment might only be “done” once the results (i.e., the values of a and b) are registered in human experimenters’ brains!  And given the slowness of human reaction times, this might imply that a real Bell experiment ought to be carried out with astronauts on faraway space stations, or with Alice on the moon and Bob on earth (which, OK, would be cool).  If we’re being reasonable, however, we can grant that the experiment is “done” once a and b are safely recorded in classical, macroscopic computer memories—in which case, given the speed of modern computer memories, separating Alice and Bob by half a kilometer can be enough.  And indeed, experiments starting in 1998 (see for example here) have done exactly that; the current record, unless I’m mistaken, is 18 kilometers.  (Update: I was mistaken; it’s 144 kilometers.)  Alas, since these experiments used hard-to-measure photons, they were still open to the detection loophole.

To close the detection loophole, the simplest approach is to use entangled qubits that (unlike photons) are slow and heavy and can be measured with success probability approaching 1.  That’s exactly what various groups did starting in 2001 (see for example here), with trapped ions, superconducting qubits, and other systems.  Alas, given current technology, these sorts of qubits are virtually impossible to move miles apart from each other without decohering them.  So the experiments used qubits that were close together, leaving the locality loophole wide open.

So the problem boils down to: how do you create long-lasting, reliably-measurable entanglement between particles that are very far apart (e.g., in separate labs)?  There are three basic ideas in Hensen et al.’s solution to this problem.

The first idea is to use a hybrid system.  Ultimately, Hensen et al. create entanglement between electron spins in nitrogen vacancy centers in diamond (one of the hottest—or coolest?—experimental quantum information platforms today), in two labs that are about a mile away from each other.  To get these faraway electron spins to talk to each other, they make them communicate via photons.  If you stimulate an electron, it’ll sometimes emit a photon with which it’s entangled.  Very occasionally, the two electrons you care about will even emit photons at the same time.  In those cases, by routing those photons into optical fibers and then measuring the photons, it’s possible to entangle the electrons.

Wait, what?  How does measuring the photons entangle the electrons from whence they came?  This brings us to the second idea, entanglement swapping.  The latter is a famous procedure to create entanglement between two particles A and B that have never interacted, by “merely” entangling A with another particle A’, entangling B with another particle B’, and then performing an entangled measurement on A’ and B’ and conditioning on its result.  To illustrate, consider the state

$$\frac{\left| 00 \right\rangle + \left| 11 \right\rangle}{\sqrt{2}} \otimes \frac{\left| 00 \right\rangle + \left| 11 \right\rangle}{\sqrt{2}}$$

and now imagine that we project the first and third qubits onto the state $$\frac{\left| 00 \right\rangle + \left| 11 \right\rangle}{\sqrt{2}}.$$

If the measurement succeeds, you can check that we’ll be left with the state $$\frac{\left| 00 \right\rangle + \left| 11 \right\rangle}{\sqrt{2}}$$ in the second and fourth qubits, even though those qubits were not entangled before.

So to recap: these two electron spins, in labs a mile away from each other, both have some probability of producing a photon.  The photons, if produced, are routed to a third site, where if they’re both there, then an entangled measurement on both of them (and a conditioning on the results of that measurement) has some nonzero probability of causing the original electron spins to become entangled.

But there’s a problem: if you’ve been paying attention, all we’ve done is cause the electron spins to become entangled with some tiny, nonzero probability (something like 6.4×10-9 in the actual experiment).  So then, why is this any improvement over the previous experiments, which just directly measured faraway entangled photons, and also had some small but nonzero probability of detecting them?

This leads to the third idea.  The new setup is an improvement because, whenever the photon measurement succeeds, we know that the electron spins are there and that they’re entangled, without having to measure the electron spins to tell us that.  In other words, we’ve decoupled the measurement that tells us whether we succeeded in creating an entangled pair, from the measurement that uses the entangled pair to violate the Bell inequality.  And because of that decoupling, we can now just condition on the runs of the experiment where the entangled pair was there, without worrying that that will open up the detection loophole, biasing the results via some bizarre correlated conspiracy.  It’s as if the whole experiment were simply switched off, except for those rare lucky occasions when an entangled spin pair gets created (with its creation heralded by the photons).  On those rare occasions, Alice and Bob swing into action, measuring their respective spins within the brief window of time—about 4 microseconds—allowed by the locality loophole, seeking an additional morsel of evidence that entanglement is real.  (Well, actually, Alice and Bob swing into action regardless; they only find out later whether this was one of the runs that “counted.”)

So, those are the main ideas (as well as I understand them); then there’s lots of engineering.  In their setup, Hensen et al. were able to create just a few heralded entangled pairs per hour.  This allowed them to produce 245 CHSH games for Alice and Bob to play, and to reject the hypothesis of local realism at ~96% confidence.  Jungsang Kim explained to me that existing technologies could have produced many more events per hour, and hence, in a similar amount of time, “particle physics” (5σ or more) rather than “psychology” (2σ) levels of confidence that local realism is false.  But in this type of experiment, everything is a tradeoff.  Building not one but two labs for manipulating NV centers in diamond is extremely onerous, and Hensen et al. did what they had to do to get a significant result.

The basic idea here, of using photons to entangle longer-lasting qubits, is useful for more than pulverizing local realism.  In particular, the idea is a major part of current proposals for how to build a scalable ion-trap quantum computer.  Because of cross-talk, you can’t feasibly put more than 10 or so ions in the same trap while keeping all of them coherent and controllable.  So the current ideas for scaling up involve having lots of separate traps—but in that case, one will sometimes need to perform a Controlled-NOT, or some other 2-qubit gate, between a qubit in one trap and a qubit in another.  This can be achieved using the Gottesman-Chuang technique of gate teleportation, provided you have reliable entanglement between the traps.  But how do you create such entanglement?  Aha: the current idea is to entangle the ions by using photons as intermediaries, very similar in spirit to what Hensen et al. do.

At a more fundamental level, will this experiment finally convince everyone that local realism is dead, and that quantum mechanics might indeed be the operating system of reality?  Alas, I predict that those who confidently predicted that a loophole-free Bell test could never be done, will simply find some new way to wiggle out, without admitting the slightest problem for their previous view.  This prediction, you might say, is based on a different kind of realism.

## Ask Me Anything: Diversity Edition

September 5th, 2015

With the fall semester imminent, and by popular request, I figured I’d do another Ask Me Anything (see here for the previous editions).  This one has a special focus: I’m looking for questions from readers who consider themselves members of groups that have historically been underrepresented in the Shtetl-Optimized comments section.  Besides the “obvious”—e.g., women and underrepresented ethnic groups—other examples might include children, traditionally religious people, jocks, liberal-arts majors… (but any group that includes John Sidles is probably not an example).  If I left out your group, please go ahead and bring it to my and your fellow readers’ attention!

My overriding ideal in life—what is to me as Communism was to Lenin, as Frosted Flakes are to Tony the Tiger—is people of every background coming together to discover and debate universal truths that transcend their backgrounds.  So few things have ever stung me more than accusations of being a closed-minded ivory-tower elitist white male nerd etc. etc.  Anyway, to anyone who’s ever felt excluded here for whatever reason, I hope this AMA will be taken as a small token of goodwill.

Similar rules apply as to my previous AMAs:

• Only one question per person.
• No multi-part questions, or questions that require me to read a document or watch a video and then comment on it.
• Questions need not have anything to do with your underrepresented group (though they could). Math, science, futurology, academic career advice, etc. are all fine.  But please be courteous; anything gratuitously nosy or hostile will be left in the moderation queue.
• I’ll stop taking further questions most likely after 24 hours (I’ll post a warning before closing the thread).

Update (Sep. 6): For anyone from the Boston area, or planning to visit it, I have an important piece of advice.  Do not ever, under any circumstances, attempt to visit Walden Pond, and tell everyone you know to stay away.  After we spent 40 minutes driving there with a toddler, the warden literally screamed at us to go away, that the park was at capacity. It wasn’t an issue of parking: even if we’d parked elsewhere, we just couldn’t go.  Exceptions were made for the people in front of us, but not for us, the ones with the 2-year-old who’d been promised her weekend outing would be to meet her best friend at Walden Pond.  It’s strangely fitting that what for Thoreau was a place of quiet contemplation, is today purely a site of overcrowding and frustration.

Another Update: OK, no new questions please, only comments on existing questions! I’ll deal with the backlog later today. Thanks to everyone who contributed.

August 26th, 2015

A bunch of people have asked me to comment on D-Wave’s release of its 1000-qubit processor, and a paper by a group including Cathy McGeoch saying that the machine is 1 or 2 orders of faster (in annealing time, not wall-clock time) than simulated annealing running on a single-core classical computer.  It’s even been suggested that the “Scott-signal” has been shining brightly for a week above Quantham City, but that Scott-man has been too lazy and out-of-shape even to change into his tights.

Scientifically, it’s not clear if much has changed.  D-Wave now has a chip with twice as many qubits as the last one.  That chip continues to be pretty effective at finding its own low-energy states: indeed, depending on various details of definition, the machine can even find its own low-energy states “faster” than some implementation of simulated annealing running on a single-core chip.  Of course, it’s entirely possible that Matthias Troyer or Sergei Isakov or Troels Ronnow or someone like that will be able to find a better implementation of simulated annealing that closes even the modest gap—as happened the last time—but I’ll give the authors the benefit of the doubt that they put good-faith effort into optimizing the classical code.

More importantly, I’d say it remains unclear whether any of the machine’s performance on the instances tested here can be attributed to quantum tunneling effects.  In fact, the paper explicitly states (see page 3) that it’s not going to consider such questions, and I think the authors would agree that you could very well see results like theirs, even if what was going on was fundamentally classical annealing.  Also, of course, it’s still true that, if you wanted to solve a practical optimization problem, you’d first need to encode it into the Chimera graph, and that reduction entails a loss that could hand a decisive advantage to simulated annealing, even without the need to go to multiple cores.  (This is what I’ve described elsewhere as essentially all of these performance comparisons taking place on “the D-Wave machine’s home turf”: that is, on binary constraint satisfaction problems that have precisely the topology of D-Wave’s Chimera graph.)

But, I dunno, I’m just not feeling the urge to analyze this in more detail.  Part of the reason is that I think the press might be getting less hyper-excitable these days, thereby reducing the need for a Chief D-Wave Skeptic.  By this point, there may have been enough D-Wave announcements that papers realize they no longer need to cover each one like an extraterrestrial landing.  And there are more hats in the ring now, with John Martinis at Google seeking to build superconducting quantum annealing machines but with ~10,000x longer coherence times than D-Wave’s, and with IBM Research and some others also trying to scale superconducting QC.  The realization has set in, I think, that both D-Wave and the others are in this for the long haul, with D-Wave currently having lots of qubits, but with very short coherence times and unclear prospects for any quantum speedup, and Martinis and some others having qubits of far higher quality, but not yet able to couple enough of them.

The other issue is that, on my flight from Seoul back to Newark, I watched two recent kids’ movies that were almost defiant in their simple, unironic, 1950s-style messages of hope and optimism.  One was Disney’s new live-action Cinderella; the other was Brad Bird’s Tomorrowland.  And seeing these back-to-back filled me with such positivity and good will that, at least for these few hours, it’s hard to summon my usual crusty self.  I say, let’s invent the future together, and build flying cars and jetpacks in our garages!  Let a thousand creative ideas bloom for how to tackle climate change and the other crises facing civilization!  (Admittedly, mass-market flying cars and jetpacks are probably not a step forward on climate change … but, see, there’s that negativity coming back.)  And let another thousand ideas bloom for how to build scalable quantum computers—sure, including D-Wave’s!  Have courage and be kind!

So yeah, if readers would like to discuss the recent D-Wave paper further (especially those who know something about it), they’re more than welcome to do so in the comments section.  But I’ve been away from Dana and Lily for two weeks, and will endeavor to spend time with them rather than obsessively reloading the comments (let’s see if I succeed).

As a small token of my goodwill, I enclose two photos from my last visit to a D-Wave machine, which occurred when I met with some grad students in Waterloo this past spring.  As you can see, I even personally certified that the machine was operating as expected.  But more than that: surpassing all reasonable expectations for quantum AI, this model could actually converse intelligently, through a protruding head resembling that of IQC grad student Sarah Kaiser.

## 6-photon BosonSampling

August 19th, 2015

The news is more-or-less what the title says!

In Science, a group led by Anthony Laing at Bristol has now reported BosonSampling with 6 photons, beating their own previous record of 5 photons, as well as the earlier record of 4 photons achieved a few years ago by the Walmsley group at Oxford (as well as the 3-photon experiments done by groups around the world).  I only learned the big news from a commenter on this blog, after the paper was already published (protip: if you’ve pushed forward the BosonSampling frontier, feel free to shoot me an email about it).

As several people explain in the comments, the main advance in the paper is arguably not increasing the number of photons, but rather the fact that the device is completely reconfigurable: you can try hundreds of different unitary transformations with the same chip.  In addition, the 3-photon results have an unprecedentedly high fidelity (about 95%).

The 6-photon results are, of course, consistent with quantum mechanics: the transition amplitudes are indeed given by permanents of 6×6 complex matrices.  Key sentence:

After collecting 15 sixfold coincidence events, a confidence of P = 0.998 was determined that these are drawn from a quantum (not classical) distribution.

No one said scaling BosonSampling would be easy: I’m guessing that it took weeks of data-gathering to get those 15 coincidence events.  Scaling up further will probably require improvements to the sources.

There’s also a caveat: their initial state consisted of 2 modes with 3 photons each, as opposed to what we really want, which is 6 modes with 1 photon each.  (Likewise, in the Walmsley group’s 4-photon experiments, the initial state consisted of 2 modes with 2 photons each.)  If the number of modes stayed 2 forever, then the output distributions would remain easy to sample with a classical computer no matter how many photons we had, since we’d then get permanents of matrices with only 2 distinct rows.  So “scaling up” needs to mean increasing not only the number of photons, but also the number of sources.

Nevertheless, this is an obvious step forward, and it came sooner than I expected.  Huge congratulations to the authors on their accomplishment!

But you might ask: given that 6×6 permanents are still pretty easy for a classical computer (the more so when the matrices have only 2 distinct rows), why should anyone care?  Well, the new result has major implications for what I’ve always regarded as the central goal of quantum computing research, much more important than breaking RSA or Grover search or even quantum simulation: namely, getting Gil Kalai to admit he was wrong.  Gil is on record, repeatedly, on this blog as well as his (see for example here), as saying that he doesn’t think BosonSampling will ever be possible even with 7 or 8 photons.  I don’t know whether the 6-photon result is giving him second thoughts (or sixth thoughts?) about that prediction.

## Jacob Bekenstein (1947-2015)

August 18th, 2015

Today I learned the sad news that Jacob Bekenstein, one of the great theoretical physicists of our time, passed away at the too-early age of 68.

Everyone knows what a big deal it was when Stephen Hawking discovered in 1975 that black holes radiate.  Bekenstein was the guy who, as a grad student in Princeton in the early 1970s, was already raving about black holes having nonzero entropy and temperature, and satisfying the Second Law of Thermodynamics—something just about everyone, including Hawking, considered nuts at the time.  It was, as I understand it, Hawking’s failed attempt to prove Bekenstein wrong that led to Hawking’s discovery of the Hawking radiation, and thence to the modern picture of black holes.

In the decades since, Bekenstein continued to prove ingenious physical inequalities, often using thought experiments involving black holes.  The most famous of these, the Bekenstein bound, says that the number of bits that can be stored in any bounded physical system is finite, and is upper-bounded by ~2.6×1043 MR, where M is the system’s mass in kilograms and R is its radius in meters.  (This bound is saturated by black holes, and only by black holes, which therefore emerge as the most compact possible storage medium—though probably not the best for retrieval!)  Bekenstein’s lectures were models of clarity and rigor: at conferences full of audacious speculations, he stood out to my non-expert eyes as someone who was simply trying to follow chains of logic from accepted physical principles, however mind-bogglingly far those chains led but no further.

I first met Bekenstein in 2003, when I was a grad student spending a semester at Hebrew University in Jerusalem.  I was struck by the kindness he showed a 21-year-old nobody, who wasn’t even a physics student, coming to bother him.  Not only did he listen patiently to my blather about applying computational complexity to physics, he said that of course physics should ultimately aim to understand everything as the output of some computer program, that he too was thinking in terms of computation when he studied black-hole entropy.  I remember pondering the fact that the greatest reductionist I’d ever met was wearing a yarmulke—and then scolding myself for wasting precious brain-cycles on such a trivial thought when there was science to discuss.  I met Bekenstein maybe four or five more times on visits to Israel, most recently a year and a half ago, when we shared walks to and from the hotel at a firewall workshop at the Weizmann Institute.  He was unfailingly warm, modest, and generous—totally devoid of the egotism that I’ve heard can occasionally afflict people of his stature.  Now, much like with the qubits hitting the event horizon, the information that comprised Jacob Bekenstein might seem to be gone, but it remains woven into the cosmos.