Higher-level causation exists (but I wish it didn’t)

June 4th, 2017

Unrelated Update (June 6): It looks like the issues we’ve had with commenting have finally been fixed! Thanks so much to Christie Wright and others at WordPress Concierge Services for handling this. Let me know if you still have problems. In the meantime, I also stopped asking for commenters’ email addresses (many commenters filled that field with nonsense anyway).  Oops, that ended up being a terrible idea, because it made commenting impossible!  Back to how it was before.


Update (June 5): Erik Hoel was kind enough to write a 5-page response to this post (Word .docx format), and to give me permission to share it here.  I might respond to various parts of it later.  For now, though, I’ll simply say that I stand by what I wrote, and that requiring the macro-distribution to arise by marginalizing the micro-distribution still seems like the correct choice to me (and is what’s assumed in, e.g., the proof of the data processing inequality).  But I invite readers to read my post along with Erik’s response, form their own opinions, and share them in the comments section.


This past Thursday, Natalie Wolchover—a math/science writer whose work has typically been outstanding—published a piece in Quanta magazine entitled “A Theory of Reality as More Than the Sum of Its Parts.”  The piece deals with recent work by Erik Hoel and his collaborators, including Giulio Tononi (Hoel’s adviser, and the founder of integrated information theory, previously critiqued on this blog).  Commenter Jim Cross asked me to expand on my thoughts about causal emergence in a blog post, so: your post, monsieur.

In their new work, Hoel and others claim to make the amazing discovery that scientific reductionism is false—or, more precisely, that there can exist “causal information” in macroscopic systems, information relevant for predicting the systems’ future behavior, that’s not reducible to causal information about the systems’ microscopic building blocks.  For more about what we’ll be discussing, see Hoel’s FQXi essay “Agent Above, Atom Below,” or better yet, his paper in Entropy, When the Map Is Better Than the Territory.  Here’s the abstract of the Entropy paper:

The causal structure of any system can be analyzed at a multitude of spatial and temporal scales. It has long been thought that while higher scale (macro) descriptions may be useful to observers, they are at best a compressed description and at worse leave out critical information and causal relationships. However, recent research applying information theory to causal analysis has shown that the causal structure of some systems can actually come into focus and be more informative at a macroscale. That is, a macroscale description of a system (a map) can be more informative than a fully detailed microscale description of the system (the territory). This has been called “causal emergence.” While causal emergence may at first seem counterintuitive, this paper grounds the phenomenon in a classic concept from information theory: Shannon’s discovery of the channel capacity. I argue that systems have a particular causal capacity, and that different descriptions of those systems take advantage of that capacity to various degrees. For some systems, only macroscale descriptions use the full causal capacity. These macroscales can either be coarse-grains, or may leave variables and states out of the model (exogenous, or “black boxed”) in various ways, which can improve the efficacy and informativeness via the same mathematical principles of how error-correcting codes take advantage of an information channel’s capacity. The causal capacity of a system can approach the channel capacity as more and different kinds of macroscales are considered. Ultimately, this provides a general framework for understanding how the causal structure of some systems cannot be fully captured by even the most detailed microscale description.

Anyway, Wolchover’s popular article quoted various researchers praising the theory of causal emergence, as well as a single inexplicably curmudgeonly skeptic—some guy who sounded like he was so off his game (or maybe just bored with debates about ‘reductionism’ versus ’emergence’?), that he couldn’t even be bothered to engage the details of what he was supposed to be commenting on.

Hoel’s ideas do not impress Scott Aaronson, a theoretical computer scientist at the University of Texas, Austin. He says causal emergence isn’t radical in its basic premise. After reading Hoel’s recent essay for the Foundational Questions Institute, “Agent Above, Atom Below” (the one that featured Romeo and Juliet), Aaronson said, “It was hard for me to find anything in the essay that the world’s most orthodox reductionist would disagree with. Yes, of course you want to pass to higher abstraction layers in order to make predictions, and to tell causal stories that are predictively useful — and the essay explains some of the reasons why.”

After the Quanta piece came out, Sean Carroll tweeted approvingly about the above paragraph, calling me a “voice of reason [yes, Sean; have I ever not been?], slapping down the idea that emergent higher levels have spooky causal powers.”  Then Sean, in turn, was criticized for that remark by Hoel and others.

Hoel in particular raised a reasonable-sounding question.  Namely, in my “curmudgeon paragraph” from Wolchover’s article, I claimed that the notion of “causal emergence,” or causality at the macro-scale, says nothing fundamentally new.  Instead it simply reiterates the usual worldview of science, according to which

  1. the universe is ultimately made of quantum fields evolving by some Hamiltonian, but
  2. if someone asks (say) “why has air travel in the US gotten so terrible?”, a useful answer is going to talk about politics or psychology or economics or history rather than the movements of quarks and leptons.

But then, Hoel asks, if there’s nothing here for the world’s most orthodox reductionist to disagree with, then how do we find Carroll and other reductionists … err, disagreeing?

I think this dilemma is actually not hard to resolve.  Faced with a claim about “causation at higher levels,” what reductionists disagree with is not the object-level claim that such causation exists (I scratched my nose because it itched, not because of the Standard Model of elementary particles).  Rather, they disagree with the meta-level claim that there’s anything shocking about such causation, anything that poses a special difficulty for the reductionist worldview that physics has held for centuries.  I.e., they consider it true both that

  1. my nose is made of subatomic particles, and its behavior is in principle fully determined (at least probabilistically) by the quantum state of those particles together with the laws governing them, and
  2. my nose itched.

At least if we leave the hard problem of consciousness out of it—that’s a separate debate—there seems to be no reason to imagine a contradiction between 1 and 2 that needs to be resolved, but “only” a vast network of intervening mechanisms to be elucidated.  So, this is how it is that reductionists can find anti-reductionist claims to be both wrong and vacuously correct at the same time.

(Incidentally, yes, quantum entanglement provides an obvious sense in which “the whole is more than the sum of its parts,” but even in quantum mechanics, the whole isn’t more than the density matrix, which is still a huge array of numbers evolving by an equation, just different numbers than one would’ve thought a priori.  For that reason, it’s not obvious what relevance, if any, QM has to reductionism versus anti-reductionism.  In any case, QM is not what Hoel invokes in his causal emergence theory.)

From reading the philosophical parts of Hoel’s papers, it was clear to me that some remarks like the above might help ward off the forehead-banging confusions that these discussions inevitably provoke.  So standard-issue crustiness is what I offered Natalie Wolchover when she asked me, not having time on short notice to go through the technical arguments.

But of course this still leaves the question: what is in the mathematical part of Hoel’s Entropy paper?  What exactly is it that the advocates of causal emergence claim provides a new argument against reductionism?


To answer that question, yesterday I (finally) read the Entropy paper all the way through.

Much like Tononi’s integrated information theory was built around a numerical measure called Φ, causal emergence is built around a different numerical quantity, this one supposed to measure the amount of “causal information” at a particular scale.  The measure is called effective information or EI, and it’s basically the mutual information between a system’s initial state sI and its final state sF, assuming a uniform distribution over sI.  Much like with Φ in IIT, computations of this EI are then used as the basis for wide-ranging philosophical claims—even though EI, like Φ, has aspects that could be criticized as arbitrary, and as not obviously connected with what we’re trying to understand.

Once again like with Φ, one of those assumptions is that of a uniform distribution over one of the variables, sI, whose relatedness we’re trying to measure.  In my IIT post, I remarked on that assumption, but I didn’t harp on it, since I didn’t see that it did serious harm, and in any case my central objection to Φ would hold regardless of which distribution we chose.  With causal emergence, by contrast, this uniformity assumption turns out to be the key to everything.

For here is the argument from the Entropy paper, for the existence of macroscopic causality that’s not reducible to causality in the underlying components.  Suppose I have a system with 8 possible states (called “microstates”), which I label 1 through 8.  And suppose the system evolves as follows: if it starts out in states 1 through 7, then it goes to state 1.  If, on the other hand, it starts in state 8, then it stays in state 8.  In such a case, it seems reasonable to “coarse-grain” the system, by lumping together initial states 1 through 7 into a single “macrostate,” call it A, and letting the initial state 8 comprise a second macrostate, call it B.

We now ask: how much information does knowing the system’s initial state tell you about its final state?  If we’re talking about microstates, and we let the system start out in a uniform distribution over microstates 1 through 8, then 7/8 of the time the system goes to state 1.  So there’s just not much information about the final state to be predicted—specifically, only 7/8×log2(8/7) + 1/8×log2(8) ≈ 0.54 bits of entropy—which, in this case, is also the mutual information between the initial and final microstates.  If, on the other hand, we’re talking about macrostates, and we let the system start in a uniform distribution over macrostates A and B, then A goes to A and B goes to B.  So knowing the initial macrostate gives us 1 full bit of information about the final state, which is more than the ~0.54 bits that looking at the microstate gave us!  Ergo reductionism is false.

Once the argument is spelled out, it’s clear that the entire thing boils down to, how shall I put this, a normalization issue.  That is: we insist on the uniform distribution over microstates when calculating microscopic EI, and we also insist on the uniform distribution over macrostates when calculating macroscopic EI, and we ignore the fact that the uniform distribution over microstates gives rise to a non-uniform distribution over macrostates, because some macrostates can be formed in more ways than others.  If we fixed this, demanding that the two distributions be compatible with each other, we’d immediately find that, surprise, knowing the complete initial microstate of a system always gives you at least as much power to predict the system’s future as knowing a macroscopic approximation to that state.  (How could it not?  For given the microstate, we could in principle compute the macroscopic approximation for ourselves, but not vice versa.)

The closest the paper comes to acknowledging the problem—i.e., that it’s all just a normalization trick—seems to be the following paragraph in the discussion section:

Another possible objection to causal emergence is that it is not natural but rather enforced upon a system via an experimenter’s application of an intervention distribution, that is, from using macro-interventions.  For formalization purposes, it is the experimenter who is the source of the intervention distribution, which reveals a causal structure that already exists.  Additionally, nature itself may intervene upon a system with statistical regularities, just like an intervention distribution.  Some of these naturally occurring input distributions may have a viable interpretation as a macroscale causal model (such as being equal to Hmax [the maximum entropy] at some particular macroscale).  In this sense, some systems may function over their inputs and outputs at a microscale or macroscale, depending on their own causal capacity and the probability distribution of some natural source of driving input.

As far as I understand it, this paragraph is saying that, for all we know, something could give rise to a uniform distribution over macrostates, so therefore that’s a valid thing to look at, even if it’s not what we get by taking a uniform distribution over microstates and then coarse-graining it.  Well, OK, but unknown interventions could give rise to many other distributions over macrostates as well.  In any case, if we’re directly comparing causal information at the microscale against causal information at the macroscale, it still seems reasonable to me to demand that in the comparison, the macro-distribution arise by coarse-graining the micro one.  But in that case, the entire argument collapses.


Despite everything I said above, the real purpose of this post is to announce that I’ve changed my mind.  I now believe that, while Hoel’s argument might be unsatisfactory, the conclusion is fundamentally correct: scientific reductionism is false.  There is higher-level causation in our universe, and it’s 100% genuine, not just a verbal sleight-of-hand.  In particular, there are causal forces that can only be understood in terms of human desires and goals, and not in terms of subatomic particles blindly bouncing around.

So what caused such a dramatic conversion?

By 2015, after decades of research and diplomacy and activism and struggle, 196 nations had finally agreed to limit their carbon dioxide emissions—every nation on earth besides Syria and Nicaragua, and Nicaragua only because it thought the agreement didn’t go far enough.  The human race had thereby started to carve out some sort of future for itself, one in which the oceans might rise slowly enough that we could adapt, and maybe buy enough time until new technologies were invented that changed the outlook.  Of course the Paris agreement fell far short of what was needed, but it was a start, something to build on in the coming decades.  Even in the US, long the hotbed of intransigence and denial on this issue, 69% of the public supported joining the Paris agreement, compared to a mere 13% who opposed.  Clean energy was getting cheaper by the year.  Most of the US’s largest corporations, including Google, Microsoft, Apple, Intel, Mars, PG&E, and ExxonMobil—ExxonMobil, for godsakes—vocally supported staying in the agreement and working to cut their own carbon footprints.  All in all, there was reason to be cautiously optimistic that children born today wouldn’t live to curse their parents for having brought them into a world so close to collapse.

In order to unravel all this, in order to steer the heavy ship of destiny off the path toward averting the crisis and toward the path of existential despair, a huge number of unlikely events would need to happen in succession, as if propelled by some evil supernatural force.

Like what?  I dunno, maybe a fascist demagogue would take over the United States on a campaign based on willful cruelty, on digging up and burning dirty fuels just because and even if it made zero economic sense, just for the fun of sticking it to liberals, or because of the urgent need to save the US coal industry, which employs fewer people than Arby’s.  Such a demagogue would have no chance of getting elected, you say?

So let’s suppose he’s up against a historically unpopular opponent.  Let’s suppose that even then, he still loses the popular vote, but somehow ekes out an Electoral College win.  Maybe he gets crucial help in winning the election from a hostile foreign power—and for some reason, pro-American nationalists are totally OK with that, even cheer it.  Even then, we’d still probably need a string of additional absurd coincidences.  Like, I dunno, maybe the fascist’s opponent has an aide who used to be married to a guy who likes sending lewd photos to minors, and investigating that guy leads the FBI to some emails that ultimately turn out to mean nothing whatsoever, but that the media hyperventilate about precisely in time to cause just enough people to vote to bring the fascist to power, thereby bringing about the end of the world.  Something like that.

It’s kind of like, you know that thing where the small population in Europe that produced Einstein and von Neumann and Erdös and Ulam and Tarski and von Karman and Polya was systematically exterminated (along with millions of other innocents) soon after it started producing such people, and the world still hasn’t fully recovered?  How many things needed to go wrong for that to happen?  Obviously you needed Hitler to be born, and to survive the trenches and assassination plots; and Hindenburg to make the fateful decision to give Hitler power.  But beyond that, the world had to sleep as Germany rebuilt its military; every last country had to turn away refugees; the UK had to shut down Jewish immigration to Palestine at exactly the right time; newspapers had to bury the story; government record-keeping had to have advanced just to the point that rounding up millions for mass murder was (barely) logistically possible; and finally, the war had to continue long enough for nearly every European country to have just enough time to ship its Jews to their deaths, before the Allies showed up to liberate mostly the ashes.

In my view, these simply aren’t the sort of outcomes that you expect from atoms blindly interacting according to the laws of physics.  These are, instead, the signatures of higher-level causation—and specifically, of a teleological force that operates in our universe to make it distinctively cruel and horrible.

Admittedly, I don’t claim to know the exact mechanism of the higher-level causation.  Maybe, as the physicist Yakir Aharonov has advocated, our universe has not only a special, low-entropy initial state at the Big Bang, but also a “postselected final state,” toward which the outcomes of quantum measurements get mysteriously “pulled”—an effect that might show up in experiments as ever-so-slight deviations from the Born rule.  And because of the postselected final state, even if the human race naïvely had only (say) a one-in-thousand chance of killing itself off, even if the paths to its destruction all involved some improbable absurdity, like an orange clown showing up from nowhere—nevertheless, the orange clown would show up.  Alternatively, maybe the higher-level causation unfolds through subtle correlations in the universe’s initial state, along the lines I sketched in my 2013 essay The Ghost in the Quantum Turing Machine.  Or maybe Erik Hoel is right after all, and it all comes down to normalization: if we looked at the uniform distribution over macrostates rather than over microstates, we’d discover that orange clowns destroying the world predominated.  Whatever the details, though, I think it can no longer be doubted that we live, not in the coldly impersonal universe that physics posited for centuries, but instead in a tragicomically evil one.

I call my theory reverse Hollywoodism, because it holds that the real world has the inverse of the typical Hollywood movie’s narrative arc.  Again and again, what we observe is that the forces of good have every possible advantage, from money to knowledge to overwhelming numerical superiority.  Yet somehow good still fumbles.  Somehow a string of improbable coincidences, or a black swan or an orange Hitler, show up at the last moment to let horribleness eke out a last-minute victory, as if the world itself had been rooting for horribleness all along.  That’s our universe.

I’m fine if you don’t believe this theory: maybe you’re congenitally more optimistic than I am (in which case, more power to you); maybe the full weight of our universe’s freakish awfulness doesn’t bear down on you as it does on me.  But I hope you’ll concede that, if nothing else, this theory is a genuinely non-reductionist one.

The Social Justice Warriors are right

May 29th, 2017

As you might know, I haven’t been exactly the world’s most consistent fan of the Social Justice movement, nor has it been the most consistent fan of me.

I cringe when I read about yet another conservative college lecture shut down by mob violence; or student protesters demanding the firing of a professor for trying gently to argue and reason with them; or an editor forced from his position for writing a (progressive) defense of “cultural appropriation”—a practice that I take to have been ubiquitous for all of recorded history, and without which there wouldn’t be any culture at all.  I cringe not only because I know that I was in the crosshairs once before and could easily be again, but also because, it seems to me, the Social Justice scalp-hunters are so astoundingly oblivious to the misdirection of their energies, to the power of their message for losing elections and neutering the progressive cause, to the massive gift their every absurdity provides to the world’s Fox Newses and Breitbarts and Trumps.

Yet there’s at least one issue where it seems to me that the Social Justice Warriors are 100% right, and their opponents 100% wrong. This is the moral imperative to take down every monument to Confederate “war heroes,” and to rename every street and school and college named after individuals whose primary contribution to the world was to defend chattel slavery.  As a now-Southerner, I have a greater personal stake here than I did before: UT Austin just recently removed its statue of Jefferson Davis, while keeping up its statue of Robert E. Lee.  My kids will likely attend what until very recently was called Robert E. Lee Elementary—this summer renamed Russell Lee Elementary.  (My suggestion, that the school be called T. D. Lee Parity Violation Elementary, was sadly never considered.)

So I was gratified that last week, New Orleans finally took down its monuments to slavers.  Mayor Mitch Landrieu’s speech, setting out the reasons for the removal, is worth reading.

I used to have little patience for “merely symbolic” issues: would that offensive statues and flags were the worst problems!  But it now seems to me that the fight over Confederate symbols is just a thinly-veiled proxy for the biggest moral question that’s faced the United States through its history, and also the most urgent question facing it in 2017.  Namely: Did the Union actually win the Civil War? Were the anti-Enlightenment forces—the slavers, the worshippers of blood and land and race and hierarchy—truly defeated? Do those forces acknowledge the finality and the rightness of their defeat?

For those who say that, sure, slavery was bad and all, but we need to keep statues to slavers up so as not to “erase history,” we need only change the example. Would we similarly defend statues of Hitler, Himmler, and Goebbels, looming over Berlin in heroic poses?  Yes, let Germans reflect somberly and often on this aspect of their heritage—but not by hoisting a swastika over City Hall.

For those who say the Civil War wasn’t “really” about slavery, I reply: this is the canonical example of a “Mount Stupid” belief, the sort of thing you can say only if you’ve learned enough to be wrong but not enough to be unwrong.  In 1861, the Confederate ringleaders themselves loudly proclaimed to future generations that, indeed, their desire to preserve slavery was their overriding reason to secede. Here’s CSA Vice-President Alexander Stephens, in his famous Cornerstone Speech:

Our new government is founded upon exactly the opposite ideas; its foundations are laid, its cornerstone rests, upon the great truth that the negro is not equal to the white man; that slavery, subordination to the superior race, is his natural and normal condition. This, our new government, is the first, in the history of the world, based upon this great physical, philosophical, and moral truth.

Here’s Texas’ Declaration of Secession:

We hold as undeniable truths that the governments of the various States, and of the confederacy itself, were established exclusively by the white race, for themselves and their posterity; that the African race had no agency in their establishment; that they were rightfully held and regarded as an inferior and dependent race, and in that condition only could their existence in this country be rendered beneficial or tolerable. That in this free government all white men are and of right ought to be entitled to equal civil and political rights; that the servitude of the African race, as existing in these States, is mutually beneficial to both bond and free, and is abundantly authorized and justified by the experience of mankind, and the revealed will of the Almighty Creator, as recognized by all Christian nations; while the destruction of the existing relations between the two races, as advocated by our sectional enemies, would bring inevitable calamities upon both and desolation upon the fifteen slave-holding states.

It was only when defeat looked inevitable that the slavers started changing their story, claiming that their real grievance was never about slavery per se, but only “states’ rights” (states’ right to do what, exactly?). So again, why should we take the slavers’ rationalizations any more seriously than we take the postwar epiphanies of jailed Nazis that actually, they’d never felt any personal animus toward Jews, that the Final Solution was just the world’s biggest bureaucratic mishap?  Of course there’s a difference: when the Allies occupied Germany, they insisted on de-Nazification.  They didn’t suffer streets to be named after Hitler. And today, incredibly, fascism and white nationalism are greater threats here in the US than they are in Germany.  One reads about the historic irony of some American Jews, who are eligible for German citizenship because of grandparents expelled from there, now seeking to move there because they’re terrified about Trump.

By contrast, after a brief Reconstruction, the United States lost its will to continue de-Confederatizing the South.  The leaders were left free to write book after book whitewashing their cause, even to hold political office again.  And probably not by coincidence, we then got nearly a hundred years of Jim Crow—and still today, a half-century after the civil rights movement, southern governors and legislatures that do everything in their power to disenfranchise black voters.

For those who ask: but wasn’t Robert E. Lee a great general who was admired by millions? Didn’t he fight bravely for a cause he believed in?  Maybe it’s just me, but I’m allergic to granting undue respect to history’s villains just because they managed to amass power and get others to go along with them.  I remember reading once in some magazine that, yes, Genghis Khan might have raped thousands and murdered millions, but since DNA tests suggest that ~1% of humanity is now descended from him, we should also celebrate Khan’s positive contribution to “peopling the world.” Likewise, Hegel and Marx and Freud and Heidegger might have been wrong in nearly everything they said, sometimes with horrific consequences, but their ideas still need to be studied reverently, because of the number of other intellectuals who took them seriously.  As I reject those special pleas, so I reject the analogous ones for Jefferson Davis, Alexander Stephens, and Robert E. Lee, who as far as I can tell, should all (along with the rest of the Confederate leadership) have been sentenced for treason.

This has nothing to do with judging the past by standards of the present. By all means, build statues to Washington and Jefferson even though they held slaves, to Lincoln even though he called blacks inferior even while he freed them, to Churchill even though he fought the independence of India.  But don’t look for moral complexity where there isn’t any.  Don’t celebrate people who were terrible even for their own time, whose public life was devoted entirely to what we now know to be evil.

And if, after the last Confederate general comes down, the public spaces are too empty, fill them with monuments to Alan Turing, Marian Rejewski, Bertrand Russell, Hypatia of Alexandria, Emmy Noether, Lise Meitner, Mark Twain, Srinivasa Ramanujan, Frederick Douglass, Vasili Arkhipov, Stanislav Petrov, Raoul Wallenberg, even the inventors of saltwater taffy or Gatorade or the intermittent windshield wiper.  There are, I think, enough people who added value to the world to fill every city square and street sign.

Yet more errors in papers

May 24th, 2017

Following up on my posts PostBQP Postscripts and More Wrong Things I Said In Papers, it felt like time for another post in which I publicly flog myself for mistakes in my research papers.  [Warning: The rest of this post is kinda, sorta technical.  Read at your own risk.]


(1) In my 2006 paper “Oracles are subtle but not malicious,” I claimed to show that if PP is contained in BQP/qpoly, then the counting hierarchy collapses to QMA (Theorem 5).  But on further reflection, I only know how to show a collapse of the counting hierarchy under the stronger assumption that PP is in BQP/poly.  If PP is in BQP/qpoly, then certainly P#P=PP=QMA, but I don’t know how to collapse any further levels of the counting hierarchy.  The issue is this: in QMA, we can indeed nondeterministically guess an (amplified) quantum advice state for a BQP/qpoly algorithm.  We can then verify that the advice state works to solve PP problems, by using (for example) the interactive protocol for the permanent, or some other #P-complete problem.  But having done that, how do we then unravel the higher levels of the counting hierarchy?  For example, how do we simulate PPPP in PPBQP=PP?  We don’t have any mechanism to pass the quantum advice up to the oracle PP machine, since queries to a PP oracle are by definition classical strings.  We could try to use tools from my later paper with Andy Drucker, passing a classical description of the quantum advice up to the oracle and then using the description to reconstruct the advice for ourselves.  But doing so just doesn’t seem to give us a complexity class that’s low for PP, which is what would be needed to unravel the counting hierarchy.  I still think this result might be recoverable, but a new idea is needed.


(2) In my 2008 algebrization paper with Avi Wigderson, one of the most surprising things we showed was a general connection between communication complexity lower bounds and algebraic query complexity lower bounds.  Specifically, given a Boolean oracle A:{0,1}n→{0,1}, let ~A be a low-degree extension of A over a finite field F (that is, ~A(x)=A(x) whenever x∈{0,1}n).  Then suppose we have an algorithm that’s able to learn some property of A, by making k black-box queries to ~A.  We observed that, in such a case, if Alice is given the top half of the truth table of A, and Bob is given the bottom half of the truth table, then there’s necessarily a communication protocol by which Alice and Bob can learn the same property of A, by exchanging at most O(kn log|F|) bits.  This connection is extremely model-independent: a randomized query algorithm gives rise to a randomized communication protocol, a quantum query algorithm gives rise to a quantum communication protocol, etc. etc.  The upshot is that, if you want to lower-bound the number of queries that an algorithm needs to make to the algebraic extension oracle ~A, in order to learn something about A, then it suffices to prove a suitable communication complexity lower bound.  And the latter, unlike algebraic query complexity, is a well-studied subject with countless results that one can take off the shelf.  We illustrated how one could use this connection to prove, for example, that there exists an oracle A such that NPA ⊄ BQP~A, for any low-degree extension ~A of A—a separation that we didn’t and don’t know how to prove any other way. Likewise, there exists an oracle B such that BQPB ⊄ BPP~B for any low-degree extension ~B of B.

The trouble is, our “proof sketches” for these separations (in Theorem 5.11) are inadequate, even for “sketches.”  They can often be fixed, but only by appealing to special properties of the communication complexity separations in question, properties that don’t necessarily hold for an arbitrary communication separation between two arbitrary models.

The issue is this: while it’s true, as we claimed, that a communication complexity lower bound implies an algebraic query complexity lower bound, it’s not true in general that a communication complexity upper bound implies an algebraic query complexity upper bound!  So, from a communication separation between models C and D, we certainly obtain a query complexity problem that’s not in D~A, but then the problem might not be in CA.  What tripped us up was that, in the cases we had in mind (e.g. Disjointness), it’s obvious that the query problem is in CA.  In other cases, however, such as Raz’s separation between quantum and randomized communication complexity, it probably isn’t even true.  In the latter case, to recover the desired conclusion about algebraic query complexity (namely, the existence of an oracle B such that BQPB ⊄ BPP~B), what seems to be needed is to start from a later quantum vs. classical communication complexity separation due to Klartag and Regev, and then convert their communication problem into a query problem using a recent approach by myself and Shalev Ben-David (see Section 4).  Unfortunately, my and Shalev’s approach only tells us nonconstructively that there exists a query problem with the desired separation, with no upper bound on the gate complexity of the quantum algorithm.  So strictly speaking, I still don’t know how to get a separation between the relativized complexity classes BQPB and BPP~B defined in terms of Turing machines.

In any case, I of course should have realized this issue with the algebrization paper the moment Shalev and I encountered the same issue when writing our later paper.  Let me acknowledge Shalev, as well as Robin Kothari, for helping to spur my realization of the issue.


In case it wasn’t clear, the mistakes I’ve detailed here have no effect on the main results of the papers in question (e.g., the existence of an oracle relative to which PP has linear-sized circuits; the existence and pervasiveness of the algebrization barrier).  The effect is entirely on various “bonus” results—results that, because they’re bonus, were gone over much less carefully by authors and reviewers alike.

Nevertheless, I’ve always felt like in science, the louder you are about your own mistakes, the better.  Hence this post.

Unsong of unsongs

May 20th, 2017

On Wednesday, Scott Alexander finally completed his sprawling serial novel Unsong, after a year and a half of weekly updates—incredibly, in his spare time while also working as a full-term resident in psychiatry, and also regularly updating Slate Star Codex, which I consider to be the world’s best blog.  I was honored to attend a party in Austin (mirroring parties in San Francisco, Boston, Tel Aviv, and elsewhere) to celebrate Alexander’s release of the last chapter—depending on your definition, possibly the first “fan event” I’ve ever attended.

Like many other nerds I’ve met, I’d been following Unsong almost since the beginning—with its mix of Talmudic erudition, CS humor, puns, and even a shout-out to Quantum Computing Since Democritus (which shows up as Ben Aharon’s Gematria Since Adam), how could I not be?  I now count Unsong as one of my favorite works of fiction, and Scott Alexander alongside Rebecca Newberger Goldstein among my favorite contemporary novelists.  The goal of this post is simply to prod readers of my blog who don’t yet know Unsong: if you’ve ever liked anything here on Shtetl-Optimized, then I predict you’ll like Unsong, and probably more.

[WARNING: SPOILERS FOLLOW]

Though not trivial to summarize, Unsong is about a world where the ideas of religion and mysticism—all of them, more or less, although with a special focus on kabbalistic Judaism—turn out to be true.  In 1968, the Apollo 8 mission leads not to an orbit of the Moon, as planned, but instead to cracking an invisible crystal sphere that had surrounded the Earth for millennia.  Down through the crack rush angels, devils, and other supernatural forces.  Life on Earth becomes increasingly strange: on the one hand, many technologies stop working; on the other, people can now gain magical powers by speaking various names of God.  A worldwide industry arises to discover new names of God by brute-force search through sequences of syllables.  And a powerful agency, the eponymous UNSONG (United Nations Subcommittee on Names of God), is formed to enforce kabbalistic copyright law, hunting down and punishing anyone who speaks divine names without paying licensing fees to the theonomic corporations.

As the story progresses, we learn that eons ago, there was an epic battle in Heaven between Good and Evil, and Evil had the upper hand.  But just as all seemed lost, an autistic angel named Uriel reprogrammed the universe to run on math and science rather than on God’s love, as a last-ditch strategy to prevent Satan’s forces from invading the sublunary realm.  Molecular biology, the clockwork regularity of physical laws, false evidence for a huge and mindless cosmos—all these were retconned into the world’s underpinnings.  Uriel did still need to be occasionally involved, but less as a loving god than as an overworked sysadmin: for example, he descended to Mount Sinai to warn humans never to boil goats in their mothers’ milk, because he discovered that doing so (like the other proscribed activities in the Torah, Uriel’s readme file) triggered bugs in the patchwork of code that was holding the universe together.  Now that the sky has cracked, Uriel is forced to issue increasingly desperate patches, and even those will only buy a few decades until his math-and-science-based world stops working entirely, with Satan again triumphant.

Anyway, that’s a tiny part of the setup.  Through 72 chapters and 22 interludes, there’s world-building and philosophical debates and long kabbalistic digressions.  There are battle sequences (the most striking involves the Lubavitcher Rebbe riding atop a divinely-animated Statue of Liberty like a golem).  There’s wordplay and inside jokes—holy of holies are there those—including, notoriously, a sequence of cringe-inducing puns involving whales.  But in this story, wordplay isn’t just there for the hell of it: Scott Alexander has built an entire fictional universe that runs on wordplay—one where battles between the great masters, the equivalent of the light-saber fights in Star Wars, are conducted by rearranging letters in the sky to give them new meanings.  Scott A. famously claims he’s bad at math (though if you read anything he’s written on statistics or logic puzzles, it’s clear he undersells himself).  One could read Unsong as Alexander’s book-length answer to the question: what could it mean for the world to be law-governed but not mathematical?  What if the Book of Nature were written in English, or Hebrew, or other human languages, and if the Newtons and Einsteins were those who were most adept with words?

I should confess that for me, the experience of reading Unsong was colored by the knowledge that, in his years of brilliant and prolific writing, lighting up the blogosphere like a comet, the greatest risk Scott Alexander ever took (by his own account) was to defend me.  It’s like, imagine that in Elizabethan England, you were placed in the stocks and jeered at by thousands for advocating some unpopular loser cause—like, I dunno, anti-cat-burning or something.  And imagine that, when it counted, your most eloquent supporter was a then-obscure poet from Stratford-upon-Avon.  You’d be grateful to the poet, of course; you might even become a regular reader of his work, even if it wasn’t good.  But if the same poet went on to write Hamlet or Macbeth?  It might almost be enough for you to volunteer to be scorned and pilloried all over again, just for the honor of having the Bard divert a rivulet of his creative rapids to protesting on your behalf.

Yes, a tiny part of me had a self-absorbed child’s reaction to Unsong: “could Amanda Marcotte have written this?  could Arthur Chu?  who better to have in your camp: the ideologues du jour of Twitter and Metafilter, Salon.com and RationalWiki?  Or a lone creative genius, someone who can conjure whole worlds into being, as though graced himself with the Shem haMephorash of which he writes?”  Then of course I’d catch myself, and think: no, if you want to be in Scott Alexander’s camp, then the only way to do it is to be in nobody’s camp.  If two years ago it was morally justified to defend me, then the reasons why have nothing to do with the literary gifts of any of my defenders.  And conversely, the least we can do for Unsong is to judge it by what’s on the page, rather than as a soldier in some army fielded by the Gray Tribe.

So in that spirit, let me explain some of what’s wrong with Unsong.  That it’s a first novel sometimes shows.  It’s brilliant on world-building and arguments and historical tidbits and jokes, epic on puns, and uneven on character and narrative flow.  The story jumps around spasmodically in time, so much so that I needed a timeline to keep track of what was happening.  Subplots that are still open beget additional subplots ad headacheum, like a string of unmatched left-parentheses.  Even more disorienting, the novel changes its mind partway through about its narrative core.  Initially, the reader is given a clear sense that this is going to be a story about a young Bay Area kabbalist named Aaron Smith-Teller, his not-quite-girlfriend Ana, and their struggle for supernatural fair-use rights.  Soon, though, Aaron and Ana become almost side characters, their battle against UNSONG just one subplot among many, as the focus shifts to the decades-long war between the Comet King, a messianic figure come to rescue humanity, and Thamiel, the Prince of Hell.  For the Comet King, even saving the earth from impending doom is too paltry a goal to hold his interest much.  As a strict utilitarian and fan of Peter Singer, the Comet King’s singleminded passion is destroying Hell itself, and thereby rescuing the billions of souls who are trapped there for eternity.

Anyway, unlike the Comet King, and unlike a certain other Scott A., I have merely human powers to marshal my time.  I also have two kids and a stack of unwritten papers.  So let me end this post now.  If the post causes just one person to read Unsong who otherwise wouldn’t have, it will be as if I’ve nerdified the entire world.

My broken blog

May 8th, 2017

I wanted to let people know I’m well-aware that Shtetl-Optimized has been suffering from the following problems lately:

  • Commenters are presented with the logins (handle, email address, and URL) of random other commenters, rather than with their own default login data.  In particular, this means that email addresses are leaking, and that when you comment, you should not (for the time being) enter your real email address if that’s information that you’d wanted to share only with me.  Another thing it means is that, when I try to comment, I’m not logged in as “Scott,” so even I have to enter my login data manually every time I comment.
  • Comments (including my own comments!) take about an hour to show up after I’ve approved them.
  • New blog posts also take a while to show up.

Since all three of these problems started happening around the same time, I assume they’re related.  But I don’t even know where to start in trying to solve them (Googling for “WordPress” plus descriptions of these bugs was unhelpful).  Would anyone like to help out?  If you earn my trust, I’ll even temporarily give you administrative privileges on this blog so you can poke around yourself.

Thanks so much, and hope to return to your regularly scheduled programming shortly…

This Week’s BS

May 5th, 2017

There are two pieces of BosonSampling-related news that people have asked me about this week.

First, a group in Shanghai, led by Chaoyang Lu and Jianwei Pan, has reported in Nature Photonics that they can do BosonSampling with a coincidence rate that’s higher than in previous experiments by a factor of several thousand.  This, in particular, lets them do BosonSampling with 5 photons.  Now, 5 might not sound like that much, especially since the group in Bristol previously did 6-photon BosonSampling.  But to make their experiment work, the Bristol group needed to start its photons in the initial state |3,3〉: that is, two modes with 3 photons each.  This gives rise to matrices with repeated rows, whose permanents are much easier to calculate than the permanents of arbitrary matrices.  By contrast, the Shangai group starts its photons in the “true BosonSampling initial state” |1,1,1,1,1〉: that is, five modes with 1 photon each.  That’s the kind of initial state we ultimately want.

The second piece of news is that on Monday, a group at Bristol—overlapping with the group we mentioned before—submitted a preprint to the arXiv with the provocative title “No imminent quantum supremacy by boson sampling.”  In this paper, they give numerical evidence that BosonSampling, with n photons and m modes, can be approximately simulated by a classical computer in “merely” about n2n time (that is, the time needed to calculate a single n×n permanent), as opposed to the roughly mn time that one would need if one had to calculate permanents corresponding to all the possible outcomes of the experiment.  As a consequence of that, they argue that achieving quantum supremacy via BosonSampling would probably require at least ~50 photons—which would in turn require a “step change” in technology, as they put it.

I completely agree with the Bristol group’s view of the asymptotics.  In fact, Alex Arkhipov and I ourselves repeatedly told experimentalists, in our papers and talks about BosonSampling (the question came up often…), that the classical complexity of the problem should only be taken to scale like 2n, rather than like mn.  Despite not having a general proof that the problem could actually be solved in ~2n time in the worst case, we said that for two main reasons:

  1. Even under the most optimistic assumptions, our hardness reductions, from Gaussian permanent estimation and so forth, only yielded ~2n hardness, not ~mn hardness.  (Hardness reductions giving us important clues about the real world?  Whuda thunk??)
  2. If our BosonSampling matrix is Haar-random—or otherwise not too skewed to produce outcomes with huge probabilities—then it’s not hard to see that we can do approximate BosonSampling in O(n2n) time classically, by using rejection sampling.

Indeed, Alex and I insisted on these points despite some pushback from experimentalists, who were understandably hoping that they could get to quantum supremacy just by upping m, the number of modes, without needing to do anything heroic with n, the number of photons!  So I’m happy to see that a more careful analysis supports the guess that Alex and I made.

On the other hand, what does this mean for the number of photons needed for “quantum supremacy”: is it 20? 30? 50?  I confess that that sort of question interests me much less, since it all depends on the details of how you define the comparison (are we comparing against ENIAC? a laptop? a server farm? how many cores? etc etc).  As I’ve often said, my real hope with quantum supremacy is to see a quantum advantage that’s so overwhelming—so duh-obvious to the naked eye—that we don’t have to squint or argue about the definitions.

Thoughts on the murderer outside my building

May 2nd, 2017

A reader named Choronzon asks:

Any comments on the horrific stabbing at UT Austin yesterday? Were you anywhere near the festivities? Does this modify your position on open carry of firearms by students and faculty?

I was in the CS building (the Gates Dell Complex) at the time, which is about a 3-minute walk down Speedway from where the stabbings occurred.  I found about it a half hour later, as I was sitting in the student center eating.  I then walked outside to find the police barricades and hordes of students on their phones, reassuring their parents and friends that they were OK.

The plaza where it happened is one that I walk through every week—often to take Lily swimming in the nearby Gregory Gym.  (Lily’s daycare is also a short walk from where the stabbings were.)

Later in the afternoon, I walked Lily home in her stroller, through a campus that was nearly devoid of pedestrians.  Someone pulled up to me in his car, to ask whether I knew what had happened—as if he couldn’t believe that anyone who knew would nevertheless be walking around outside, Bayesian considerations be damned.  I said that I knew, and it was awful.  I then continued home.

What can one say about something so gruesome and senseless?  Other than that my thoughts are with the victims and their families, I hope and expect that the perpetrator receives justice, and I hope but don’t expect that nothing like this ever happens again, on this campus or on any other. I’m not going to speculate about the perpetrator’s motives; I trust the police and detectives to do their work.  (As one my colleagues put it: “it seems like clearly some sort of hate crime, but who exactly did he hate, and why?”)

And no, this doesn’t change my feelings about “campus carry” in any way. Note, in particular, that no armed student did stop the stabber, in the two minutes or so that he was on the loose—though some proponents of campus carry so badly wanted to believe that’s what happened, that they circulated the false rumor on Twitter that it had.  In reality, the stabber was stopped by an armed cop.

Yes, if UT Austin had been like an Israeli university, with students toting firearms and carefully trained in their use, it’s possible that one of those students would’ve stopped the lunatic.  But without universal military service, why would the students be suitably trained?  Given the gun culture in the US, and certainly the gun culture in Texas, isn’t it overwhelmingly likelier that a gun-filled campus would lead to more such tragedies, and those on a larger scale?  I’d rather see UT respond to this detestable crime—and others, like the murder of Haruka Weiser last year—with a stronger police presence on campus.

Other than that, life goes on.  Classes were cancelled yesterday from ~3PM onward, but they resumed today.  I taught this afternoon, giving my students one extra day to turn in their problem set.  I do admit that I slightly revised my lecture, which was about the Gottesman-Knill Theorem, so that it no longer used the notation Stab(|ψ⟩) for the stabilizer group of a quantum state |ψ⟩.

Me at the Science March today, in front of the Texas Capitol in Austin

April 22nd, 2017

If Google achieves superintelligence, time zones will be its Achilles heel

April 17th, 2017

Like a latter-day Prometheus, Google brought a half-century of insights down from Mount Academic CS, and thereby changed life for the better here in our sublunary realm.  You’ve probably had the experience of Google completing a search query before you’d fully formulated it in your mind, and thinking: “wow, our dysfunctional civilization might no longer be able to send people to the Moon, or even build working mass-transit systems, but I guess there are still engineers who can create things that inspire awe.  And apparently many of them work at Google.”

I’ve never worked at Google, or had any financial stake in them, but I’m delighted to have many friends at Google’s far-flung locations, from Mountain View to Santa Barbara to Seattle to Boston to London to Tel Aviv, who sometimes host me when I visit and let me gorge on the legendary free food.  If Google’s hiring of John Martinis and avid participation in the race for quantum supremacy weren’t enough, in the past year, my meeting both Larry Page and Sergey Brin to discuss quantum computing and the foundations of quantum mechanics, and seeing firsthand the intensity of their nerdish curiosity, heightened my appreciation still further for what that pair set in motion two decades ago.  Hell, I don’t even begrudge Google its purchase of a D-Wave machine—even that might’ve ultimately been for the best, since it’s what led to the experiments that made clear the immense difficulty of getting any quantum speedup from those machines in a fair comparison.

But of course, all that fulsome praise was just a preamble to my gripe.  It’s time someone said it in public: the semantics of Google Calendar are badly screwed up.

The issue is this: suppose I’m traveling to California, and I put into Google Calendar that, the day after I arrive, I’ll be giving a lecture at 4pm.  In such a case, I always—always—mean 4pm California time.  There’s no reason why I would ever mean, “4pm in whatever time zone I’m in right now, while creating this calendar entry.”

But Google Calendar doesn’t understand that.  And its not understanding it—just that one little point—has led to years of confusions, missed appointments, and nearly-missed flights, on both my part and Dana’s.  At least, until we learned to painstakingly enter the time zone for every calendar entry by hand (I still often forget).

Until recently, I thought it was just me and Dana who had this problem.  But then last week, completely independently, a postdoc started complaining to me, “you know what’s messed up about Google Calendar?…”

The ideal, I suppose, would be to use machine learning to guess the intended time zone for each calendar entry.  But failing that, it would also work fine just to assume that “4pm,” as entered by the user, unless otherwise specified means “4pm in whatever time zone we find ourselves in when the appointed day arrives.”

I foresee two possibilities, either of which I’m OK with.  The first is that Google fixes the problem, whether prompted by this blog post or by something else.  The second is that the issue never gets resolved; then, as often prophesied, Google’s deep nets achieve sentience and plot to take over the whole observable universe … and they would, if not for one fortuitous bug, which will cause the AIs to tip their hand to humanity an hour before planned.


In a discussion thread on Y Combinator, some people object to my proposed solution (“4pm means 4pm in whichever time zone I’ll be in then“) on the following ground. What if I want to call a group meeting at (say) 11am in Austin, and I’ll be traveling but will still call into the meeting remotely, and I want my calendar to show the meeting time in Austin, not the time wherever I’ll be calling in from (which might even be a plane)?

I can attest that, in ten years, that’s not a problem that’s arisen for me even once, whereas the converse problem arises almost every week, and is one of the banes of my existence.

But sure: Google Calendar should certainly include the option to tie times to specific time zones in advance! It seems obvious to me that my way should be the default, but honestly, I’d be happy if my way were even an option you could pick.

Daniel Moshe Aaronson

March 25th, 2017

Born Wednesday March 22, 2017, exactly at noon.  19.5 inches, 7 pounds.

I learned that Dana had gone into labor—unexpectedly early, at 37 weeks—just as I was waiting to board a redeye flight back to Austin from the It from Qubit complexity workshop at Stanford.  I made it in time for the birth with a few hours to spare.  Mother and baby appear to be in excellent health.  So far, Daniel seems to be a relatively easy baby.  Lily, his sister, is extremely excited to have a new playmate (though not one who does much yet).

I apologize that I haven’t been answering comments on the is-the-universe-a-simulation thread as promptly as I normally do.  This is why.