Can we reverse time to before this hypefest started?

March 15th, 2019

The purpose of this post is mostly just to signal-boost Konstantin Kakaes’s article in MIT Technology Review, entitled “No, scientists didn’t just ‘reverse time’ with a quantum computer.” The title pretty much says it all—but if you want more, you should read the piece, which includes the following droll quote from some guy calling himself “Director of the Quantum Information Center at the University of Texas at Austin”:

If you’re simulating a time-reversible process on your computer, then you can ‘reverse the direction of time’ by simply reversing the direction of your simulation. From a quick look at the paper, I confess that I didn’t understand how this becomes more profound if the simulation is being done on IBM’s quantum computer.

Incredibly, the time-reversal claim has now gotten uncritical attention in Newsweek, Discover, Cosmopolitan, my Facebook feed, and elsewhere—hence this blog post, which has basically no content except “the claim to have ‘reversed time,’ by running a simulation backwards, is exactly as true and as earth-shattering as a layperson might think it is.”

If there’s anything interesting here, I suppose it’s just that “scientists use a quantum computer to reverse time” is one of the purest examples I’ve ever seen of a scientific claim that basically amounts to a mind-virus or meme optimized for sharing on social media—discarding all nontrivial “science payload” as irrelevant to its propagation.

“Quantum Computing and the Meaning of Life”

March 13th, 2019

Manolis Kellis is a computational biologist at MIT, known as one of the leaders in applying big data to genomics and gene regulatory networks. Throughout my 9 years at MIT, Manolis was one of my best friends there, even though our research styles and interests might seem distant. He and I were in the same PECASE class; see if you can spot us both in this photo (in the rows behind America’s last sentient president). My and Manolis’s families also became close after we both got married and had kids. We still keep in touch.

Today Manolis will be celebrating his 42nd birthday, with a symposium on the meaning of life (!). He asked his friends and colleagues to contribute talks and videos reflecting on that weighty topic.

Here’s a 15-minute video interview that Manolis and I recorded last night, where he asks me to pontificate about the implications of quantum mechanics for consciousness and free will and whether the universe is a computer simulation—and also about, uh, how to balance blogging with work and family.

Also, here’s a 2-minute birthday video that I made for Manolis before I really understood what he wanted. Unlike the first video, this one has no academic content, but it does involve me wearing a cowboy hat and swinging a makeshift “lasso.”

Happy birthday Manolis!

Death of proof greatly exaggerated

March 7th, 2019

In 1993, the science writer John Horgan—who’s best known for his book The End of Science, and (of course) for interviewing me in 2016—wrote a now-(in)famous cover article for Scientific American entitled “The Death of Proof.” Mashing together a large number of (what I’d consider) basically separate trends and ideas, Horgan argued that math was undergoing a fundamental change, with traditional deductive proofs being replaced by a combination of non-rigorous numerical simulations, machine-generated proofs, probabilistic and probabilistically-checkable proofs, and proofs using graphics and video. Horgan also suggested that Andrew Wiles’s then-brand-new proof of Fermat’s Last Theorem—which might have looked, at first glance, like a spectacular counterexample to the “death of proof” thesis—could be the “last gasp of a dying culture” and a “splendid anachronism.” Apparently, “The Death of Proof” garnered one of the largest volumes of angry mail in Scientific American‘s history, with mathematician after mathematician arguing that Horgan had strung together half-digested quotes and vignettes to manufacture a non-story.

Now Horgan—who you could variously describe as a wonderful sport, or a ham, or a sucker for punishment—has written a 26-year retrospective on his “death of proof” article. The prompt for this was Horgan’s recent discovery that, back in the 90s, David Hoffman and Hermann Karcher, two mathematicians annoyed by the “death of proof” article, had named a nonexistent mathematical object after its author. The so-called Horgan surface is a minimal surface that numerical computations strongly suggested should exist, but that can be rigorously proven not to exist after all. “The term was intended as an insult, but I’m honored anyway,” Horgan writes.

As a followup to his blog post, Horgan then decided to solicit commentary from various people he knew, including yours truly, about “how proofs are faring in an era of increasing computerization.” He wrote, “I’d love to get a paragraph or two from you.” Alas, I didn’t have the time to do as requested, but only to write eight paragraphs. So Horgan suggested that I make the result into a post on my own blog, which he’d then link to. Without further ado, then:

John, I like you so I hate to say it, but the last quarter century has not been kind to your thesis about “the death of proof”!  Those mathematicians sending you the irate letters had a point: there’s been no fundamental change to mathematics that deserves such a dramatic title.  Proof-based math remains quite healthy, with (e.g.) a solution to the Poincaré conjecture since your article came out, as well as to the Erdős discrepancy problem, the Kadison-Singer conjecture, Catalan’s conjecture, bounded gaps in primes, testing primality in deterministic polynomial time, etc. — just to pick a few examples from the tiny subset of areas that I know anything about.

There are evolutionary changes to mathematical practice, as there always have been.  Since 2009, the website MathOverflow has let mathematicians query the global hive-mind about an obscure reference or a recalcitrant step in a proof, and get near-instant answers.  Meanwhile “polymath” projects have, with moderate success, tried to harness blogs and other social media to make advances on long-standing open math problems using massive collaborations.

While humans remain in the driver’s seat, there are persistent efforts to increase the role of computers, with some notable successes.  These include Thomas Hales’s 1998 computer-assisted proof of the Kepler Conjecture (about the densest possible way to pack oranges) — now fully machine-verified from start to finish, after the Annals of Mathematics refused to publish a mixture of traditional mathematics and computer code (seems this is not exactly what happened; see the comment section for more).  It also includes William McCune’s 1996 solution to the Robbins Conjecture in algebra (the computer-generated proof was only half a page, but involved substitutions so strange that for 60 years no human had found them); and at the “opposite extreme,” the 2016 solution to the Pythagorean triples problem by Marijn Heule and collaborators, which weighed in at 200 terabytes (at that time, “the longest proof in the history of mathematics”).

It’s conceivable that someday, computers will replace humans at all aspects of mathematical research — but it’s also conceivable that, by the time they can do that, they’ll be able to replace humans at music and science journalism and everything else!

New notions of proof — including probabilistic, interactive, zero-knowledge, and even quantum proofs — have seen further development by theoretical computer scientists since 1993.  So far, though, these new types of proof remain either entirely theoretical (as with quantum proofs), or else they’re used for cryptographic protocols but not for mathematical research.  (For example, zero-knowledge proofs now play a major role in certain cryptocurrencies, such as Zcash.)

In many areas of math (including my own, theoretical computer science), proofs have continued to get longer and harder for any one person to absorb.  This has led some to advocate a split approach, wherein human mathematicians would talk to each other only about the handwavy intuitions and high-level concepts, while the tedious verification of details would be left to computers.  So far, though, the huge investment of time needed to write proofs in machine-checkable format — for almost no return in new insight — has prevented this approach’s wide adoption.

Yes, there are non-rigorous approaches to math, which continue to be widely used in physics and engineering and other fields, as they always have been.  But none of these approaches have displaced proof as the gold standard whenever it’s available.  If I had to speculate about why, I’d say: if you use non-rigorous approaches, then even if it’s clear to you under what conditions your results can be trusted, it’s probably much less clear to others.  Also, even if only one segment of a research community cares about rigor, whatever earlier work that segment builds on will need to be rigorous as well — thereby exerting constant pressure in that direction.  Thus, the more collaborative a given research area becomes, the more important is rigor.

For my money, the elucidation of the foundations of mathematics a century ago, by Cantor, Frege, Peano, Hilbert, Russell, Zermelo, Gödel, Turing, and others, still stands as one of the greatest triumphs of human thought, up there with evolution or quantum mechanics or anything else.  It’s true that the ideal set by these luminaries remains mostly aspirational.  When mathematicians say that a theorem has been “proved,” they still mean, as they always have, something more like: “we’ve reached a social consensus that all the ideas are now in place for a strictly formal proof that could be verified by a machine … with the only task remaining being massive rote coding work that none of us has any intention of ever doing!”  It’s also true that mathematicians, being human, are subject to the full panoply of foibles you might expect: claiming to have proved things they haven’t, squabbling over who proved what, accusing others of lack of rigor while hypocritically taking liberties themselves.  But just like love and honesty remain fine ideals no matter how often they’re flouted, so too does mathematical rigor.

Update: Here’s Horgan’s new post (entitled “Okay, Maybe Proofs Aren’t Dying After All”), which also includes a contribution from Peter Woit.

De-sneering my life

February 27th, 2019

If I’m being honest, the most exciting recent development in my life is this: a little over a month ago, I stopped checking “SneerClub” (a place I’d previously resolved not even to name here, but I think an exception is warranted now). Permanently, cold turkey. I won’t even visit to read their sneers about this post. I’ve made progress cutting down on other self-destructive social media fixations as well. Many friends suggested this course to me, and I thank them all, though I ultimately had to follow my own path to the obvious.

Ironically, the SneerClubbers themselves begged me to stop reading them (!), so presumably for once they’ll be okay with something I did (but if not, I don’t care). If any of them still have something to say to me, they can come to this blog, or email me, or if they pass through Austin, set up a time to hash it out over chips and queso (my treat). What I’ll no longer do is spend hours every week binge-reading a forum of people who’ve adopted nastiness and bad faith as their explicit principles. I’ll no longer toss and turn at night wondering how it came about that two thousand Redditors hate Scott Aaronson so much, and what I could say or do (short of total self-abnegation) that would make them hate me less. I plan to spend the freed-up time being Scott Aaronson.

Resolving to ignore one particular online hate pit—and then sticking to the resolution, as so far I have—has been a pure, unmitigated improvement to my quality of life. If you don’t believe me, ask my wife and kids. I recommend this course to anyone.

You could sensibly ask: why did I ever spend time worrying about an anti-nerds-like-me forum that’s so poisonous for its targets and participants alike? After long introspection, I think the answer is: there’s a part of me, perhaps a gift from the childhood bullies, that’s so obsessed with “society’s hatred of STEM nerds,” that it constantly seeks out evidence to confirm that its fears are justified—evidence that it can then wave in front of the rest of my brain to say “you see?? what did I always tell you?” And alas, whenever that part of my brain seeks such evidence, the world dutifully supplies mountains of it. It’s never once disappointed.

Now the SneerClubbers—who are perceptive and talented in their cruelty, if in nothing else—notice this about me, and gleefully ridicule me for it. But they’re oblivious to the central irony: that unlike the vast majority of humankind, or even the vast majority of social justice activists, they (the SneerClubbers) really do hate everyone like me. They’re precisely what the paranoid part of my brain wrongly fears that everyone else I meet is secretly like. They’re like someone who lectures you about your hilariously overblown fear of muggers, while simultaneously mugging you.

But at least they’re not the contented and self-confident bullies of my childhood nightmares, kicking dirt down at nerds from atop their pinnacle of wokeness and social adeptness. If you spend enough time studying them, they themselves come across as angry, depressed, pathetic. So for example: here’s one of my most persistent attackers, popping up on a math thread commemorating Michael Atiyah (one of the great mathematicians of the 20th century), just to insult Atiyah—randomly, gratuitously, and a few days after Atiyah had died. Almost everything posted all over Reddit by this individual—who uses the accurate tagline “unpleasantly radical”—has the same flavor. Somehow seeing this made it click for me: wait a second, these are the folks are lecturing me about my self-centeredness and arrogance and terrible social skills? Like, at least I try to be nice.

Scott Alexander, who writes the world’s best blog and is a more central target of SneerClub than I’ve been, recently announced that he asked the moderators of r/ssc to close its notorious “Culture War” thread, and they’ve done so—moving the thread to a new home on Reddit called “TheMotte.”

For those who don’t know: r/ssc is the place on Reddit to discuss Scott’s SlateStarCodex blog, though Scott himself was never too involved as more than a figurehead.  The Culture War thread was the place within r/ssc to discuss race, gender, immigration, and other hot-button topics.  The thread, which filled up with a bewildering thousands of comments per week (!), attracted the, err … full range of political views, including leftists, libertarians, and moderates but also alt-righters, neoreactionaries, and white nationalists. Predictably, SneerClub treated the thread as a gift from heaven: a constant source of inflammatory material that they could use to smear Scott personally (even if most of the time, Scott hadn’t even seen the offending content, let alone endorsing it).

Four months ago, I was one of the apparently many friends who told Scott that I felt he should dissociate the Culture War thread from his brand. So I congratulate him on his decision, which (despite his eloquently-expressed misgivings) I feel confident was the right one. Think about it this way: nobody’s freedom of speech has been curtailed—the thread continues full steam at TheMotte, for anyone who enjoys it—but meanwhile, the sneerers have been deprived of a golden weapon with which to slime Scott. Meanwhile, while the sneerers themselves might never change their minds about anything, Scott has demonstrated to third parties that he’s open and reasonable and ready to compromise, like the debater who happily switches to his opponent’s terminology. What’s not to like?

A couple weeks ago, while in Albuquerque for the SQuInT conference, I visited the excellent National Museum of Nuclear Science and History.  It was depressing, as it should have been, to tour the detailed exhibits about the murderous events surrounding the birth of the nuclear era: the Holocaust, the Rape of Nanking, the bombings of Hiroshima and Nagasaki. It was depressing in a different way to tour the exhibits about the early Atomic Age, and see the boundless optimism that ‘unleashing the power of the atom’ would finally usher in a near-utopia of space travel and clean energy—and then to compare that vision to where we are now, with climate change ravaging the planet and (in a world-historic irony) the people who care most about the environment having denounced and marginalized the most reliable source of carbon-free energy, the one that probably had the best chance to avert our planet’s terrifying future.

But on the bright side: how wonderful to have born into a time and place when, for the most part, those who hate you have only the power to destroy your life that you yourself grant them. How wonderful when one can blunt their knives by simply refusing to open a browser tab.

Four updates

February 12th, 2019

A few weeks ago, I was at QIP’2019 in Boulder, CO. This week I was at SQuInT’2019 in Albuquerque, NM. There were lots of amazing talks—feel free to ask in the comments section.

There’s an interview with me at the website “GigaOm,” conducted by Byron Reese and entitled Quantum Computing: Capabilities and Limits. I didn’t proofread the transcript and it has some errors in it, but hopefully the meaning comes through. In other interview news, if you were interested in my podcast with Adam Ford in Melbourne but don’t like YouTube, Adam has helpfully prepared transcripts of the two longest segments: The Ghost in the Quantum Turing Machine and The Winding Road to Quantum Supremacy.

The New York Times ran an article entitled The Hard Part of Computer Science? Getting Into Class, about the surge in computer science majors all over the US, and the shortage of professors to teach them. The article’s go-to example of a university where this is happening is UT Austin, and there’s extensive commentary from my department chair, Don Fussell.

The STOC’2019 accepted papers list is finally out. Lots of cool stuff!


February 4th, 2019

I’ve of course been following the recent public debate about whether to build a circular collider to succeed the LHC—notably including Sabine Hossenfelder’s New York Times column arguing that we shouldn’t.  (See also the responses by Jeremy Bernstein and Lisa Randall, and the discussion on Peter Woit’s blog, and Daniel Harlow’s Facebook thread, and this Vox piece by Kelsey Piper.)  Let me blog about this as a way of cracking my knuckles or tuning my violin, just getting back into blog-shape after a long hiatus for travel and family and the beginning of the semester.

Regardless of whether this opinion is widely shared among my colleagues, I like Sabine.  I’ve often found her blogging funny and insightful, and I wish more non-Lubos physicists would articulate their thoughts for the public the way she does, rather than just standing on the sidelines and criticizing the ones who do. I find it unfortunate that some of the replies to Sabine’s arguments dwelled on her competence and “standing” in physics (even if we set aside—as we should—Lubos’s misogynistic rants, whose predictability could be used to calibrate atomic clocks). It’s like this: if high-energy physics had reached a pathological state of building bigger and bigger colliders for no good reason, then we’d expect that it would take a semi-outsider to say so in public, so then it wouldn’t be a further surprise to find precisely such a person doing it.

Not for the first time, though, I find myself coming down on the opposite side as Sabine. Basically, if civilization could get its act together and find the money, I think it would be pretty awesome to build a new collider to push forward the energy frontier in our understanding of the universe.

Note that I’m not making the much stronger claim that this is the best possible use of $20 billion for science. Plausibly a thousand $20-million projects could be found that would advance our understanding of reality by more than a new collider would. But it’s also important to realize that that’s not the question at stake here. When, for example, the US Congress cancelled the Superconducting Supercollider midway through construction—partly, it’s believed, on the basis of opposition from eminent physicists in other subfields, who argued that they could do equally important science for much cheaper—none of the SSC budget, as in 0% of it, ever did end up redirected to those other subfields. In practice, then, the question of “whether a new collider is worth it” is probably best considered in absolute terms, rather than relative to other science projects.

What I found most puzzling, in Sabine’s writings on this subject, was the leap in logic from

  1. many theorists expected that superpartners, or other new particles besides the Higgs boson, had a good chance of being discovered at the LHC, based on statistical arguments about “natural” parameter values, and
  2. the basic soundness of naturalness arguments was always open to doubt, and indeed the LHC results to date offer zero support for them, and
  3. many of the same theorists now want an even bigger collider, and continue to expect new particles to be found, and haven’t sufficiently reckoned with their previous failed predictions, to …
  4. therefore we shouldn’t build the bigger collider.

How do we get from 1-3 to 4: is the idea that we should punish the errant theorists, by withholding an experiment that they want, in order to deter future wrong predictions? After step 3, it seems to me that Sabine could equally well have gone to: and therefore it’s all the more important that we do build a new collider, in order to establish all the more conclusively that there’s just an energy desert up there—and that I, Sabine, was right to emphasize that possibility, and those other theorists were wrong to downplay it!

Like, I gather that there are independently motivated scenarios where there would be only the Higgs at the LHC scale, and then new stuff at the next energy scale beyond it. And as an unqualified outsider who enjoys talking to friends in particle physics and binge-reading about it, I’d find it hard to assign the totality of those scenarios less than ~20% credence or more than ~80%—certainly if the actual experts don’t either.

And crucially, it’s not as if raising the collision energy is just one arbitrary direction in which to look for new fundamental physics, among a hundred a-priori equally promising directions. Basically, there’s raising the collision energy and then there’s everything else. By raising the energy, you’re not testing one specific idea for physics beyond Standard Model, but a hundred or a thousand ideas in one swoop.

The situation reminds me a little of the quantum computing skeptics who say: scalable QC can never work, in practice and probably even in principle; the mainstream physics community only thinks it can work because of groupthink and hype; therefore, we shouldn’t waste more funds trying to make it work. With the sole, very interesting exception of Gil Kalai, none of the skeptics ever seem to draw what strikes me as an equally logical conclusion: whoa, let’s go full speed ahead with trying to build a scalable QC, because there’s an epochal revolution in physics to be had here—once the experimenters finally see that I was right and the mainstream was wrong, and they start to unravel the reasons why!

Of course, $20 billion is a significant chunk of change, by the standards of science even if not by the standards of random government wastages (like our recent $11 billion shutdown). And ultimately, decisions do need to be made about which experiments are most interesting to pursue with limited resources. And if a future circular collider were built, and if it indeed just found a desert, I think the balance would tilt pretty strongly toward Sabine’s position—that is, toward declining to build an even bigger and more expensive collider after that. If the Patriots drearily won every Superbowl 13-3, year after year after year, eventually no one would watch anymore and the Superbowl would get cancelled (well, maybe that will happen for other reasons…).

But it’s worth remembering that—correct me if I’m wrong—so far there have been no cases in the history of particle physics of massively expanding the energy frontier and finding absolutely nothing new there (i.e., nothing that at least conveyed multiple bits of information, as the Higgs mass did). And while my opinion should count for less than a neutrino mass, just thinking it over a-priori, I keep coming back to the question: before we close the energy frontier for good, shouldn’t there have been at least one unmitigated null result, rather than zero?

The Winding Road to Quantum Supremacy

January 15th, 2019

Greetings from QIP’2019 in Boulder, Colorado! Obvious highlights of the conference include Urmila Mahadev’s opening plenary talk on her verification protocol for quantum computation (which I blogged about here), and Avishay Tal’s upcoming plenary on his and Ran Raz’s oracle separation between BQP and PH (which I blogged about here). If you care, here are the slides for the talk I just gave, on the paper “Online Learning of Quantum States” by me, Xinyi Chen, Elad Hazan, Satyen Kale, and Ashwin Nayak. Feel free to ask in the comments about what else is going on.

I returned a few days ago from my whirlwind Australia tour, which included Melbourne and Sydney; a Persian wedding that happened to be held next to a pirate ship (the Steve Irwin, used to harass whalers and adorned with a huge Jolly Roger); meetings and lectures graciously arranged by friends at UTS; a quantum computing lab tour personally conducted by 2018 “Australian of the Year” Michelle Simmons; three meetups with readers of this blog (or more often, readers of the other Scott A’s blog who graciously settled for the discount Scott A); and an excursion to Grampians National Park to see wild kangaroos, wallabies, koalas, and emus.

But the thing that happened in Australia that provided the actual occassion for this post is this: I was interviewed by Adam Ford in Carlton Gardens in Melbourne, about quantum supremacy, AI risk, Integrated Information Theory, whether the universe is discrete or continuous, and to be honest I don’t remember what else. You can watch the first segment, the one about the prospects for quantum supremacy, here on YouTube. My only complaint is that Adam’s video camera somehow made me look like an out-of-shape slob who needs to hit the gym or something.

Update (Jan. 16): Adam has now posted a second video on YouTube, wherein I talk about my “Ghost in the Quantum Turing Machine” paper, my critique of Integrated Information Theory, and more.

And now Adam has posted yet a third segment, in which I talk about small, lighthearted things like existential threats to civilization and the prospects for superintelligent AI.

And a fourth, in which I talk about whether reality is discrete or continuous.

Related to the “free will / consciousness” segment of the interview: the biologist Jerry Coyne, whose blog “Why Evolution Is True” I’ve intermittently enjoyed over the years, yesterday announced my existence to his readers, with a post that mostly criticizes my views about free will and predictability, as I expressed them years ago in a clip that’s on YouTube (at the time, Coyne hadn’t seen GIQTM or my other writings on the subject). Coyne also took the opportunity to poke fun at this weird character he just came across whose “life is devoted to computing” and who even mistakes tips for change at airport smoothie stands. Some friends here at QIP had a good laugh over the fact that, for the world beyond theoretical computer science and quantum information, this is what 23 years of research, teaching, and writing apparently boil down to: an 8.5-minute video clip where I spouted about free will, and also my having been arrested once in a comic mix-up at Philadelphia airport. Anyway, since then I had a very pleasant email exchange with Coyne—someone with whom I find myself in agreement much more often than not, and who I’d love to have an extended conversation with sometime despite the odd way our interaction started.

Incompleteness ex machina

December 30th, 2018

I have a treat with which to impress your friends at New Year’s Eve parties tomorrow night: a rollicking essay graciously contributed by a reader named Sebastian Oberhoff, about a unified and simplified way to prove all of Gödel’s Incompleteness Theorems, as well as Rosser’s Theorem, directly in terms of computer programs. In particular, this improves over my treatments in Quantum Computing Since Democritus and my Rosser’s Theorem via Turing machines post. While there won’t be anything new here for the experts, I loved the style—indeed, it brings back wistful memories of how I used to write, before I accumulated too many imaginary (and non-imaginary) readers tut-tutting at crass jokes over my shoulder. May 2019 bring us all the time and the courage to express ourselves authentically, even in ways that might be sneered at as incomplete, inconsistent, or unsound.


December 27th, 2018

I’m planning to be in Australia soon—in Melbourne January 4-10 for a friend’s wedding, then in Sydney January 10-11 to meet colleagues and give a talk. It will be my first trip down under for 12 years (and Dana’s first ever). If there’s interest, I might be able to do a Shtetl-Optimized meetup in Melbourne the evening of Friday the 4th (or the morning of Saturday the 5th), and/or another one in Sydney the evening of Thursday the 10th. Email me if you’d go, and then we’ll figure out details.

The National Quantum Initiative Act is now law. Seeing the photos of Trump signing it, I felt … well, whatever emotions you might imagine I felt.

Frank Verstraete asked me to announce that the University of Vienna is seeking a full professor in quantum algorithms; see here for details.

Why are amplitudes complex?

December 17th, 2018

[By prior agreement, this post will be cross-posted on Microsoft’s Q# blog, even though it has nothing to do with the Q# programming language.  It does, however, contain many examples that might be fun to implement in Q#!]

Why should Nature have been quantum-mechanical?  It’s totally unclear what would count as an answer to such a question, and also totally clear that people will never stop asking it.

Short of an ultimate answer, we can at least try to explain why, if you want this or that piece of quantum mechanics, then the rest of the structure is inevitable: why quantum mechanics is an “island in theoryspace,” as I put it in 2003.

In this post, I’d like to focus on a question that any “explanation” for QM at some point needs to address, in a non-question-begging way: why should amplitudes have been complex numbers?  When I was a grad student, it was his relentless focus on that question, and on others in its vicinity, that made me a lifelong fan of Chris Fuchs (see for example his samizdat), despite my philosophical differences with him.

It’s not that complex numbers are a bad choice for the foundation of the deepest known description of the physical universe—far from it!  (They’re a field, they’re algebraically closed, they’ve got a norm, how much more could you want?)  It’s just that they seem like a specific choice, and not the only possible one.  There are also the real numbers, for starters, and in the other direction, the quaternions.

Quantum mechanics over the reals or the quaternions still has constructive and destructive interference among amplitudes, and unitary transformations, and probabilities that are absolute squares of amplitudes.  Moreover, these variants turn out to lead to precisely the same power for quantum computers—namely, the class BQP—as “standard” quantum mechanics, the one over the complex numbers.  So none of those are relevant differences.

Indeed, having just finished teaching an undergrad Intro to Quantum Information course, I can attest that the complex nature of amplitudes is needed only rarely—shockingly rarely, one might say—in quantum computing and information.  Real amplitudes typically suffice.  Teleportationsuperdense coding, the Bell inequality, quantum money, quantum key distribution, the Deutsch-Jozsa and Bernstein-Vazirani and Simon and Grover algorithms, quantum error-correction: all of those and more can be fully explained without using a single i that’s not a summation index.  (Shor’s factoring algorithm is an exception; it’s much more natural with complex amplitudes.  But as the previous paragraph implied, their use is removable even there.)

It’s true that, if you look at even the simplest “real” examples of quantum systems—or as a software engineer might put it, at the application layers built on top of the quantum OS—then complex numbers are everywhere, in a way that seems impossible to remove.  The Schrödinger equation, energy eigenstates, the position/momentum commutation relation, the state space of a spin-1/2 particle in 3-dimensional space: none of these make much sense without complex numbers (though it can be fun to try).

But from a sufficiently Olympian remove, it feels circular to use any of this as a “reason” for why quantum mechanics should’ve involved complex amplitudes in the first place.  It’s like, once your OS provides a certain core functionality (in this case, complex numbers), it’d be surprising if the application layer didn’t exploit that functionality to the hilt—especially if we’re talking about fundamental physics, where we’d like to imagine that nothing is wasted or superfluous (hence Rabi’s famous question about the muon: “who ordered that?”).

But why should the quantum OS have provided complex-number functionality at all?  Is it possible to answer that question purely in terms of the OS’s internal logic (i.e., abstract quantum information), making minimal reference to how the OS will eventually get used?  Maybe not—but if so, then that itself would seem worthwhile to know.

If we stick to abstract quantum information language, then the most “obvious, elementary” argument for why amplitudes should be complex numbers is one that I spelled out in Quantum Computing Since Democritus, as well as my Is quantum mechanics an island in theoryspace? paper.  Namely, it seems desirable to be able to implement a “fraction” of any unitary operation U: for example, some V such that V2=U, or V3=U.  With complex numbers, this is trivial: we can simply diagonalize U, or use the Hamiltonian picture (i.e., take e-iH/2 where U=e-iH), both of which ultimately depend on the complex numbers being algebraically closed.  Over the reals, by contrast, a 2×2 orthogonal matrix like $$ U = \left(\begin{array}[c]{cc}1 & 0\\0 & -1\end{array}\right)$$

has no 2×2 orthogonal square root, as follows immediately from its determinant being -1.  If we want a square root of U (or rather, of something that acts like U on a subspace) while sticking to real numbers only, then we need to add another dimension, like so: $$ \left(\begin{array}[c]{ccc}1 & 0 & 0\\0 & -1 & 0\\0 & 0&-1\end{array}\right)=\left(\begin{array}[c]{ccc}1 & 0 & 0\\0 & 0 & 1\\0 & -1 & 0\end{array}\right) ^{2} $$

This is directly related to the fact that there’s no way for a Flatlander to “reflect herself” (i.e., switch her left and right sides while leaving everything else unchanged) by any continuous motion, unless she can lift off the plane and rotate herself through the third dimension.  Similarly, for us to reflect ourselves would require rotating through a fourth dimension.

One could reasonably ask: is that it?  Aren’t there any “deeper” reasons in quantum information for why amplitudes should be complex numbers?

Indeed, there are certain phenomena in quantum information that, slightly mysteriously, work out more elegantly if amplitudes are complex than if they’re real.  (By “mysteriously,” I mean not that these phenomena can’t be 100% verified by explicit calculations, but simply that I don’t know of any deep principle by which the results of those calculations could’ve been predicted in advance.)

One famous example of such a phenomenon is due to Bill Wootters: if you take a uniformly random pure state in d dimensions, and then you measure it in an orthonormal basis, what will the probability distribution (p1,…,pd) over the d possible measurement outcomes look like?  The answer, amazingly, is that you’ll get a uniformly random probability distribution: that is, a uniformly random point on the simplex defined by pi≥0 and p1+…+pd=1.  This fact, which I’ve used in several papers, is closely related to Archimedes’ Hat-Box Theorem, beloved by friend-of-the-blog Greg Kuperberg.  But here’s the kicker: it only works if amplitudes are complex numbers.  If amplitudes are real, then the resulting distribution over distributions will be too bunched up near the corners of the probability simplex; if they’re quaternions, it will be too bunched up near the middle.

There’s an even more famous example of such a Goldilocks coincidence—one that’s been elevated, over the past two decades, to exalted titles like “the Axiom of Local Tomography.”  Namely: suppose we have an unknown finite-dimensional mixed state ρ, shared by two players Alice and Bob.  For example, ρ might be an EPR pair, or a correlated classical bit, or simply two qubits both in the state |0⟩.  We imagine that Alice and Bob share many identical copies of ρ, so that they can learn more and more about it by measuring this copy in this basis, that copy in that basis, and so on.

We then ask: can ρ be fully determined from the joint statistics of product measurements—that is, measurements that Alice and Bob can apply separately and locally to their respective subsystems, with no communication between them needed?  A good example here would be the set of measurements that arise in a Bell experiment—measurements that, despite being local, certify that Alice and Bob must share an entangled state.

If we asked the analogous question for classical probability distributions, the answer is clearly “yes.”  That is, once you’ve specified the individual marginals, and you’ve also specified all the possible correlations among the players, you’ve fixed your distribution; there’s nothing further to specify.

For quantum mixed states, the answer again turns out to be yes, but only because amplitudes are complex numbers!  In quantum mechanics over the reals, you could have a 2-qubit state like $$ \rho=\frac{1}{4}\left(\begin{array}[c]{cccc}1 & 0 & 0 & -1\\0 & 1 & 1 & 0\\0 & 1 & 1 & 0\\-1& 0 & 0 & 1\end{array}\right) ,$$

which clearly isn’t the maximally mixed state, yet which is indistinguishable from the maximally mixed state by any local measurement that can be specified using real numbers only.  (Proof: exercise!)

In quantum mechanics over the quaternions, something even “worse” happens: namely, the tensor product of two Hermitian matrices need not be Hermitian.  Alice’s measurement results might be described by the 2×2 quaternionic density matrix $$ \rho_{A}=\frac{1}{2}\left(\begin{array}[c]{cc}1 & -i\\i & 1\end{array}\right), $$

and Bob’s results might be described by the 2×2 quaternionic density matrix $$ \rho_{B}=\frac{1}{2}\left(\begin{array}[c]{cc}1 & -j\\j & 1\end{array}\right), $$

and yet there might not be (and in this case, isn’t) any 4×4 quaternionic density matrix corresponding to ρA⊗ρB, which would explain both results separately.

What’s going on here?  Why do the local measurement statistics underdetermine the global quantum state with real amplitudes, and overdetermine it with quaternionic amplitudes, being in one-to-one correspondence with it only when amplitudes are complex?

We can get some insight by looking at the number of independent real parameters needed to specify a d-dimensional Hermitian matrix.  Over the complex numbers, the number is exactly d2: we need 1 parameter for each of the d diagonal entries, and 2 (a real part and an imaginary part) for each of the d(d-1)/2 upper off-diagonal entries (the lower off-diagonal entries being determined by the upper ones).  Over the real numbers, by contrast, “Hermitian matrices” are just real symmetric matrices, so the number of independent real parameters is only d(d+1)/2.  And over the quaternions, the number is d+4[d(d-1)/2] = 2d(d-1).

Now, it turns out that the Goldilocks phenomenon that we saw above—with local measurement statistics determining a unique global quantum state when and only when amplitudes are complex numbers—ultimately boils down to the simple fact that $$ (d_A d_B)^2 = d_A^2 d_B^2, $$

but $$\frac{d_A d_B (d_A d_B + 1)}{2} > \frac{d_A (d_A + 1)}{2} \cdot \frac{d_B (d_B + 1)}{2},$$

and conversely $$ 2 d_A d_B (d_A d_B – 1) < 2 d_A (d_A – 1) \cdot 2 d_B (d_B – 1).$$

In other words, only with complex numbers does the number of real parameters needed to specify a “global” Hermitian operator, exactly match the product of the number of parameters needed to specify an operator on Alice’s subsystem, and the number of parameters needed to specify an operator on Bob’s.  With real numbers it overcounts, and with quaternions it undercounts.

A major research goal in quantum foundations, since at least the early 2000s, has been to “derive” the formalism of QM purely from “intuitive-sounding, information-theoretic” postulates—analogous to how, in 1905, some guy whose name I forget derived the otherwise strange-looking Lorentz transformations purely from the assumption that the laws of physics (including a fixed, finite value for the speed of light) take the same form in every inertial frame.  There have been some nontrivial successes of this program: most notably, the “axiomatic derivations” of QM due to Lucien Hardy and (more recently) Chiribella et al.  Starting from axioms that sound suitably general and nontechnical (if sometimes unmotivated and weird), these derivations perform the impressive magic trick of deriving the full mathematical structure of QM: complex amplitudes, unitary transformations, tensor products, the Born rule, everything.

However, in every such derivation that I know of, some axiom needs to get introduced to capture “local tomography”: i.e., the “principle” that composite systems must be uniquely determined by the statistics of local measurements.  And while this principle might sound vague and unobjectionable, to those in the business, it’s obvious what it’s going to be used for the second it’s introduced.  Namely, it’s going to be used to rule out quantum mechanics over the real numbers, which would otherwise be a model for the axioms, and thus to “explain” why amplitudes have to be complex.

I confess that I was always dissatisfied with this.  For I kept asking myself: would I have ever formulated the “Principle of Local Tomography” in the first place—or if someone else had proposed it, would I have ever accepted it as intuitive or natural—if I didn’t already know that QM over the complex numbers just happens to satisfy it?  And I could never honestly answer “yes.”  It always felt to me like a textbook example of drawing the target around where the arrow landed—i.e., of handpicking your axioms so that they yield a predetermined conclusion, which is then no more “explained” than it was at the beginning.

Two months ago, something changed for me: namely, I smacked into the “Principle of Local Tomography,” and its reliance on complex numbers, in my own research, when I hadn’t in any sense set out to look for it.  This still doesn’t convince me that the principle is any sort of a-priori necessity.  But it at least convinces me that it’s, you know, the sort of thing you can smack into when you’re not looking for it.

The aforementioned smacking occurred while I was writing up a small part of a huge paper with Guy Rothblum, about a new connection between so-called “gentle measurements” of quantum states (that is, measurements that don’t damage the states much), and the subfield of classical CS called differential privacy.  That connection is a story in itself; let me know if you’d like me to blog about it separately.  Our paper should be on the arXiv any day now; in the meantime, here are some PowerPoint slides.

Anyway, for the paper with Guy, it was of interest to know the following: suppose we have a two-outcome measurement E (let’s say, on n qubits), and suppose it accepts every product state with the same probability p.  Must E then accept every entangled state with probability p as well?  Or, a closely-related question: suppose we know E’s acceptance probabilities on every product state.  Is that enough to determine its acceptance probabilities on all n-qubit states?

I’m embarrassed to admit that I dithered around with these questions, finding complicated proofs for special cases, before I finally stumbled on the one-paragraph, obvious-in-retrospect “Proof from the Book” that slays them in complete generality.

Here it is: if E accepts every product state with probability p, then clearly it accepts every separable mixed state (i.e., every convex combination of product states) with the same probability p.  Now, a well-known result of Braunstein et al., from 1998, states that (surprisingly enough) the separable mixed states have nonzero density within the set of all mixed states, in any given finite dimension.  Also, the probability that E accepts ρ can be written as f(ρ)=Tr(Eρ), which is linear in the entries of ρ.  OK, but a linear function that’s determined on a subset of nonzero density is determined everywhere.  And in particular, if f is constant on that subset then it’s constant everywhere, QED.

But what does any of this have to do with why amplitudes are complex numbers?  Well, it turns out that the 1998 Braunstein et al. result, which was the linchpin of the above argument, only works in complex QM, not in real QM.  We can see its failure in real QM by simply counting parameters, similarly to what we did before.  An n-qubit density matrix requires 4n real parameters to specify (OK, 4n-1, if we demand that the trace is 1).  Even if we restrict to n-qubit density matrices with real entries only, we still need 2n(2n+1)/2 parameters.  By contrast, it’s not hard to show that an n-qubit real separable density matrix can be specified using only 3n real parameters—and indeed, that any such density matrix lies in a 3n-dimensional subspace of the full 2n(2n+1)/2-dimensional space of 2n×2n symmetric matrices.  (This is simply the subspace spanned by all possible tensor products of n Pauli I, X, and Z matrices—excluding the Y matrix, which is the one that involves imaginary numbers.)

But it’s not only the Braunstein et al. result that fails in real QM: the fact that I wanted for my paper with Guy fails as well.  As a counterexample, consider the 2-qubit measurement that accepts the state ρ with probability Tr(Eρ), where $$ E=\frac{1}{2}\left(\begin{array}[c]{cccc}1 & 0 & 0 & -1\\0 & 1 & 1 & 0\\0 & 1 & 1 & 0\\-1 & 0 & 0 & 1\end{array}\right).$$

I invite you to check that this measurement, which we specified using a real matrix, accepts every product state (a|0⟩+b|1⟩)(c|0⟩+d|1⟩), where a,b,c,d are real, with the same probability, namely 1/2—just like the “measurement” that simply returns a coin flip without even looking at the state at all.  And yet the measurement can clearly be nontrivial on entangled states: for example, it always rejects $$\frac{\left|00\right\rangle+\left|11\right\rangle}{\sqrt{2}},$$ and it always accepts $$ \frac{\left|00\right\rangle-\left|11\right\rangle}{\sqrt{2}}.$$

Is it a coincidence that we used exactly the same 4×4 matrix (up to scaling) to produce a counterexample to the real-QM version of Local Tomography, and also to the real-QM version of the property I wanted for the paper with Guy?  Is anything ever a coincidence in this sort of discussion?

I claim that, looked at the right way, Local Tomography and the property I wanted are the same property, their truth in complex QM is the same truth, and their falsehood in real QM is the same falsehood.  Why?  Simply because Tr(Eρ), the probability that the measurement E accepts the mixed state ρ, is a function of two Hermitian matrices E and ρ (both of which can be either “product” or “entangled”), and—crucially—is symmetric under the interchange of E and ρ.

Now it’s time for another confession.  We’ve identified an elegant property of quantum mechanics that’s true but only because amplitudes are complex numbers: namely, if you know the probability that your quantum circuit accepts every product state, then you also know the probability that it accepts an arbitrary state.  Yet, despite its elegance, this property turns out to be nearly useless for “real-world applications” in quantum information and computing.  The reason for the uselessness is that, for the property to kick in, you really do need to know the probabilities on product states almost exactly—meaning (say) to 1/exp(n) accuracy for an n-qubit state.

Once again a simple example illustrates the point.  Suppose n is even, and suppose our measurement simply projects the n-qubit state onto a tensor product of n/2 Bell pairs.  Clearly, this measurement accepts every n-qubit product state with exponentially small probability, even as it accepts the entangled state 
$$\left(\frac{\left|00\right\rangle+\left|11\right\rangle}{\sqrt{2}}\right)^{\otimes n/2}$$

with probability 1.  But this implies that noticing the nontriviality on entangled states, would require knowing the acceptance probabilities on product states to exponential accuracy.

In a sense, then, I come back full circle to my original puzzlement: why should Local Tomography, or (alternatively) the-determination-of-a-circuit’s-behavior-on-arbitrary-states-from-its-behavior-on-product-states, have been important principles for Nature’s laws to satisfy?  Especially given that, in practice, the exponential accuracy required makes it difficult or impossible to exploit these principles anyway?  How could we have known a-priori that these principles would be important—if indeed they are important, and are not just mathematical spandrels?

But, while I remain less than 100% satisfied about “why the complex numbers? why not just the reals?,” there’s one conclusion that my recent circling-back to these questions has made me fully confident about.  Namely: quantum mechanics over the quaternions is a flaming garbage fire, which would’ve been rejected at an extremely early stage of God and the angels’ deliberations about how to construct our universe.

In the literature, when the question of “why not quaternionic amplitudes?” is discussed at all, you’ll typically read things about how the parameter-counting doesn’t quite work out (just like it doesn’t for real QM), or how the tensor product of quaternionic Hermitian matrices need not be Hermitian.  In this paper by McKague, you’ll read that the CHSH game is winnable with probability 1 in quaternionic QM, while in this paper by Fernandez and Schneeberger, you’ll read that the non-commutativity of the quaternions introduces an order-dependence even for spacelike-separated operations.

But none of that does justice to the enormity of the problem.  To put it bluntly: unless something clever is done to fix it, quaternionic QM allows superluminal signaling.  This is easy to demonstrate: suppose Alice holds a qubit in the state |1⟩, while Bob holds a qubit in the state |+⟩ (yes, this will work even for unentangled states!)  Also, let $$U=\left(\begin{array}[c]{cc}1 & 0\\0 & j\end{array}\right) ,~~~V=\left(\begin{array}[c]{cc}1 & 0\\0& i\end{array}\right).$$

We can calculate that, if Alice applies U to her qubit and then Bob applies V to his qubit, Bob will be left with the state $$ \frac{j \left|0\right\rangle +
k \left|1\right\rangle}{\sqrt{2}}.$$

By contrast, if Alice decided to apply U only after Bob applied V, Bob would be left with the state 
$$ \frac{j \left|0\right\rangle – k \left|1\right\rangle}{\sqrt{2}}.$$

But Bob can distinguish these two states with certainty, for example by applying the unitary $$ \frac{1}{\sqrt{2}}\left(\begin{array}[c]{cc}j & k\\k & j\end{array}\right). $$

Therefore Alice communicated a bit to Bob.

I’m aware that there’s a whole literature on quaternionic QM, including for example a book by Adler.  Would anyone who knows that literature be kind enough to enlighten us on how it proposes to escape the signaling problem?  Regardless of the answer, though, it seems worth knowing that the “naïve” version of quaternionic QM—i.e., the version that gets invoked in quantum information discussions like the ones I mentioned above—is just immediately blasted to smithereens by the signaling problem, without the need for any subtle considerations like the ones that differentiate real from complex QM.

Update (Dec. 20): In response to this post, Stephen Adler was kind enough to email me with further details about his quaternionic QM proposal, and to allow me to share them here. Briefly, Adler completely agrees that quaternionic QM inevitably leads to superluminal signaling—but in his proposal, the surprising and nontrivial part is that quaternionic QM would reduce to standard, complex QM at large distances. In particular, the strength of a superluminal signal would fall off exponentially with distance, quickly becoming negligible beyond the Planck or grand unification scales. Despite this, Adler says that he eventually abandoned his proposal for quaternionic QM, since he was unable to make specific particle physics ideas work out (but the quaternionic QM proposal then influenced his later work).

Unrelated Update (Dec. 18): Probably many of you have already seen it, and/or already know what it covers, but the NYT profile of Donald Knuth (entitled “The Yoda of Silicon Valley”) is enjoyable and nicely written.