## What happens when an unstoppable PR force hits an NP-hard problem? The answer’s getting clearer

January 16th, 2014

Update (Jan. 23): Daniel Lidar, one of the authors of the “Defining and detecting…” paper, was kind enough to email me his reactions to this post.  While he thought the post was generally a “very nice summary” of their paper, he pointed out one important oversight in my discussion.  Ironically, this oversight arose from my desire to bend over backwards to be generous to D-Wave!  Specifically, I claimed that there were maybe ~10% of randomly-chosen 512-qubit problem instances on which the D-Wave Two slightly outperformed the simulated annealing solver (compared to ~75% where simulated annealing outperformed the D-Wave Two), while also listing several reasons (such as the minimum annealing time, and the lack of any characterization of the “good” instances) why that “speedup” is likely to be entirely an artifact.  I obtained the ~10% and ~75% figures by eyeballing Figure 7 in the paper, and looking at which quantiles were just above and just below the 100 line when N=512.

However, I neglected to mention that even the slight “speedup” on ~10% of instances, only appears when one looks at the “quantiles of ratio”: in other words, when one plots the probability distribution of [Simulated annealing time / D-Wave time] over all instances, and then looks at (say) the ~10% of the distribution that’s best for the D-Wave machine.  The slight speedup disappears when one looks at the “ratio of quantiles”: that is, when one (say) divides the amount of time that simulated annealing needs to solve its best 10% of instances, by the amount of time that the D-Wave machine needs to solve its best 10%.  And Rønnow et al. give arguments in their paper that ratio of quantiles is probably the more relevant performance comparison than quantiles of ratio.  (Incidentally, the slight speedup on a few instances also only appears for certain values of the parameter r, which controls how many possible settings there are for each coupling.  Apparently it appears for r=1, but disappears for r=3 and r=7—thereby heightening one’s suspicion that we’re dealing with an artifact of the minimum annealing time or something like that, rather than a genuine speedup.)

There’s one other important point in the paper that I didn’t mention: namely, all the ratios of simulated annealing time to D-Wave time are normalized by 512/N, where N is the number of spins in the instance being tested.  In this way, one eliminates the advantages of the D-Wave machine that come purely from its parallelism (which has nothing whatsoever to do with “quantumness,” and which could easily skew things in D-Wave’s favor if not controlled for), while still not penalizing the D-Wave machine in absolute terms.

A few days ago, a group of nine authors (Troels Rønnow, Zhihui Wang, Joshua Job, Sergio Boixo, Sergei Isakov, David Wecker, John Martinis, Daniel Lidar, and Matthias Troyer) released their long-awaited arXiv preprint Defining and detecting quantum speedup, which contains the most thorough performance analysis of the D-Wave devices to date, and which seems to me to set a new standard of care for any future analyses along these lines.  Notable aspects of the paper: it uses data from the 512-qubit machine (a previous comparison had been dismissed by D-Wave’s supporters because it studied the 128-qubit model only); it concentrates explicitly from the beginning on comparisons of scaling behavior between the D-Wave devices and comparable classical algorithms, rather than getting “sidetracked” by other issues; and it includes authors from both USC and Google’s Quantum AI Lab, two places that have made large investments in D-Wave’s machines and have every reason to want to see them succeed.

Let me quote the abstract in full:

The development of small-scale digital and analog quantum devices raises the question of how to fairly assess and compare the computational power of classical and quantum devices, and of how to detect quantum speedup. Here we show how to define and measure quantum speedup in various scenarios, and how to avoid pitfalls that might mask or fake quantum speedup. We illustrate our discussion with data from a randomized benchmark test on a D-Wave Two device with up to 503 qubits. Comparing the performance of the device on random spin glass instances with limited precision to simulated classical and quantum annealers, we find no evidence of quantum speedup when the entire data set is considered, and obtain inconclusive results when comparing subsets of instances on an instance-by-instance basis. Our results for one particular benchmark do not rule out the possibility of speedup for other classes of problems and illustrate that quantum speedup is elusive and can depend on the question posed.

Since the paper is exceedingly well-written, and since I have maybe an hour before I’m called back to baby duty, my inclination is simply to ask people to RTFP rather than writing yet another long blog post.  But maybe there are four points worth calling attention to:

1. The paper finds, empirically, that the time needed to solve random size-N instances of the quadratic binary optimization (QUBO) problem on D-Wave’s Chimera constraint graph seems to scale like exp(c√N) for some constant c—and that this is true regardless of whether one attacks the problem using the D-Wave Two, quantum Monte Carlo (i.e., a classical algorithm that tries to mimic the native physics of the machine), or an optimized classical simulated annealing code.  Notably, exp(c√N) is just what one would have predicted from theoretical arguments based on treewidth; and the constant c doesn’t appear to be better for the D-Wave Two than for simulated annealing.
2. The last sentence of the abstract (“Our results … do not rule out the possibility of speedup for other classes of problems”) is, of course, the reed on which D-Wave’s supporters will now have to hang their hopes.  But note that it’s unclear what experimental results could ever “rule out the possibility of speedup for other classes of problems.”  (No matter how many wrong predictions a psychic has made, the possibility remains that she’d be flawless at predicting the results of Croatian ping-pong tournaments…)  Furthermore, like with previous experiments, the instances tested all involved finding ground states for random coupling configurations of the D-Wave machine’s own architecture.  In other words, this was a set of instances where one might have thought, a priori, that the D-Wave machine would have an immense home-field advantage.  Thus, one really needs to look more closely, to see whether there’s any positive evidence for an asymptotic speedup by the D-Wave machine.
3. Here, for D-Wave supporters, the biggest crumb the paper throws is that, if one considers only the ~10% of instances on which the D-Wave machine does best, then the machine does do slightly better on those instances than simulated annealing does.  (Conversely, simulated annealing does better than the D-Wave machine on the ~75% of instances on which it does best.)  Unfortunately, no one seems to know how to characterize the instances on which the D-Wave machine will do best: one just has to try it and see what happens!  And of course, it’s extremely rare that two heuristic algorithms will succeed or fail on exactly the same set of instances: it’s much more likely that their performances will be correlated, but imperfectly.  So it’s unclear, at least to me, whether this finding represents anything other than the “noise” that would inevitably occur even if one classical algorithm were pitted against another one.
4. As the paper points out, there’s also a systematic effect that biases results in the D-Wave Two’s favor, if one isn’t careful.  Namely, the D-Wave Two has a minimum annealing time of 20 microseconds, which is often greater than the optimum annealing time, particularly for small instance sizes.  The effect of that is artificially to increase the D-Wave Two’s running time for small instances, and thereby make its scaling behavior look better than it really is.  The authors say they don’t know whether even the D-Wave Two’s apparent advantage for its “top 10% of instances” will persist after this effect is fully accounted for.

Those seeking something less technical might want to check out an excellent recent article in Inc. by Will Bourne, entitled “D-Wave’s dream machine” (“D-Wave thinks it has built the first commercial quantum computer.  Mother Nature has other ideas”).  Wisely, Bourne chose not to mention me at all in this piece.  Instead, he gradually builds a skeptical case almost entirely on quotes from people like Seth Lloyd and Daniel Lidar, who one might have thought would be more open to D-Wave’s claims.  Bourne’s piece illustrates that it is possible for the mainstream press to get the D-Wave story pretty much right, and that you don’t even need a physics background to do so: all you need is a willingness to commit journalism.

Oh.  I’d be remiss not to mention that, in the few days between the appearance of this paper and my having a chance to write this post, two other preprints of likely interest to the Shtetl-Optimized commentariat showed up on quant-ph.  The first, by a large list of authors mostly from D-Wave, is called Entanglement in a quantum annealing processor.  This paper presents evidence for a point that many skeptics (including me) had been willing to grant for some time: namely, that the states generated by the D-Wave machines contain some nonzero amount of entanglement.  (Note that, because of a technical property called “stoquasticity,” such entanglement is entirely compatible with the machines continuing to be efficiently simulable on a classical computer using Quantum Monte Carlo.)  While it doesn’t address the performance question at all, this paper seems like a perfectly fine piece of science.

From the opposite side of the (eigen)spectrum comes the latest preprint by QC skeptic Michel Dyakonov, entitled Prospects for quantum computing: Extremely doubtful.  Ironically, Dyakonov and D-Wave seem to agree completely about the irrelevance of fault-tolerance and other insights from quantum computing theory.  It’s just that D-Wave thinks QC can work even without the theoretical insights, whereas Dyakonov thinks that QC can’t work even with the insights.  Unless I missed it, there’s no new scientific content in Dyakonov’s article.  It’s basically a summary of some simple facts about QC and quantum fault-tolerance, accompanied by sneering asides about how complicated and implausible it all sounds, and how detached from reality the theorists are.

And as for the obvious comparisons to previous “complicated and implausible” technologies, like (say) classical computing, or heavier-than-air flight, or controlled nuclear fission?  Dyakonov says that such comparisons are invalid, because they ignore the many technologies proposed in previous eras that didn’t work.  What’s striking is how little he seems to care about why the previous technologies failed: was it because they violated clearly-articulated laws of physics?  Or because there turned out to be better ways to do the same things?  Or because the technologies were simply too hard, too expensive, or too far ahead of their time?  Supposing QC to be impossible, which of those is the reason for the impossibility?  Since we’re not asking about something “arbitrary” here (like teaching a donkey to read), but rather about the computational power of Nature itself, isn’t it of immense scientific interest to know the reason for QC’s impossibility?  How does Dyakonov propose to learn the reason, assuming he concedes that he doesn’t already know it?

(As I’ve said many times, I’d support even the experiments that D-Wave was doing, if D-Wave and its supporters would only call them for what they were: experiments.  Forays into the unknown.  Attempts to find out what happens when a particular speculative approach is thrown at NP-hard optimization problems.  It’s only when people obfuscate the results of those experiments, in order to claim something as “commercially useful” that quite obviously isn’t yet, that they leave the realm of science, and indeed walk straight into the eager jaws of skeptics like Dyakonov.)

Anyway, since we seem to have circled back to D-Wave, I’d like to end this post by announcing my second retirement as Chief D-Wave Skeptic.  The first time I retired, it was because I mistakenly thought that D-Wave had fundamentally changed, and would put science ahead of PR from that point forward.  (The truth seems to be that there were, and are, individuals at D-Wave committed to science, but others who remain PR-focused.)  This time, I’m retiring for a different reason: because scientists like the authors of the “Defining and detecting” preprint, and journalists like Will Bourne, are doing my job better than I ever did it.  If the D-Wave debate were the American Civil War, then my role would be that of the frothy-mouthed abolitionist pamphleteer: someone who repeats over and over points that are fundamentally true, but in a strident manner that serves only to alienate fence-sitters and allies.  As I played my ineffective broken record, the Wave Power simply moved from one triumph to another, expanding its reach to Google, NASA, Lockheed Martin, and beyond.  I must have looked like a lonely loon on the wrong side of history.

But today the situation is different.  Today Honest Abe and his generals (Honest Matthias and his coauthors?) are meeting the Wave Power on the battlefield of careful performance comparisons against Quantum Monte Carlo and simulated annealing.  And while the battles might continue all the way to 2000 qubits or beyond, the results so far are not looking great for the Wave Power.  The intractability of NP-complete problems—that which we useless, ivory-tower theorists had prophesied years ago, to much derision and laughter—would seem to be rearing its head.  So, now that the bombs are bursting and the spins decohering in midair, what is there for a gun-shy pampleteer like myself to do but sit back and watch it all play out?

Well, and maybe blog about it occasionally.  But not as “Chief Skeptic,” just as another interested observer.

## BosonSampling Lecture Notes from Rio

December 28th, 2013

Update (January 3): There’s now a long interview with me about quantum computing in the Washington Post (or at least, on their website).  The interview accompanies their lead article about quantum computing and the NSA, which also quotes me (among many others), and which reports—unsurprisingly—that the NSA is indeed interested in building scalable quantum computers but, based on the Snowden documents, appears to be quite far from that goal.

(Warning: The interview contains a large number of typos and other errors, which might have arisen from my infelicities in speaking or the poor quality of the phone connection.  Some were corrected but others remain.)

The week before last, I was in Rio de Janeiro to give a mini-course on “Complexity Theory and Quantum Optics” at the Instituto de Física of the Universidade Federal Fluminense.  Next week I’ll be giving a similar course at the Jerusalem Winter School on Quantum Information.

In the meantime, my host in Rio, Ernesto Galvão, and others were kind enough to make detailed, excellent notes for my five lectures in Rio.  You can click the link in the last sentence to get them, or here are links for the five lectures individually:

If you have questions or comments about the lectures, leave them here (since I might not check the quantumrio blog).

One other thing: I can heartily recommend a trip to Rio to anyone interested in quantum information—or, for that matter, to anyone interested in sunshine, giant Jesus statues, or (especially) fruit juices you’ve never tasted before.  My favorite from among the latter was acerola.  Also worth a try are caja, mangaba, guarana, umbu, seriguela, amora, and fruta do conde juices—as well as caju and cacao, even though they taste almost nothing like the more commercially exportable products from the same plants (cashews and chocolate respectively).  I didn’t like cupuaçu or graviola juices.  Thanks so much to Ernesto and everyone else for inviting me (not just because of the juice).

Update (January 2): You can now watch videos of my mini-course at the Jerusalem Winter School here.

Videos of the other talks at the Jerusalem Winter School are available from the same site (just scroll through them on the right).

## Merry Christmas! My quantum computing research explained, using only the 1000 most common English words

December 24th, 2013

[With special thanks to the Up-Goer Five Text Editor, which was inspired by this xkcd]

I study computers that would work in a different way than any computer that we have today.  These computers would be very small, and they would use facts about the world that are not well known to us from day to day life.  No one has built one of these computers yet—at least, we don’t think they have!—but we can still reason about what they could do for us if we did build them.

How would these new computers work? Well, when you go small enough, you find that, in order to figure out what the chance is that something will happen, you need to both add and take away a whole lot of numbers—one number for each possible way that the thing could happen, in fact. What’s interesting is, this means that the different ways a thing could happen can “kill each other out,” so that the thing never happens at all! I know it sounds weird, but the world of very small things has been known to work that way for almost a hundred years.

So, with the new kind of computer, the idea is to make the different ways each wrong answer could be reached kill each other out (with some of them “pointing” in one direction, some “pointing” in another direction), while the different ways that the right answer could be reached all point in more or less the same direction. If you can get that to happen, then when you finally look at the computer, you’ll find that there’s a very good chance that you’ll see the right answer. And if you don’t see the right answer, then you can just run the computer again until you do.

For some problems—like breaking a big number into its smallest parts (say, 43259 = 181 × 239)—we’ve learned that the new computers would be much, much faster than we think any of today’s computers could ever be. For other problems, however, the new computers don’t look like they’d be faster at all. So a big part of my work is trying to figure out for which problems the new computers would be faster, and for which problems they wouldn’t be.

You might wonder, why is it so hard to build these new computers? Why don’t we have them already? This part is a little hard to explain using the words I’m allowed, but let me try. It turns out that the new computers would very easily break. In fact, if the bits in such a computer were to “get out” in any way—that is, to work themselves into the air in the surrounding room, or whatever—then you could quickly lose everything about the new computer that makes it faster than today’s computers. For this reason, if you’re building the new kind of computer, you have to keep it very, very carefully away from anything that could cause it to lose its state—but then at the same time, you do have to touch the computer, to make it do the steps that will eventually give you the right answer. And no one knows how to do all of this yet. So far, people have only been able to use the new computers for very small checks, like breaking 15 into 3 × 5. But people are working very hard today on figuring out how to do bigger things with the new kind of computer.

In fact, building the new kind of computer is so hard, that some people even believe it won’t be possible! But my answer to them is simple. If it’s not possible, then that’s even more interesting to me than if it is possible! And either way, the only way I know to find out the truth is to try it and see what happens.

Sometimes, people pretend that they already built one of these computers even though they didn’t. Or they say things about what the computers could do that aren’t true. I have to admit that, even though I don’t really enjoy it, I do spend a lot of my time these days writing about why those people are wrong.

Oh, one other thing. Not long from now, it might be possible to build computers that don’t do everything that the new computers could eventually do, but that at least do some of it. Like, maybe we could use nothing but light and mirrors to answer questions that, while not important in and of themselves, are still hard to answer using today’s computers. That would at least show that we can do something that’s hard for today’s computers, and it could be a step along the way to the new computers. Anyway, that’s what a lot of my own work has been about for the past four years or so.

Besides the new kind of computers, I’m also interested in understanding what today’s computers can and can’t do. The biggest open problem about today’s computers could be put this way: if a computer can check an answer to a problem in a short time, then can a computer also find an answer in a short time? Almost all of us think that the answer is no, but no one knows how to show it. Six years ago, another guy and I figured out one of the reasons why this question is so hard to answer: that is, why the ideas that we already know don’t work.

Anyway, I have to go to dinner now. I hope you enjoyed this little piece about the kind of stuff that I work on.

## Luke Muehlhauser interviews me about philosophical progress

December 14th, 2013

I’m shipping out today to sunny Rio de Janeiro, where I’ll be giving a weeklong course about BosonSampling, at the invitation of Ernesto Galvão.  Then it’s on to Pennsylvania (where I’ll celebrate Christmas Eve with old family friends), Israel (where I’ll drop off Dana and Lily with Dana’s family in Tel Aviv, then lecture at the Jerusalem Winter School in Theoretical Physics), Puerto Rico (where I’ll speak at the FQXi conference on Physics of Information), back to Israel, and then New York before returning to Boston at the beginning of February.  Given this travel schedule, it’s possible that blogging will be even lighter than usual for the next month and a half (or not—we’ll see).

In the meantime, however, I’ve got the equivalent of at least five new blog posts to tide over Shtetl-Optimized fans.  Luke Muehlhauser, the Executive Director of the Machine Intelligence Research Institute (formerly the Singularity Institute for Artificial Intelligence), did an in-depth interview with me about “philosophical progress,” in which he prodded me to expand on certain comments in Why Philosophers Should Care About Computational Complexity and The Ghost in the Quantum Turing Machine.  Here are (abridged versions of) Luke’s five questions:

1. Why are you so interested in philosophy? And what is the social value of philosophy, from your perspective?

2. What are some of your favorite examples of illuminating Q-primes [i.e., scientifically-addressable pieces of big philosophical questions] that were solved within your own field, theoretical computer science?

3. Do you wish philosophy-the-field would be reformed in certain ways? Would you like to see more crosstalk between disciplines about philosophical issues? Do you think that, as Clark Glymour suggested, philosophy departments should be defunded unless they produce work that is directly useful to other fields … ?

4. Suppose a mathematically and analytically skilled student wanted to make progress, in roughly the way you describe, on the Big Questions of philosophy. What would you recommend they study? What should they read to be inspired? What skills should they develop? Where should they go to study?

5. Which object-level thinking tactics … do you use in your own theoretical (especially philosophical) research?  Are there tactics you suspect might be helpful, which you haven’t yet used much yourself?

PS. In case you missed it before, Quantum Computing Since Democritus was chosen by Scientific American blogger Jennifer Ouellette (via the “Time Lord,” Sean Carroll) as the top physics book of 2013.  Woohoo!!

## 23, Me, and the Right to Misinterpret Probabilities

December 11th, 2013

If you’re the sort of person who reads this blog, you may have heard that 23andMe—the company that (until recently) let anyone spit into a capsule, send it away to a DNA lab, and then learn basic information about their ancestry, disease risks, etc.—has suspended much of its service, on orders from the US Food and Drug Administration.  As I understand it, on Nov. 25, the FDA ordered 23andMe to stop marketing to new customers (though it can still serve existing customers), and on Dec. 5, the company stopped offering new health-related information to any customers (though you can still access the health information you had before, and ancestry and other non-health information is unaffected).

Of course, the impact of these developments is broader: within a couple weeks, “do-it-yourself genomics” has gone from an industry whose explosive growth lots of commentators took as a given, to one whose future looks severely in doubt (at least in the US).

The FDA gave the reasons for its order in a letter to Ann Wojcicki, 23andMe’s CEO.  Excerpts:

For instance, if the BRCA-related risk assessment for breast or ovarian cancer reports a false positive, it could lead a patient to undergo prophylactic surgery, chemoprevention, intensive screening, or other morbidity-inducing actions, while a false negative could result in a failure to recognize an actual risk that may exist.  Assessments for drug responses carry the risks that patients relying on such tests may begin to self-manage their treatments through dose changes or even abandon certain therapies depending on the outcome of the assessment.  For example, false genotype results for your warfarin drug response test could have significant unreasonable risk of illness, injury, or death to the patient due to thrombosis or bleeding events that occur from treatment with a drug at a dose that does not provide the appropriately calibrated anticoagulant effect …  The risk of serious injury or death is known to be high when patients are either non-compliant or not properly dosed; combined with the risk that a direct-to-consumer test result may be used by a patient to self-manage, serious concerns are raised if test results are not adequately understood by patients or if incorrect test results are reported.

To clarify, the DNA labs that 23andMe uses are already government-regulated.  Thus, the question at issue here is not whether, if 23andMe claims (say) that you have CG instead of CC at some particular locus, the information is reliable.  Rather, the question is whether 23andMe should be allowed to tell you that fact, while also telling you that a recent research paper found that people with CG have a 10.4% probability of developing Alzheimer’s disease, as compared to a 7.2% base rate.  More bluntly, the question is whether ordinary schmoes ought to be trusted to learn such facts about themselves, without a doctor as an intermediary to interpret the results for them, or perhaps to decide that there’s no good reason for the patient to know at all.

Among medical experts, a common attitude seems to be something like this: sure, getting access to your own genetic data is harmless fun, as long as you’re an overeducated nerd who just wants to satisfy his or her intellectual curiosity (or perhaps narcissism).  But 23andMe crossed a crucial line when it started marketing its service to the hoi polloi, as something that could genuinely tell them about health risks.  Most people don’t understand probability, and are incapable of parsing “based on certain gene variants we found, your chances of developing diabetes are about 6 times higher than the baseline” as anything other than “you will develop diabetes.”  Nor, just as worryingly, are they able to parse “your chances are lower than the baseline” as anything other than “you won’t develop diabetes.”

I understand this argument.  Nevertheless, I find it completely inconsistent with a free society.  Moreover, I predict that in the future, the FDA’s current stance will be looked back upon as an outrage, with the subtleties in the FDA’s position mattering about as much as the subtleties in the Church’s position toward Galileo (“look, Mr. G., it’s fine to discuss heliocentrism among your fellow astronomers, as a hypothesis or a calculational tool—just don’t write books telling the general public that heliocentrism is literally true, and that they should change their worldviews as a result!”).  That’s why I signed this petition asking the FDA to reconsider its decision, and I encourage you to sign it too.

Here are some comments that might help clarify my views:

(1) I signed up for 23andMe a few years ago, as did the rest of my family.  The information I gained from it wasn’t exactly earth-shattering: I learned, for example, that my eyes are probably blue, that my ancestry is mostly Ashkenazi, that there’s a risk my eyesight will further deteriorate as I age (the same thing a succession of ophthalmologists told me), that I can’t taste the bitter flavor in brussels sprouts, and that I’m an “unlikely sprinter.”  On the other hand, seeing exactly which gene variants correlate with these things, and how they compare to the variants my parents and brother have, was … cool.  It felt like I imagine it must have felt to buy a personal computer in 1975.  In addition, I found nothing the slightest bit dishonest about the way the results were reported.  Each result was stated explicitly in terms of probabilities—giving both the baseline rate for each condition, and the rate conditioned on having such-and-such gene variant—and there were even links to the original research papers if I wanted to read them myself.  I only wish that I got half as much context and detail from conventional doctor visits—or for that matter, from most materials I’ve read from the FDA itself.  (When Dana was pregnant, I was pleasantly surprised when some of the tests she underwent came back with explicit probabilities and base rates.  I remember wishing doctors would give me that kind of information more often.)

(2) From my limited reading and experience, I think it’s entirely possible that do-it-yourself genetic testing is overhyped; that it won’t live up to its most fervent advocates’ promises; that for most interesting traits there are just too many genes involved, via too many labyrinthine pathways, to make terribly useful predictions about individuals, etc.  So it’s important to me that, in deciding whether what 23andMe does should be legal, we’re not being asked to decide any of these complicated questions!  We’re only being asked whether the FDA should get to decide the answers in advance.

(3) As regular readers will know, I’m far from a doctrinaire libertarian.  Thus, my opposition to shutting down 23andMe is not at all a corollary of reflexive opposition to any government regulation of anything.  In fact, I’d be fine if the FDA wanted to insert a warning message on 23andMe (in addition to the warnings 23andMe already provides), emphasizing that genetic tests only provide crude statistical information, that they need to be interpreted with care, consult your doctor before doing anything based on these results, etc.  But when it comes to banning access to the results, I have trouble with some of the obvious slippery slopes.  E.g., what happens when some Chinese or Russian company launches a competing service?  Do we ban Americans from mailing their saliva overseas?  What happens when individuals become able just to sequence their entire genomes, and store and analyze them on their laptops?  Do we ban the sequencing technology?  Or do we just ban software that makes it easy enough to analyze the results?  If the software is hard enough to use, so only professional biologists use it, does that make it OK again?  Also, if the FDA will be in the business of banning genomic data analysis tools, then what about medical books?  For that matter, what about any books or websites, of any kind, that might cause someone to make a poor medical decision?  What would such a policy, if applied consistently, do to the multibillion-dollar alternative medicine industry?

(4) I don’t understand the history of 23andMe’s interactions with the FDA.  From what I’ve read, though, they have been communicating for five years, with everything 23andMe has said in public sounding conciliatory rather than defiant (though the FDA has accused 23andMe of being tardy with its responses).  Apparently, the key problem is simply that the FDA hasn’t yet developed a regulatory policy specifically for direct-to-consumer genetic tests.  It’s been considering such a policy for years—but in the meantime, it believes no one should be marketing such tests for health purposes before a policy exists.  Alas, there are very few cases where I’d feel inclined to support a government in saying: “X is a new technology that lots of people are excited about.  However, our regulatory policies haven’t yet caught up to X.  Therefore, our decision is that X is banned, until and unless we figure out how to regulate it.”  Maybe I could support such a policy, if X had the potential to level cities and kill millions.  But when it comes to consumer DNA tests, this sort of preemptive banning seems purposefully designed to give wet dreams to Ayn Rand fans.

(5) I confess that, despite everything I’ve said, my moral intuitions might be different if dead bodies were piling up because of terrible 23andMe-inspired medical decisions.  But as far as I know, there’s no evidence so far that even a single person was harmed.  Which isn’t so surprising: after all, people might run to their doctor terrified about something they learned on 23onMe, but no sane doctor would ever make a decision solely on that basis, without ordering further tests.

## Twenty Reasons to Believe Oswald Acted Alone

December 2nd, 2013

As the world marked the 50th anniversary of the JFK assassination, I have to confess … no, no, not that I was in on the plot.  I wasn’t even born then, silly.  I have to confess that, in between struggling to make a paper deadline, attending a workshop in Princeton, celebrating Thanksgivukkah, teaching Lily how to pat her head and clap her hands, and not blogging, I also started dipping, for the first time in my life, into a tiny fraction of the vast literature about the JFK assassination.  The trigger (so to speak) for me was this article by David Talbot, the founder of Salon.com.  I figured, if the founder of Salon is a JFK conspiracy buff—if, for crying out loud, my skeptical heroes Bertrand Russell and Carl Sagan were both JFK conspiracy buffs—then maybe it’s at least worth familiarizing myself with the basic facts and arguments.

So, what happened when I did?  Were the scales peeled from my eyes?

In a sense, yes, they were.  Given how much has been written about this subject, and how many intelligent people take seriously the possibility of a conspiracy, I was shocked by how compelling I found the evidence to be that there were exactly three shots, all fired by Lee Harvey Oswald with a Carcano rifle from the sixth floor of the Texas School Book Depository, just as the Warren Commission said in 1964.  And as for Oswald’s motives, I think I understand them as well and as poorly as I understand the motives of the people who send me ramblings every week about P vs. NP and the secrets of the universe.

Before I started reading, if someone forced me to guess, maybe I would’ve assigned a ~10% probability to some sort of conspiracy.  Now, though, I’d place the JFK conspiracy hypothesis firmly in Moon-landings-were-faked, Twin-Towers-collapsed-from-the-inside territory.  Or to put it differently, “Oswald as lone, crazed assassin” has been added to my large class of “sanity-complete” propositions: propositions defined by the property that if I doubt any one of them, then there’s scarcely any part of the historical record that I shouldn’t doubt.  (And while one can’t exclude the possibility that Oswald confided in someone else before the act—his wife or a friend, for example—and that other person kept it a secret for 50 years, what’s known about Oswald strongly suggests that he didn’t.)

So, what convinced me?  In this post, I’ll give twenty reasons for believing that Oswald acted alone.  Notably, my reasons will have less to do with the minutiae of bullet angles and autopsy reports, than with general principles for deciding what’s true and what isn’t.  Of course, part of the reason for this focus is that the minutiae are debated in unbelievable detail elsewhere, and I have nothing further to contribute to those debates.  But another reason is that I’m skeptical that anyone actually comes to believe the JFK conspiracy hypothesis because they don’t see how the second bullet came in at the appropriate angle to pass through JFK’s neck and shoulder and then hit Governor Connally.  Clear up some technical point (or ten or fifty of them)—as has been done over and over—and the believers will simply claim that the data you used was altered by the CIA, or they’ll switch to other “anomalies” without batting an eye.  Instead, people start with certain general beliefs about how the world works, “who’s really in charge,” what sorts of explanations to look for, etc., and then use their general beliefs to decide which claims to accept about JFK’s head wounds or the foliage in Dealey Plaza—not vice versa.  That being so, one might as well just discuss the general beliefs from the outset.  So without further ado, here are my twenty reasons:

1. Conspiracy theorizing represents a known bug in the human nervous system.  Given that, I think our prior should be overwhelmingly against anything that even looks like a conspiracy theory.  (This is not to say conspiracies never happen.  Of course they do: Watergate, the Tobacco Institute, and the Nazi Final Solution were three well-known examples.  But the difference between conspiracy theorists’ fantasies and actual known conspiracies is this: in a conspiracy theory, some powerful organization’s public face hides a dark and terrible secret; its true mission is the opposite of its stated one.  By contrast, in every real conspiracy I can think of, the facade was already 90% as terrible as the reality!  And the “dark secret” was that the organization was doing precisely what you’d expect it to do, if its members genuinely held the beliefs that they claimed to hold.)

2. The shooting of Oswald by Jack Ruby created the perfect conditions for conspiracy theorizing to fester.  Conditioned on that happening, it would be astonishing if a conspiracy industry hadn’t arisen, with its hundreds of books and labyrinthine arguments, even under the assumption that Oswald and Ruby both really acted alone.

3. Other high-profile assassinations to which we might compare this one—for example, those of Lincoln, Garfield, McKinley, RFK, Martin Luther King Jr., Gandhi, Yitzchak Rabin…—appear to have been the work of “lone nuts,” or at most “conspiracies” of small numbers of lowlifes.  So why not this one?

4. Oswald seems to have perfectly fit the profile of a psychopathic killer (see, for example, Case Closed by Gerald Posner).  From very early in his life, Oswald exhibited grandiosity, resentment, lack of remorse, doctrinaire ideological fixations, and obsession with how he’d be remembered by history.

5. A half-century of investigation has failed to link any individual besides Oswald to the crime.  Conspiracy theorists love to throw around large, complicated entities like the CIA or the Mafia as potential “conspirators”—but in the rare cases when they’ve tried to go further, and implicate an actual human being other than Oswald or Ruby (or distant power figures like LBJ), the results have been pathetic and tragic.

6. Oswald had previously tried to assassinate General Walker—a fact that was confirmed by his widow Marina Oswald, but that, incredibly, is barely even discussed in the reams of conspiracy literature.

7. There’s clear evidence that Oswald murdered Officer Tippit an hour after shooting JFK—a fact that seems perfectly consistent with the state of mind of someone who’d just murdered the President, but that, again, seems to get remarkably little discussion in the conspiracy literature.

8. Besides being a violent nut, Oswald was also a known pathological liar.  He lied on his employment applications, he lied about having established a thriving New Orleans branch of Fair Play for Cuba, he lied and lied and lied.  Because of this tendency—as well as his persecution complex—Oswald’s loud protestations after his arrest that he was just a “patsy” count for almost nothing.

9. According to police accounts, Oswald acted snide and proud of himself after being taken into custody: for example, when asked whether he had killed the President, he replied “you find out for yourself.”  He certainly didn’t act like an innocent “patsy” arrested on such a grave charge would plausibly act.

10. Almost all JFK conspiracy theories must be false, simply because they’re mutually inconsistent.  Once you realize that, and start judging the competing conspiracy theories by the standards you’d have to judge them by if at most one could be true, enlightenment may dawn as you find there’s nothing in the way of just rejecting all of them.  (Of course, some people have gone through an analogous process with religions.)

11. The case for Oswald as lone assassin seems to become stronger, the more you focus on the physical evidence and stuff that happened right around the time and place of the event.  To an astonishing degree, the case for a conspiracy seems to rely on verbal testimony years or decades afterward—often by people who are known confabulators, who were nowhere near Dealey Plaza at the time, who have financial or revenge reasons to invent stories, and who “remembered” seeing Oswald and Ruby with CIA agents, etc. only under drugs or hypnosis.  This is precisely the pattern we would expect if conspiracy theorizing reflected the reality of the human nervous system rather than the reality of the assassination.

12. If the conspiracy is so powerful, why didn’t it do something more impressive than just assassinate JFK? Why didn’t it rig the election to prevent JFK from becoming President in the first place?  (In math, very often the way you discover a bug in your argument is by realizing that the argument gives you more than you originally intended—vastly, implausibly more.  Yet every pro-conspiracy argument I’ve read seems to suffer from the same problem.  For example, after successfully killing JFK, did the conspiracy simply disband?  Or did it go on to mastermind other assassinations?  If it didn’t, why not?  Isn’t pulling the puppet-strings of the world sort of an ongoing proposition?  What, if any, are the limits to this conspiracy’s power?)

13. Pretty much all the conspiracy writers I encountered exude total, 100% confidence, not only in the existence of additional shooters, but in the guilt of their favored villains (they might profess ignorance, but then in the very next sentence they’d talk about how JFK’s murder was “a triumph for the national security establishment”).  For me, their confidence had the effect of weakening my own confidence in their intellectual honesty, and in any aspects of their arguments that I had to take on faith.  The conspiracy camp would of course reply that the “Oswald acted alone” camp also exudes too much confidence in its position.  But the two cases are not symmetric: for one thing, because there are so many different conspiracy theories, but only one Oswald.  If I were a conspiracy believer I’d be racked with doubts, if nothing else then about whether my conspiracy was the right one.

14. Every conspiracy theory I’ve encountered seems to require “uncontrolled growth” in size and complexity: that is, the numbers of additional shooters, alterations of medical records, murders of inconvenient witnesses, coverups, coverups of the coverups, etc. that need to be postulated all seem to multiply without bound.  To some conspiracy believers, this uncontrolled growth might actually be a feature: the more nefarious and far-reaching the conspiracy’s tentacles, the better.  It should go without saying that I regard it as a bug.

15. JFK was not a liberal Messiah.  He moved slowly on civil rights for fear of a conservative backlash, invested heavily in building nukes, signed off on the botched plans to kill Fidel Castro, and helped lay the groundwork for the US’s later involvement in Vietnam.  Yes, it’s possible that he would’ve made wiser decisions about Vietnam than LBJ ended up making; that’s part of what makes his assassination (like RFK’s later assassination) a tragedy.  But many conspiracy theorists’ view of JFK as an implacable enemy of the military-industrial complex is preposterous.

16. By the same token, LBJ was not exactly a right-wing conspirator’s dream candidate.  He was, if anything, more aggressive on poverty and civil rights than JFK was.  And even if he did end up being better for certain military contractors, that’s not something that would’ve been easy to predict in 1963, when the US’s involvement in Vietnam had barely started.

17. Lots of politically-powerful figures have gone on the record as believers in a conspiracy, including John Kerry, numerous members of Congress, and even frequently-accused conspirator LBJ himself.  Some people would say that this lends credibility to the conspiracy cause.  To me, however, it indicates just the opposite: that there’s no secret cabal running the world, and that those in power are just as prone to bugs in the human nervous system as anyone else is.

18. As far as I can tell, the conspiracy theorists are absolutely correct that JFK’s security in Dallas was unbelievably poor; that the Warren Commission was as interested in reassuring the nation and preventing a war with the USSR or Cuba as it was in reaching the truth (the fact that it did reach the truth is almost incidental); and that agencies like the CIA and FBI kept records related to the assassination classified for way longer than there was any legitimate reason to (though note that most records finally were declassified in the 1990s, and they provided zero evidence for any conspiracy).  As you might guess, I ascribe all of these things to bureaucratic incompetence rather than to conspiratorial ultra-competence.  But once again, these government screwups help us understand how so many intelligent people could come to believe in a conspiracy even in the total absence of one.

19. In the context of the time, the belief that JFK was killed by a conspiracy filled a particular need: namely, the need to believe that the confusing, turbulent events of the 1960s had an understandable guiding motive behind them, and that a great man like JFK could only be brought down by an equally-great evil, rather than by a chronically-unemployed loser who happened to see on a map that JFK’s motorcade would be passing by his workplace.  Ironically, I think that Roger Ebert got it exactly right when he praised Oliver Stone’s JFK movie for its “emotional truth.”  In much the same way, one could say that Birth of a Nation was “emotionally true” for Southern racists, or that Ben Stein’s Expelled was “emotionally true” for creationists.  Again, I’d say that the “emotional truth” of the conspiracy hypothesis is further evidence for its factual falsehood: for it explains how so many people could come to believe in a conspiracy even if the evidence for one were dirt-poor.

20. At its core, every conspiracy argument seems to be built out of “holes”: “the details that don’t add up in the official account,” “the questions that haven’t been answered,” etc.  What I’ve never found is a truly coherent alternative scenario: just one “hole” after another.  This pattern is the single most important red flag for me, because it suggests that the JFK conspiracy theorists view themselves as basically defense attorneys: people who only need to sow enough doubts, rather than establish the reality of what happened.  Crucially, creationism, 9/11 trutherism, and every other elaborate-yet-totally-wrong intellectual edifice I’ve ever encountered has operated on precisely the same “defense attorney principle”: “if we can just raise enough doubts about the other side’s case, we win!”  But that’s a terrible approach to knowledge, once you’ve seen firsthand how a skilled arguer can raise unlimited doubts even about the nonexistence of a monster under your bed.  Such arguers are hoping, of course, that you’ll find their monster hypothesis so much more fun, exciting, and ironically comforting than the “random sounds in the night hypothesis,” that it won’t even occur to you to demand they show you their monster.

## Scattershot BosonSampling: A new approach to scalable BosonSampling experiments

November 8th, 2013

Update (12/2): Jeremy Hsu has written a fantastic piece for IEEE Spectrum, entitled “D-Wave’s Year of Computing Dangerously.”

Update (11/13): See here for video of a fantastic talk that Matthias Troyer gave at Stanford, entitled “Quantum annealing and the D-Wave devices.” The talk includes the results of experiments on the 512-qubit machine. (Thanks to commenter jim for the pointer. I attended the talk when Matthias gave it last week at Harvard, but I don’t think that one was videotaped.)

Update (11/11): A commenter named RaulGPS has offered yet another great observation that, while forehead-slappingly obvious in retrospect, somehow hadn’t occurred to us.  Namely, Raul points out that the argument given in this post, for the hardness of Scattershot BosonSampling, can also be applied to answer open question #4 from my and Alex’s paper: namely, how hard is BosonSampling with Gaussian inputs and number-resolving detectors?  Raul points out that the latter, in general, is certainly at least as hard as Scattershot BS.  For we can embed Scattershot BS into “ordinary” BS with Gaussian inputs, by first generating a bunch of entangled 2-mode Gaussian states (which are highly attenuated, so that with high probability none of them have 2 or more photons per mode), and then applying a Haar-random unitary U to the “right halves” of these Gaussian states while doing nothing to the left halves.  Then we can measure the left halves to find out which of the input states contained a photon before we applied U.  This is precisely equivalent to Scattershot BS, except for the unimportant detail that our measurement of the “herald” photons has been deferred till the end of the experiment instead of happening at the beginning.  And therefore, since (as I explain in the post) a fast classical algorithm for approximate Scattershot BosonSampling would let us estimate the permanents of i.i.d. Gaussian matrices in BPPNP, we deduce that a fast classical algorithm for approximate Gaussian BosonSampling would have the same consequence.  In short, approximate Gaussian BS can be argued to be hard under precisely the same complexity assumption as can approximate ordinary BS (and approximate Scattershot BS).  Thus, in the table in Section 1.4 of our paper, the entries “Gaussian states / Adaptive, demolition” and “Gaussian states / Adaptive, nondemolition” should be “upgraded” from “Exact sampling hard” to “Apx. sampling hard?”

One other announcement: following a suggestion by commenter Rahul, I hereby invite guest posts on Shtetl-Optimized by experimentalists working on BosonSampling, offering your personal views about the prospects and difficulties of scaling up.  Send me email if you’re interested.  (Or if you don’t feel like writing a full post, of course you can also just leave a comment on this one.)

[Those impatient for a cool, obvious-in-retrospect new idea about BosonSampling, which I learned from the quantum optics group at Oxford, should scroll to the end of this post.  Those who don't even know what BosonSampling is, let alone Scattershot BosonSampling, should start at the beginning.]

BosonSampling is a proposal by me and Alex Arkhipov for a rudimentary kind of quantum computer: one that would be based entirely on generating single photons, sending them through a network of beamsplitters and phaseshifters, and then measuring where they ended up.  BosonSampling devices are not thought to be capable of universal quantum computing, or even universal classical computing for that matter.  And while they might be a stepping-stone toward universal optical quantum computers, they themselves have a grand total of zero known practical applications.  However, even if the task performed by BosonSamplers is useless, the task is of some scientific interest, by virtue of apparently being hard!  In particular, Alex and I showed that, if a BosonSampler can be simulated exactly in polynomial time by a classical computer, then P#P=BPPNP, and hence the polynomial hierarchy collapses to the third level.  Even if a BosonSampler can only be approximately simulated in classical polynomial time, the polynomial hierarchy would still collapse, if a reasonable-looking conjecture in classical complexity theory is true.  For these reasons, BosonSampling might provide an experimental path to testing the Extended Church-Turing Thesis—i.e., the thesis that all natural processes can be simulated with polynomial overhead by a classical computer—that’s more “direct” than building a universal quantum computer.  (As an asymptotic claim, obviously the ECT can never be decisively proved or refuted by a finite number of experiments.  However, if one could build a BosonSampler with, let’s say, 30 photons, then while it would still be feasible to verify the results with a classical computer, it would be fair to say that the BosonSampler was working “faster” than any known algorithm running on existing digital computers.)

In arguing for the hardness of BosonSampling, the crucial fact Alex and I exploited is that the amplitudes for n-photon processes are given by the permanents of nxn matrices of complex numbers, and Leslie Valiant proved in 1979 that the permanent is #P-complete (i.e., as hard as any combinatorial counting problem, and probably even “harder” than NP-complete).  To clarify, this doesn’t mean that a BosonSampler lets you calculate the permanent of a given matrix—that would be too good to be true!  (See the tagline of this blog.)  What you could do with a BosonSampler is weirder: you could sample from a probability distribution over matrices, in which matrices with large permanents are more likely to show up than matrices with small permanents.  So, what Alex and I had to do was to argue that even that sampling task is still probably intractable classically—in the sense that, if it weren’t, then there would also be unlikely classical algorithms for more “conventional” problems.

Anyway, that’s my attempt at a 2-paragraph summary of something we’ve been thinking about on and off for four years.  See here for my and Alex’s original paper on BosonSampling, here for a recent followup paper, here for PowerPoint slides, here and here for MIT News articles by Larry Hardesty, and here for my blog post about the first (very small, 3- or 4-photon) demonstrations of BosonSampling by quantum optics groups last year, with links to the four experimental papers that came out then.

In general, we’ve been thrilled by the enthusiastic reaction to BosonSampling by quantum optics people—especially given that the idea started out as pure complexity theory, with the connection to optics coming as an “unexpected bonus.”  But not surprisingly, BosonSampling has also come in for its share of criticism: e.g., that it’s impractical, unscalable, trivial, useless, oversold, impossible to verify, and probably some other things.  A few people have even claimed that, in expressing support and cautious optimism about the recent BosonSampling experiments, I’m guilty of the same sort of quantum computing hype that I complain about in others.  (I’ll let you be the judge of that.  Reread the paragraphs above, or anything else I’ve ever written about this topic, and then compare to, let’s say, this video.)

By far the most important criticism of BosonSampling—one that Alex and I have openly acknowledged and worried a lot about almost from the beginning—concerns the proposal’s scalability.  The basic problem is this: in BosonSampling, your goal is to measure a pattern of quantum interference among n identical, non-interacting photons, where n is as large as possible.  (The special case n=2 is called the Hong-Ou-Mandel dip; conversely, BosonSampling can be seen as just “Hong-Ou-Mandel on steroids.”)  The bigger n gets, the harder the experiment ought to be to simulate using a classical computer (with the difficulty increasing at least like ~2n).  The trouble is that, to detect interference among n photons, the various quantum-mechanical paths that your photons could take, from the sources, through the beamsplitter network, and finally to the detectors, have to get them there at exactly the same time—or at any rate, close enough to “the same time” that the wavepackets overlap.  Yet, while that ought to be possible in theory, the photon sources that actually exist today, and that will exist for the foreseeable future, just don’t seem good enough to make it happen, for anything more than a few photons.

The reason—well-known for decades as a bane to quantum information experiments—is that there’s no known process in nature that can serve as a deterministic single-photon source.  What you get from an attenuated laser is what’s called a coherent state: a particular kind of superposition of 0 photons, 1 photon, 2 photons, 3 photons, etc., rather than just 1 photon with certainty (the latter is called a Fock state).  Alas, coherent states behave essentially like classical light, which makes them pretty much useless for BosonSampling, and for many other quantum information tasks besides.  For that reason, a large fraction of modern quantum optics research relies on a process called Spontaneous Parametric Down-Conversion (SPDC).  In SPDC, a laser (called the “pump”) is used to stimulate a crystal to produce further photons.  The process is inefficient: most of the time, no photon comes out.  But crucially, any time a photon does come out, its arrival is “heralded” by a partner photon flying out in the opposite direction.  Once in a while, 2 photons come out simultaneously, in which case they’re heralded by 2 partner photons—and even more rarely, 3 photons come out, heralded by 3 partner photons, and so on.  Furthermore, there exists something called a number-resolving detector, which can tell you (today, sometimes, with as good as ~95% reliability) when one or more partner photons have arrived, and how many of them there are.  The result is that SPDC lets us build what’s called a nondeterministic single-photon source.  I.e., you can’t control exactly when a photon comes out—that’s random—but eventually one (and only one) photon will come out, and when that happens, you’ll know it happened, without even having to measure and destroy the precious photon.  The reason you’ll know is that the partner photon heralds its presence.

Alas, while SPDC sources have enabled demonstrations of a large number of cool quantum effects, there’s a fundamental problem with using them for BosonSampling.  The problem comes from the requirement that n—the number of single photons fired off simultaneously into your beamsplitter network—should be big (say, 20 or 30).  Suppose that, in a given instant, the probability that your SPDC source succeeds in generating a photon is p.  Then what’s the probability that two SPDC sources will both succeed in generating a photon at that instant?  p2.  And the probability that three sources will succeed is p3, etc.  In general, with n sources, the probability that they’ll succeed simultaneously falls off exponentially with n, and the amount of time you’ll need to sit in the lab waiting for the lucky event increases exponentially with n.  Sure, when it finally does happen, it will be “heralded.”  But if you need to wait exponential time for it to happen, then there would seem to be no advantage over classical computation.  This is the reason why so far, BosonSampling has only been demonstrated with 3-4 photons.

At least three solutions to the scaling problem suggest themselves, but each one has problems of its own.  The first solution is simply to use general methods for quantum fault-tolerance: it’s not hard to see that, if you had a fault-tolerant universal quantum computer, then you could simulate BosonSampling with as many photons as you wanted.  The trouble is that this requires a fault-tolerant universal quantum computer!  And if you had that, then you’d probably just skip BosonSampling and use Shor’s algorithm to factor some 10,000-digit numbers.  The second solution is to invent some specialized fault-tolerance method that would apply directly to quantum optics.  Unfortunately, we don’t know how to do that.  The third solution—until recently, the one that interested me and Alex the most—would be to argue that, even if your sources are so cruddy that you have no idea which ones generated a photon and which didn’t in any particular run, the BosonSampling distribution is still intractable to simulate classically.  After all, the great advantage of BosonSampling is that, unlike with (say) factoring or quantum simulation, we don’t actually care which problem we’re solving!  All we care about is that we’re doing something that we can argue is hard for classical computers.  And we have enormous leeway to change what that “something” is, to match the capabilities of current technology.  Alas, yet again, we don’t know how to argue that BosonSampling is hard to simulate approximately in the presence of realistic amounts of noise—at best, we can argue that it’s hard to simulate approximately in the presence of tiny amounts of noise, and hard to simulate super-accurately in the presence of realistic noise.

When faced with these problems, until recently, all we could do was

1. shrug our shoulders,
2. point out that none of the difficulties added up to a principled argument that scalable BosonSampling was not possible,
3. stress, again, that all we were asking for was to scale to 20 or 30 photons, not 100 or 1000 photons, and
4. express hope that technologies for single-photon generation currently on the drawing board—most notably, something called “optical multiplexing”—could be used to get up to the 20 or 30 photons we wanted.

Well, I’m pleased to announce, with this post, that there’s now a better idea for how to scale BosonSampling to interesting numbers of photons.  The idea, which I’ve taken to calling Scattershot BosonSampling, is not mine or Alex’s.  I learned of it from Ian Walmsley’s group at Oxford, where it’s been championed in particular by Steve Kolthammer(Update: A commenter has pointed me to a preprint by Lund, Rahimi-Keshari, and Ralph from May of this year, which I hadn’t seen before, and which contains substantially the same idea, albeit with an unsatisfactory argument for computational hardness.  In any case, as you’ll see, it’s not surprising that this idea would’ve occurred to multiple groups of experimentalists independently; what’s surprising is that we didn’t think of it!)  The minute I heard about Scattershot BS, I kicked myself for failing to think of it, and for getting sidetracked by much more complicated ideas.  Steve and others are working on a paper about Scattershot BS, but in the meantime, Steve has generously given me permission to share the idea on this blog.  I suggested a blog post for two reasons: first, as you’ll see, this idea really is “blog-sized.”  Once you make the observation, there’s barely any theoretical analysis that needs to be done!  And second, I was impatient to get out to the “experimental BosonSampling community”—not to mention to the critics!—that there’s now a better way to BosonSample, and one that’s incredibly simple to boot.

OK, so what is the idea?  Well, recall from above what an SPDC source does: it produces a photon with only a small probability, but whenever it does, it “heralds” the event with a second photon.  So, let’s imagine that you have an array of 200 SPDC sources.  And imagine that, these sources being unpredictable, only (say) 10 of them, on average, produce a photon at any given time.  Then what can you do?  Simple: just define those 10 sources to be the inputs to your experiment!  Or to say it more carefully: instead of sampling only from a probability distribution over output configurations of your n photons, now you’ll sample from a joint distribution over inputs and outputs: one where the input is uniformly random, and the output depends on the input (and also, of course, on the beamsplitter network).  So, this idea could also be called “Double BosonSampling”: now, not only do you not control which output will be observed (but only the probability distribution over outputs), you don’t control which input either—yet this lack of control is not a problem!  There are two key reasons why it isn’t:

1. As I said before, SPDC sources have the crucial property that they herald a photon when they produce one.  So, even though you can’t control which 10 or so of your 200 SPDC sources will produce a photon in any given run, you know which 10 they were.
2. In my and Alex’s original paper, the “hardest” case of BosonSampling that we were able to find—the case we used for our hardness reductions—is simply the one where the mxn “scattering matrix,” which describes the map between the n input modes and the m>>n output modes, is a Haar-random matrix whose columns are orthonormal vectors.  But now suppose we have m input modes and m output modes, and the mxm unitary matrix U mapping inputs to outputs is Haar-random.  Then any mxn submatrix of U will simply be an instance of the “original” hard case that Alex and I studied!

More formally, what can we  say about the computational complexity of Scattershot BS?  Admittedly, I don’t know of a reduction from ordinary BS to Scattershot BS (though it’s easy to give a reduction in the other direction).  However, under exactly the same assumption that Alex and I used to argue that ordinary BosonSampling was hard—our so-called Permanent of Gaussians Conjecture (PGC)—one can show that Scattershot BS is hard also, and by essentially the same proof.  The only difference is that, instead of talking about the permanents of nxn submatrices of an mxn Haar-random, column-orthonormal matrix, now we talk about the permanents of nxn submatrices of an mxm Haar-random unitary matrix.  Or to put it differently: where before we fixed the columns that defined our nxn submatrix and only varied the rows, now we vary both the rows and the columns.  But the resulting nxn submatrix is still close in variation distance to a matrix of i.i.d. Gaussians, for exactly the same reasons it was before.  And we can still check whether submatrices with large permanents are more likely to be sampled than submatrices with small permanents, in the way predicted by quantum mechanics.

Now, everything above assumed that each SPDC source produces either 0 or 1 photon.  But what happens when the SPDC sources produce 2 or more photons, as they sometimes do?  It turns out that there are two good ways to deal with these “higher-order terms” in the context of Scattershot BS.  The first way is by using number-resolving detectors to count how many herald photons each SPDC source produces.  That way, at least you’ll know exactly which sources produced extra photons, and how many extra photons each one produced.  And, as is often the case in BosonSampling, a devil you know is a devil you can deal with.  In particular, a few known sources producing extra photons, just means that the amplitudes of the output configurations will now be permanents of matrices with a few repeated rows in them.  But the permanent of an otherwise-random matrix with a few repeated rows should still be hard to compute!  Granted, we don’t know how to derive that as a consequence of our original hardness assumption, but this seems like a case where one is perfectly justified to stick one’s neck out and make a new assumption.

But there’s also a more elegant way to deal with higher-order terms.  Namely, suppose m>>n2 (i.e., the number of input modes is at least quadratically greater than the average number of photons).  That’s an assumption that Alex and I typically made anyway in our original BosonSampling paper, because of our desire to avoid what we called the “Bosonic Birthday Paradox” (i.e., the situation where two or more photons congregate in the same output mode).  What’s wonderful is that exactly the same assumption also implies that, in Scattershot BS, two or more photons will almost never be found in the same input mode!  That is, when you do the calculation, you find that, once you’ve attenuated your SPDC sources enough to avoid the Bosonic Birthday Paradox at the output modes, you’ve also attenuated them enough to avoid higher-order terms at the input modes.  Cool, huh?

Are there any drawbacks to Scattershot BS?  Well, Scattershot BS certainly requires more SPDC sources than ordinary BosonSampling does, for the same average number of photons.  A little less obviously, Scattershot BS also requires a larger-depth beamsplitter network.  In our original paper, Alex and I showed that for ordinary BosonSampling, it suffices to use a beamsplitter network of depth O(n log m), where n is the number of photons and m is the number of output modes (or equivalently detectors).  However, our construction took advantage of the fact that we knew exactly which n<<m sources the photons were going to come from, and could therefore optimize for those.  For Scattershot BS, the depth bound increases to O(m log m): since the n photons could come from any possible subset of the m input modes, we no longer get the savings based on knowing where they originate.  But this seems like a relatively minor issue.

I don’t want to give the impression that Scattershot BS is a silver bullet that will immediately let us BosonSample with 30 photons.  The most obvious limiting factor that remains is the efficiency of the photon detectors—both those used to detect the photons that have passed through the beamsplitter network, and those used to detect the herald photons.  Because of detector inefficiencies, I’m told that, without further technological improvements (or theoretical ideas), it will still be quite hard to push Scattershot BS beyond about 10 photons.  Still, as you might have noticed, 10 is greater than 4 (the current record)!  And certainly, Scattershot BS itself—a simple, obvious-in-retrospect idea that was under our noses for years, and that immediately pushes forward the number of photons a BosonSampler can handle—should make us exceedingly reluctant to declare there can’t be any more such ideas, and that our current ignorance amounts to a proof of impossibility.

## Three things that I should’ve gotten around to years ago

October 15th, 2013

Updates (11/8): Alas, video of Eliezer’s talk will not be available after all. The nincompoops who we paid to record the talk wrote down November instead of October for the date, didn’t show up, then stalled for a month before finally admitting what had happened. So my written summary will have to suffice (and maybe Eliezer can put his slides up as well).

In other news, Shachar Lovett has asked me to announce a workshop on complexity and coding theory, which will be held at UC San Diego, January 8-10, 2014.

Update (10/21): Some readers might be interested in my defense of LessWrongism against a surprisingly-common type of ad-hominem attack (i.e., “the LW ideas must be wrong because so many of their advocates are economically-privileged but socially-awkward white male nerds, the same sorts of people who might also be drawn to Ayn Rand or other stuff I dislike”). By all means debate the ideas—I’ve been doing it for years—but please give beyond-kindergarten arguments when you do so!

Update (10/18): I just posted a long summary and review of Eliezer Yudkowsky’s talk at MIT yesterday.

Update (10/15): Leonard Schulman sent me the news that, according to an article by Victoria Woollaston in the Daily Mail, Google hopes to use its D-Wave quantum computer to “solve global warming,” “develop sophisticated artificial life,” and “find aliens.”  (No, I’m not making any of this up: just quoting stuff other people made up.)  The article also repeats the debunked canard that the D-Wave machine is “3600 times faster,” and soberly explains that D-Wave’s 512 qubits compare favorably to the mere 32 or 64 bits found in home PCs (exercise for those of you who aren’t already rolling on the floor: think about that until you are).  It contains not a shadow of a hint of skepticism anywhere, not one token sentence.  I would say that, even in an extremely crowded field, Woollaston’s piece takes the cake as the single most irresponsible article about D-Wave I’ve seen.  And I’d feel terrible for my many friends at Google, whose company comes out of this looking like a laughingstock.  But that’s assuming that this isn’t some sort of elaborate, Sokal-style prank, designed simply to prove that media outlets will publish anything whatsoever, no matter how forehead-bangingly absurd, as long as it contains the words “D-Wave,” “Google,” “NASA,” and “quantum”—and thereby, to prove the truth of what I’ve been saying on this blog since 2007.

1. I’ve added MathJax support to the comments section!  If you want to insert an inline LaTeX equation, surround it with$$\backslash( \backslash)$$, while if you want to insert a displayed equation, surround it with $$\text{\\ \\}$$.  Thanks very much to Michael Dixon for prodding me to do this and telling me how.

2. I’ve also added upvoting and downvoting to the comments section!  OK, in the first significant use of comment voting, the readers have voted overwhelmingly, by 41 – 13, that they want the comment voting to disappear.  So disappear it has!

3. Most importantly, I’ve invited Eliezer Yudkowsky to MIT to give a talk!  He’s here all week, and will be speaking on “Recursion in Rational Agents: Foundations for Self-Modifying AI” this Thursday at 4PM in 32-123 in the MIT Stata Center.  Refreshments at 3:45.  See here for the abstract.  Anyone in the area who’s interested in AI, rationalism, or other such nerdy things is strongly encouraged to attend; it should be interesting.  Just don’t call Eliezer a “Singularitarian”: I’m woefully out of the loop, but I learned yesterday that they’ve dropped that term entirely, and now prefer to be known as machine intelligence researchers talk about the intelligence explosion.

(In addition, Paul Christiano—former MIT undergrad, and my collaborator on quantum money—will be speaking today at 4:30 at the Harvard Science Center, on “Probabilistic metamathematics and the definability of truth.”  His talk will be related to Eliezer’s but somewhat more technical.  See here for details.)

Update (10/15): Alistair Sinclair asked me to post the following announcement.

The Simons Institute for the Theory of Computing at UC Berkeley invites applications for Research Fellowships for academic year 2014-15.

Simons-Berkeley Research Fellowships are an opportunity for outstanding junior scientists (up to 6 years from PhD by Fall 2014) to spend one or two semesters at the Institute in connection with one or more of its programs. The programs for 2014-15 are as follows:

* Algorithmic Spectral Graph Theory (Fall 2014)
* Algorithms and Complexity in Algebraic Geometry (Fall 2014)
* Information Theory (Spring 2015)

Applicants who already hold junior faculty or postdoctoral positions are welcome to apply. In particular, applicants who hold, or expect to hold, postdoctoral appointments at other institutions are encouraged to apply to spend one semester as a Simons-Berkeley Fellow subject to the approval of the postdoctoral institution.

Further details and application instructions can be found at http://simons.berkeley.edu/fellows2014. Information about the Institute and the above programs can be found at http://simons.berkeley.edu.

Deadline for applications: 15 December, 2013.

## Five announcements

October 1st, 2013

Update (Oct. 3): OK, a sixth announcement.  I just posted a question on CS Theory StackExchange, entitled Overarching reasons why problems are in P or BPP.  If you have suggested additions or improvements to my rough list of “overarching reasons,” please post them over there — thanks!

1. I’m in Oxford right now, for a Clay Institute workshop on New Insights into Computational Intractability.  The workshop is concurrent with three others, including one on Number Theory and Physics that includes an amplituhedron-related talk by Andrew Hodges.  (Speaking of which, see here for a small but non-parodic observation about expressing amplitudes as volumes of polytopes.)

2. I was hoping to stay in the UK one more week, to attend the Newton Institute’s special semester on Mathematical Challenges in Quantum Information over in Cambridge.  But alas I had to cancel, since my diaper-changing services are needed in the other Cambridge.  So, if anyone in Cambridge (or anywhere else in the United Kingdom) really wants to talk to me, come to Oxford this week!

3. Back in June, Jens Eisert and three others posted a preprint claiming that the output of a BosonSampling device would be “indistinguishable from the uniform distribution” in various senses.  Ever since then, people have emailing me, leaving comments on this blog, and cornering me at conferences to ask whether Alex Arkhipov and I had any response to these claims.  OK, so just this weekend, we posted our own 41-page preprint, entitled “BosonSampling Is Far From Uniform.”  I hope it suffices by way of reply!  (Incidentally, this is also the paper I hinted at in a previous post: the one where π2/6 and the Euler-Mascheroni constant make cameo appearances.)  To clarify, if we just wanted to answer the claims of the Eisert group, then I think a couple paragraphs would suffice for that (see, for example, these PowerPoint slides).  In our new paper, however, Alex and I take the opportunity to go further: we study lots of interesting questions about the statistical properties of Haar-random BosonSampling distributions, and about how one might test efficiently whether a claimed BosonSampling device worked, even with hundreds or thousands of photons.

4. Also on the arXiv last night, there was a phenomenal survey about the quantum PCP conjecture by Dorit Aharonov, Itai Arad, and my former postdoc Thomas Vidick (soon to be a professor at Caltech).  I recommend reading it in the strongest possible terms, if you’d like to see how far people have come with this problem (but also, how far they still have to go) since my “Quantum PCP Manifesto” seven years ago.

5. Christos Papadimitriou asked me to publicize that the deadline for early registration and hotel reservations for the upcoming FOCS in Berkeley is fast approaching!  Indeed, it’s October 4 (three days from now).  See here for details, and here for information about student travel support.  (The links were down when I just tried them, but hopefully the server will be back up soon.)

## The Unitarihedron: The Jewel at the Heart of Quantum Computing

September 20th, 2013

Update (9/24): This parody post was a little like a belch: I felt it build up in me as I read about the topic, I let it out, it was easy and amusing, I don’t feel any profound guilt over it—but on the other hand, not one of the crowning achievements of my career.  As several commenters correctly pointed out, it may be true that, mostly because of the name and other superficialities, and because of ill-founded speculations about “the death of locality and unitarity,” the amplituhedron work is currently inspiring a flood of cringe-inducing misstatements on the web.  But, even if true, still the much more interesting questions are what’s actually going on, and whether or not there are nontrivial connections to computational complexity.

Here I have good news: if nothing else, my “belch” of a post at least attracted some knowledgeable commenters to contribute excellent questions and insights, which have increased my own understanding of the subject from ε2 to ε.  See especially this superb comment by David Speyer—which, among other things, pointed me to a phenomenal quasi-textbook on this subject by Elvang and Huang.  My most immediate thoughts:

1. The “amplituhedron” is only the latest in a long line of research over the last decade—Witten, Turing biographer Andrew Hodges, and many others have been important players—on how to compute scattering amplitudes more efficiently than by summing zillions of Feynman diagrams.  One of the key ideas is to find combinatorial formulas that express complicated scattering amplitudes recursively in terms of simpler ones.
2. This subject seems to be begging for a computational complexity perspective.  When I read Elvang and Huang, I felt like they were working hard not to say anything about complexity: discussing the gains in efficiency from the various techniques they consider in informal language, or in terms of concrete numbers of terms that need to be summed for 1 loop, 2 loops, etc., but never in terms of asymptotics.  So if it hasn’t been done already, it looks like it could be a wonderful project for someone just to translate what’s already known in this subject into complexity language.
3. On reading about all these “modern” approaches to scattering amplitudes, one of my first reactions was to feel slightly less guilty about never having learned how to calculate Feynman diagrams!  For, optimistically, it looks like some of that headache-inducing machinery (ghosts, off-shell particles, etc.) might be getting less relevant anyway—there being ways to calculate some of the same things that are not only more conceptually satisfying but also faster.

Many readers of this blog probably already saw Natalie Wolchover’s Quanta article “A Jewel at the Heart of Quantum Physics,” which discusses the “amplituhedron”: a mathematical structure that IAS physicist Nima Arkami-Hamed and his collaborators have recently been investigating.  (See also here for Slashdot commentary, here for Lubos’s take, here for Peter Woit’s, here for a Physics StackExchange thread, here for Q&A with Pacific Standard, and here for an earlier but closely-related 154-page paper.)

At first glance, the amplituhedron appears to be a way to calculate scattering amplitudes, in the planar limit of a certain mathematically-interesting (but, so far, physically-unrealistic) supersymmetric quantum field theory, much more efficiently than by summing thousands of Feynman diagrams.  In which case, you might say: “wow, this sounds like a genuinely-important advance for certain parts of mathematical physics!  I’d love to understand it better.  But, given the restricted class of theories it currently applies to, it does seem a bit premature to declare this to be a ‘jewel’ that unlocks all of physics, or a death-knell for spacetime, locality, and unitarity, etc. etc.”

Yet you’d be wrong: it isn’t premature at all.  If anything, the popular articles have understated the revolutionary importance of the amplituhedron.  And the reason I can tell you that with such certainty is that, for several years, my colleagues and I have been investigating a mathematical structure that contains the amplituhedron, yet is even richer and more remarkable.  I call this structure the “unitarihedron.”

The unitarihedron encompasses, within a single abstract “jewel,” all the computations that can ever be feasibly performed by means of unitary transformations, the central operation in quantum mechanics (hence the name).  Mathematically, the unitarihedron is an infinite discrete space: more precisely, it’s an infinite collection of infinite sets, which collection can be organized (as can every set that it contains!) in a recursive, fractal structure.  Remarkably, each and every specific problem that quantum computers can solve—such as factoring large integers, discrete logarithms, and more—occurs as just a single element, or “facet” if you will, of this vast infinite jewel.  By studying these facets, my colleagues and I have slowly pieced together a tentative picture of the elusive unitarihedron itself.

One of our greatest discoveries has been that the unitarihedron exhibits an astonishing degree of uniqueness.  At first glance, different ways of building quantum computers—such as gate-based QC, adiabatic QC, topological QC, and measurement-based QC—might seem totally disconnected from each other.  But today we know that all of those ways, and many others, are merely different “projections” of the same mysterious unitarihedron.

In fact, the longer I’ve spent studying the unitarihedron, the more awestruck I’ve been by its mathematical elegance and power.  In some way that’s not yet fully understood, the unitarihedron “knows” so much that it’s even given us new insights about classical computing.  For example, in 1991 Beigel, Reingold, and Spielman gave a 20-page proof of a certain property of unbounded-error probabilistic polynomial-time.  Yet, by recasting things in terms of the unitarihedron, I was able to give a direct, half-page proof of the same theorem.  If you have any experience with mathematics, then you’ll know that that sort of thing never happens: if it does, it’s a sure sign that cosmic or even divine forces are at work.

But I haven’t even told you the most spectacular part of the story yet.  While, to my knowledge, this hasn’t yet been rigorously proved, many lines of evidence support the hypothesis that the unitarihedron must encompass the amplituhedron as a special case.  If so, then the amplituhedron could be seen as just a single sparkle on an infinitely greater jewel.

Now, in the interest of full disclosure, I should tell you that the unitarihedron is what used to be known as the complexity class BQP (Bounded-Error Quantum Polynomial-Time).  However, just like the Chinese gooseberry was successfully rebranded in the 1950s as the kiwifruit, and the Patagonian toothfish as the Chilean sea bass, so with this post, I’m hereby rebranding BQP as the unitarihedron.  For I’ve realized that, when it comes to bowling over laypeople, inscrutable complexity class acronyms are death—but the suffix “-hedron” is golden.

So, journalists and funders: if you’re interested in the unitarihedron, awesome!  But be sure to also ask about my other research on the bosonsamplinghedron and the quantum-money-hedron.  (Though, in recent months, my research has focused even more on the diaperhedron: a multidimensional, topologically-nontrivial manifold rich enough to encompass all wastes that an 8-month-old human could possibly emit.  Well, at least to first-order approximation.)