## Archive for the ‘Speaking Truth to Parallelism’ Category

### Umesh Vazirani responds to Geordie Rose

Thursday, February 6th, 2014

You might recall that Shin, Smith, Smolin, and Vazirani posted a widely-discussed preprint a week ago, questioning the evidence for large-scale quantum behavior in the D-Wave machine.  Geordie Rose responded here.   Tonight, in a Shtetl-Optimized exclusive scoop, I bring you Umesh Vazirani’s response to Geordie’s comments. Without further ado:

Even a cursory reading of our paper will reveal that Geordie Rose is attacking a straw man. Let me quickly outline the main point of our paper and the irrelevance of Rose’s comments:

To date the Boixo et al paper was the only serious evidence in favor of large scale quantum behavior by the D-Wave machine. We investigated their claims and showed that there are serious problems with their conclusions. Their conclusions were based on the close agreement between the input-output data from D-Wave and quantum simulated annealing, and their inability despite considerable effort to find any classical model that agreed with the input-output data. In our paper, we gave a very simple classical model of interacting magnets that closely agreed with the input-output data. We stated that our results implied that “it is premature to conclude that D-Wave machine exhibits large scale quantum behavior”.

Rose attacks our paper for claiming that “D-Wave processors are inherently classical, and can be described by a classical model with no need to invoke quantum mechanics.”  A reading of our paper will make it perfectly clear that this is not a claim that we make.  We state explicitly “It is worth emphasizing that the goal of this paper is not to provide a classical model for the D-Wave machine, … The classical model introduced here is useful for the purposes of studying the large-scale algorithmic features of the D-Wave machine. The task of finding an accurate model for the D-Wave machine (classical, quantum or otherwise), would be better pursued with direct access, not only to programming the D-Wave machine, but also to its actual hardware.”

Rose goes on to point to a large number of experiments conducted by D-Wave to prove small scale entanglement over 2-8 qubits and criticizes our paper for not trying to model those aspects of D-Wave. But such small scale entanglement properties are not directly relevant to prospects for a quantum speedup. Therefore we were specifically interested in claims about the large scale quantum behavior of D-Wave. There was exactly one such claim, which we duly investigated, and it did not stand up to scrutiny.

### TIME’s cover story on D-Wave: A case study in the conventions of modern journalism

Thursday, February 6th, 2014

This morning, commenter rrtucci pointed me to TIME Magazine’s cover story about D-Wave (yes, in today’s digital media environment, I need Shtetl-Optimized readers to tell me what’s on the cover of TIME…).  rrtucci predicted that, soon after reading the article, I’d be hospitalized with a severe stress-induced bleeding ulcer.  Undeterred, I grit my teeth, paid the $5 to go behind the paywall, and read the article. The article, by Lev Grossman, could certainly be a lot worse. If you get to the end, it discusses the experiments by Matthias Troyer’s group, and it makes clear the lack of any practically-relevant speedup today from the D-Wave devices. It also includes a few skeptical quotes: “In quantum computing, we have to be careful what we mean by ‘utilizing quantum effects,’” says Monroe, the University of Maryland scientist, who’s among the doubters. “This generally means that we are able to store superpositions of information in such a way that the system retains its ‘fuzziness,’ or quantum coherence, so that it can perform tasks that are impossible otherwise. And by that token there is no evidence that the D-Wave machine is utilizing quantum effects.” One of the closest observers of the controversy has been Scott Aaronson, an associate professor at MIT and the author of a highly influential quantum-computing blog [aww, shucks --SA]. He remains, at best, cautious. “I’m convinced … that interesting quantum effects are probably present in D-Wave’s devices,” he wrote in an email. “But I’m not convinced that those effects, right now, are playing any causal role in solving any problems faster than we could solve them with a classical computer. Nor do I think there’s any good argument that D-Wave’s current approach, scaled up, will lead to such a speedup in the future. It might, but there’s currently no good reason to think so.” Happily, the quote from me is something that I actually agreed with at the time I said it! Today, having read the Shin et al. paper—which hadn’t yet come out when Grossman emailed me—I might tone down the statement “I’m convinced … that interesting quantum effects are probably present” to something like: “there’s pretty good evidence for quantum effects like entanglement at a ‘local’ level, but at the ‘global’ level we really have no idea.” Alas, ultimately I regard this article as another victim (through no fault of the writer, possibly) of the strange conventions of modern journalism. Maybe I can best explain those conventions with a quickie illustration: MAGIC 8-BALL: THE RENEGADE MATH WHIZ WHO COULD CHANGE NUMBERS FOREVER An eccentric billionaire, whose fascinating hobbies include nude skydiving and shark-taming, has been shaking up the scientific world lately with his controversial claim that 8+0 equals 17 [... six more pages about the billionaire redacted ...] It must be said that mathematicians, who we reached for comment because we’re diligent reporters, have tended to be miffed, skeptical, and sometimes even sarcastic about the billionaire’s claims. Not surprisingly, though, the billionaire and his supporters have had some dismissive comments of their own about the mathematicians. So, which side is right? Or is the truth somewhere in the middle? At this early stage, it’s hard for an outsider to say. In the meantime, the raging controversy itself is reason enough for us to be covering this story using this story template. Stay tuned for more! As shown (for example) by Will Bourne’s story in Inc. magazine, it’s possible for a popular magazine to break out of the above template when covering D-Wave, or at least bend it more toward reality. But it’s not easy. More detailed comments: • The article gets off on a weird foot in the very first paragraph, describing the insides of D-Wave’s devices as “the coldest place in the universe.” Err, 20mK is pretty cold, but colder temperatures are routinely achieved in many other physics experiments. (Are D-Wave’s the coldest current, continuously-operating experiments, or something like that? I dunno: counterexamples, anyone? I’ve learned from experts that they’re not, not even close. I heard from someone who had a bunch of dilution fridges running at 10mK in the lab he was emailing me from…) • The article jumps enthusiastically into the standard Quantum Computing = Exponential Parallelism Fallacy (the QC=EPF), which is so common to QC journalism that I don’t know if it’s even worth pointing it out anymore (but here I am doing so). • Commendably, the article states clearly that QCs would offer speedups only for certain specific problems, not others; that D-Wave’s devices are designed only for adiabatic optimization, and wouldn’t be useful (e.g.) for codebreaking; and that even for optimization, “D-Wave’s hardware isn’t powerful enough or well enough understood to show serious quantum speedup yet.” But there’s a crucial further point that the article doesn’t make: namely, that we have no idea yet whether adiabatic optimization is something where quantum computers can give any practically-important speedup. In other words, even if you could implement adiabatic optimization perfectly—at zero temperature, with zero decoherence—we still don’t know whether there’s any quantum speedup to be had that way, for any of the nifty applications that the article mentions: “software design, tumor treatments, logistical planning, the stock market, airlines schedules, the search for Earth-like planets in other solar systems, and in particular machine learning.” In that respect, adiabatic optimization is extremely different from (e.g.) Shor’s factoring algorithm or quantum simulation: things where we know how much speedup we could get, at least compared to the best currently-known classical algorithms. But I better stop now, since I feel myself entering an infinite loop (and I didn’t even need the adiabatic algorithm to detect it). ### What happens when an unstoppable PR force hits an NP-hard problem? The answer’s getting clearer Thursday, January 16th, 2014 Update (Jan. 23): Daniel Lidar, one of the authors of the “Defining and detecting…” paper, was kind enough to email me his reactions to this post. While he thought the post was generally a “very nice summary” of their paper, he pointed out one important oversight in my discussion. Ironically, this oversight arose from my desire to bend over backwards to be generous to D-Wave! Specifically, I claimed that there were maybe ~10% of randomly-chosen 512-qubit problem instances on which the D-Wave Two slightly outperformed the simulated annealing solver (compared to ~75% where simulated annealing outperformed the D-Wave Two), while also listing several reasons (such as the minimum annealing time, and the lack of any characterization of the “good” instances) why that “speedup” is likely to be entirely an artifact. I obtained the ~10% and ~75% figures by eyeballing Figure 7 in the paper, and looking at which quantiles were just above and just below the 100 line when N=512. However, I neglected to mention that even the slight “speedup” on ~10% of instances, only appears when one looks at the “quantiles of ratio”: in other words, when one plots the probability distribution of [Simulated annealing time / D-Wave time] over all instances, and then looks at (say) the ~10% of the distribution that’s best for the D-Wave machine. The slight speedup disappears when one looks at the “ratio of quantiles”: that is, when one (say) divides the amount of time that simulated annealing needs to solve its best 10% of instances, by the amount of time that the D-Wave machine needs to solve its best 10%. And Rønnow et al. give arguments in their paper that ratio of quantiles is probably the more relevant performance comparison than quantiles of ratio. (Incidentally, the slight speedup on a few instances also only appears for certain values of the parameter r, which controls how many possible settings there are for each coupling. Apparently it appears for r=1, but disappears for r=3 and r=7—thereby heightening one’s suspicion that we’re dealing with an artifact of the minimum annealing time or something like that, rather than a genuine speedup.) There’s one other important point in the paper that I didn’t mention: namely, all the ratios of simulated annealing time to D-Wave time are normalized by 512/N, where N is the number of spins in the instance being tested. In this way, one eliminates the advantages of the D-Wave machine that come purely from its parallelism (which has nothing whatsoever to do with “quantumness,” and which could easily skew things in D-Wave’s favor if not controlled for), while still not penalizing the D-Wave machine in absolute terms. A few days ago, a group of nine authors (Troels Rønnow, Zhihui Wang, Joshua Job, Sergio Boixo, Sergei Isakov, David Wecker, John Martinis, Daniel Lidar, and Matthias Troyer) released their long-awaited arXiv preprint Defining and detecting quantum speedup, which contains the most thorough performance analysis of the D-Wave devices to date, and which seems to me to set a new standard of care for any future analyses along these lines. Notable aspects of the paper: it uses data from the 512-qubit machine (a previous comparison had been dismissed by D-Wave’s supporters because it studied the 128-qubit model only); it concentrates explicitly from the beginning on comparisons of scaling behavior between the D-Wave devices and comparable classical algorithms, rather than getting “sidetracked” by other issues; and it includes authors from both USC and Google’s Quantum AI Lab, two places that have made large investments in D-Wave’s machines and have every reason to want to see them succeed. Let me quote the abstract in full: The development of small-scale digital and analog quantum devices raises the question of how to fairly assess and compare the computational power of classical and quantum devices, and of how to detect quantum speedup. Here we show how to define and measure quantum speedup in various scenarios, and how to avoid pitfalls that might mask or fake quantum speedup. We illustrate our discussion with data from a randomized benchmark test on a D-Wave Two device with up to 503 qubits. Comparing the performance of the device on random spin glass instances with limited precision to simulated classical and quantum annealers, we find no evidence of quantum speedup when the entire data set is considered, and obtain inconclusive results when comparing subsets of instances on an instance-by-instance basis. Our results for one particular benchmark do not rule out the possibility of speedup for other classes of problems and illustrate that quantum speedup is elusive and can depend on the question posed. Since the paper is exceedingly well-written, and since I have maybe an hour before I’m called back to baby duty, my inclination is simply to ask people to RTFP rather than writing yet another long blog post. But maybe there are four points worth calling attention to: 1. The paper finds, empirically, that the time needed to solve random size-N instances of the quadratic binary optimization (QUBO) problem on D-Wave’s Chimera constraint graph seems to scale like exp(c√N) for some constant c—and that this is true regardless of whether one attacks the problem using the D-Wave Two, quantum Monte Carlo (i.e., a classical algorithm that tries to mimic the native physics of the machine), or an optimized classical simulated annealing code. Notably, exp(c√N) is just what one would have predicted from theoretical arguments based on treewidth; and the constant c doesn’t appear to be better for the D-Wave Two than for simulated annealing. 2. The last sentence of the abstract (“Our results … do not rule out the possibility of speedup for other classes of problems”) is, of course, the reed on which D-Wave’s supporters will now have to hang their hopes. But note that it’s unclear what experimental results could ever ”rule out the possibility of speedup for other classes of problems.” (No matter how many wrong predictions a psychic has made, the possibility remains that she’d be flawless at predicting the results of Croatian ping-pong tournaments…) Furthermore, like with previous experiments, the instances tested all involved finding ground states for random coupling configurations of the D-Wave machine’s own architecture. In other words, this was a set of instances where one might have thought, a priori, that the D-Wave machine would have an immense home-field advantage. Thus, one really needs to look more closely, to see whether there’s any positive evidence for an asymptotic speedup by the D-Wave machine. 3. Here, for D-Wave supporters, the biggest crumb the paper throws is that, if one considers only the ~10% of instances on which the D-Wave machine does best, then the machine does do slightly better on those instances than simulated annealing does. (Conversely, simulated annealing does better than the D-Wave machine on the ~75% of instances on which it does best.) Unfortunately, no one seems to know how to characterize the instances on which the D-Wave machine will do best: one just has to try it and see what happens! And of course, it’s extremely rare that two heuristic algorithms will succeed or fail on exactly the same set of instances: it’s much more likely that their performances will be correlated, but imperfectly. So it’s unclear, at least to me, whether this finding represents anything other than the “noise” that would inevitably occur even if one classical algorithm were pitted against another one. 4. As the paper points out, there’s also a systematic effect that biases results in the D-Wave Two’s favor, if one isn’t careful. Namely, the D-Wave Two has a minimum annealing time of 20 microseconds, which is often greater than the optimum annealing time, particularly for small instance sizes. The effect of that is artificially to increase the D-Wave Two’s running time for small instances, and thereby make its scaling behavior look better than it really is. The authors say they don’t know whether even the D-Wave Two’s apparent advantage for its “top 10% of instances” will persist after this effect is fully accounted for. Those seeking something less technical might want to check out an excellent recent article in Inc. by Will Bourne, entitled “D-Wave’s dream machine” (“D-Wave thinks it has built the first commercial quantum computer. Mother Nature has other ideas”). Wisely, Bourne chose not to mention me at all in this piece. Instead, he gradually builds a skeptical case almost entirely on quotes from people like Seth Lloyd and Daniel Lidar, who one might have thought would be more open to D-Wave’s claims. Bourne’s piece illustrates that it is possible for the mainstream press to get the D-Wave story pretty much right, and that you don’t even need a physics background to do so: all you need is a willingness to commit journalism. Oh. I’d be remiss not to mention that, in the few days between the appearance of this paper and my having a chance to write this post, two other preprints of likely interest to the Shtetl-Optimized commentariat showed up on quant-ph. The first, by a large list of authors mostly from D-Wave, is called Entanglement in a quantum annealing processor. This paper presents evidence for a point that many skeptics (including me) had been willing to grant for some time: namely, that the states generated by the D-Wave machines contain some nonzero amount of entanglement. (Note that, because of a technical property called “stoquasticity,” such entanglement is entirely compatible with the machines continuing to be efficiently simulable on a classical computer using Quantum Monte Carlo.) While it doesn’t address the performance question at all, this paper seems like a perfectly fine piece of science. From the opposite side of the (eigen)spectrum comes the latest preprint by QC skeptic Michel Dyakonov, entitled Prospects for quantum computing: Extremely doubtful. Ironically, Dyakonov and D-Wave seem to agree completely about the irrelevance of fault-tolerance and other insights from quantum computing theory. It’s just that D-Wave thinks QC can work even without the theoretical insights, whereas Dyakonov thinks that QC can’t work even with the insights. Unless I missed it, there’s no new scientific content in Dyakonov’s article. It’s basically a summary of some simple facts about QC and quantum fault-tolerance, accompanied by sneering asides about how complicated and implausible it all sounds, and how detached from reality the theorists are. And as for the obvious comparisons to previous “complicated and implausible” technologies, like (say) classical computing, or heavier-than-air flight, or controlled nuclear fission? Dyakonov says that such comparisons are invalid, because they ignore the many technologies proposed in previous eras that didn’t work. What’s striking is how little he seems to care about why the previous technologies failed: was it because they violated clearly-articulated laws of physics? Or because there turned out to be better ways to do the same things? Or because the technologies were simply too hard, too expensive, or too far ahead of their time? Supposing QC to be impossible, which of those is the reason for the impossibility? Since we’re not asking about something “arbitrary” here (like teaching a donkey to read), but rather about the computational power of Nature itself, isn’t it of immense scientific interest to know the reason for QC’s impossibility? How does Dyakonov propose to learn the reason, assuming he concedes that he doesn’t already know it? (As I’ve said many times, I’d support even the experiments that D-Wave was doing, if D-Wave and its supporters would only call them for what they were: experiments. Forays into the unknown. Attempts to find out what happens when a particular speculative approach is thrown at NP-hard optimization problems. It’s only when people obfuscate the results of those experiments, in order to claim something as “commercially useful” that quite obviously isn’t yet, that they leave the realm of science, and indeed walk straight into the eager jaws of skeptics like Dyakonov.) Anyway, since we seem to have circled back to D-Wave, I’d like to end this post by announcing my second retirement as Chief D-Wave Skeptic. The first time I retired, it was because I mistakenly thought that D-Wave had fundamentally changed, and would put science ahead of PR from that point forward. (The truth seems to be that there were, and are, individuals at D-Wave committed to science, but others who remain PR-focused.) This time, I’m retiring for a different reason: because scientists like the authors of the “Defining and detecting” preprint, and journalists like Will Bourne, are doing my job better than I ever did it. If the D-Wave debate were the American Civil War, then my role would be that of the frothy-mouthed abolitionist pamphleteer: someone who repeats over and over points that are fundamentally true, but in a strident manner that serves only to alienate fence-sitters and allies. As I played my ineffective broken record, the Wave Power simply moved from one triumph to another, expanding its reach to Google, NASA, Lockheed Martin, and beyond. I must have looked like a lonely loon on the wrong side of history. But today the situation is different. Today Honest Abe and his generals (Honest Matthias and his coauthors?) are meeting the Wave Power on the battlefield of careful performance comparisons against Quantum Monte Carlo and simulated annealing. And while the battles might continue all the way to 2000 qubits or beyond, the results so far are not looking great for the Wave Power. The intractability of NP-complete problems—that which we useless, ivory-tower theorists had prophesied years ago, to much derision and laughter—would seem to be rearing its head. So, now that the bombs are bursting and the spins decohering in midair, what is there for a gun-shy pampleteer like myself to do but sit back and watch it all play out? Well, and maybe blog about it occasionally. But not as “Chief Skeptic,” just as another interested observer. ### Merry Christmas! My quantum computing research explained, using only the 1000 most common English words Tuesday, December 24th, 2013 [With special thanks to the Up-Goer Five Text Editor, which was inspired by this xkcd] I study computers that would work in a different way than any computer that we have today. These computers would be very small, and they would use facts about the world that are not well known to us from day to day life. No one has built one of these computers yet—at least, we don’t think they have!—but we can still reason about what they could do for us if we did build them. How would these new computers work? Well, when you go small enough, you find that, in order to figure out what the chance is that something will happen, you need to both add and take away a whole lot of numbers—one number for each possible way that the thing could happen, in fact. What’s interesting is, this means that the different ways a thing could happen can “kill each other out,” so that the thing never happens at all! I know it sounds weird, but the world of very small things has been known to work that way for almost a hundred years. So, with the new kind of computer, the idea is to make the different ways each wrong answer could be reached kill each other out (with some of them “pointing” in one direction, some “pointing” in another direction), while the different ways that the right answer could be reached all point in more or less the same direction. If you can get that to happen, then when you finally look at the computer, you’ll find that there’s a very good chance that you’ll see the right answer. And if you don’t see the right answer, then you can just run the computer again until you do. For some problems—like breaking a big number into its smallest parts (say, 43259 = 181 × 239)—we’ve learned that the new computers would be much, much faster than we think any of today’s computers could ever be. For other problems, however, the new computers don’t look like they’d be faster at all. So a big part of my work is trying to figure out for which problems the new computers would be faster, and for which problems they wouldn’t be. You might wonder, why is it so hard to build these new computers? Why don’t we have them already? This part is a little hard to explain using the words I’m allowed, but let me try. It turns out that the new computers would very easily break. In fact, if the bits in such a computer were to “get out” in any way—that is, to work themselves into the air in the surrounding room, or whatever—then you could quickly lose everything about the new computer that makes it faster than today’s computers. For this reason, if you’re building the new kind of computer, you have to keep it very, very carefully away from anything that could cause it to lose its state—but then at the same time, you do have to touch the computer, to make it do the steps that will eventually give you the right answer. And no one knows how to do all of this yet. So far, people have only been able to use the new computers for very small checks, like breaking 15 into 3 × 5. But people are working very hard today on figuring out how to do bigger things with the new kind of computer. In fact, building the new kind of computer is so hard, that some people even believe it won’t be possible! But my answer to them is simple. If it’s not possible, then that’s even more interesting to me than if it is possible! And either way, the only way I know to find out the truth is to try it and see what happens. Sometimes, people pretend that they already built one of these computers even though they didn’t. Or they say things about what the computers could do that aren’t true. I have to admit that, even though I don’t really enjoy it, I do spend a lot of my time these days writing about why those people are wrong. Oh, one other thing. Not long from now, it might be possible to build computers that don’t do everything that the new computers could eventually do, but that at least do some of it. Like, maybe we could use nothing but light and mirrors to answer questions that, while not important in and of themselves, are still hard to answer using today’s computers. That would at least show that we can do something that’s hard for today’s computers, and it could be a step along the way to the new computers. Anyway, that’s what a lot of my own work has been about for the past four years or so. Besides the new kind of computers, I’m also interested in understanding what today’s computers can and can’t do. The biggest open problem about today’s computers could be put this way: if a computer can check an answer to a problem in a short time, then can a computer also find an answer in a short time? Almost all of us think that the answer is no, but no one knows how to show it. Six years ago, another guy and I figured out one of the reasons why this question is so hard to answer: that is, why the ideas that we already know don’t work. Anyway, I have to go to dinner now. I hope you enjoyed this little piece about the kind of stuff that I work on. ### D-Wave: Truth finally starts to emerge Thursday, May 16th, 2013 Wrap-Up (June 5): This will be my final update on this post (really!!), since the discussion seems to have reached a point where not much progress is being made, and since I’d like to oblige the commenters who’ve asked me to change the subject. Let me try to summarize the main point I’ve been trying to get across this whole time. I’ll call the point (*). (*) D-Wave founder Geordie Rose claims that D-Wave has now accomplished its goal of building a quantum computer that, in his words, is “better at something than any other option available.” This claim has been widely and uncritically repeated in the press, so that much of the nerd world now accepts it as fact. However, the claim is not supported by the evidence currently available. It appears that, while the D-Wave machine does outperform certain off-the-shelf solvers, simulated annealing codes have been written that outperform the D-Wave machine on its own native problem when run on a standard laptop. More research is needed to clarify the issue, but in the meantime, it seems worth knowing that this is where things currently stand. In the comments, many people tried repeatedly to change the subject from (*) to various subsidiary questions. For example: isn’t it possible that D-Wave’s current device will be found to provide a speedup on some other distribution of instances, besides the one that was tested? Even if not, isn’t it possible that D-Wave will achieve a genuine speedup with some future generation of machines? Did it make business sense for Google to buy a D-Wave machine? What were Google’s likely reasons? What’s D-Wave’s current value as a company? Should Cathy McGeoch have acted differently, in the type of comparison she agreed to do, or in how she communicated about its results? Should I have acted differently, in my interaction with McGeoch? And, I’m afraid to say, I jumped in to the discussion of all of those questions—because, let’s face it, there are very few subjects about which I don’t have an opinion, or at least a list of qualified observations to make. In retrospect, I now think that was a mistake. It would have been better to sidestep all the other questions—not one of which I really know the answer to, and each of which admits multiple valid perspectives—and just focus relentlessly on the truth of assertion (*). Here’s an analogy: imagine that a biotech startup claimed that, by using an expensive and controversial new gene therapy, it could cure patients at a higher rate than with the best available conventional drugs—basing its claim on a single clinical trial. Imagine that this claim was widely repeated in the press as an established fact. Now imagine that closer examination of the clinical trial revealed that it showed nothing of the kind: it compared against the wrong drugs. And imagine that a more relevant clinical trial—mostly unmentioned in the press—had also been done, and discovered that when you compare to the right drugs, the drugs do better. Imagine that someone wrote a blog post bringing all of this to public attention. And now imagine that the response to that blogger was the following: “aha, but isn’t it possible that some future clinical trial will show an advantage for the gene therapy—maybe with some other group of patients? Even if not, isn’t it possible that the startup will manage to develop an effective gene therapy sometime in the future? Betcha didn’t consider that, did you? And anyway, at least they’re out there trying to make gene therapy work! So we should all support them, rather than relentlessly criticizing. And as for the startup’s misleading claims to the public? Oh, don’t be so naïve: that’s just PR. If you can’t tune out the PR and concentrate on the science, that’s your own damn problem. In summary, the real issue isn’t what some clinical trial did or didn’t show; it’s you and your hostile attitude.” In a different context, these sorts of responses would be considered strange, and the need to resort to them revealing. But the rules for D-Wave are different. (Interestingly, in excusing D-Wave’s statements, some commenters explicitly defended standards of intellectual discourse so relaxed that, as far as I could tell, just about anything anyone could possibly say would be OK with them—except of course for what I say on this blog, which is not OK! It reminds me of the central tenet of cultural relativism: that there exist no universal standards by which any culture could ever be judged “good” or “bad,” except that Western culture is irredeemably evil.) Update (June 4): Matthias Troyer (who, unfortunately, still can’t comment here for embargo reasons) has asked me to clarify that it’s not he, but rather his postdoc Sergei Isakov, who deserves the credit for actually writing the simulated annealing code that outperformed the D-Wave machine on the latter’s own “home turf” (i.e., random QUBO instances with the D-Wave constraint graph). The quantum Monte Carlo code, which also did quite well at simulating the D-Wave machine, was written by Isakov together with another of Matthias’s postdocs, Troels Rønnow. Update (June 3): See Cathy McGeoch’s response (here and here), and my response to her response. Yet More Updates (June 2): Alex Selby has a detailed new post summarizing his comparisons between the D-Wave device (as reported by McGeoch and Wang) and his own solver—finding that his solver can handily outperform the device and speculating about the reasons why. In other news, Catherine McGeoch spoke on Friday in the MIT quantum group meeting. Incredibly, she spoke for more than an hour, without once mentioning the USC results that found that simulated annealing on a standard laptop (when competently implemented) handily outperformed the D-Wave machine, or making any attempt to reconcile those results with hers and Wang’s. Instead, McGeogh used the time to enlighten the assembled experts about what quantum annealing was, what an exact solver was, etc. etc., then repeated the speedup claims as if the more informative comparisons simply didn’t exist. I left without asking questions, not wanting to be the one to instigate an unpleasant confrontation, and—I’ll admit—questioning my own sanity as a result of no one else asking about the gigantic elephant in the room. More Updates (May 21): Happy 25th birthday to me! Among the many interesting comments below, see especially this one by Alex Selby, who says he’s written his own specialist solver for one class of the McGeoch and Wang benchmarks that significantly outperforms the software (and D-Wave machine) tested by McGeoch and Wang on those benchmarks—and who provides the Python code so you can try it yourself. Also, Igor Vernik asked me to announce that on July 8th, D-Wave will be giving a technical presentation at the International Superconducting Electronics Conference in Cambridge. See here for more info; I’ll be traveling then and won’t be able to make it. I don’t know whether the performance comparisons to Matthias Troyer’s and Alex Selby’s code will be among the topics discussed, or if there will be an opportunity to ask questions about such things. In another exciting update, John Smolin and Graeme Smith posted a paper to the arXiv tonight questioning even the “signature of quantumness” part of the latest D-Wave claims—the part that I’d been ~98% willing to accept, even as I relayed evidence that cast enormous doubt on the “speedup” part. Specifically, Smolin and Smith propose a classical model that they say can explain the “bimodal” pattern of success probabilities observed by the USC group as well as quantum annealing can. I haven’t yet had time to read their paper or form an opinion about it, but I’d be very interested if others wanted to weigh in. Update (May 26): The USC group has put out a new preprint responding to Smolin and Smith, offering additional evidence for quantum behavior in the D-Wave device that they say can’t be explained using Smolin and Smith’s model. Update (May 17): Daniel Lidar emailed me to clarify his views about error-correction and the viability of D-Wave’s approach. He invited me to share his clarification with others—something that I’m delighted to do, since I agree with him wholeheartedly. Without further ado, here’s what Lidar says: I don’t believe D-Wave’s approach is scalable without error correction. I believe that the incorporation of error correction is a necessary condition in order to ever achieve a speedup with D-Wave’s machines, and I don’t believe D-Wave’s machines are any different from other types of quantum information processing in this regard. I have repeatedly made this point to D-Wave over several years, and I hope that in the future their designs will allow more flexibility in the incorporation of error correction. Lidar also clarified that he not only doesn’t dispute what Matthias Troyer told me about the lack of speedup of the D-Wave device compared to classical simulated annealing in their experiments, but “fully agrees, endorses, and approves” of it—and indeed, that he himself was part of the team that did the comparison. In other news, this Hacker News thread, which features clear, comprehending discussions of this blog post and the backstory that led up to it, has helped to restore my faith in humanity. Two years ago almost to the day, I announced my retirement as Chief D-Wave Skeptic. But—as many readers predicted at the time—recent events (and the contents of my inbox!) have given me no choice except to resume my post. In an all-too-familiar pattern, multiple rounds of D-Wave-related hype have made it all over the world before the truth has had time to put its pants on and drop its daughter off in daycare. And the current hype is particularly a shame, because once one slices through all the layers of ugh—the rigged comparisons, the “dramatic announcements” that mean nothing, the lazy journalists cherry-picking what they want to hear and ignoring the inconvenient bits—there really has been a huge scientific advance this past month in characterizing the D-Wave devices. I’m speaking about the experiments on the D-Wave One installed at USC, the main results of which finally appeared in April. Two of the coauthors of this new work—Matthias Troyer and Daniel Lidar—were at MIT recently to speak about their results, Troyer last week and Lidar this Tuesday. Intriguingly, despite being coauthors on the same paper, Troyer and Lidar have very different interpretations of what their results mean, but we’ll get to that later. For now, let me summarize what I think their work has established. Evidence for Quantum Annealing Behavior For the first time, we have evidence that the D-Wave One is doing what should be described as “quantum annealing” rather than “classical annealing” on more than 100 qubits. (Note that D-Wave itself now speaks about “quantum annealing” rather than “quantum adiabatic optimization.” The difference between the two is that the adiabatic algorithm runs coherently, at zero temperature, while quantum annealing is a “messier” version in which the qubits are strongly coupled to their environment throughout, but still maintain some quantum coherence.) The evidence for quantum annealing behavior is still extremely indirect, but despite my “Chief Skeptic” role, I’m ready to accept what the evidence indicates with essentially no hesitation. So what is the evidence? Basically, the USC group ran the D-Wave One on a large number of randomly generated instances of what I’ll call the “D-Wave problem”: namely, the problem of finding the lowest-energy configuration of an Ising spin glass, with nearest-neighbor interactions that correspond to the D-Wave chip’s particular topology. Of course, restricting attention to this “D-Wave problem” tilts the tables heavily in D-Wave’s favor, but no matter: scientifically, it makes a lot more sense than trying to encode Sudoku puzzles or something like that. Anyway, the group then looked at the distribution of success probabilities when each instance was repeatedly fed to the D-Wave machine. For example, would the randomly-generated instances fall into one giant clump, with a few outlying instances that were especially easy or especially hard for the machine? Surprisingly, they found that the answer was no: the pattern was strongly bimodal, with most instances either extremely easy or extremely hard, and few instances in between. Next, the group fed the same instances to Quantum Monte Carlo: a standard classical algorithm that uses Wick rotation to find the ground states of “stoquastic Hamiltonians,” the particular type of quantum evolution that the D-Wave machine is claimed to implement. When they did that, they found exactly the same bimodal pattern that they found with the D-Wave machine. Finally they fed the instances to a classical simulated annealing program—but there they found a “unimodal” distribution, not a bimodal one. So, their conclusion is that whatever the D-Wave machine is doing, it’s more similar to Quantum Monte Carlo than it is to classical simulated annealing. Curiously, we don’t yet have any hint of a theoretical explanation for why Quantum Monte Carlo should give rise to a bimodal distribution, while classical simulating annealing should give rise to a unimodal one. The USC group simply observed the pattern empirically (as far as I know, they’re the first to do so), then took advantage of it to characterize the D-Wave machine. I regard explaining this pattern as an outstanding open problem raised by their work. In any case, if we accept that the D-Wave One is doing “quantum annealing,” then despite the absence of a Bell-inequality violation or other direct evidence, it’s reasonably safe to infer that there should be large-scale entanglement in the device. I.e., the true quantum state is no doubt extremely mixed, but there’s no particular reason to believe we could decompose that state into a mixture of product states. For years, I tirelessly repeated that D-Wave hadn’t even provided evidence that its qubits were entangled—and that, while you can have entanglement with no quantum speedup, you can’t possibly have a quantum speedup without at least the capacity to generate entanglement. Now, I’d say, D-Wave finally has cleared the evidence-for-entanglement bar—and, while they’re not the first to do so with superconducting qubits, they’re certainly the first to do so with so many superconducting qubits. So I congratulate D-Wave on this accomplishment. If this had been advertised from the start as a scientific research project—”of course we’re a long way from QC being practical; no one would ever claim otherwise; but as a first step, we’ve shown experimentally that we can entangle 100 superconducting qubits with controllable couplings”—my reaction would’ve been, “cool!” (Similar to my reaction to any number of other steps toward scalable QC being reported by research groups all over the world.) No Speedup Compared to Classical Simulated Annealing But of course, D-Wave’s claims—and the claims being made on its behalf by the Hype-Industrial Complex—are far more aggressive than that. And so we come to the part of this post that has not been pre-approved by the International D-Wave Hype Repeaters Association. Namely, the same USC paper that reported the quantum annealing behavior of the D-Wave One, also showed no speed advantage whatsoever for quantum annealing over classical simulated annealing. In more detail, Matthias Troyer’s group spent a few months carefully studying the D-Wave problem—after which, they were able to write optimized simulated annealing code that solves the D-Wave problem on a normal, off-the-shelf classical computer, about 15 times faster than the D-Wave machine itself solves the D-Wave problem! Of course, if you wanted even more classical speedup than that, then you could simply add more processors to your classical computer, for only a tiny fraction of the ~$10 million that a D-Wave One would set you back.

Some people might claim it’s “unfair” to optimize the classical simulated annealing code to take advantage of the quirks of the D-Wave problem.  But think about it this way: D-Wave has spent ~$100 million, and hundreds of person-years, optimizing the hell out of a special-purpose annealing device, with the sole aim of solving this one problem that D-Wave itself defined. So if we’re serious about comparing the results to a classical computer, isn’t it reasonable to have one professor and a few postdocs spend a few months optimizing the classical code as well? As I said, besides simulated annealing, the USC group also compared the D-Wave One’s performance against a classical implementation of Quantum Monte Carlo. And maybe not surprisingly, the D-Wave machine was faster than a “direct classical simulation of itself” (I can’t remember how many times faster, and couldn’t find that information in the paper). But even here, there’s a delicious irony. The only reason the USC group was able to compare the D-Wave one against QMC at all, is that QMC is efficiently implementable on a classical computer! (Albeit probably with a large constant overhead compared to running the D-Wave annealer itself—hence the superior performance of classical simulated annealing over QMC.) This means that, if the D-Wave machine can be understood as reaching essentially the same results as QMC (technically, “QMC with no sign problem”), then there’s no real hope for using the D-Wave machine to get an asymptotic speedup over a classical computer. The race between the D-Wave machine and classical simulations of the machine would then necessarily be a cat-and-mouse game, a battle of constant factors with no clear asymptotic victor. (Some people might conjecture that it will also be a “Tom & Jerry game,” the kind where the classical mouse always gets the better of the quantum cat.) At this point, it’s important to give a hearing to three possible counterarguments to what I’ve written above. The first counterargument is that, if you plot both the runtime of simulated annealing and the runtime of the D-Wave machine as functions of the instance size n, you find that, while simulated annealing is faster in absolute terms, it can look like the curve for the D-Wave machine is less steep. Over on the blog “nextbigfuture”, an apparent trend of this kind has been fearlessly extrapolated to predict that with 512 qubits, the D-Wave machine will be 10 billion times faster than a classical computer. But there’s a tiny fly in the ointment. As Troyer carefully explained to me last week, the “slow growth rate” of the D-Wave machine’s runtime is, ironically, basically an artifact of the machine being run too slowly on small values of n. Run the D-Wave machine as fast as it can run for small n, and the difference in the slopes disappears, with only the constant-factor advantage for simulated annealing remaining. In short, there seems to be no evidence, at present, that the D-Wave machine is going to overtake simulated annealing for any instance size. The second counterargument is that the correlation between the two “bimodal distributions”—that for the D-Wave machine and that for the Quantum Monte Carlo simulation—is not perfect. In other words, there are a few instances (not many) that QMC solves faster than the D-Wave machine, and likewise a few instances that the D-Wave machine solves faster than QMC. Not surprisingly, the latter fact has been eagerly seized on by the D-Wave boosters (“hey, sometimes the machine does better!”). But Troyer has a simple and hilarious response to that. Namely, he found that his group’s QMC code did a better job of correlating with the D-Wave machine, than the D-Wave machine did of correlating with itself! In other words, calibration errors seem entirely sufficient to explain the variation in performance, with no need to posit any special class of instances (however small) on which the D-Wave machine dramatically outperforms QMC. The third counterargument is just the banal one: the USC experiment was only one experiment with one set of instances (albeit, a set one might have thought would be heavily biased toward D-Wave). There’s no proof that, in the future, it won’t be discovered that the D-Wave machine does something more than QMC, and that there’s some (perhaps specially-designed) set of instances on which the D-Wave machine asymptotically outperforms both QMC and Troyer’s simulated annealing code. (Indeed, I gather that folks at D-Wave are now assiduously looking for such instances.) Well, I concede that almost anything is possible in the future—but “these experiments, while not supporting D-Wave’s claims about the usefulness of its devices, also don’t conclusively disprove those claims” is a very different message than what’s currently making it into the press. Comparison to CPLEX is Rigged Unfortunately, the USC paper is not the one that’s gotten the most press attention—perhaps because half of it inconveniently told the hypesters something they didn’t want to hear (“no speedup”). Instead, journalists have preferred a paper released this week by Catherine McGeoch and Cong Wang, which reports that quantum annealing running on the D-Wave machine outperformed the CPLEX optimization package running on a classical computer by a factor of ~3600, on Ising spin problems involving 439 bits. Wow! That sounds awesome! But before rushing to press, let’s pause to ask ourselves: how can we reconcile this with the USC group’s result of no speedup? The answer turns out to be painfully simple. CPLEX is a general-purpose, off-the-shelf exact optimization package. Of course an exact solver can’t compete against quantum annealing—or for that matter, against classical annealing or other classical heuristics! Noticing this problem, McGeoch and Wang do also compare the D-Wave machine against tabu search, a classical heuristic algorithm. When they do so, they find that an advantage for the D-Wave machine persists, but it becomes much, much smaller (they didn’t report the exact time comparison). Amusingly, they write in their “Conclusions and Future Work” section: It would of course be interesting to see if highly tuned implementations of, say, tabu search or simulated annealing could compete with Blackbox or even QA [i.e., the D-Wave machines] on QUBO [quadratic binary optimization] problems; some preliminary work on this question is underway. As I said above, at the time McGeoch and Wang’s paper was released to the media (though maybe not at the time it was written?), the “highly tuned implementation” of simulated annealing that they ask for had already been written and tested, and the result was that it outperformed the D-Wave machine on all instance sizes tested. In other words, their comparison to CPLEX had already been superseded by a much more informative comparison—one that gave the “opposite” result—before it ever became public. For obvious reasons, most press reports have simply ignored this fact. Troyer, Lidar, and Stone Soup Much of what I’ve written in this post, I learned by talking to Matthias Troyer—the man who carefully experimented with the D-Wave machine and figured out how to beat it using simulated annealing, and who I regard as probably the world’s #1 expert right now on what exactly the machine does. Troyer wasn’t shy about sharing his opinions, and while couched with qualifications, they tended toward extremely skeptical. For example, Troyer conjectured that, if D-Wave ultimately succeeds in getting a speedup over classical computers in a fair comparison, then it will probably be by improving coherence and calibration, incorporating error-correction, and doing other things that “traditional,” “academic” quantum computing researchers had said all along would need to be done. As I said, Daniel Lidar is another coauthor on the USC paper, and also recently visited MIT to speak. Lidar and Troyer agree on the basic facts—yet Lidar noticeably differed from Troyer, in trying to give each fact the most “pro-D-Wave spin” it could possibly support. Lidar spoke at our quantum group meeting, not about the D-Wave vs. simulated annealing performance comparison (which he agrees with), but about a proposal of his for incorporating quantum error-correction into the D-Wave device, together with some experimental results. He presented his proposal, not as a reductio ad absurdum of D-Wave’s entire philosophy, but rather as a positive opportunity to get a quantum speedup using D-Wave’s approach. So, to summarize my current assessment of the situation: yes, absolutely, D-Wave might someday succeed—ironically, by adapting the very ideas from “the gate model” that its entire business plan has been based on avoiding, and that D-Wave founder Geordie Rose has loudly denigrated for D-Wave’s entire history! If that’s what happens, then I predict that science writers, and blogs like “nextbigfuture,” will announce from megaphones that D-Wave has been vindicated at last, while its narrow-minded, theorem-obsessed, ivory-tower academic naysayers now have egg all over their faces. No one will care that the path to success—through quantum error-correction and so on—actually proved the academic critics right, and that D-Wave’s “vindication” was precisely like that of the deliciousness of stone soup in the old folktale. As for myself, I’ll probably bang my head on my desk until I sustain so much brain damage that I no longer care either. But at least I’ll still have tenure, and the world will have quantum computers. The Messiah’s Quantum Annealer Over the past few days, I’ve explained the above to at least six different journalists who asked. And I’ve repeatedly gotten a striking response: “What you say makes sense—but then why are all these prestigious people and companies investing in D-Wave? Why did Bo Ewald, a prominent Silicon Valley insider, recently join D-Wave as president of its US operations? Why the deal with Lockheed Martin? Why the huge deal with NASA and Google, just announced today? What’s your reaction to all this news?” My reaction, I confess, is simple. I don’t care—I actually told them this—if the former Pope Benedict has ended his retirement to become D-Wave’s new marketing director. I don’t care if the Messiah has come to Earth on a flaming chariot, not to usher in an age of peace but simply to spend$10 million on D-Wave’s new Vesuvius chip.  And if you imagine that I’ll ever care about such things, then you obviously don’t know much about me.  I’ll tell you what: if peer pressure is where it’s at, then come to me with the news that Umesh Vazirani, or Greg Kuperberg, or Matthias Troyer is now convinced, based on the latest evidence, that D-Wave’s chip asymptotically outperforms simulated annealing in a fair comparison, and does so because of quantum effects.  Any one such scientist’s considered opinion would mean more to me than 500,000 business deals.

The Argument from Consequences

Let me end this post with an argument that several of my friends in physics have explicitly made to me—not in the exact words below but in similar ones.

“Look, Scott, let the investors, government bureaucrats, and gullible laypeople believe whatever they want—and let D-Wave keep telling them whatever’s necessary to stay in business.  It’s unsportsmanlike and uncollegial of you to hold D-Wave’s scientists accountable for whatever wild claims their company’s PR department might make.  After all, we’re in this game too!  Our universities put out all sorts of overhyped press releases, but we don’t complain because we know that it’s done for our benefit.  Besides, you’d doubtless be trumpeting the same misleading claims, if you were in D-Wave’s shoes and needed the cash infusions to survive.  Anyway, who really cares whether there’s a quantum speedup yet or no quantum speedup?  At least D-Wave is out there trying to build a scalable quantum computer, and getting millions of dollars from Jeff Bezos, Lockheed, Google, the CIA, etc. etc. to do so—resources more of which would be directed our way if we showed a more cooperative attitude!  If we care about scalable QCs ever getting built, then the wise course is to celebrate what D-Wave has done—they just demonstrated quantum annealing on 100 qubits, for crying out loud!  So let’s all be grownups here, focus on the science, and ignore the marketing buzz as so much meaningless noise—just like a tennis player might ignore his opponent’s trash-talking (‘your mother is a whore,’ etc.) and focus on the game.”

I get this argument: really, I do.  I even concede that there’s something to be said for it.  But let me now offer a contrary argument for the reader’s consideration.

Suppose that, unlike in the “stone soup” scenario I outlined above, it eventually becomes clear that quantum annealing can be made to work on thousands of qubits, but that it’s a dead end as far as getting a quantum speedup is concerned.  Suppose the evidence piles up that simulated annealing on a conventional computer will continue to beat quantum annealing, if even the slightest effort is put into optimizing the classical annealing code.  If that happens, then I predict that the very same people now hyping D-Wave will turn around and—without the slightest acknowledgment of error on their part—declare that the entire field of quantum computing has now been unmasked as a mirage, a scam, and a chimera.  The same pointy-haired bosses who now flock toward quantum computing, will flock away from it just as quickly and as uncomprehendingly.  Academic QC programs will be decimated, despite the slow but genuine progress that they’d been making the entire time in a “parallel universe” from D-Wave.  People’s contempt for academia is such that, while a D-Wave success would be trumpeted as its alone, a D-Wave failure would be blamed on the entire QC community.

When it comes down to it, that’s the reason why I care about this matter enough to have served as “Chief D-Wave Skeptic” from 2007 to 2011, and enough to resume my post today.  As I’ve said many times, I really, genuinely hope that D-Wave succeeds at building a QC that achieves an unambiguous speedup!  I even hope the academic QC community will contribute to D-Wave’s success, by doing careful independent studies like the USC group did, and by coming up with proposals like Lidar’s for how D-Wave could move forward.  On the other hand, in the strange, unlikely event that D-Wave doesn’t succeed, I’d like people to know that many of us in the QC community were doing what academics are supposed to do, which is to be skeptical and not leave obvious questions unasked.  I’d like them to know that some of us simply tried to understand and describe what we saw in front of us—changing our opinions repeatedly as new evidence came in, but disregarding “meta-arguments” like my physicist friends’ above.  The reason I can joke about how easy it is to bribe me is that it’s actually kind of hard.

### My New York Times essay on quantum computing

Monday, December 5th, 2011

I have a special treat for those commenters who consider me an incorrigible publicity-hound: an essay I was invited to write for the New York Times Science section, entitled Quantum Computing Promises New Insights, Not Just Supermachines.  (My original title was “The Real Reasons to Study Quantum Computing.”)  This piece is part of a collection of essays on “the future of computing,” which include one on self-driving cars by Sebastian Thrun, one on online learning by Daphne Koller, and other interesting stuff (the full list is here).

In writing my essay, the basic constraints were:

(a) I’d been given a rare opportunity to challenge at least ten popular misconceptions about quantum computing, and would kick myself for years if I didn’t hit all of them,

(b) I couldn’t presuppose the reader had heard of quantum computing, and

Satisfying these constraints was harder than it looked, and I benefited greatly from the feedback of friends and colleagues, as well as the enormously helpful Times staff.  I did get one request that floored me: namely, to remove all the material about “interference” and “amplitudes” (too technical), and replace it by something ordinary people could better relate to—like, say, a description of how a quantum computer would work by trying every possible answer in parallel.  Eventually, though, the Gray Lady and I found a compromise that everyone liked (and that actually improved the piece): namely, I’d first summarize the usual “try all answers in parallel” view, and then explain why it was wrong, bringing in the minus signs and Speaking Truth to Parallelism.

To accompany the essay, I also did a short podcast interview about quantum computing with the Times‘ David Corcoran.  (My part starts around 8:20.)  Overall, I’m happy with the interview, but be warned: when Corcoran asks me what quantum computers’ potential is, I start talking about the “try all answers in parallel” misconception—and then they cut to the next question before I get to the part about its being a misconception!  I need to get better at delivering soundbites…

One final comment: in case you’re wondering, those black spots on the Times‘ cartoon of me seem to be artifacts of whatever photo-editing software they used.  They’re not shrapnel wounds or disfiguring acne.

### Better late than never

Tuesday, May 3rd, 2011

No, I’m not talking about Osama, but about my reactions below to a New Yorker article about quantum computing—reactions whose writing was rudely interrupted by last night’s news.  Of all the possible times in the past decade to get him, they had to pick one that would overshadow an important Shtetl-Optimized discussion about complexity theory, the Many-Worlds Interpretation, and the popularization of science?  Well, I guess I’ll let it slide.

As already discussed on Peter Woit’s blog, this week’s New Yorker has a long piece about quantum computing by the novelist Rivka Galchen (unfortunately the article is behind a paywall).  Most of the article is about the quantum computing pioneer David Deutsch: his genius, his eccentricity, his certainty that parallel universes exist, his insistence on rational explanations for everything, his disdain for “intellectual obfuscators” (of whom Niels Bohr is a favorite example), his indifference to most of the problems that occupy other quantum computing researchers, the messiness of his house, his reluctance to leave his house, and his love of the TV show House.

Having spent a wonderful, mind-expanding day with Deutsch in 2002—at his house in Oxford, of course—I can personally vouch for all of the above (except the part about House, which hadn’t yet debuted then).  On the one hand, Deutsch is one of the most brilliant conversationalists I’ve ever encountered; on the other hand, I was astonished to find myself, as a second-year graduate student, explaining to the father of quantum computing what BQP was.  So basically, David Deutsch is someone who merits a New Yorker profile if anyone does.  And I was pleased to see Galchen skillfully leveraging Deutsch’s highly-profilable personality to expose a lay audience (well, OK, a chardonnay-sipping Manhattan socialite audience) to some of the great questions of science and philosophy.

However, reading this article also depressed me, as it dawned on me that the entire thing could have been written fifteen years ago, with only minor changes to the parts about experiment and zero change to the theoretical parts.  I thought: “has there really been that little progress in quantum computing theory the past decade and a half—at least progress that a New Yorker reader would care about?”  Even the sociological observations are dated: Galchen writes about interest in quantum computing as the “Oxford flu,” rather than the “Waterloo flu” or “Caltech flu” that it’s been since 2000 or so (the latter two capitals of the field aren’t even mentioned!).  A good analogy would be an article about the Web, published today, that described the strange and exciting new world of Netscape, HotBot, and AltaVista.

A more serious issue is that the article falls victim to almost every misleading pop-science trope about quantum computing that some of us have trying to correct for the past decade.  For example:

With one millionth of the hardware of an ordinary laptop, a quantum computer could store as many bits of information as there are particles in the universe.

Noooooo!  That’s only for an extremely strange definition of “store”…

Oxford’s eight-qubit quantum computer has significantly less computational power than an abacus, but fifty to a hundred qubits could make something as powerful as any laptop.

Noooooo!  Fifty to a hundred qubits could maybe replace your laptop, if the only thing you wanted to use your laptop for was simulating a system of fifty to a hundred qubits…

In a 1985 paper, Deutsch pointed out that, because Turing was working with classical physics, his universal computer could imitate only a subset of possible computers.  Turing’s theory needed to account for quantum mechanics if its logic was to hold.  Deutsch proposed a universal quantum computer based on quantum physics, which would have calculating powers that Turing’s computer (even in theory) could not simulate.

There are at least three problems here.  The first is conflating simulation with efficient simulation.  At the risk of going hoarse, a classical Turing machine can calculate absolutely everything that a quantum computer can calculate! It might “merely” need exponentially more time.  Second, no one has proved that a classical Turing machine really does need exponentially more time, i.e., that it can’t efficiently simulate a quantum computer.  That remains a (deep, plausible, and widely-believed) conjecture, which will take enormous mathematical advances to resolve.  And third, Deutsch’s landmark paper wasn’t among the ones to give evidence for that conjecture.  The first such evidence only came later, with the work of Bernstein-Vazirani, Simon, and Shor.

To be fair to Galchen, Deutsch himself has often been inaccurate on these points, even though he ought to (and does!) know better.  Specifically, he conflates the original Church-Turing Thesis (which isn’t challenged in the slightest by quantum computing) with its modern, polynomial-time version (which is), and he neglects to mention the conjectural status of quantum computers’ speedup.  Here are two examples out of many, from The Fabric of Reality:

“quantum computers can perform computations of which no (human) mathematician will ever, even in principle, be capable.”
“if the visible universe were the extent of physical reality, physical reality would not even remotely contain the resources required to factorize such a large number.”

Am I just harping over technicalities here?  In my view, the issue goes deeper.  All of the above oversights can be understood as symptoms of complexophobia: the fear of acknowledging that one is actually making statements about computational complexity theory.  Again and again, I’ve seen science writers go through strange verbal contortions to avoid the question of how anyone could know that a computation inherently requires a huge amount of time—as if the reader must be prevented, at all costs, from seeing such a claim as anything other than obvious.  It can be fascinating to watch, in the same way it’s fascinating to watch a politician discuss (say) Confederate History Month without mentioning slavery.  How long can you poke and prod the P versus NP beast without rousting it?

On the other hand, complexity theory does show up in Galchen’s article, and in an extremely interesting context: that of explaining where Deutsch got the idea for quantum computing.

According to Deutsch, the insight for [his famous 1985 paper] came from a conversation in the early eighties with the physicist Charles Bennett, of I.B.M., about computational-complexity theory, at the time a sexy new field that investigated the difficulty of a computational task.

Is “at the time” meant to imply complexity theory is no longer sexy, or merely that it’s no longer new?  Leaving that aside…

Mass, for instance, is a fundamental property, because it remains the same in any setting; weight is a relative property, because an object’s weight depends on the strength of gravity acting on it … If computational complexity was like mass—if it was a relative property—then complexity was quite profound; if not, then not.

“I was just sounding off,” Deutsch said.  “I said they make too much of this”—meaning complexity theory—”because there’s no standard computer with respect to which you should be calculating the complexity of the task.”  Just as an object’s weight depends on the force of gravity in which it’s measured, the degree of computational complexity depended on the computer on which it was measured.  One could find out how complex a task was to perform on a particular computer, but that didn’t say how complex a task was fundamentally, in reference to the universe … Complexity theorists, Deutsch reasoned, were wasting their time.

The tale continues with Bennett pointing out that the universe itself could be taken to be the “fundamental computer,” which leads Deutsch to the shocking realization that the complexity theorists weren’t complete morons. Sure, they had a silly theory where all the answers depended on which computer you chose (which somehow none of them ever noticed), but luckily, it could be fixed by the simple addition of quantum mechanics!

Over the anguished howls of my classical complexity-theorist friends, I should point out that this story isn’t completely false.  There’s no denying that merging quantum mechanics with theoretical computer science was a major advance in human knowledge, and that the people who first had the idea to merge the two were not computer scientists, but physicists like Deutsch and Feynman (the latter’s role is completely left out of Galchen’s story).

But complexity theory wasn’t so much a flawed early attempt at quantum computing as an essential prerequisite to it: the thing that made it possible to articulate how quantum computers might differ from classical computers in the first place.  Indeed, it occurs to me that Deutsch and Bennett’s conversation provides the key to resolving a puzzle discussed in the article:

“Quantum computers should have been invented in the nineteen-thirties,” [Deutsch] observed near the end of our conversation.  “The stuff that I did in the late nineteen-seventies and early nineteen-eighties didn’t use any innovation that hadn’t been known in the thirties.”  That is straightforwardly true.  Deutsch went on, “The question is why.”

I used to go around saying the same thing: “someone like John von Neumann could have easily invented quantum computing in the 1930s, had he just put the pieces together!”  But I now suspect this view is a mistake, the result of projecting what’s obvious today onto a much earlier era.  For there’s at least one essential ingredient for quantum computing that wouldn’t enter scientific consciousness until the 1970s or so: complexity theory, and particularly the distinction between polynomial and exponential time.

Over the years, I’ve developed what I call the Minus-Sign Test, a reliable way to rate popularizations of quantum mechanics.  To pass the Minus-Sign Test, all a popularization needs to do is mention the minus signs: i.e., interference between positive and negative amplitudes, the defining feature of quantum mechanics, the thing that makes it different from classical probability theory, the reason why we can’t say Schrödinger’s cat is “really either dead or alive,” and we simply don’t know which one, the reason why the entangled particles can’t have just agreed in advance that one would spin up and the other would spin down.  Another name for the Minus-Sign Test is the High-School Student Test, since it’s the thing that determines whether a bright high-school student, meeting quantum mechanics for the first time through the popularization, would come away thinking of superposition as

(b) a synonym used by some famous authority figures for ignorance.

Despite the low bar set by the Minus-Sign Test, I’m afraid almost every popular article about quantum mechanics ever written has failed it, the present piece included.

Reading Not Even Wrong, I was surprised at first that the discussion centered around Deutsch’s argument that quantum computing proves the existence of Many Worlds.  (More precisely, Deutsch’s position is that Many Worlds is an established fact with or without quantum computing, but that for those who are too dense or stubborn to see it, a working quantum computer will be useful for hitting them over the head.)

As others pointed out: yes, the state of the universe as described by quantum mechanics is a vastly, exponentially bigger thing than anything dreamt of in classical physics; and a scalable quantum computer would be dramatic evidence that this exponentiality is really “out there,” that it’s not just an artifact of our best current theory.  These are not merely truths, but truths worth shouting from the rooftops.

However, there’s then the further question of whether it’s useful to talk about one quantum-mechanical universe as an exponential number of parallel semi-classical universes.  After all, to whatever extent the branches of a superposition successfully contribute to a quantum computation, to that extent they’re not so much “parallel universes” as one giant, fault-tolerantly-encoded, self-interfering blob; and to whatever extent those branches do look like parallel universes, to that extent they’re now forever out of causal contact with each other—the branches other than our own figuring into our explanations for observable events only in the way that classical counterfactuals figure in.

Anyway, I thought: does anyone still care about these issues?  Wasn’t every possible argument and counterargument explored to death years ago?

But this reaction just reveals my personal bias.  Sometime in graduate school, I realized that I was less interested in winning philosophical debates than in discovering new phenomena for philosophers to debate about.  Why brood over the true meaning of (say) Gödel’s Theorem or the Bell Inequality, when there are probably other such worldview-changing results still to be found, and those results might render the brooding irrelevant anyway?  Because of this attitude, I confess to being less interested in whether Many-Worlds is true than in whether it’s scientifically fruitful.  As Peter Shor once memorably put it on this blog: why not be a Many-Worlder on Monday, a Bohmian on Tuesday, and a Copenhagenist on Wednesday, if that’s what helps you prove new theorems?

Ironically, this attitude seems to me to mesh well with Deutsch’s own emphasis on explanation as the goal of science.  Ask not whether the parallel universes are “really there,” or whether they should really be called “parallel universes”—ask what explanatory work they do for you!  (That is, over and above the explanatory work that QM itself already does for you, assuming you accept it and know how to use it.)

So for me, the single strongest argument in favor of Many-Worlds is what I call the “Deutsch argument”:

Many-Worlds is scientifically fruitful, because it led David Deutsch to think of quantum computing.

This argument carries considerable force for me.  On the other hand, if we accept it, then it seems we should also accept the following argument:

Bohmian mechanics is scientifically fruitful, because it led John Bell to think of the Bell inequality.

Furthermore, consider the following facts:

David Deutsch is a brilliant, iconoclastic theoretical physicist, who thought deeply about quantum foundations at a time when it was unfashionable to do so.  His extraordinary (and not wholly-unjustified!) self-confidence in his own powers of reasoning has led to his defending not one but many heterodox ideas.

Is it possible that these facts provide a common explanation for Deutsch’s certainty about Many-Worlds and his pioneering role in quantum computing, without our needing to invoke the former to explain the latter?

Let me end with a few miscellaneous reactions to Galchen’s article.

Physics advances by accepting absurdities.  Its history is one of unbelievable ideas proving to be true.

I’d prefer to say the history of physics is one of a vast number of unbelievable ideas proving to be false, and a few specific unbelievable ideas proving to be true—especially ideas having to do with the use of negative numbers where one might have thought only positive numbers made sense.

[Robert Schoelkopf and his group at Yale] have configured their computer to run what is known as a Grover’s algorithm, one that deals with a four-card-monte type of question: Which hidden card is the queen?  It’s a sort of Shor’s algorithm for beginners, something that a small quantum computer can take on.

No, small quantum computers can and have taken on both Shor’s and Grover’s algorithms, solving tiny instances in each case.  The real difference between Shor’s and Grover’s algorithms is one that complexophobia might prevent Galchen from mentioning: Shor gives you a (conjectured) exponential speedup for some highly-specific problems (factoring and discrete log), while Grover gives you “merely” a quadratic speedup, but for a much wider class of problems.

“Look,” [Deutsch] went on, “I can’t stop you from writing an article about a weird English guy who thinks there are parallel universes.  But I think that style of thinking is kind of a put-down to the reader.  It’s almost like saying, If you’re not weird in these ways, you’ve got no hope as a creative thinker.  That’s not true.  The weirdness is only superficial.”

This was my favorite passage in the article.

### Hopefully my last D-Wave post ever

Thursday, December 17th, 2009

Several people asked me to comment on an entry by Hartmut Neven in the Google Research Blog, about using D-Wave’s “quantum” computers for image recognition.

I said nothing: what is there to say?  Didn’t I already spend enough time on this subject for 10400 lifetimes?  I want to create, explore, discover things that no one expected—not be some talking-head playing his assigned role in a script, a blogger-pundit who journalists know they can rely on to say “f(X)” whenever X happens.  Even if f(X) is true.  Why can’t I just tell the world what f is and be done with it?

Then more people asked me to comment.

I set the matter aside.  I worked on the complexity problem that’s currently obsessing me.  I met with students, sent recommendation letters, answered emails, went ice-skating with my girlfriend.

Then more people asked me to comment.

And I thought: yes, I believe it’s vital for scientists to communicate with the broader public, not just a few colleagues.  And yes, it’s important for scientists to offer a skeptical perspective on the news—since otherwise, they implicitly cede the field to those making dubious and unsubstantiated claims.  And yes, blogging is a wonderful tool for scientists to connect directly with anyone in the world who’s curious about their work.  But isn’t there some statute of limitations on a given story?  When does it end?  And why me?

Then more people asked me to comment—so I wrote the following only-slightly-fictionalized exchange.

Skeptic: Let me see if I understand correctly.  After three years, you still haven’t demonstrated two-qubit entanglement in a superconducting device (as the group at Yale appears to have done recently)?  You still haven’t explained how your “quantum computer” demos actually exploit any quantum effects?  While some of your employees are authoring or coauthoring perfectly-reasonable papers on various QC topics, those papers still bear essentially zero relation to your marketing hype?  The academic physicists working on superconducting QC—who have no interest in being scooped—still pay almost no attention to you?  So, what exactly has changed since the last ten iterations?  Why are we still talking?

D-Wave: Then you must not have read our latest press release!  Your questions are all obsolete, because now we’re recruiting thousands of volunteers over the Internet to study the power of adiabatic quantum computing!

Onlooker: Hmm, an interesting counterargument!  D-Wave might not be using quantum mechanics, but they are using the Internet!  And their new project even has a cool code-name: “AQUA@home”!  So, skeptic, how do you respond to that?

Skeptic (distractedly): You know, when I was eight years old, and dreamed of building starships and artificial intelligences in my basement, my first order of business was always to invent code-names—not just for the projects themselves, but for every little subcomponent of them.  The second order of business was to think through the marketing aspects.  What should the robot look like?  What recreational facilities should be available on the starship, and what color should it be painted?  It really, genuinely felt like I was making concrete progress toward realizing my plans.  Sure, the engine and control system still needed to be built, but at least I had code-names and “design specs”!  How many others had even gotten that far?

D-Wave: Who cares?  This isn’t some children’s game.  Keep in mind that we’re delivering a product—serving our customers, by solving the 4-by-4 Sudoku puzzles they rely on to keep their businesses running.

Skeptic: We’ve been through this how many times?  A pigeon can probably be trained to solve 4-by-4 Sudokus.  So the only relevant questions concern the details of how you solve them.  For example, how do you encode a problem instance?  How much of the work is done in the encoding procedure itself?  What evidence do you have for quantum coherence at intermediate points of the computation?  Can you measure an entanglement witness, to give people confidence that you’re doing something other than classical simulated annealing?

Onlooker: Hmm, those do seem like important questions…

D-Wave: But they’re based on outdated premises!  Today, we’re pleased to announce that, using what might be a quantum computer, and might also be a noisy, probabilistic classical computer, we can solve 5-by-5 Sudoku puzzles!

Onlooker: Whoa, awesome!  So we’re back to square one then.  As long as D-Wave’s demos only involved 4-by-4 Sudokus, the skeptic’s arguments almost had me persuaded.  But 5-by-5?  I don’t know what to think anymore.  Skeptic, where are you?  What’s your reaction to this latest development?

Skeptic:

D-Wave: That silence you hear is the sound of the skeptic’s worldview crashing all around him!  But we haven’t even played our top card yet.  Today, we’re positively ecstatic to announce that we’ve entered into an official-sounding partnership with GOOGLE, Inc. (or anyway, with someone who works at Google Research).  Together, we’re harnessing the power of quantum adiabatic optimization to create the next generation of car-recognition systems!

Onlooker: WOW!  This debate is over, then.  I confess: D-Wave on its own did seem a bit flaky to me.  But Google is the company born without sin.  Everything they do, have done, and will ever do is perfect by definition—from building the search engine that changed the world, to running mail servers that only fail for an insignificant 0.001% of users, to keeping the Chinese people safe from lies.  And, as Google is infallible, so too its 20,000 diverse employees—who are encouraged to spend 20% of their time on high-risk, exploratory projects—have nevertheless failed to come up with a single idea that didn’t pan out.  Skeptic, show your face!  Will you admit that, through grit, moxie, old-fashioned Canadian inventiveness, and the transformative power of the Internet, D-Wave has finally achieved what the naysayers said was impossible—namely, getting someone from Google Research to coauthor a paper with them?

Skeptic: Yes.  I concede!  D-Wave wins, and I hereby retire as skeptic.  In particular, the next time D-Wave announces something, there’s no need to ask me for my reaction.  I’ll be busy tending to my own project, codenamed ARGHH@home, which consists of banging my head against a brick wall.

### Quantum Computing Since Democritus Lecture 13: How Big Are Quantum States?

Sunday, June 8th, 2008

A year and a half after the actual course, the remaining Democritus lecture notes are finally being transcribed and posted — thanks to my new summer student, Chris Granade. In Lecture 13, you can read about why QMA, QCMA, and BQP/qpoly are not merely complexity classes but battlefields, where competing visions of what quantum states really are face each other off in conflicts that we as theorists intentionally provoked. (Work with me here.)

### Scientific American article is out!

Sunday, February 17th, 2008

After three years of procrastination and delays, my 8-page feature article on “The Limits of Quantum Computers” has finally appeared in the March issue of Scientific American. Once I get permission, I’ll post a plain-text version on my website. In the meantime, you can buy the online issue for US\$5.00 from SciAm‘s website, in which case you get colorful sidebars and graphics (including a bearded, white-lab-coated cartoon scientist holding quantum computers and complexity class inclusion diagrams), as well as an interview with Jeff Kimble about the unfortunate movie “Jumper”, and other articles about “the end of cosmology”, prediction markets (Robin Hanson and his “futarchy” get a mention), and the disastrous overfishing of the bluefin tuna (the kind used for toro sushi).

Update (2/18): By popular demand, I’m posting a rough early draft (PDF) of my article online. Read at your own risk!

So, what was it like to write for Scientific American? Exhausting, excruciating, and ultimately worth it. As a general rule, SciAm (probably like all large-circulation magazines) rewrites articles so extensively that the person listed in the byline is less the “writer” than the “content consultant.” Almost every sentence in my article bears the scars of battle (some that I won, more that I didn’t). Yet I have to concede that, when something was really cringe-inducingly wrong, SciAm was willing to listen and make changes — and besides, they did a great job with the cartoons. I’m happy with the end result. Thanks to George Musser, the editor who solicited the article and gave me lots of feedback in the early stages, and to Graham Collins, who finally saw it into print.

• In an earlier draft, the cartoons adorning my article were “all balding white guy, all the time” (supposedly, because of the need to keep a “consistent character” throughout the article). I demanded some sort of cartoon-diversity. After a heated discussion among the editors — in which, I’m told, the name of Larry Summers was invoked — they finally agreed to add a cartoon black woman. To those who think I’m a male chauvinist pig: how many brownie points do I get?
• No, the crystal ball with floating ψ’s and φ’s, mounted atop a keyboard, is not an accurate depiction of what a quantum computer would look like. Having toured some actual QC labs, though, I had to admit it worked better graphically than a lab table piled high with tinfoil, lasers, and assorted pipes.
• The topic label of the article is “Information Technology.” I pleaded with them to change the topic to “Computer Science,” but to no avail. Apparently the problem was that in the table of contents, the previous two articles were labeled “Prediction Science” and “Brain Science.”
• The complexity class inclusion diagram on page 67 was a key concession I did win. (Apparently some editors felt a Venn diagram with P, NP, BQP, and PSPACE would be way too complicated, even for readers who regularly gobble down heaping helpings of M-theory.) As you can imagine, exposing people to this stuff seemed pretty important to me: this is apparently the first time P, NP, and NP-completeness have been explained at any length in Scientific American since articles by Knuth and by Lewis and Papadimitriou in the 1970′s.
• In the author bio on page 67, the description of me as a “high school dropout” is a slight exaggeration, but there’s no other short term for what I am (see here for more).
• I had nothing to do with the sidebar on page 68, about Vernor Vinge’s novel A Fire Upon the Deep. I’ve never read that (or anything else by Vinge for that matter).
• My original draft included explanations of both the polynomial and adversary methods for quantum lower bounds, with references to BBCMdW and Ambainis. Shockingly, all of that was cut, while the part about time machines was greatly expanded.

During the hairiest parts of editing process, I was reminded of a passage in Anita and Solomon Feferman’s biography of the great logician Alfred Tarski, which described Tarski’s writing of an article for Scientific American (the only popular article he ever wrote).

Usually the Scientific American articles are heavily edited; many are rewriteen and some even ghostwritten, but Wisnovsky [Tarski's editor] knew better than to tamper with Tarski’s work and did not — except for his usage of ‘which’ and ‘that’. It seemed to him that Tarski did a 180-degree reversal of these words, so he changed every ‘which’ to ‘that’ and every ‘that’ to ‘which’ and sent the proofs to Tarski, who changed everything back to the way it had been. Wisnovsky got out Fowler’s Dictionary of Modern English Usage, the house bible, and called Tarski on the telephone. “I asked if I could read him the relevant passage on ‘that’ and ‘which’ and he said, ‘yes’. It goes on for pages, but he listened very patiently until I finished. Then he said, ‘Well, you see, that is Fowler. I am Tarski.’ The minute he said that I caved in. I felt cut off at the knees and I gave up trying to make any changes at all.”

Yet, while the “Tarski approach” to magazine writing is a tempting one, here’s the final irony. I looked up Tarski’s actual article from 1969, and it badly needed an editor.