## Google, D-Wave, and the case of the factor-10^8 speedup for WHAT?

Update (Dec. 16):  If you’re still following this, please check out an important comment by Alex Selby, the discoverer of Selby’s algorithm, which I discussed in the post.  Selby queries a few points in the Google paper: among other things, he disagrees with their explanation of why his classical algorithm works so well on D-Wave’s Chimera graph (and with their prediction that it should stop working for larger graphs), and he explains that Karmarkar-Karp is not the best known classical algorithm for the Number Partitioning problem.  He also questions whether simulated annealing is the benchmark against which everything should be compared (on the grounds that “everything else requires fine-tuning”), pointing out that SA itself typically requires lots of tuning to get it to work well.

Update (Dec. 11): MIT News now has a Q&A with me about the new Google paper. I’m really happy with how the Q&A turned out; people who had trouble understanding this blog post might find the Q&A easier. Thanks very much to Larry Hardesty for arranging it.

Meanwhile, I feel good that there seems to have been actual progress in the D-Wave debate! In previous rounds, I had disagreed vehemently with some of my MIT colleagues (like Ed Farhi and Peter Shor) about the best way to respond to D-Wave’s announcements. Today, though, at our weekly group meeting, there was almost no daylight between any of us. Partly, I’m sure, it’s that I’ve learned to express myself better; partly it’s that the “trigger” this time was a serious research paper by a group separate from D-Wave, rather than some trash-talking statement from Geordie Rose. But mostly it’s that, thanks to the Google group’s careful investigations, this time pretty much anyone who knows anything agrees about all the basic facts, as I laid them out in this blog post and in the Q&A. All that remains are some small differences in emotional attitude: e.g., how much of your time do you want to spend on a speculative, “dirty” approach to quantum computing (which is far ahead of everyone else in terms of engineering and systems integration, but which still shows no signs of an asymptotic speedup over the best classical algorithms, which is pretty unsurprising given theoretical expectations), at a time when the “clean” approaches might finally be closing in on the long-sought asymptotic quantum speedup?

Another Update: Daniel Lidar was nice enough to email me an important observation, and to give me permission to share it here.  Namely, the D-Wave 2X has a minimum annealing time of 20 microseconds.  Because of this, the observed running times for small instance sizes are artificially forced upward, making the growth rate in the machine’s running time look milder than it really is.  (Regular readers might remember that exactly the same issue plagued previous D-Wave vs. classical performance comparisons.)  Correcting this would certainly decrease the D-Wave 2X’s predicted speedup over simulated annealing, in extrapolations to larger numbers of qubits than have been tested so far (although Daniel doesn’t know by how much).  Daniel stresses that he’s not criticizing the Google paper, which explicitly mentions the minimum annealing time—just calling attention to something that deserves emphasis.

In retrospect, I should’ve been suspicious, when more than a year went by with no major D-Wave announcements that everyone wanted me to react to immediately. Could it really be that this debate was over—or not “over,” but where it always should’ve been, in the hands of experts who might disagree vehemently but are always careful to qualify speedup claims—thereby freeing up the erstwhile Chief D-Wave Skeptic for more “””rewarding””” projects, like charting a middle path through the Internet’s endless social justice wars?

Nope.

As many of you will have seen by now, on Monday a team at Google put out a major paper reporting new experiments on the D-Wave 2X machine.  (See also Hartmut Neven’s blog post about this.)  The predictable popularized version of the results—see for example here and here—is that the D-Wave 2X has now demonstrated a factor-of-100-million speedup over standard classical chips, thereby conclusively putting to rest the question of whether the device is “truly a quantum computer.”  In the comment sections of one my previous posts, D-Wave investor Steve Jurvetson even tried to erect a victory stele, by quoting Karl Popper about falsification.

In situations like this, the first thing I do is turn to Matthias Troyer, who’s arguably the planet’s most balanced, knowledgeable, trustworthy interpreter of quantum annealing experiments. Happily, in collaboration with Ilia Zintchenko and Ethan Brown, Matthias was generous enough to write a clear 3-page document putting the new results into context, and to give me permission to share it on this blog. From a purely scientific standpoint, my post could end right here, with a link to their document.

Then again, from a purely scientific standpoint, the post could’ve ended even earlier, with the link to the Google paper itself!  For this is not a case where the paper hides some crucial issue that the skeptics then need to ferret out.  On the contrary, the paper’s authors include some of the most careful people in the business, and the paper explains the caveats as clearly as one could ask.  In some sense, then, all that’s left for me or Matthias to do is to tell you what you’d learn if you read the paper!

So, OK, has the D-Wave 2X demonstrated a factor-108 speedup or not?  Here’s the shortest answer that I think is non-misleading:

Yes, there’s a factor-108 speedup that looks clearly asymptotic in nature, and there’s also a factor-108 speedup over Quantum Monte Carlo. But the asymptotic speedup is only if you compare against simulated annealing, while the speedup over Quantum Monte Carlo is only constant-factor, not asymptotic. And in any case, both speedups disappear if you compare against other classical algorithms, like that of Alex Selby. Also, the constant-factor speedup probably has less to do with quantum mechanics than with the fact that D-Wave built extremely specialized hardware, which was then compared against a classical chip on the problem of simulating the specialized hardware itself (i.e., on Ising spin minimization instances with the topology of D-Wave’s Chimera graph). Thus, while there’s been genuine, interesting progress, it remains uncertain whether D-Wave’s approach will lead to speedups over the best known classical algorithms, let alone to speedups over the best known classical algorithms that are also asymptotic or also of practical importance. Indeed, all of these points also remain uncertain for quantum annealing as a whole.

To expand a bit, there are really three separate results in the Google paper:

1. The authors create Chimera instances with tall, thin energy barriers blocking the way to the global minimum, by exploiting the 8-qubit “clusters” that play such a central role in the Chimera graph.  In line with a 2002 theoretical prediction by Farhi, Goldstone, and Gutmann (a prediction we’ve often discussed on this blog), they then find that on these special instances, quantum annealing reaches the global minimum exponentially faster than classical simulated annealing, and that the D-Wave machine realizes this advantage.  As far as I’m concerned, this completely nails down the case for computationally-relevant collective quantum tunneling in the D-Wave machine, at least within the 8-qubit clusters.  On the other hand, the authors point out that there are other classical algorithms, like that of Selby (building on Hamze and de Freitas), which group together the 8-bit clusters into 256-valued mega-variables, and thereby get rid of the energy barrier that kills simulated annealing.  These classical algorithms are found empirically to outperform the D-Wave machine.  The authors also match the D-Wave machine’s asymptotic performance (though not the leading constant) using Quantum Monte Carlo, which (despite its name) is a classical algorithm often used to find quantum-mechanical ground states.
2. The authors make a case that the ability to tunnel past tall, thin energy barriers—i.e., the central advantage that quantum annealing has been shown to have over classical annealing—might be relevant to at least some real-world optimization problems.  They do this by studying a classic NP-hard problem called Number Partitioning, where you’re given a list of N positive integers, and your goal is to partition the integers into two subsets whose sums differ from each other by as little as possible.  Through numerical studies on classical computers, they find that quantum annealing (in the ideal case) and Quantum Monte Carlo should both outperform simulated annealing, by roughly equal amounts, on random instances of Number Partitioning.  Note that this part of the paper doesn’t involve any experiments on the D-Wave machine itself, so we don’t know whether calibration errors, encoding loss, etc. will kill the theoretical advantage over simulated annealing.  But even if not, this still wouldn’t yield a “true quantum speedup,” since (again) Quantum Monte Carlo is a perfectly-good classical algorithm, whose asymptotics match those of quantum annealing on these instances.
3. Finally, on the special Chimera instances with the tall, thin energy barriers, the authors find that the D-Wave 2X reaches the global optimum about 108 times faster than Quantum Monte Carlo running on a single-core classical computer.  But, extremely interestingly, they also find that this speedup does not grow with problem size; instead it simply saturates at ~108.  In other words, this is a constant-factor speedup rather than an asymptotic one.  Now, obviously, solving a problem “only” 100 million times faster (rather than asymptotically faster) can still have practical value!  But it’s crucial to remember that this constant-factor speedup is only observed for the Chimera instances—or in essence, for “the problem of simulating the D-Wave machine itself”!  If you wanted to solve something of practical importance, you’d first need to embed it into the Chimera graph, and it remains unclear whether any of the constant-factor speedup would survive that embedding.  In any case, while the paper isn’t explicit about this, I gather that the constant-factor speedup disappears when one compares against (e.g.) the Selby algorithm, rather than against QMC.

So then, what do I say to Steve Jurvetson?  I say—happily, not grudgingly!—that the new Google paper provides the clearest demonstration so far of a D-Wave device’s capabilities.  But then I remind him of all the worries the QC researchers had from the beginning about D-Wave’s whole approach: the absence of error-correction; the restriction to finite-temperature quantum annealing (moreover, using “stoquastic Hamiltonians”), for which we lack clear evidence for a quantum speedup; the rush for more qubits rather than better qubits.  And I say: not only do all these worries remain in force, they’ve been thrown into sharper relief than ever, now that many of the side issues have been dealt with.  The D-Wave 2X is a remarkable piece of engineering.  If it’s still not showing an asymptotic speedup over the best known classical algorithms—as the new Google paper clearly explains that it isn’t—then the reasons are not boring or trivial ones.  Rather, they seem related to fundamental design choices that D-Wave made over a decade ago.

The obvious question now is: can D-Wave improve its design, in order to get a speedup that’s asymptotic, and that holds against all classical algorithms (including QMC and Selby’s algorithm), and that survives the encoding of a “real-world” problem into the Chimera graph?  Well, maybe or maybe not.  The Google paper returns again and again to the subject of planned future improvements to the machine, and how they might clear the path to a “true” quantum speedup. Roughly speaking, if we rule out radical alterations to D-Wave’s approach, there are four main things one would want to try, to see if they helped:

1. Lower temperatures (and thus, longer qubit lifetimes, and smaller spectral gaps that can be safely gotten across without jumping up to an excited state).
2. Better calibration of the qubits and couplings (and thus, ability to encode a problem of interest, like the Number Partitioning problem mentioned earlier, to greater precision).
3. The ability to apply “non-stoquastic” Hamiltonians.  (D-Wave’s existing machines are all limited to stoquastic Hamiltonians, defined as Hamiltonians all of whose off-diagonal entries are real and non-positive.  While stoquastic Hamiltonians are easier from an engineering standpoint, they’re also the easiest kind to simulate classically, using algorithms like QMC—so much so that there’s no consensus on whether it’s even theoretically possible to get a true quantum speedup using stoquastic quantum annealing.  This is a subject of active research.)
4. Better connectivity among the qubits (thereby reducing the huge loss that comes from taking problems of practical interest, and encoding them in the Chimera graph).

(Note that “more qubits” is not on this list: if a “true quantum speedup” is possible at all with D-Wave’s approach, then the 1000+ qubits that they already have seem like more than enough to notice it.)

Anyway, these are all, of course, things D-Wave knows about and will be working on in the near future. As well they should! But to repeat: even if D-Wave makes all four of these improvements, we still have no idea whether they’ll see a true, asymptotic, Selby-resistant, encoding-resistant quantum speedup. We just can’t say for sure that they won’t see one.

In the meantime, while it’s sometimes easy to forget during blog-discussions, the field of experimental quantum computing is a proper superset of D-Wave, and things have gotten tremendously more exciting on many fronts within the last year or two.  In particular, the group of John Martinis at Google (Martinis is one of the coauthors of the Google paper) now has superconducting qubits with orders of magnitude better coherence times than D-Wave’s qubits, and has demonstrated rudimentary quantum error-correction on 9 of them.  They’re now talking about scaling up to ~40 super-high-quality qubits with controllable couplings—not in the remote future, but in, like, the next few years.  If and when they achieve that, I’m extremely optimistic that they’ll be able to show a clear quantum advantage for something (e.g., some BosonSampling-like sampling task), if not necessarily something of practical importance.  IBM Yorktown Heights, which I visited last week, is also working (with IARPA funding) on integrating superconducting qubits with many-microsecond coherence times.  Meanwhile, some of the top ion-trap groups, like Chris Monroe’s at the University of Maryland, are talking similarly big about what they expect to be able to do soon. The “academic approach” to QC—which one could summarize as “understand the qubits, control them, keep them alive, and only then try to scale them up”—is finally bearing some juicy fruit.

(At last week’s IBM conference, there was plenty of D-Wave discussion; how could there not be? But the physicists in attendance—I was almost the only computer scientist there—seemed much more interested in approaches that aim for longer-laster qubits, fault-tolerance, and a clear asymptotic speedup.)

I still have no idea when and if we’ll have a practical, universal, fault-tolerant QC, capable of factoring 10,000-digit numbers and so on.  But it’s now looking like only a matter of years until Gil Kalai, and the other quantum computing skeptics, will be forced to admit they were wrong—which was always the main application I cared about anyway!

So yeah, it’s a heady time for QC, with many things coming together faster than I’d expected (then again, it was always my personal rule to err on the side of caution, and thereby avoid contributing to runaway spirals of hype).  As we stagger ahead into this new world of computing—bravely, coherently, hopefully non-stoquastically, possibly fault-tolerantly—my goal on this blog will remain what it’s been for a decade: not to prognosticate, not to pick winners, but merely to try to understand and explain what has and hasn’t already been shown.

Update (Dec. 10): Some readers might be interested in an economic analysis of the D-Wave speedup by commenter Carl Shulman.

Another Update: Since apparently some people didn’t understand this post, here are some comments from a Y-Combinator thread about the post that might be helpful:

(1) [T]he conclusion of the Google paper is that we have probable evidence that with enough qubits and a big enough problem it will be faster for a very specific problem compared to a non-optimal classical algorithm (we have ones that are for sure better).

This probably sounds like a somewhat useless result (quantum computer beats B-team classical algorithm), but it is in fact interesting because D-Wave’s computers are designed to perform quantum annealing and they are comparing it to simulated annealing (the somewhat analogous classical algorithm). However they only found evidence of a constant (i.e. one that 4000 qubits wouldn’t help with) speed up (though a large one) compared to a somewhat better algorithm (Quantum Monte Carlo, which is ironically not a quantum algorithm), and they still can’t beat an even better classical algorithm (Selby’s) at all, even in a way that won’t scale.

Scott’s central thesis is that although it is possible there could be a turning point past 2000 qubits where the D-Wave will beat our best classical alternative, none of the data collected so far suggests that. So it’s possible that a 4000 qubit D-Wave machine will exhibit this trend, but there is no evidence of it (yet) from examining a 2000 qubit machine. Scott’s central gripe with D-Wave’s approach is that they don’t have any even pie-in-the-sky theoretical reason to expect this to happen, and scaling up quantum computers without breaking the entire process is much harder than for classical computers so making them even bigger doesn’t seem like a solution.

(2) DWave machines are NOT gate quantum computers; they call their machine quantum annealing machines. It is not known the complexity class of problems that can be solved efficiently by quantum annealing machines, or if that class is equivalent to classical machines.

The result shows that the DWave machine is asymptotically faster than the Simulated Annealing algorithm (yay!), which suggests that it is executing the Quantum Annealing algorithm. However, the paper also explicitly states that this does not mean that the Dwave machine is exhibiting a ‘quantum speedup’. To do this, they would need to show it to outperform the best known classical algorithm, which as the paper acknowledges, it does not.

What the paper does seem to be showing is that the machine in question is actually fundamentally quantum in nature; it’s just not clear yet that that the type of quantum computer it is is an improvement over classical ones.

(3) [I]t isn’t called out in the linked blog since by now Scott probably considers it basic background information, but D-Wave only solves a very particular problem, and it is both not entirely clear that it has a superior solution to that problem than a classical algorithm can obtain and it is not clear that encoding real problems into that problem will not end up costing you all of the gains itself. Really pragmatic applications are still a ways into the future. It’s hard to imagine what they might be when we’re still so early in the process, and still have no good idea what either the practical or theoretical limits are.

(4) The popular perception of quantum computers as “doing things in parallel” is very misleading. A quantum computer lets you perform computation on a superposed state while maintaining that superposition. But that only helps if the structure of the problem lets you somehow “cancel out” the incorrect results leaving you with the single correct one. [There’s hope for the world! –SA]

### 222 Responses to “Google, D-Wave, and the case of the factor-10^8 speedup for WHAT?”

1. fred Says:

Bleh…

2. Jordan Says:

As long as we have no proof one way or the other about the relationship between P and BQP, we can’t say anything definitively about the delineation of power among these machines (Dwave, Martinis’, Intel or otherwise).

3. Ash Williams Says:

They say science proceeds one funeral at a time, but sometimes it seems more like it proceeds one settled bet at a time 🙂

4. fred Says:

To me, the *only* interesting aspect of D-Wave is the fact that it forces everyone to come up with good science to compare theoretical asymptotic speedups between wildly different hardware and algorithms.
Hopefully all this work will be useful once we’re dealing with a real QC.

5. Michael Nielsen Says:

Just wanted to say thanks for writing this, Scott – it must have been a lot of work, and it’s tremendously informative.

6. Dov Says:

Well Scott, it looks like your skepticism has become a severe bias. I hope you realize that biases are a liability for scientists.

7. Scott Says:

Michael #5: Thanks!!!

Dov #6: Instead of posting a content-free comment like that one, why don’t you help us by explaining what my biases caused me to get wrong?

8. Rhenium Says:

I’m a research scientist, albeit in chemistry. I frequent Scott’s site to hear from someone much cleverer than myself on this sort of stuff. I am aware of Scott’s previous skepticism, does this mean he is won over?

Can someone use the equivalent of XKCD’s “thing explainer” to tell me outcome of the current paper?

9. Ian Says:

I think the outcome is basically that:

If you fix the problem space (i.e. only draw from a set of problems that meet a specific set of constraints)

and

limit yourself to comparisons with one “classic” flavor of a non-quantum algorithm (classic simulated annealing)

then

D-Wave’s computer wins by a humungous amount.

One challenge to date has been to show a clear advantage of the D-Wave hardware. Here they’ve done it by artificially limiting the problem space and the possible competing solutions. Whether their “victory” means anything (I personally think nothing at all) is up to interpretation.

10. Gelectrode Says:

Rhenium, please read Scott’s bolded text carefully. The D-Wave has not shown an asymptotic improvement (yet) over the best classical algorithm, even for an artificially-constructed problem designed to best highlight such a difference.

11. Daniel Freeman Says:

A lot of computer doctors bought a special machine which fixes number problems (and maybe some day real problems). They still have not fixed any number problems faster than normal machines can, but they’re still trying. As far as anyone can tell, even if we had a bigger special machine, it still doesn’t seem like it would be any better, which is a problem that not many news people understand very well. A big *very* special machine *could* fix number problems faster than normal machines (and faster than special machines), but they’re hard to build, so for some reason some people have thrown money at special machines.

(courtesy http://xkcd.com/simplewriter/)

12. Scott Says:

Rhenium #8: LOL, I should’ve expected that, after I spent all day writing a post spelling out exactly what I thought, someone would write in to ask what I thought! 🙂

Look, it’s not a matter of “skeptical” versus “won over”; you really have to separate it out into the individual claims. I’ve always been willing to credit D-Wave when the evidence was presented that they did something, and I’ve always gotten annoyed (and tried to correct the record) when people wildly misunderstood or exaggerated where they are—which has happened, like clockwork, pretty much every single time they or Google/NASA or USC/Lockheed have announced anything. Both of these traditions continue today.

The broader context here is that D-Wave has been doggedly following what the vast majority in the field, from the beginning, has considered a “dirty” approach to QC. The “clean” approach—the one that’s ultimately better—is to build high-quality qubits, incorporate fault-tolerance, and make it clear to everyone that you’re seeing a real quantum speedup, by targeting one of the problems where we know theoretically that a speedup should exist. There are groups all over the world working toward this, and some of them might even be close to succeeding. The D-Wave approach, by contrast, is to throw together lots of qubits that are of much lower quality, use them for optimization problems where we have no idea whether a quantum speedup should even exist or not, and then just see what happens, interpreting any ambiguous results in the most optimistic light imaginable.

Now, of course it’s possible that the “dirty / anti-theoretical” approach will get lucky, and beat the “clean” approach to showing a genuine asymptotic quantum speedup over the best known classical algorithms (even if only by a year or two)—particularly if the “dirty” approach has more funding and manpower. The history of technology provides plenty of precedents for that. But it hasn’t happened yet, might not even be close to happening. The race continues. That was the point of this post.

13. Scott Says:

Gelectrode #10 and Daniel #11: LOL, thanks, comments crossed!

14. Joshua Zelinsky Says:

> In particular, the group of John Martinis at Google (Martinis is one of the coauthors of the Google paper) now has superconducting qubits with orders of magnitude better coherence times than D-Wave’s qubits, and has demonstrated rudimentary quantum error-correction on 9 of them.

If one wanted to actually implement Shor’s for a reasonably small number but that would still pretty small, say 3 or 4 digits, how many qubits would one actually need including error correction?

Also, how much room is there to actually improve the theoretical end of the quantum error correction, in terms of reducing how many qubits one actually need, and is there any possibility that for specific things we care about like Shore’s algorithm we may be able to use less error correction than we would need for an actually universal implementation?

15. Scott Says:

Joshua #14: In general, if you want to do fault-tolerance at all and have it be a net win, you’ll probably need at least tens of physical qubits for every logical one. And as you need to cope with higher decoherence rates (say, 1% per qubit per gate time rather than 0.01%), the number of physical qubits will go up pretty dramatically. (But people continue to design better fault-tolerance schemes, and it’s possible that they’re still several orders of magnitude away from the theoretical limits.)

On the other hand, if you had (say) 40 qubits that supported several thousand nearest-neighbor operations before decoherence—as some of the groups say they’re aiming for in the near future—it seems plausible that you could just run Shor’s algorithm “bare” (with no error-correction) to factor numbers with at least a few digits. But I don’t know the actual estimates of number of gates, etc. so I can’t say for sure—would anyone who’s looked into this like to enlighten us?

I’m not aware of any fault-tolerance method that’s “specialized” to Shor’s algorithm. On the other hand, as shown by Cleve and Watrous in 2000, Shor’s algorithm does have the important property that the “quantum” part of it can be implemented using only log-depth circuits. So, that fact can reduce the fault-tolerance overhead to below what one would have thought naïvely.

16. noobermin Says:

Hi, thoughts on this? It’s in Nature: The undecidability of the spectral gap problem

http://www.nature.com/nature/journal/v528/n7581/full/nature16059.html

17. Jordan Says:

re ‘fault-tolerance method that’s “specialized” to Shor’s algorithm.’

http://arxiv.org/abs/0909.2188

18. Scott Says:

noobermin #16: That paper is a year or two old by now, but it’s extremely nice! To clarify, it doesn’t show an actual finite physical system whose behavior is undecidable—rather, what it shows (and what’s a technical tour de force) is that you can encode the halting problem into the problem of deciding whether a translationally-invariant local Hamiltonian is gapped or gapless, as the number of particles is increased to infinity. Perhaps the most interesting open problem arising from this work (which I understand is still open), is whether the classical analogue of the problem is also undecidable. Almost all of Cubitt et al.’s construction is classical, but there’s one part that uses Kitaev’s phase estimation.

19. John Smolin Says:

How can you say that “this completely nails down the case for computationally-relevant collective quantum tunneling in the D-Wave machine, at least within the 8-qubit clusters”? Although “computationally relevant” may be ill-defined, if it means anything it should mean a speed-up compared to either the best possible classical algorithm. Comparing only to classical simulated annealing is a straw-man created by the confusion that quantum annealing is sort of like simulated annealing.

Also, I would say that it shows that some sort of collective behavior is going on within the 8-“qubit” clusters, but there is no reason it need be quantum.

20. Trevor Says:

Heya, Scott.

Thanks for the great article. Small correction that I wanted to toss out there: Number partitioning as you describe it in point 2 of your Google Paper results (partitioning a set of numbers into two subsets where the difference between their sums is minimized) is most certainly NP-Hard. An NP-Complete version of the problem would be partitioning the set into two sets “having sums with a difference less than or equal to X”, or “having the same sums” (the special case of X=0). In the problem as you originally defined it, verifying any given answer requires entirely recomputing it, which puts it entirely in the realm of NP-Hard.

21. Scott Says:

Trevor #20: Oh, c’mon. Do you think I didn’t pause while writing that, ponder whether to say “NP-hard” instead of “NP-complete,” and then decide that anyone who knew enough to care about the difference would also know that, to get the NP-complete version, you need to convert the optimization problem into a decision problem in the usual way?

God dammit. I’ll change it. 🙂

22. Scott Says:

John #19: If you and Graeme and Umesh and Seung-Woo (or whoever) can come up with another classical model that explains what’s going on, I’ll be extremely interested and impressed! But the facts that QMC and the DW2X track each other so closely in the new paper’s results seemed to me like pretty strong evidence that the physics is well-described by QMC, which would mean tunneling was present. Also, if tunneling wasn’t present, then I find it hard to understand why there would be such a big advantage, not just for any instances, but specifically for the Farhi-Goldstone-Gutmann-like ones with the long thin spikes. Can you explain this?

On a different note … I’ve been getting some nasty comments (which I haven’t let through the moderation filter) about how I’m a fanatical D-Wave hater, “flinging feces” at their historic achievement for no reason. How ironic that, to many of my QC colleagues, I look instead like someone who bends over backwards to give D-Wave the benefit of the doubt whenever possible!

23. Trevor Says:

Hah! That’s fair. I almost didn’t bother mentioning it, but then I considered that, back in the day when I was learning the material in the first place, series of “people who care will already know” tended to contribute to misunderstandings of people who were on the verge of groking something. The number of times I’ve seen the pure Traveling Salesman optimization problem referred to as a classic NP-Complete problem, Shirley, must contribute to enthusiasts’ confusion.

24. Rahul Says:

“the D-Wave 2X reaches the global optimum about 10^8 times faster than Quantum Monte Carlo running on a single-core classical computer.”

Why is this a good comparison? It’s like building a new, custom-modified F1 racing car & bench-marking it against an Amish buggy.

Couldn’t D-Wave have chosen something even slower to benchmark against? I’m sure the exponent would look even better.

25. Shehab Says:

Since Selby’s algorithm, I don’t see why would anyone publish a paper on D-Wave’s performance without comparing it against Selby’s algorithm!

26. Henning Dekant Says:

Very happy to see that my best case scenario seems to pan out, i.e. rather than hampering the QC development D-Wave seems to draw in more attention and money to the field. And just in case the “dirty” approach to QC gets lucky, the “clean” folks hurry to get their act together.

INHO Google (Neven) deserves some credit in bringing Martinis group and D-Wave together, as the former will certainly benefit from some of D-Wave’s engineering (e.g. fridge, shielding, lithography etc.).

27. Joscha Says:

Great Scott, I am probably going to embarrass myself here, but could it be that we will discover that quantum computation is not intrinsically more powerful than classical computation, but that superpositional features are simply underdefined with respect to spacetime? Here is a sketch of the argument: http://palmstroem.blogspot.ca/2015/11/rethinking-quantum-mechanics-and.html

In that case, it will still be as expensive as ever to make a classical model of the spacetime properties of a quantum system, but we will never get more than classical computational power out of a quantum system, because it intrinsically does not do more than that.

28. John Smolin Says:

Scott #22: The classical model is that the clusters tend to rotate together–exactly like in the cluster algorithms that outperform the D-Wave machine.

I admit I haven’t studied the Farhi-Goldstone-Guttmann paper carefully, but it is clear that simulated annealing will be bad when there are clusters that stick together. The algorithm tries to flip one bit of a cluster and naturally its neighbors outvote it and flip it back. Only if the whole cluster flips together do they have a chance to change–but classical correlation of flips is sufficient.

I see the fact that cluster-updating classical algorithms are fast as decent evidence that there exists a classical theory that explains the speedup. In one sense the classical algorithm themselves are such theories.

29. Helmut Says:

My main issue is that the poor performance of simulated annealing (SA) vs DW2 is by design. The instances are built from strongly coupled clusters with fields. If you were to exploit the intrinsic symmetry of the clusters by performing “tailored” cluster updates, SA would do better. Also, is is well known that any Monte Carlo method that has glassiness and fields performs very poorly. Even the state of the art parallel tempering Monte Carlo used in the field of spin glasses. While the scaling might suggest an advantage for DW2X, if I may be devils advocate, they are just seeing a poor performance of SA, especially because the scaling towards QMC is constant. So what did we learn? That SA sucks. Nothing else.

Also: FWIW, it is the HFS (Hamze-de Freitas-Selby) algorithm…

[…] 及其量子計算機持批判態度的 MIT 理論計算機科學傢 Scott Aaronson 發表文章回應瞭最新的報道。Google 的研究人員在最新論文中稱，D-Wave […]

31. Jennifer Says:

Trevor #20 and Scott #21: I’ve learnt something from this exchange – possibly something I ought to have already known, but forgotten. So thank you!

32. Tommy Says:

Scott, thank you very much for the detailed-but-still-condensed explanation, I was waiting for this post and it arrived really quick! Basically, it seems to boil down to the D-Wave machine being very efficient at simulating a D-Wave machine 🙂 But at least a confirmation that the machine is an interesting piece of engineering nonetheless.

I also wanted to ask you something else a bit OT, feel free to remove this second part of my comment if you think it might lead to unwanted discussion. What is your opinion about what happened during and after the talk by Maryam Namazie at the Goldsmith University? In particular, what do you think about the reactions of Goldsmith’s Feminist and LGBQT+ societies? I’m asking you because I think that you have been quite involved in these kind of mechanics lately, and you might have a better viewpoint.

http://www.independent.co.uk/student/news/muslim-students-from-goldsmiths-university-s-islamic-society-heckle-and-aggressively-interrupt-a6760306.html

33. Google、D-Wave、及10^8倍加速案例 _ HPJ's Personal Website Says:

[…] 及其量子计算机持批判态度的 MIT 理论计算机科学家 Scott Aaronson 发表文章回应了最新的报道。Google 的研究人员在最新论文中称，D-Wave […]

34. Rahul Says:

Shehab #25:

To make the thing compared (DWave) look good, of course!

35. Rahul Says:

Henning Dekant #26:

“Very happy to see that my best case scenario seems to pan out, i.e. rather than hampering the QC development D-Wave seems to draw in more attention and money to the field.”

Only for so long. Depends on how long DWave can ride the Hype Wave. Even cold fusion got attention & money for half a decade.

At some point, people will start asking the pesky questions: e.g. What problem did that 15 million dollar box help us solve exactly?

I’d rather have QC plod along with realistic goals & truthful achievements than have a wave of funding & attention predicated on hype & exaggeration.

When you go around touting 10^8 speedups, be sure that the disappointment is necessarily coming up ahead at some point.

36. Scott Says:

Jordan #17: Thanks!

37. Scott Says:

Joscha #27:

I am probably going to embarrass myself here, but could it be that we will discover that quantum computation is not intrinsically more powerful than classical computation, but that superpositional features are simply underdefined with respect to spacetime?

I don’t really understand what that means, but I can tell you that, if the attempt to build QCs discovered anything along those lines, it would be a scientific revolution (the overthrow of conventional QM), and a thousand times more exciting than a mere “success” in building QCs that behaved as theory predicted.

38. Peter Love Says:

Dear Scott,

This is a very nice post! Thanks for taking the time to summarize so clearly and concisely. I’m looking forward to Graham and Umesh’s next paper reproducing all these effects with a nice classical spin model :-).

Cheers,

Peterr

39. jonas Says:

Scott: I hope you’ll write in more detail about the results of those three other research groups you mentioned.

Rahul: your analogy with cold fusion is great, because research eventually leading to practical nuclear fusion reactors gets very little funding these days, instead everyone wants more efficient fission reactors, like those salt thingy ones, because they seem to promise a shorter term gain. Scott has explained a few times that when such a backlash happens with DWave, then the other quantum computing groups will get a bad reputation and no funding too.

40. Bram Cohen Says:

Much congratulations to the D-Wave folks for finally having compelling evidence that their machine is exhibiting quantum phenomena. (I mean this without sarcasm, but with much snark.)

41. Scott Says:

jonas #39: Actually, I don’t really agree with that analogy. With the salt-thingy fission reactors, we’re pretty damn sure they would work on a relatively short timescale, whereas with fusion reactors, no one really has any idea how long they’ll take to become practical. And then there’s the small matter that the future of civilization might hang on having cheap, widespread carbon-free energy ASAP.

With D-Wave, by contrast, the problem since the beginning has been that even if you build it, we didn’t know and still don’t know whether there’s any genuine quantum-mechanical speedup to be had by their approach (stoquastic quantum annealing at finite temperature). If your goal is to prove to skeptics that a quantum speedup is possible at all, then it’s the “clean” approach (nuclear fusion in your analogy), rather than the D-Wave approach, that provides the safer option.

42. jjirwin Says:

Thanks for the article Scott. I have to wonder whether in the end the factor that will decide if D-wave is revolutionary or a flop is the market. If some company finds a way to use a quantum annealer to churn through some practical problem waaaay faster than anyone else in the world can do it, and if the company is able to monetize their results (QAaaS: Quantum Annealing as a Service…heh), then we will know for sure that the computer is doing something special. If it weren’t special some other company would just use a classical computer to undercut their prices.

I understand QAaaS is by no means around the corner, but I think to D-Wave, a business, the market in the end is the most important for their success.

43. Carl Shulman Says:

This post hits most of the elements of hype here clearly, but it seems to be a bit shy about drawing out the economics.

One can get big constant factors from specialized hardware for many problems, but mostly we don’t use super-specialized hardware because of costs of design and other tradeoffs. So we can ask how much of a constant factor we could get over a single Xeon core, by using the D-Wave budget.

D-Wave has raised over $138 million in US dollars from investors, plus its$20 million from Google and Lockheed Martin.

http://venturebeat.com/2015/01/29/d-wave-keeps-dreaming-of-quantum-computers-everywhere-takes-23-1m-more/

A development budget of $100,000,000 would be plenty to develop conventional ASICs specialized for a given problem like Quantum Monte Carlo, and thereafter one’s bang per buck would be vastly higher. For bitcoin mining ASICs are 300,000-1,000,000 times faster than CPUs. Second, the per system costs are much higher for D-Wave. The D-Wave device sells for$10,000,000, and was compared with one core of a six-core Intel system that sells for $600: http://www.newegg.com/Product/Product.aspx?Item=N82E16819117499 That’s a factor of 100,000 right there. And as you discuss in the post, the best classical algorithms can beat the D-Wave device on a single PC on every problem it has been tested on, despite those problems being rigged to advantage the D-Wave system (with huge additional overhead for translating real-world problems to the device). That means the D-Wave device is at least millions of times more expensive for the same computational performance on real world problems than a conventional ASIC approach (with some adjustment for power consumption, I don’t know what the D-Wave device’s running costs are). And since the D-Wave device shows no asymptotic speedup, there’s little reason from the current data to expect that price-performance gap to be crossed without major changes. In the past you’ve joked that Google could pay you$10,000,000 to run Troyer’s algorithm on a laptop instead of a D-Wave One. The example would be even more ludicrous after adding all of the orders of magnitude attainable through spending D-Wave’s $138MM+ funding on developing relevant ASICs and testing on a real-world problem. I realize that in principle all this can be inferred from the paper, a bit of background knowledge, and your post. But one of the lessons of the D-Wave saga seems to be that you really need to spell it out for journalists, and the fact that D-Wave is so many orders of magnitude more expensive than systems that outperform it is in the native tongue of business reporters (moreso than the computer science analysis). 44. Scott Says: Carl #43: Thanks so much for the economic angle! 45. Lisa Adam Says: Hey Scott – what would be real interesting to know is why a lot of really smart, well informed people/organizations support D-Wave’s approach. Perhaps they know something others don’t know. Los Alamos National Labs Google NASA Lockheed Martin USC US Government Canadian Government Goldman Sachs Jeff Bezos 46. Rahul Says: Carl Shulman: Bravo! This is exactly the sort of hard number based hype debunking we need. I hope we can disabuse at least of of the many journalists lapping away the 10^8 speedup metric fed them by D-Wave’s hype machine. 47. Rahul Says: @Lisa Adam asking “why a lot of really smart, well informed people/organizations support D-Wave’s approach.” Good point. Always puzzled me. Some thoughts: (a) Principal Agent Problem: The guys taking these decisions are doing what’s best for themselves which often is NOT what’s best for the organizations / firms they represent (b) Some variant of Pascals Wager: i.e. They know DWave is probably bogus but hell what about that small chance that DWave’s for real? That would be a low probability event but with a huge payoff. So why risk it. What Google etc. is paying to buy in in this lottery is peanuts anyways. (c) One might know a thing is a dud but also know that the public perception of that thing is not that it is a dud. So it makes sense to play along and grab whatever publicity benefits accrue. (d) There’s a sucker born every minute. 48. Scott Says: Tommy #32: I also wanted to ask you something else a bit OT, feel free to remove this second part of my comment if you think it might lead to unwanted discussion. What is your opinion about what happened during and after the talk by Maryam Namazie at the Goldsmith University? In particular, what do you think about the reactions of Goldsmith’s Feminist and LGBQT+ societies? I’m asking you because I think that you have been quite involved in these kind of mechanics lately, and you might have a better viewpoint. That’s actually completely off-topic, but what the hell, I’ll bite. I personally think that Maryam Namazie is a human rights (and specifically, women’s rights) heroine. But regardless of whether you agree or disagree with any particular thing she says, it seems obvious that she’s within the norms of academic discourse and should be welcome to speak at universities. That she would be shouted down, barred from speaking at Warwick University, and issued death threats, and that “liberal” people would defend this state of affairs, is a moral outrage—though alas, not a surprising one, to anyone who’s been paying attention to anything since around the time of the fatwa against Rushdie, which much of what we’d now call the “Social Justice Warrior left” also made excuses for. Furthermore, that student feminist and LGBTQ organizations (!) would choose to side with the groups who want women to be stoned for adultery, covered in burkas, married off by their parents at age 15, etc., and gays to be executed, rather than with a fellow feminist who’s risked her life to criticize those practices, is an irony that would be considered too over-the-top for a satirical novel, were it not the sort of thing that now regularly happens in our world. I should add that I find the tactic of disrupting a talk at a university, by shouting down the speaker, scuffling with the audience members, etc., always to be disgraceful and a discredit to anyone who does it. Even if, hypothetically, some MIT group invited (say) a neo-Nazi to speak, I would try to find some better way to express my opposition than by disrupting the talk (hosting a parallel talk, and making it more popular than the original, is better). And Namazie is about as far as you can get from a neo-Nazi. Indeed, regarding the charge that she’s an “Islamophobe,” maybe the best response is: there are thousands of former Jews, or Jewish atheists, who speak at least as harshly against the tenets of Judaism as Namazie does against those of Islam. And they get invited to speak at synagogues and Hillel centers. They certainly don’t get death threats. 49. Marnie Dunsmore Says: @Scott “The broader context here is that D-Wave has been doggedly following what the vast majority in the field, from the beginning, has considered a “dirty” approach to QC. The “clean” approach—the one that’s ultimately better—is to build high-quality qubits, incorporate fault-tolerance, and make it clear to everyone that you’re seeing a real quantum speedup, by targeting one of the problems where we know theoretically that a speedup should exist. There are groups all over the world working toward this, and some of them might even be close to succeeding. The D-Wave approach, by contrast, is to throw together lots of qubits that are of much lower quality, use them for optimization problems where we have no idea whether a quantum speedup should even exist or not, and then just see what happens, interpreting any ambiguous results in the most optimistic light imaginable.” I looked at D-Waves website and in particular this document: This D-Wave document describes how they build their qubits: http://www.dwavesys.com/tutorials/background-reading-series/introduction-d-wave-quantum-hardware They say this: “The techniques learned from the semiconductor industry have resulted in the construction of a Large-Scale Integration (LSI) fabrication capability owned by D-Wave. This fabrication capability is unique. Shown in Figure 5 is a cross section of one of the processors fabricated at D-Wave’s superconducting electronics foundry. The fabrication process that has been developed is able to yield LSI (128,000+ Josephson junctions for the 1000 qubit processor in the D-Wave 2X) superconducting circuits. It is the only superconducting fabrication facility capable of yielding superconducting processors of this complexity. Fabrication yield is critical to improving processor performance and requires on-going significant investment, and in order to maintain historical qubit doubling rates, investments are being made to improve the capability into VLSI (1,000,000+ Josephson junction per processor) territory over the next five years.” So it appears that the quality of qubits is a process design and fabrication challenge. Are you saying that we need to focus more on the qubits process design and fabrication problem? Or are you saying something else? 50. Scott Says: Lisa #45: Let me answer your question with another question. Why would anyone think it was a good idea to respond to a scientific post about experiments on the D-Wave 2X, not with evidence or arguments but with a name-dropping list of organizations and companies who support D-Wave? At best, that’s weirdly off-topic; at worst, it’s an attempt to circumvent normal scientific questioning by a foot-stomping appeal to money and power. In some sense, my rage at the possibility of the latter is the main reason why I’ve been motivated to blog about this topic for the past decade. But to take your question seriously … first of all, given what I wrote in this very post, it’s not obvious that the D-Wave approach shouldn’t be supported! Sure, it’s a “dirty” form of QC, one with no strong theoretical reason for expecting a speedup, but maybe it’s worth a try, and at least we could learn some interesting physics and engineering from it (arguably we already have). Personally, I’m more excited about the “clean” approaches to QC, which of course Google and the US and Canadian governments are also very much invested in, but I don’t really begrudge them for making a “side bet” on the dirty approach—as long as everyone communicates honestly about what that approach is and what it has and hasn’t accomplished. Secondly, I think a large part of the answer to your question is just the size of the entities you listed (with the sole exception of Jeff Bezos 🙂 ). I like the way Greg Kuperberg put it to me recently: if some professor at UC Davis supports creationism, that doesn’t mean “UC Davis supports creationism,” as much as many journalists would yearn to put it that way. It just means that UC Davis is heterogeneous and has many factions. This is particularly obvious in the case of the US Government (!). As far as I know, the US Government has no official position on the viability of D-Wave-style quantum annealing, in the sense that it could be said to have an official position on Cuba. It’s true that In-Q-Tel, a CIA-connected venture capital firm, made an investment in D-Wave. But then again, other parts of the US Government (the NSF, NSA, DoD, DoE…) have made much larger investments in “clean” QC approaches, so what are we to conclude? It’s almost like we can’t adjudicate these questions based on whose side has the more powerful authorities, and have to look at the evidence… 51. Marnie Dunsmore Says: @Carl Shulman #43 “A development budget of$100,000,000 would be plenty to develop conventional ASICs specialized for a given problem like Quantum Monte Carlo, and thereafter one’s bang per buck would be vastly higher. For bitcoin mining ASICs are 300,000-1,000,000 times faster than CPUs.”

Thank you, Carl, for this much needed post.

52. Henning Dekant Says:

Carl #43: Apples and oranges. If you compare the R&D costs of D-Wave you need to tally this up to all the R&D cost that went into conventional semiconductor tech.

53. Scott Says:

Marnie #49:

Are you saying that we need to focus more on the qubits process design and fabrication problem? Or are you saying something else?

There are experimental groups all over the world who are focusing on getting better control over qubits, especially when those qubits are integrated into larger systems. If you like, I’m saying that I’m more excited about that approach than I am about the D-Wave one, which up till now has focused much more heavily on the number of qubits than on their quality. But I’m also saying that, if D-Wave wants the best possible shot at showing a genuine speedup with their “dirty” approach, they’ll need to focus more on quality as well (and I think they know that, and the new Google paper supports that assessment).

54. Rahul Says:

Scott #50:

I disagree.

I did not think Lisa’s question was “weirdly off topic” nor an “attempt to circumvent normal scientific questioning by a foot-stomping appeal to money and power”.

I think it is a perfectly reasonable question. e.g. Take Deolalikar’s proof P=NP. One way for someone confronted with that claim is to actually go deep into the evidence or arguments in his paper. Utilize the “normal scientific questioning” needing time, effort & relevant training.

But another dirty, yet often utilized short cut is for me to say Hmmm obviously (say) Scott Aaronson, Gil Kalai, Greg Kuperberg, Umesh Vazirani all think it is flawed so there must be something wrong in it.

This has nothing to do with foot stomping appeals to money etc. but simply a way to trust credible sources when one doesn’t have the inclination or skill to go through an argument. No doubt, the only fundamental way of testing an argument / proof is to go through the logical steps painstakingly but very often the best one can do is to decide based on what one considers as credible sources.

Ergo, no doubt I think you are right about D-Wave (& Google / Lockheed / NASA etc. are wrong) but yet I don’t think Lisa’s question was fundamentally absurd or deserving of outrage.

It seems perfectly reasonable to wonder why all these organizations / firms did act the way they did.

Rahul: Comment #47

For any single D-Wave supporter, your comments have merit. Your comments don’t explain why so many smart people/organizations support D-Wave. These people/organizations would each have done their own diligence —> they all came to the same conclusion!

It seems much more likely that D-Wave’s supporters know something the rest of us don’t know (Occam’s Razor principle).

56. Rahul Says:

“But it’s now looking like only a matter of years until Gil Kalai, and the other quantum computing skeptics, will be forced to admit they were wrong”

Scott, are there any specific, recent breakthroughs / scaleups / experiments that make you say this?

From the strategies of Boson Sampling as well as conventional qubit systems we still seem quite a way from coming anywhere near the point where a quantum computing system might beat conventional hardware at any well defined task, no matter how artificial or practically useless?

Isn’t that the milestone to force Gil to eat his words? Or what other milestone do you have in mind?

57. Marnie Dunsmore Says:

@Scott

Regarding your original post you state this:

“Roughly speaking, if we rule out radical alterations to D-Wave’s approach, there are four main things one would want to try, to see if they helped:”

And then you state four things that would help:

1.Lower temperatures
2.Better calibration of the qubits and couplings
3.The ability to apply “non-stoquastic” Hamiltonians.
4.Better connectivity among the qubits.

Lower temps is a no brainer.

2 and 4 are interesting: they both will probably be, in whole or in part, achieved at the physical level, through physical design and fabrication improvements. I’m not sure about 3, but 1, 2 and 4 put the ball in the court of physical design and fabrication, as well as advances in cooling.

The ETH paper that you posted (Zintchenko, Brown, Troyer) mentions the additional challenge of “embedding of problems into the native hardware architecture with limited connectivity” which in their opinion is “expected to significantly slow down the annealing dynamics. More complex optimization problems may incur even higher mapping and embedding penalties.”

In your initial discussion, you touched on embedding, but I would be interested in your further thoughts on embedding and its penalties.

58. Marnie Dunsmore Says:

@Scott #53

“they’ll need to focus more on quality as well (and I think they know that, and the new Google paper supports that assessment).”

Just had a look at the paper.

http://arxiv.org/abs/1512.02206

Section C and the remarks on scaling are interesting. I’m still contemplating them.

59. Mike Says:

Henning@52

“If you compare the R&D costs of D-Wave you need to tally this up to all the R&D cost that went into conventional semiconductor tech.”

Not all of the costs apparently: As D-Wave itself says: “The techniques learned from the semiconductor industry have resulted in the construction of a Large-Scale Integration (LSI) fabrication capability owned by D-Wave.”

😉

60. fred Says:

#45 Lisa

“well informed people/organizations support D-Wave’s approach […] Google”

Google investing 10M$or so in D-Wave, that’s literally a *drop* in the bucket for them. They have very little to lose compared to what they can gain, even if the tech is unlikely to succeed. And to keep things in perspective: Google also invested 542M$ in “Magic Leap”, a startup working on a very secretive/revolutionary augmented reality hardware.
Just like for D-Wave, a lot of people are skeptical about Magic Leap’s claims (although the “results” should be directly tangible…). Magic Leap also just secured an extra 872M$of investment. 61. Nick Read Says: Re Scott #50: Contra Rahul, I think you have the response to Lisa #45 just right. On another note, I had no idea Greg K was a creationist! 62. Jeremy Stanson Says: I think Scott’s distinction between “clean” and “dirty” QC is smart but his bias towards “clean” QC is not. D-Wave identified something early on: it doesn’t matter how “clean” the parts are, when you scale a system up the system gets “dirty.” This is due to a lot of factors: environment, physical discrepancies between devices, material discrepancies between devices, areal density, cross talk, control circuitry, magnetic field gradients, temperature gradients, etc. D-Wave is learning how to work with a “dirty” system because that is the only kind of system (at least, of the superconducting variety) that can be built at scale. What a waste to plug years +$s into perfecting a handful of qubits that are not at all representative of the “same” qubits within a larger system!

63. fred Says:

Jeremy #62
“it doesn’t matter how “clean” the parts are, when you scale a system up the system gets “dirty.””

I disagree.
Classical computers are an example of this. A digital computer made of one billion transistors is just as clean (as a logical circuit) as a digital computer made from one dozen transistors. The elementary parts are designed this way, bottom up.
Even when considering analog electronics (like radios or HIFI equipment, where transistors function in the linear regime), building blocks can be assembled in a way that keeps the overall design “clean” regardless of its size: by matching input and output impedances, etc.

Now, maybe there’s something specific about QM that makes it impossible to apply the same principle to qbits, but that’s the whole point of trying to build a QC!

64. Henning Dekant Says:

Mike #59, fair enough, breaking out the part that they could leverage from established lithography processes makes sense, as long as you don’t make the mistake of comparing R&D dollars with the price tag for the finished product. (Reminds me of the fallacy in reporting about the “multi-billion dollar” blimp that got lost over Pennsylvania),

65. Mike Says:

Jeremy@62

“it doesn’t matter how “clean” the parts are, when you scale a system up the system gets “dirty.”

So, it’s better to start with inferior qubits in the first place?

Hmmmmm.

66. Henning Dekant Says:

#45 Lisa, I think much of the disconnect is that theoretical CS folks like Scott only care about proven quantum speed-up, whereas VC folks also care about any working alternative to silicon (given that Moore’s law no longer holds). So anything new that can be scaled up will attract investments, quantum speed-up is considered a bonus, but any practical speed-up will do. The fact that there is build-in QC hype for marketing purposes is just considered another goody.

For some reason lots of commenters here seem to think that pointy haired bosses will only see the hype and never read the fine print.

67. Jeremy Stanson Says:

fred 63

I think your last statement sums it up the best. You can’t project the robustness of semiconducting transistors onto superconducting qubits.

Part of my point is that D-Wave did not set out to build 1000 “dirty” qubits. They built 1000 qubits and, in doing so, the qubits necessarily became “dirty.” What is more, they became “dirty” in ways you might not even expect when you’re only looking at <10 "clean" ones.

68. Jeremy Stanson Says:

@Mike #64

See my #66.

Also, I challenge you to identify what it is about D-Wave’s qubit design that makes them “inferior.” You will see that any design element that has a negative impact on the qubit’s performance (in terms of noise, coherence, etc.) is included because it is necessary in order to enable that qubit to: i) be controlled, and/or ii) interact with other qubits.

A simple example: these qubits couple to signals (either control signals or to each other) via inductive coupling. A single qubit can be a nice tight loop with great performance metrics, but as you want it to couple to more and more things it needs to get bigger and bigger…

69. Henning Dekant Says:

@Jeremy Stanson, I think you underestimate the robustness that can be achieved with quantum error correction codes. Obviously in terms of scale up the proof is in the pudding, and hopefully we won’t have to wait too long until either Martinis/Google or IBM will unveil a chip at some scale.

BTW Daniel Lidar recently gave a Google Talk in which he summarized some of the software error correction schemes that he ran on a D-Wave machine, it seems that this indeed improves the scaling characteristics. (I summarized the talk at this link)

70. Greg Kuperberg Says:

You will see that any design element that has a negative impact on the qubit’s performance (in terms of noise, coherence, etc.) is included because it is necessary in order to enable that qubit to: i) be controlled, and/or ii) interact with other qubits.

If you have a violin that can’t hold a tune, you can’t logically justify it with the argument that you need an orchestra. This is exactly why quantum computation is hard.

71. Marnie Dunsmore Says:

@Henning Dekant

“I think much of the disconnect is that theoretical CS folks like Scott only care about proven quantum speed-up, whereas VC folks also care about any working alternative to silicon (given that Moore’s law no longer holds). ”

In my experience, Henning, many VCs do not in fact understand Moore’s Law. For example, for many years, many VCs continued to assert that Moore’s Law applied to solar and wind power. Obviously, 2 dimensional lithography scaling does not apply to wind or solar, but somehow, it took a long time for the investment community to catch on to that.

“So anything new that can be scaled up will attract investments, quantum speed-up is considered a bonus, but any practical speed-up will do. ”

Any practical speed-up will do?

The thing is that the smart phone has already been scaled enough to be versatile, powerful and have reasonable battery life. If it’s battery life that you want to go after for a cell phone, most of the power is consumed in the PA (power amplifier), which is implemented in high voltage CMOS, SiGe or GaN. Conventional Moore’s Law Scaling won’t help you here. So it can’t be the smart phone that is driving the need for speed and density scale ups, and improvements in power efficiency.

It is the super computing market, and the desire to lower server farm power consumption that is driving governments and VCs to fund finer line/higher integration/higher speed/ultra low power projects.

Unfortunately, on this basis, the US has given away, “commodified” or offshored most of its more conventional fine line CMOS processes, which in my opinion, will turn out badly.

“The fact that there is build-in QC hype for marketing purposes is just considered another goody. ”

If we had an infinity of research dollars, this would be the case. But we won’t. We’ve scuttled or underfunded more conventional, proven, high reliability approaches that allow high speed/low power/high integration design, in order to go after quantum computing. We are putting too many eggs in one basket.

“For some reason lots of commenters here seem to think that pointy haired bosses will only see the hype and never read the fine print.”

Henning, I like your paper. No, I don’t think it is all pointy haired bosses out there, but it is true that many VCs and NSF decision makers, and some engineers, seem not to be able to understand even basic principles like Moore’s Law.

Or maybe they don’t care, and the hype is all, and that has become the normative way of operating in Silicon Valley.

And I would also remind you that Google is an advertising company (that’s where most of its revenue comes from, ultimately), and doesn’t not have to earn its income on manufacturing. So it is quite shielded from the reality of making an actual manufactured product (unlike Intel, Motorola, Texas Instruments, Analog Devices, Infineon, Analog Devices, Raytheon, CREE, TSMC and other companies with their feet on the chip manufacturing ground.)

72. Scott Says:

Lisa #55:

It seems much more likely that D-Wave’s supporters know something the rest of us don’t know (Occam’s Razor principle).

My own understanding of Occam’s Razor includes the rule: “never attribute to James-Bond-like secrets that which is adequately explained by Dilbert-like herd behavior.” 🙂

I’m sure part of this, is that I’ve personally spoken to people from at least half a dozen companies and government agencies that were seriously considering buying a D-Wave machine, or investing in D-Wave. In these cases, it quickly became clear that, far from being privy to novel information about D-Wave that would blow the open research community’s collective mind, most potential buyers didn’t even understand the most basic points about QC, like the difference between NP-complete problems and factoring. In some cases, they were only a step or two beyond the level, “QC is new and shiny and tries all the answers in different parallel universes and will show everyone how cool our organization is.”

73. Greg Kuperberg Says:

QC is new and shiny and tries all the answers in different parallel universes and will show everyone how cool our organization is.

Indeed, at the pop science level, that has a lot of truth to it. Thus justifying the interpretation of the D-Wave device as a corporate vanity product.

74. Scott Says:

Jeremy Stanson:

D-Wave is learning how to work with a “dirty” system because that is the only kind of system (at least, of the superconducting variety) that can be built at scale.

Your comments on this thread are the first thing that ever caused me to reflect that D-Wave’s biggest problem might be insufficient technological optimism! 🙂 Encouraged by my experimentalist friends, I see absolutely no reason to assume, from the outset, that one can’t build a clean system at scale. In particular, as you’re probably aware, in the near future the Martinis group is planning to integrate ~40 superconducting qubits, which are orders-of-magnitude “cleaner” than D-Wave’s, with arbitrary time-varying controllable couplings. If they do that, it should already be enough to demonstrate a clear quantum/classical separation for something, and will directly contradict your statement.

But it’s not only that it seems possible: it’s that, as far as theorists know today, building a “clean” system is necessary to achieve a genuine quantum speedup. Or rather: if you have a “clean” system at scale—as Martinis, Schoelkopf, Monroe, Wineland, and others are working toward—then we’re almost certain you’ll see a true quantum speedup (“almost” only because we can’t prove stuff like BPP≠BQP). With a “dirty” system, by contrast (e.g., stoquastic annealing at a fixed nonzero temperature), about the best we can say is that we can’t prove you won’t see a true asymptotic speedup. But so far, there’s no compelling reason from either theory or experiment (including from the new Google results) to expect such a speedup to be there.

75. Jeremy Stanson Says:

Henning @ 69

I’m talking about qubit performance, not QC performance. The assertion is that D-Wave’s qubits are “dirty” in terms of their individual qubit properties.

Greg #70,

Analogy never works. If the “violin” is a qubit, the problem here is that the very design, structure, size, shape, geometry, … every aspect of the qubit, changes when you go from a single device to a large system (orchestra?) of interconnected devices all being controlled by a finite (and low number, significantly less than the number of qubits) control lines that you can fit into a cryogenic environment. If you’re interested in the sound of the orchestra, it doesn’t tell you much to have one violin that holds a tune if that violin needs to completely change in order to fit into the orchestra.

76. Jeremy Stanson Says:

Scott #74

I don’t have the benefit of expert “experimentalist friends” at hand (an invocation that nearly contradicts your “name-dropping” criticism of Lisa @ #50), but I think what most bothers me about your “clean” >> “dirty” argument is that you seem to think that D-Wave WANTS dirty qubits. Of course, D-Wave wants and strives for the best possible qubits they can work with. They are (maybe?) the best-funded SC QC effort to date (before Martinis joined Google, of course). They have the most sophisticated SC fab capability known to the public. Your experimentalist friends are probably very smart, but nobody other than D-Wave has built SC devices as sophisticated as D-Wave’s. It is just possible that D-Wave knows a bit more about it than your friends do (and remember, D-Wave WANTS the cleanest qubits they can manage).

D-Wave’s strategy hasn’t been to design their qubits to be dirty and just see how many they can build; rather, the strategy is to build large systems of the best possible qubits (best at scale, that is) and then see what they can learn in order to improve it for the next generation. The Martinis effort will only get to 40 qubits by doing the same thing. They’ll build X qubits, they’ll be a bit messy, they’ll clean them up, then they’ll scale that to Y qubits, they’ll be messy again, so they’ll clean them up, and so on.

77. Greg Kuperberg Says:

D-Wave’s strategy hasn’t been to design their qubits to be dirty and just see how many they can build

Actually, that has exactly been their strategy. Making good qubits is hard; they never bothered.

78. Job Says:

D-Wave’s strategy hasn’t been to design their qubits to be dirty and just see how many they can build; rather, the strategy is to build large systems of the best possible qubits (best at scale, that is) and then see what they can learn in order to improve it for the next generation.

OK, so they’re building quasi-quantum, nearly-quantum, maybe-quantum computers and selling them for 10 million. Nothing wrong with that.

79. Scott Says:

Jeremy: Of course all else being equal, D-Wave would take higher-quality qubits over lower-quality ones. That I think they wouldn’t is an absurd strawman that doesn’t advance the discussion. But they settled for lower-quality qubits, while the rest of the field refused to settle. And the main question at issue here—a question that your comments haven’t even touched on yet—is whether, having settled for the dirtier qubits, it’s then possible even in principle to achieve a true, asymptotic quantum speedup over the best known classical algorithms. I don’t know whether it is or isn’t—and neither, I think, does anyone else. What’s your evidence that it’s possible?

Incidentally, I’d better explain once and for all (also for Rahul) why I’m completely unmoved by what Google and Lockheed and Goldman Sachs do as companies, but extremely moved by what Troyer and Martinis and Monroe believe as individuals, and don’t see the slightest inconsistency there. I love the way Carl Sagan once put it: “In science, there are no authorities. At most, there are experts.” Matthias Troyer is an expert. John Martinis is an expert. Chris Monroe is an expert. When they speak, I listen to them—not because of how much money or how many people they manage or how many awards they’ve won, but because I can interact with them directly and see for myself the agility of their reasoning and their responsiveness to evidence.

As a general rule, I’ll weight the opinion of a single individual whose judgment I admire over than the collective decision of an organization of thousands—even if the organization is itself full of individuals whose judgments I admire. I’ve disagreed vehemently with Leonid Levin about whether scalable QC is possible at all, but one thing Leonid said that’s always stuck with me is that “the whole is stupider than the sum of its parts”—so that even in (say) a funding panel composed of brilliant scientists, the tiniest grain of stupidity in each one can get exponentially amplified by the others. This matches my experience.

80. Henning Dekant Says:

Jeremy #76, I think looking at the qubit quality in isolation is a bit of a mute point, as you said, D-Wave will strive for the best that they can build. In the end surface codes to stabilize qubits are going to be the decisive difference between what Martinis and IBM are doing versus D-Wave.

81. Henning Dekant Says:

@Marnie Dunsmore #72, lots of food for thought. Certainly sympathize with your stance on VC guys. With some of them it is hard to tell if they just throw out some marketing yarn, or if they really believe the nonsense they sometimes spout. Their technological acumen is sometimes shockingly low, but then again they aren’t the ones performing the due diligence. That’s when they bring in the experts.

But the good thing is that their money augments government funding. I.e. I think on balance D-Wave probably drew in more money from private sources into the field then it deprived other research initiatives of governmental funding.

82. Joshua Zelinsky Says:

Marnie Dunsmore,@ #71

> In my experience, Henning, many VCs do not in fact understand Moore’s Law. For example, for many years, many VCs continued to assert that Moore’s Law applied to solar and wind power. Obviously, 2 dimensional lithography scaling does not apply to wind or solar, but somehow, it took a long time for the investment community to catch on to that.

You may be interpreting what they were saying in an overly uncharitable fashion. When people make such comments it seems like they don’t need Moore’s Law in the sense you are using it but more as rough short-hand for “about exponential growth in some metric with a doubling time measured in years.”

83. Greg Kuperberg Says:

I’m completely unmoved by what Google and Lockheed and Goldman Sachs do as companies, but extremely moved by what Troyer and Martinis and Monroe believe as individuals, and don’t see the slightest inconsistency there.

I would explain my epistemology a little differently. Of course, Google, Lockheed, and Goldman Sachs are composed of people, and those people fall into different factions. Notwithstanding that, I would very likely trust many of these people, and often by extension their companies, on those topics on which they are actually expert. I might well trust Google before I trust Troyer on street mapping, Lockheed before Martinis on fighter jets, and Goldman Sachs before Monroe on stock portfolios. However, none of these three entities has any track record of expertise in quantum computation, while Toryer, Martinis, and Monroe all do.

For that matter, so does IBM; or rather, a group at IBM that IBM lets speak for it on quantum computation. Guess what, they say the same thing as the academics.

84. Tim May Says:

Marnie #71, you are exactly correct.

“In my experience, Henning, many VCs do not in fact understand Moore’s Law. For example, for many years, many VCs continued to assert that Moore’s Law applied to solar and wind power. Obviously, 2 dimensional lithography scaling does not apply to wind or solar, but somehow, it took a long time for the investment community to catch on to that.”

Moore’s law just an observation, made during the 1960s and extended into the 70s. And it continues to generally apply.

But, as I like to tell my friends, it was printing t-shirts. That is, photolithograply, Print transistors at half the size, get four times as many per unit area. Rinse and repeat.

It has worked so long as customers paid enough for manunfacture camera systems which could print 10 micron features, then 5 micron features, then one micron features, then 500 nm, then 120 nm, then 60 nm, then 28 nm, then 22 nm, then 16 nm, and now some 14 nm and even some 10 nm features.

And each iteration generally raised the price of fabs from a miliion bucks to $10 m, to$50 m, and so on up to the current tab of about $3 billion for a state of the art fab. This stuff is all only paid for if markets and customers exist, not as some kind of inexorable law of the world. In fact, that we have gone from several dozen IC companies in 1975 to about 3 right now (Samsung, TSMC, Intel), possibly another tier comprised of Global Foundries, Texas Instruments, a few others) says a lot about the economics. My point to a lot my futurist, singularitarian friends is that it’s silly to take Moore’s Law as some roadmap to the future. How it relates to quantum stuff, I don’t know. Which is why I read blogs like this. But I sure do know that Moore’s Law is pretty much a side issue. (Caveat: I worked in the same building with Gordon and consulted with him on some of the alpha particle and cosmic ray stuff, circa 1977-80. A great guy. But he didn’t take his own law too seriously and would reject any arguments about what Intel should be doing based on any reasoning along the lines of “But Moore’s Law says such and such.”) BTW, I expect there are some new such “laws” related to de-coherence times, physical dimensions, temperatures, etc. But I think we are a long ways away from when a guy like Gordon Moore could plot parts already for sale and noticing a trend line (which is all Gordon ever claimed to be doing). –Tim 85. Claude Crépeau Says: Hey Scott, great writing of this very readable analysis of the D-Wave/Google claims. I certainly don’t have the patience to read carefully and debunk one by one the excessive claims. I congratulate your patience in responding to the numerous commenta, including the most hostile ones… 86. Marnie Dunsmore Says: @Henning Decant #81 Of course, it’s nice to see a Vancouver based company be well funded and working on an interesting problem. I’m looking forward to the next paper and the “better control precision and richer connectivity graphs.” 87. Marnie Dunsmore Says: @Joshua Zelinsky #82 “You may be interpreting what they [VCs] were saying in an overly uncharitable fashion. When people make such comments it seems like they don’t need Moore’s Law in the sense you are using it but more as rough short-hand for “about exponential growth in some metric with a doubling time measured in years.” Moore’s Law: In 1965, Gordon Moore, a founder of Intel, said that the number of components on an integrated circuit would double every year and projected this rate of growth would continue for at least another decade. He revised this to doubling every two years, in 1975. What is interesting here is that Moore made a precise statement: components per chip proportional to 2^N, where N is the number of two year periods. He bounded the time period and stated the exponent specifically. Moore’s law depended upon scaling down the CMOS gate length, an investment that required several trillion dollars over a half century. By comparison, the statement “exponential growth in some metric with a doubling time measured in years” often used loosely be the investment community, is not a precise statement. The exponent defining the exponential growth is almost never stated, and neither is the time period. And sometimes, it’s just plain wrong, as with wind energy. There’s this idea among some people that these scaling improvements in hardware just pop up out of a jack-in-a-box at the flick of switch. Real exponential year over year hardware improvements occur only with massive investment of capital (something that no VC wants to hear.) 88. jonas Says: Marnie Dunsmore: Thank you for that good explanation of how Moore’s law works. It certainly makes much more sense that way than those popular versions that tried to predict that by now we’d have cpus with clock cycle speeds of several 10**10 hertz in a chip so large that it takes 10**-10 seconds to run around it at light speed, and chips with logic gates smaller than an atom. 89. Joshua Zelinsky Says: @Marnie Dunsmore #87, I agree that that’s a more precise statement of Moore’s Law. And I agree that the VCs are definitely missing that exponential growth of that sort does require massive investment. My comment was intended to focus on your statement that “For example, for many years, many VCs continued to assert that Moore’s Law applied to solar and wind power. Obviously, 2 dimensional lithography scaling does not apply to wind or solar, but somehow, it took a long time for the investment community to catch on to that.” In that context, what the VCs are saying is misguided, but it is a mistake to interpret their statements as purely failing to understand that Moore’s Law is a a statement about lithography, since that’s not what they meant. (It is incidentally worth noting that by at least some metrics, solar power has undergone an exponential improvement with a small doubling time. See for example https://en.wikipedia.org/wiki/Solar_cell_efficiency#/media/File:PVeff(rev150806).jpg ). 90. Jeremy Stanson Says: Greg #77 It seems you are stubborn on this point. You make assertions without any real argument or support. As I argued @ #75, D-Wave’s qubits are “dirty” as a consequence of having all the necessary attributes of a scalable system. Don’t just tell me “D-Wave’s qubits are dirty and they never bothered to make clean ones.” Instead, look at D-Wave’s qubits and tell me WHY they are dirty. What makes them dirty? It isn’t laziness or sloppiness on D-Wave’s part, that’s just absurd. There is no motivation for D-Wave to act that way. The qubits are dirty because they have a complete on-chip programming infrastructure that is the only way to get the necessary programming signals into the confines of a cryogenic environment when you have that many qubits – a problem no other group has faced so far because no other group has built so many qubits before. The qubits are dirty because each one is controllably coupleable to many neighboring qubits – at a scale that no other group has achieved before. The qubits are dirty because they have connections to an elaborate compensation infrastructure that accommodates discrepancies between devices that inevitably arise in the fabrication process and enable 100s-1000s of qubits all to operate in the same regime and appear nominally identical to one another – another problem no other group has needed to address yet. All of these challenges, and many more, are engineering challenges that D-Wave has identified and started to address far in advance of any other group. As Scott and I have both acknowledged, D-Wave OBVIOUSLY wants their qubits to be as clean as possible. Scott goes on @ 77 to say that acknowledging this is a “strawman” point, but I don’t think so. Both you and Scott appear to be somehow slighted by D-Wave and it looks like this colors your judgment. Scott claims that he appreciates D-Wave wants clean qubits but his perspective never seems to allow for that. It’s always disparaging of the “dirty” approach without ever seeming to really consider the value it has to offer. As an exercise, consider that D-Wave might be operating rationally and with the goal of building a legitimate quantum “machine” as quickly as possible. D-Wave hasn’t “settled” for low quality qubits. D-Wave have acknowledged that the qubits of a large-scale heterogeneous system will necessarily be lower quality than the qubits of a smaller purer system. They’ve acknowledged that the environment and infrastructure, the whole architecture, or a large-scale quantum processor is nothing like that of a small toy system. I think there is a lot of merit in building the bigger processor, seeing what the effects are on the qubits, and working on how to clean things up within the reality of a fully-integrated system. I have nothing against the “clean” approach, but I just can’t understand why otherwise perfectly rationale people cannot acknowledge the merit of the “dirty” approach. Maybe it’s because math, physics, computer science, all tend to be clean while real-world engineering (i.e., building real programmable quantum systems) is dirty. 91. Jeremy Stanson Says: Another point and then I should wrap this up. Scott emphasizes the fact that D-Wave is putting a lot of effort towards an uncertain goal. It is not known that a “dirty” system at D-Wave’s scale will ever be able to offer any quantum speedup. There is no evidence to support or even really suggest this. Fair enough. This is true. But the following is also true: it is not known that other groups following the “clean” approach will be able to scale their systems up and maintain their desired level of “cleanliness.” There is no evidence to support or even really suggest that such scaling will be possible. (In fact, the only concrete example of such scaling that we can point to is D-Wave, who have dirty qubits despite OBVIOUSLY wanting to have clean ones). Consequently, it is not known that other groups following the “clean” approach will ever be able to offer any quantum speedup at scale either. 92. Blog - physicsworld.com Says: […] are some important caveats, as MIT quantum-computer expert Scott Aaronson points out on his blog Shtetl-Optimized. It seems, for example, that there is another classical algorithm that can solve these problems […] 93. Job Says: The problem is not that D-Wave is merely scaling their system to lots of inferior qubits, they’re also scaling the unsubstantiated and over-hyped claims that go along with it. That seems to be the only attribute of their system that exhibits a definite asymptotic growth. 94. Greg Kuperberg Says: Instead, look at D-Wave’s qubits and tell me WHY they are dirty. What makes them dirty? Because they haven’t been patient enough for the later advances in qubit coherence times. In other words, they’re following the strategy, “you go to war with the army you have”. They’ve scaled all right, but their qubit methodology is stuck back in 2005, or earlier. The consensus of the field is that even today, superconducting qubits haven’t fully been invented yet, even though they’re much better than they were 10 years ago. That’s what is really meant by “clean” vs “dirty”. It’s not like a “dirty” kitchen that works almost well as a “clean” kitchen. It’s more like a “dirty” violin that can’t carry a tune. It’s not considered fruitful to scale up to thousands of qubits when you can hardly get two of them next to each other to work very well. 95. Marnie Dunsmore Says: @jonas #88 Under the square law model for the CMOS transistor, as the transistor gate length, L, is scaled down, the speed of the device scales as 1/(L^2). So, for instance, in going from a 0.35um gate length to a 0.18um gate length, we could expect the device to speed up by a factor of 4. As it turns out, the square law model is an approximation, which ignores the effect of velocity saturation, so we do not really get a quadratic speed up as we scale the gate length down. Below about 100nm gate lengths, in addition to velocity saturation, many other parasitic effects come into play, making transistor speed up advances harder to achieve. Perhaps some of the “popular versions that tried to predict that by now we’d have cpus with clock cycle speeds” in the 10s of GHz used only the square law 1/(L^2) to estimate clock speed, and ignored the secondary effects. Also, the challenge to clock speed on chips is not limited only by transistor speed, but also by clock distribution issues. CMOS is not done though. As of July, 2015, TSMC is in full production with 16nm, and scaling advantages and speedups are still happening. These do not come for free, and have happened only with massive investment. A great book about CMOS transistor modeling: Operation and Modeling of the CMOS Transistor, by Yannis Tsividis and Colin McAndrew http://www.amazon.com/Operation-Modeling-Transistor-Electrical-Engineering/dp/0195170156 96. Greg Kuperberg Says: As an exercise, consider that D-Wave might be operating rationally and with the goal of building a legitimate quantum “machine” as quickly as possible. Oh, I missed this. They clearly are acting with the goal of building a quantum machine as quickly as possible — but the results have never been particularly legitimate. 97. Jeremy Stanson Says: Job 93 I don’t see it. Can you point to examples of this asymptotic growth? The most recent news was from Google (not D-Wave) and I think everybody agrees it was well-tempered. It isn’t D-Wave’s fault if the media misconstrues / sensationalizes the QC developments. Anybody who has been involved in anything newsworthy knows that this is unavoidable. Greg 94 I agree with everything that you say, except for D-Wave’s qubit methodology being stuck in 2005. By being the first to really examine the issues that arise during scaling, D-Wave has a significant role in defining the qubit methodology of today and tomorrow. Of the “dirty” approach, you say: “It’s not considered fruitful to scale up to thousands of qubits when you can hardly get two of them next to each other to work very well.” That makes perfect sense. Yes, I agree. However, of the “clean” approach, I counter: It’s not considered fruitful to painstakingly develop an intricate 2-qubit interaction if that system is instrinsically non-scalable. This is why both the clean approach and the dirty approach have merit. All I am trying to argue is that the dirty approach does in fact have merit and I can’t see why you guys won’t acknowledge that. 98. Marnie Dunsmore Says: @Joshua Zelinsky #89 ” In that context, what the VCs are saying is misguided, but it is a mistake to interpret their statements as purely failing to understand that Moore’s Law is a a statement about lithography, since that’s not what they meant. ” I will be the first to admit that I do not understand the motivation or thought process of many VCs. I did look at the graph you posted for the solar industry. It’s good to know this kind of data is out there. In general, the growth does not appear to be governed by an exponential relationship. Don’t get me wrong: I’m a huge fan of the need to invest in alternative energy. I just think we should be able to say, with a straight face, to an investor, “It’s going to be 10 years before we are cash flow positive” and “This is not software. I can’t develop solar on a budget that looks like the initial investments in Airbnb.” 99. Job Says: The most recent news was from Google (not D-Wave) and I think everybody agrees it was well-tempered. Exactly, it took efforts by other teams and researchers to bring the D-Wave hype under control. According to D-Wave, their system “uses quantum mechanics to massively accelerate computation”, almost to the point of beating classical algorithms for specific instances of optimization problems. 100. Jeremy Stanson Says: Job 99 So… that’s a “no” regarding examples of “asymptotic growth” in D-Wave’s over-hyped claims, then? 101. Greg Kuperberg Says: By being the first to really examine the issues that arise during scaling, D-Wave has a significant role in defining the qubit methodology of today and tomorrow. That’s just not true. The are anything but the first or the main people to examine scaling to many qubits. They are the main people who scale regardless of issues, which is not the same thing. I also haven’t heard of any significant role that they play in anyone else’s methodology. It’s not considered fruitful to painstakingly develop an intricate 2-qubit interaction if that system is instrinsically non-scalable. That’s just not true either. The whole point of everyone else’s work on superconducting qubits is that it is intrinsically scalable. The only difference is that other groups recognize that it isn’t yet good enough to be worth scaling. 102. Scott Says: Greg #101: I also haven’t heard of any significant role that they play in anyone else’s methodology. In fairness, John Martinis shares my and your skepticism about the possibility of a true asymptotic speedup with D-Wave’s qubits, but he told me that some of the engineering D-Wave had done (e.g., just figuring out how to integrate hundreds of superconducting qubits while still having lines into them to control the couplings) would be useful to his group. That’s one of the main things that caused me to moderate a bit (while remaining as intolerant as ever of hype). 103. Greg Kuperberg Says: Scott — All right. After spending$100 million, it’s a relief that something that they have done has had some influence somewhere.

104. Job Says:

Jeremy #100

So the latest D-Wave system was not preceded by projections that it would be “faster than the universe” by now?

Do you want a better example of asymptotic hype than the graph posted by Jurvetson that shows D-Wave’s systems projected to be faster than all computers combined by sometime last year?

D-Wave’s approach to quantum computing is analogous to marketing a new drug before it has been scientifically verified to be more than a placebo, leaving everyone else to show when it does, and does not, work.

That’s what makes D-Waves approach “dirty”, it does not refer merely to the low quality of their system, which is what you’re failing to understand.

105. Joshua Zelinsky Says:

Marnie Dunsmore,

In that case, I think we’re essentially on the same page.

106. Jeremy Stanson Says:

Greg 101

“The whole point of everyone else’s work on superconducting qubits is that it is intrinsically scalable.”

You betray your own argument. Now whose qubit methodology is stuck in 2005! 😛 The notion that SC qubits are “intrinsically scalable” is precisely what motivated D-Wave to work with SC qubits in the first place… back in ~2005. In principle, SC qubits are totally scalable. You use lithography and pattern out many copies of the same thing. No problem, right? There’ll be lots of tweaks and learning along the way but it’s totally do-able. That’s the 2005 mentality.

Now, in 2015, if only we had some example of how that intrinsic scaling actually starts to take shape when you put it into practice… where could we look for such an example… have no large-scale SC qubit systems been built in the last 10 years…?

And @ 103

D-Wave has had “some influence.” Get over yourself, man. D-Wave is driving this industry now. It may not be D-Wave who does it, but the role (I’m hesitant to say “progress” because I know you’ll get upset) of D-Wave has spurred on interest from Google and NASA and lit the fire under IBM and Microsoft. If real QCs are going to exist, D-Wave will have caused them to exist sooner than they would have existed otherwise.

107. Jeremy Stanson Says:

Job

You mean the graph he posted in 2012? Where’s the growth in 2015?

“D-Wave’s approach to quantum computing is analogous to marketing a new drug before it has been scientifically verified to be more than a placebo, leaving everyone else to show when it does, and does not, work.”

D-Wave is marketing a concept and they aren’t tricking anybody into buying. Nobody is being deceived out of their $10 million. The organizations that buy D-Wave machines are investing in research. They are exploring something new and learning how to use it and make it better. It’s collaborative. I’d understand your rage if D-Wave WAS selling a drug and tricking the general public but that’s just not happening here. Don’t get so worked up about an analogy that isn’t accurate in the first place. Rather, get excited about a unique effort to build something that we all want to see get built. 108. fred Says: Jeremy #107 “D-Wave is marketing a concept and they aren’t tricking anybody into buying. Nobody is being deceived out of their$10 million.”

But they sure haven’t seemed to go out of their way to correct/clarify all the false hype generated by their claims in the press, etc.
How many ppl who could be potential investors are even able to parse their murky claims? None, besides a handful of experts.
That’s why Scott is saying:
“it’s that the “trigger” this time was a serious research paper by a group separate from D-Wave, rather than some trash-talking statement from Geordie Rose.”

109. Job Says:

You mean the graph he posted in 2012? Where’s the growth in 2015?

FYI, he has updated the post in 2015 to indicate that D-Wave is “still on track”.

Don’t get so worked up about an analogy that isn’t accurate in the first place. Rather, get excited about a unique effort to build something that we all want to see get built.

Huh? My posts are meant to address a misunderstanding you seem to have relative to Scott’s usage of “dirty” and which caused you to go on a tangent about how “dirty” qubits are a consequence of scaling.

Personally, I get excited about Quantum Computing in the form of stuff like Simon’s or Shor’s algorithms, the only thing i want from D-Wave and Google is convincing evidence of a definitive quantum speedup which D-Wave’s “dirty” approach has not yet delivered.

110. Greg Kuperberg Says:

D-Wave is marketing a concept and they aren’t tricking anybody into buying. Nobody is being deceived out of their $10 million. It is true that the three buyers of the D-Wave device are at some level fully informed, and that their purchases also have a major aspect of deliberate institutional vanity. However, D-Wave is also selling a story to investors. It is true that I have no inside knowledge at all of their discussions with their investors either. What I can say is that the ultimate decisions make no sense to me. The two parts of the story that I can see are the one told in the popular press, and the one told in the scholarly literature. The story in the press is a silly misrepresentation of quantum computation. The story in the scholarly literature is at some technical level remarkably transparent. But what is really there is not good news for D-Wave, other than a layer of upbeat gloss. 111. Greg Kuperberg Says: the only thing i want from D-Wave and Google is convincing evidence of a definitive quantum speedup which D-Wave’s “dirty” approach has not yet delivered. Given that neither theory nor experiment supports its existence, don’t hold your breath. 112. Jeremy Stanson Says: Job Strange for you to claim that I have “a misunderstanding relative to Scott’s usage of “dirty”… which caused [me] to go on a tangent about how “dirty” qubits are a consequence of scaling” when Scott himself identified no such misunderstanding back when he was engaged in this discussion. Fred/Greg I think it is bizarre how concerned you are for D-Wave’s poor and naive potential investors! I’m never heard so much sympathy for the 1%. 113. Job Says: Given that neither theory nor experiment supports its existence, don’t hold your breath. I assume you’re referring specifically to D-Wave’s QC. If it had been shown that D-Wave’s system does exhibit a definite quantum speedup, i would expect a follow up – perhaps larger – discussion of how that could possibly be, given the absence of any (?) significant quantum error correction. Scott has often said that the infeasibility of quantum computing would have serious implications for QM. Can we make a similar claim here that, the viability of a theoretically incomplete QC like D-Wave’s would also have implications for QM? 114. Lisa Adam Says: How about this …. Please raise your hand if you’ve signed a non-disclosure agreement with any of the following relating to QC or AQC: D-Wave, Google, NASA, Lockheed Martin, USC, LANL, US Government, In-Q-Tel? Seems logical that if you haven’t been given confidential information that you don’t have deep (or inside) knowledge of how a D-Wave 2X system operates. 115. fred Says: I’m always puzzled by the motivations of the D-Wave “defense force” folks in this blog… Are they seeing D-Wave as some kind of computational breakthrough that Scott isn’t seeing? The proof that QC are feasible? But Scott does believe that QC is a feasibility too, right? Are they working in a business making money based on how fast they can run quantum annealing? Are they feeling bad for the D-Wave folks? They perceive Scott’s skepticism as being unfair? 116. Scott Says: Lisa #114: Funny you should bring that up—as it happens, I recently discovered an airtight proof that D-Wave’s current approach can’t possibly yield an asymptotic speedup for anything. But since you’ve neither paid my standard consulting fee nor signed my nondisclosure agreement, I see no reason why I should share that proof with you. Just accept that, as an acknowledged expert on quantum computing, I know more about this subject than I can tell you. That’s how science works. 117. Rahul Says: @Lisa Adams: “Seems logical that if you haven’t been given confidential information that you don’t have deep (or inside) knowledge of how a D-Wave 2X system operates.” Oh boy. By that yardstick, I think I’m forced to believe the efficacy of every astrologer, homeopath, quack doctor, alchemist, tarot reader, witch doctor that comes along. The Oh-believe-me-this-works-but-I-am-just-not-allowed-to-tell-you-how argument is unbelievably crappy. Hell, I’d OK believing someone’s claim to black box magic if only they’d demonstrate a fantastic achievement at least. Here’s a test I propose: Let D-Wave propose a hard problem of their choice with six months notice. At an assigned time we throw a random instance of that problem, simultaneously at D-Wave and$10-Million dollars worth of optimized classical computing hardware (clusters, GPUs, FPGAs whatever) running the best algorithm D-Wave skeptics can come up with.

If D-Wave manages to solve it faster we concede that they really have something fantastic up their sleeve, even if the workings remain secret. At that point, I’m even willing to concede that D-Wave might have a true, scalable QC.

118. 373737111 Says:

119. 373737111 Says:

Scott #116: Also, what is your standard consulting fee?

120. anon Says:

From a reader in the finance industry:

Scott, you’re now mentioned in Bloomberg, in what seems like a nice marketing article for D-Wave.

http://www.bloomberg.com/news/articles/2015-12-09/quantum-supercomputers-entice-wall-street-vowing-higher-returns

Could you comment on the paper

http://arxiv.org/pdf/1508.06182v2.pdf

that the financial math researchers claim is a demonstration of D-Wave’s potential in solving real-worl problems that classical computers may not be (in practically, while maybe not provably asymptotically, better time)

121. Job Says:

Jeremy #112,

I’m not going to bother dissecting Scott’s comment describing what the “clean” approach entails in order to show you that the inverse is not as you understand it to be.

By your argument, a company is justified in marketing a 512 qubit QC which in fact contains only 10.

Those are 10 perfect qubits. Of course they want to have 512 qubits, but hey it’s difficult to scale and preserve qubit quality right?

Maybe Martinis’ team can follow this plan to “catch up” to D-Wave.

122. Shehab Says:

Rahul #34:

Is it possible that Google compared their result against simulated annealing instead of Selby’s algorithm to draw the attention of the huge number of AI practitioners working at Google?

123. Henning Dekant Says:

Job #105 untested drugs get people killed. Untested quantum computers do not. Who’s hyping now?

124. Marnie Dunsmore Says:

@Joshua Zelinsky #105

“In that case, I think we’re essentially on the same page.”

🙂

I’m repasting your solar efficiency graph here and hope to have a closer look at it over the weekend.

Thanks for sharing this.

125. Job Says:

Job #105 untested drugs get people killed. Untested quantum computers do not. Who’s hyping now?

You are. Not all untested drugs kill people – those that turn out to be placebos for example.

126. Scott Says:

373737111 #118, #119: My usual consulting fee is $1000/hour, rounded up to the nearest hour. And I thought the point was obvious. In comment after comment, Lisa repeats one of the weakest arguments known to humankind (“they do too have a real speedup over the best classical algorithms, but they’re not gonna explain it to you, because it’s super-secret!”)—an argument that the Google folks wouldn’t be caught dead making, and that D-Wave itself hasn’t made since 2008 or so. I pointed out that, once we allow that “argument,” she can’t rule out that the skeptics have super-secret reasons why there’s no speedup, which they’re not telling her. If you still don’t get it, I’ll be happy to explain further for my standard fee… 127. Scott Says: anon #120: That Bloomberg article is depressingly credulous, and quotes me with amusing selectivity (they cherry-pick something I said about how D-Wave is “in it for the long haul,” while ignoring everything about how there’s no evidence for any speedup over the best classical algorithms). As for the paper you asked me to comment on: unless I missed something, it has no performance comparison whatsoever between D-Wave and classical solvers. All it says is that you can use a D-Wave machine, which is obvious and not in doubt. In some sense, the fact that anyone would consider that evidence for anything is the reason why I have to keep blogging about this. 128. Job Says: Is it possible that Google compared their result against simulated annealing instead of Selby’s algorithm to draw the attention of the huge number of AI practitioners working at Google? From their paper, referring to Selby’s algorithm: However, we believe that such solvers will become ineffective for the next generation of annealers currently being designed. The primary motivation for this optimism is that the connectivity graphs of future annealers will have higher degree. For example, we believe that a 10-dimensional cubic lattice is an engineerable architecture. For such dense graphs, we expect that cluster finding will become too costly. It sounds like they already know that Selby’s is faster than quantum annealing running on D-Wave, and already understand why that is, so they’re not looking any further. 129. jonas Says: Re #120 that article also quotes a financial expert saying “Quantum technology can evaluate all possible scenarios in order to produce an optimized portfolio that can deliver the best risk-adjusted return.” Is that saying that quantum computers try all possible solutions in parallel, or that it can solve NP-complete problems, or neither? 130. Jeremy Stanson Says: Job 121 “I’m not going to bother dissecting Scott’s comment describing what the “clean” approach entails in order to show you that the inverse is not as you understand it to be.” I understand your hesitation. To me that does sound like an utterly futile exercise. But maybe… if you try to work it through, and then you review your results against my posts, you might learn something. I think that would make it worthwhile. 131. Darrell Burgan Says: I know I’m speaking as a mere engineer here, but I would be very interested if the cost of the solution was included in the analysis. My perception is that classical approaches to optimization problems are still vastly more cost effective than D-Wave’s system in order to produce systems that perform at equivalent speeds. My perception is also that the human process of expressing a real-world problem to the system is far quicker and less costly for a classical system than a D-Wave system. I know these are engineering concerns not basic-research concerns, but I point out that D-Wave holds itself out as an engineering company. They are producing a product, not a theory or experiments. My challenge to D-Wave would be: give me a solution that is compelling from a price/performance standpoint. I think they still have a long way to go on that front. 132. Jeremy Stanson Says: Darrell, You are correct on all fronts. D-Wave is not (yet?) at the point where there is any cost benefit to installing and using one of their machines. “Customers” who have purchased a machine to date are investing in the technology as early adopters and using it for their own research purposes. Figuring out what to use it for and how to do it are not straightforward, but these customers are trying to get that learning out of the way with the smaller, less powerful versions of the processor so that they are ready when (if?) really high-performing procesors become available. Another interesting thing from a cost perspective is the fact that the system has essentially constant power consumption regardless of the size of the processor (i.e., regardless of the number of qubits). If a D-Wave machine ever competes with a semiconducting supercomputer in terms of computational performance, then the D-Wave machine will vastly outperform the semiconducting supercomputer in terms of power consumption. 133. Scott Says: Jeremy #132: Err, the reason why D-Wave’s power consumption is “constant,” rather than scaling with the number of qubits, is simply that there’s such a massive fixed overhead to cool everything down to 15mK or whatever! It takes chutzpah to present the need for this massive cooling as an “advantage” for D-Wave! Having said that, of course if you got a huge asymptotic runtime speedup, that would ultimately translate into power savings as well. Whether such a speedup is or isn’t possible with a D-Wave-like architecture is, I repeat, the central question at issue here, on which everything else turns. 134. 373737111 Says: Scott #116 & #126: Your response implies that D-Wave has an obligation to disclose it’s inner workings to non-customers. Paying customers deserve greater access to confidential information (via agreements) than skeptics. In fact, your post suggests that you have the inside track on D-Wave, but only to people who will pay you$1000/hour.

135. Jeremy Stanson Says:

Scott # 133: Err, relative to the power consumption of a conventional semiconducting supercomputer, the power required to cool the D-Wave machine is miniscule. I haven’t related this to an asymptotic runtime speedup. If a D-Wave machine is ever faster in performance than a semiconducting supercomputer, then power consumption may not be of much relevance. I agree with you re: the central question but that wasn’t the question that Darrel asked.

What I said was that if a D-Wave machine is ever competitive with a semiconducting supercomputer (independent of the scaling of the system’s performance and even if it is actually slower than the semiconducting supercomputer), it would actually cost considerably less to operate the D-Wave machine. I’m not sure how much of this relates to reversibility of the QM processes at play (or if that’s a factor at all), but it is a certain characteristic of the superconducting nature of the system whether the algorithm it performs is quantum or classical in nature.

136. Henning Dekant Says:

Job #126, there is a reason drug research is tightly regulated by the FDA and experimental computers are not. Your comparison is an inaccurate exaggeration, about as meaningful as Jurvetson’s extrapolations that D-Wave machines will in short order be more powerful than the universe.

137. Scott Says:

373737111 #134: Of course, they don’t have the obligation to reveal anything, but then no one else has the obligation to give them the benefit of the doubt if they don’t. Openness has been the central operating principle of science since the late 1600s, and it’s been working pretty well! And until someone demonstrates a clear quantum speedup for something, we’re clearly still in the domain of science.

Incidentally, on the few occasions when investment firms, etc. have wanted to pay me $1000/hour to talk about D-Wave, I’ve told them exactly the same things that I tell everyone else on this blog for free. But who am I to object if they want it over the phone rather than in a comment section? 🙂 138. Henning Dekant Says: Scott #134, correct me if I am wrong, but until read-out the annealing process is essentially adiabatic on the D-Wave chip, there is no heat loss, so conceptually you could stuff more than one into the same fridge. To me this shows a potential path to more energy efficient scaling, so Jeremy’s argument is not without merit (I certainly made it in the past, with regards to the potential of the technology). My understanding is that at this time the overhead for shielding and control logic (i.e. the inverse “Christman tree” hanging from the top of the D-Wave boxes) prevents them from operating more than one chip in parallel. 139. Scott Says: Henning #138: OK, fine, it’s just that “power usage is constant” strikes me as a really strange way to describe a situation that’s really, “there’s this large additive constant in front of the power usage, from the cooling system.” It would be like if someone tried to hire you for a job where you had to commute for two hours each way, but they said that’s actually an advantage, since you’ll get paid by the hour, and if you decide to work longer and longer hours, you won’t have to commute any more than before. 140. Henning Dekant Says: Scott #140, there’s one aspect where that analogy falls down, there are unfortunately only 24 hours in a day, I am always acutely reminded of this when enjoying myself over here 🙂 141. Scott Says: Henning #140: Who says you’d have to go home at night? 🙂 142. Jeremy Stanson Says: Scott # 139 I totally get that the speedup issue is foremost, especially for someone in your position. But from a non-computer scientist point-of-view, the power consumption of a D-Wave system is a real consideration. D-Wave has all along been operating on a roadmap of ever-increasing processor complexity and it has always been part of their pitch that the power consumption of the system remains essentially constant. They want to compete with semiconducting supercomputers one day, and the power used to cool and operate those systems is way higher than the “large additive constant” you are referencing. You and others here often claim not to understand the motivations of investors and that’s an area where other, “non-speedup” considerations come into play. The intrinsic low-power consumption (no-power consumption in principle but of course that doesn’t happen) of a superconducting processor is probably a bit of a safety-net for investors. Apart from all the “quantum” stuff, D-Wave is building the most sophisticated superconducting fab capability in the world. Even a strictly classical superconducting processor deliberately designed to perform strictly classical algorithms may be expected to offer a constant-factor speedup and significant power savings over a semiconducting counterpart. I believe the NSA (or TRW, or Northrop Grummann, or whoever) tried to build systems like this several years back, and I believe some of the current D-Wave processor folks were part of that effort. This is nice security for investors in the event that the quantum side of things doesn’t pan out. 143. indolering Says: > On a different note … I’ve been getting some nasty comments (which I haven’t let through the moderation filter) about how I’m a fanatical D-Wave hater, “flinging feces” at their historic achievement for no reason. I’ve played the role of chief debunker myself and it’s an incredibly frustrating experience. The only thing you are asking of DWAVE is to be honest about their achievements and their capabilities. What people don’t understand is that amping up those achievements using marketing bullshit has real consequences. Thanks for fighting the good fight. 144. fred Says: Jeremy #142 “D-Wave is building the most sophisticated superconducting fab capability in the world. ” Building a sophisticated fab requires billions of dollars. 145. Marnie Dunsmore Says: @ Henning Dekant #138 “Scott #134, correct me if I am wrong, but until read-out the annealing process is essentially adiabatic on the D-Wave chip, there is no heat loss, so conceptually you could stuff more than one into the same fridge. To me this shows a potential path to more energy efficient scaling, so Jeremy’s argument is not without merit (I certainly made it in the past, with regards to the potential of the technology). ” “My understanding is that at this time the overhead for shielding and control logic (i.e. the inverse “Christman tree” hanging from the top of the D-Wave boxes) prevents them from operating more than one chip in parallel.” Henning, please discuss the interconnect challenges in “stuffing more than one D-wave” machine into the same system and cooling system. It’s obvious that in the long term, spatially, you’re going to have trouble maintaining quantum coherence over a large spatial chip area. What to do? One solution: build lots of smaller quantum computing machines and connect them together? Ah, hah! Then you have a clock distribution problem, which is the same problem that one encounters on a classical CMOS processor. On most high density ASICs, as you go to higher clock speeds (5GHz+) it’s the clock distribution network where power increases exponentially with clock speed. I’m really having trouble figuring out how you are going to get around this. One solution I can think of is to use the quantum computer just for the hottest computations on small islands of the chip, and use classical ASIC design for the rest. But this is not going to free you from the clock distribution problem. I strongly doubt that we are ever going to see a quantum only computer. I think its going to be big finfet CMOS, and small quantum computing, with huge efforts expended to address the interconnect and clock distribution challenges. (Yes, those unsexy problems.) I think the quantum computing community should come to terms with this, and stop selling quantum computing as a replacement to existing CMOS processor approaches. Also, conventional CMOS processors would also work a lot better if it was supercooled. That should be the metric that quantum computing solutions are measured against. 146. Job Says: Job #126, there is a reason drug research is tightly regulated by the FDA and experimental computers are not. Your comparison is an inaccurate exaggeration, about as meaningful as Jurvetson’s extrapolations that D-Wave machines will in short order be more powerful than the universe Uh huh, when you put it like that then… yeah i guess D-Wave really is unlike a drug company who chooses to market a likely placebo as a cure. It’s because of the absence of a computational FDA, and it’s also because Quantum Computers don’t kill people. That’s why D-Wave’s product is guaranteed not to be a quantum placebo right? See, i missed that perspective when i framed that analogy earlier. Sorry! 147. Marnie Dunsmore Says: Guys, you’d get a big improvement in clock speed just by taking conventional CMOS down to liquid nitrogen temperatures. Many of the electromigration problems, which force wide interconnect widths, and slower chip clocking speeds, would be mitigated. You could design narrower interconnects, thus improving power and clock distribution networks. 148. Scott Says: Marnie #145: FWIW, I’ve always regarded it as obvious that in practice, as you said, a “quantum computer” will look like a classical computer that calls a quantum coprocessor for the few crucial things where you actually get an advantage that way. That’s what it already looks like if you walk into a QC lab—grad students hunched in front of classical computers that show where the ions in the trap are or whatever—and it even matches how the complexity class BQP is usually formally defined. It would be silly to use the quantum part for the numerous things a classical computer can do as well or better—as I like to say, like using the Space Shuttle to drive down the freeway. 149. Greg Kuperberg Says: It would be silly to use the quantum part for the numerous things a classical computer can do as well or better—as I like to say, like using the Space Shuttle to drive down the freeway. Scott, you need to remind people that you’re a theorist. BQP is surely larger than P, and all else is just constant factors. 150. Scott Says: Greg #148: Well, or polynomial factors. 151. Marnie Dunsmore Says: Scott #147 “I’ve always regarded it as obvious that in practice, as you said, a “quantum computer” will look like a classical computer that calls a quantum coprocessor for the few crucial things where you actually get an advantage that way. ” I view it that way too. One thing I’m wondering about though, is the fundamental temperature incompatibility between D-Waves quantum qubits, which only work below 1 degree Kelvin, and conventional CMOS, which only works above liquid nitrogen temperatures, due to carrier freeze out. Carrier freeze out paper: http://www.sciencedirect.com/science/article/pii/001122759090208T I remember back in the 90s, when I was a grad student at UBC, some of my buddies were doing research on qubits that would work at liquid nitrogen temps. I don’t know what happened with this, but in order to put qubits on the same IC with conventional CMOS, the qubits have to work at higher temps that what D-waves is doing. You could try to create different temperature islands on an IC, but that is a very, very hard, problem. You could put the qubits on one IC at temps less than 1 degree K, and CMOS finfets on another IC at liquid nitrogen temperatures, but then you are back to the nasty interconnect problem. Looks complicated. 152. Job Says: Apart from all the “quantum” stuff, D-Wave is building the most sophisticated superconducting fab capability in the world. I think most people acknowledge that D-Wave’s work in the supporting technologies for Quantum Computing – the membrane around their water pill, if you will – has value. It’s unfortunate that D-Wave receives more attention for what they have yet to achieve than what they have already produced. 153. Greg Kuperberg Says: Greg #148: Well, or polynomial factors. Even more so! Not even just polynomial factors, polynomial expansions. So who cares whether the lab laptops use bits or qubits? 🙂 154. Jeremy Stanson Says: fred #144 “Building a sophisticated fab requires billions of dollars.” If you’re building a foundry from scratch, yes. If you’re building a capability within an existing foundry, I think millions will probably do: http://www.cypress.com/news/cypress-and-d-wave-systems-announce-successful-transfer-d-wave-process-technology-cypress 155. Henning Dekant Says: Marnie #145, since you addressed me I guess I should add my two cents, although Scott already addressed the most important point, and there is really not much to add to this other than maybe pointing out that D-Wave always positioned their machine as a special purpose co-processor i.e. you’d only execute problems on it that are amenable to it. When it comes to optimizations you typically already distribute these to a dedicated cluster. While not exactly a drop it replacement that’s where a quantum computer would go once it reaches an attractive price/performance point. A real quantum speed-up will get us there for sure, but polynomial may do once conventional CMOS has reached the end of feasible transistor shrinkage. Of course the latter wouldn’t be of much scientific interest from a CS perspective but still make for good business. 156. anon Says: The tone of this post and many of the comments have been complimentary of the careful work done by the expert Google team. While the Google team’s expertise is beyond doubt and their paper is carefully written, it is also clear that they have engaged in serious (and highly successful) overselling of their result, including some unprecedented hype. Consider the abstract, which is all 99% of readers of their paper will ever read: “Here, we demonstrate how finite range tunneling can provide considerable computational advantage. For a crafted problem designed to have tall and narrow energy barriers separating local minima, the D-Wave 2X quantum annealer achieves significant runtime advantages relative to Simulated Annealing (SA). For instances with 945 variables this results in a time-to-99%- success-probability that is ~10^8 times faster than SA running on a single processor core. We also compared physical QA with Quantum Monte Carlo (QMC), an algorithm that emulates quantum tunneling on classical processors. We observe a substantial constant overhead against physical QA: D-Wave 2X runs up to ~10^8 times faster than an optimized implementation of QMC on a single core.” The media briefing at NASA on Dec.8 apparently consisted of similar, if not even more misleading statements (see http://www.technologyreview.com/news/544276/google-says-it-has-proved-its-controversial-quantum-computer-really-works/). No wonder that the press was under the impression that something substantial has been achieved. The abstract lists none of the many caveats regarding the supposed 10^8 speedup, which is a number that will not survive more careful implementations of SA on faster classical processors, and is essentially meaningless since it doesn’t inform us about asymptotic speedup. The abstract says nothing about other much more powerful algorithms such as Selby’s. Nor does it explain that SA with cluster updates will drastically reduce the runtime advantage. In short, the abstract deliberately hides the many caveats revealed only much later in the paper, which will be missed by the vast majority of readers. The same and worse can be said of the media briefing. One must wonder why the Google team chose this unbefitting strategy of obfuscation. The media certainly fell for it, so some great short-term publicity was a nice outcome. More speculative is that the real reason is that the team’s funding is under internal pressures, and this was a way for its leaders to get the higher-ups at Google to pay attention. No matter what the reason, the fact that the Google team hasn’t come out with statements debunking the numerous erroneous media reports claiming that they proved that quantum computing has arrived is quite unfortunate. Yes, D-Wave has engaged in plenty of hype, but apparently the Google team isn’t immune to this either. Damage to the global quantum computing effort has been done and the backtracking from meaningless 10^8 speedup factor claims will have to start soon. 157. Henning Dekant Says: anon #157, there is hype in the computer industry? Who’s going to protect the children? Seriously, I think with Google glasses and Apple watches just being the latest examples, the public is quite accustomed to that kind of hype and factors it in accordingly. The abstract makes quite clear that this is a “crafted problem” and not a universal speed-up. IMHO the damage to the QC field exists solely in your head and is counter-factual. News like this will attract more funding and investments. 158. Aula Says: Scott #137: “they don’t have the obligation to reveal anything” Actually they do, to a certain extent. If (as Lisa Adam repeatedly suggests) Google and the other publicly traded corporate investors of D-Wave have information that will have significant implications for the value of the D-Wave company, and therefore the market values of these corporate investors thmselves, then it is illegal for them not to disclose that information. Sure, technical details of the D-Wave machine are trade secrets, but financial consequences of trade secrets must be made public. 159. Greg Kuperberg Says: As bemused as I am with the hype-fueled trajectory of D-Wave, I have to give them credit for transparency at the technical level in their publications in the arXiv and in journals. They reveal a great deal there. I don’t think that they do it mainly to placate investors, who could in principle be entertained with inside information instead. I think that the larger reason is that their real market (so far) is to sell research devices. In order to have research devices, you have to have research. Thus, this type of transparency serves the customer. 160. Google's Quantum Computer: Don't Ditch Your Servers Yet – The Data Center Journal – Bodrag NEWS Says: […] Scott Aaronson, an MIT professor and theoretical computer scientist, notes that the paper citing these seemingly astounding results for the D-Wave 2X include caveats that make the claims all but inconsequential for skeptics. For instance, the paper says in its summary section, “More work is needed to turn quantum enhanced optimization into a practical technology.” The average Joe might well ask, “So what good is this contraption?” Some good may come out of it, but interesting abstract properties without any real application are insufficient. The plight of graphene is a similar example in the materials domain. […] 161. 373737111 Says: Henning #157 & Greg #159: Your comments give me confidence that this blog is converging on what’s important here. Thanks! 162. jonas Says: Oh dear. This is on radio news here too, with sensationalist statements. It says the computer is “hundred million times faster than an ordinary computer”. They mentioned Google and near absolute zero temperature and quantum, but not the name “D-Wave” nor trying all solutions in parallel. That doesn’t make it sound so unbelievable really, because we already have large clusters of powerful computers that are at least a hundred thousand times faster as my old home computer, at least for well parallelizable tasks. 163. David Says: “D-Wave has all along been operating on a roadmap of ever-increasing processor complexity and it has always been part of their pitch that the power consumption of the system remains essentially constant.” I think the idea of a ‘constant cooling overhead’ is mistaken. A cryostat’s size (and thus it’s cost) is measured as a function of its cooling power at base temperature. Each control line or mechanical support you bring down from room temperature to the chip will add heat load. Thus you would expect something like a linear scaling in fridge size with number of qubits. We presumably haven’t seen this so far as they just designed a cryostat with enough overhead to handle their first few generations of processor. It’s hard to put specific numbers on the cost scaling without knowing a detailed breakdown of all the heat loads. Also, whilst the cost and efficiency scaling of the first two cryostat stages to large scales is well know [1], I don’t believe this is the case for the final dilution refrigerator stage. Either way you cut it though sub-K cryogenics is hellishly expensive. This is one of the big selling points of quantum computing technologies like ions that can run at room temp or 4K. [1] THE COST OF HELIUM REFRIGERATORS AND COOLERS FOR SUPERCONDUCTING DEVICES AS A FUNCTION OF COOLING AT 4 K, M. A. Green, AIP Conf. Proc. 985, 872 (2008) 164. Jeremy Stanson Says: David #163 I think you are correct, but my understanding is D-Wave has already addressed the cryostat scaling issue. It’s too much for me to call out specific references, but it’s all here: http://www.dwavesys.com/resources/publications Basically, they’ve built their system to accommodate the maximum number of control lines that they could properly filter and cool with the maximum cooling power they could achieve. Then, they designed an on-chip programming system (superconducting DACs, XY-addressable arrays, demultiplexers, etc.) so that they can use that fixed number of control lines to generate any number of control signals. Every time they build a new and bigger processor, they just swap the old processor out and put the new processor in – connected to the same electrical/mechanical infrastructure. So there really is no added heat load as the processor scales up in size. Sure, there are more qubits on each successive processor, but those are microscale loops of thin niobium and are negligible compared to the mass of all the wiring, filtering, and magnetic shielding. Given all of that, I think the idea of “constant cooling overhead” is spot on. And just to relate back to the “dirty” vs “clean” discussion, all of this on-chip programming infrastructure is absolutely necessary for a scalable system for the reasons David points out (and because there are real limits on how many control lines you can engineer to fit into a fridge and still cool them to mK). This infrastructure has a negative impact on the quantum performance of D-Wave’s qubits, but this is an issue that will affect Martinis et al. and any other SC QC effort that grows to a large number of qubits. 165. Henning Dekant Says: David #164, your wider point is well taken. I think that the sub-K technology will only play into the first wave of QC devices until we have something like topological qubits. On the other hand there is still plenty of opportunity to do innovative things to cut down on the heat load (e.g. develop optical interconnects to replace some of the control lines). D-Wave has been pushing the engineering limits on that end, and I think this will be a real competitive edge for them as long as sub-K is the only QC game in town. Off the cuff I’d estimate there may be be a commercial window of about one to two decades for sub-K QC, and this goes for Google’s and IBM’s clean approach as well, so this will be additional impetus to commercialize this as quickly as possible. 166. David Says: Jeremy #164 It looks like they have sensibly put in enough control overhead that they don’t have to redesign their fridge for each processor iteration. What I am talking about though is the asymptotic scaling to very large numbers of qubits/processors, not the next generation or two. You say: “they can use that fixed number of control lines to generate any number of control signals”. If we assume you don’t want the time taken to program and read out the processor to increase, then for this statement to hold you will need to increase the bandwidth of the control lines as you increase the number of qubits. Increasing the bandwidth will increase the heat load as you can’t filter the thermal noise coming down the lines as heavily and the signals themselves will dissipate more. Eventually you will hit a bandwidth limit (at the max speed of the demultiplexers for example) and have to add more lines. Thus you are back to a linear scaling of the heat load from control lines with qubit number. A few control lines might not seem like much but at these temperatures you are paying hardware costs upwards of$100k per milliwatt of cooling power!

At this early stage it’s practically impossible to know how this will effect the economics of superconducting quantum devices but it’s certainly a big disadvantage of the technology.

167. Job Says:

Seriously, I think with Google glasses and Apple watches just being the latest examples, the public is quite accustomed to that kind of hype and factors it in accordingly.

Evidence of a quantum speedup by D-Wave’s system is closer to confirmation of the existence of the Higgs boson than it is to any popular technology trend.

In fact, D-Wave in this context is closer to CERN than it is to Google or Apple. They’re putting out a large hadron collider and claiming that the Higgs Boson exists before their particle accelerator has even detected it.

IMO the eventual confirmation of a quantum speedup is really a turning point and should not be claimed so casually – I can see how you would have a different opinion if you don’t appreciate this.

168. Scott Says:

Job #167: In a single comment, you’ve captured what I’ve been trying to say in these D-Wave threads for the past nine years. I wish I’d written that comment.

169. Mark Says:

The cost analysis is misleading. It is like saying that it is a waste of money to develop ENIAC because you can outperform what it does easily with the amount of money that was spent on it. It doesn’t take potential future opportunities into account. The bet is that after it gets to work we can focus on cutting costs and make it available for masses.

170. Mark Says:

It is also good to point out that we all agree that the D-Wave machine is not classical. By the way, more applied CS researchers don’t care as much about asymptotic as you think. It is already huge if we can compute usefully large instances 100x faster. That means cutting the costs by ~100x and increasing the profits by ~100x. In practice solutions seldom turn out to be as clean as theoreticians like.

171. Mark Says:

One final thing: people are already building special purpose machines. If you have a machine that can save considerable time on a task that is performed as Google’s scale it makes sense to make one. Think about GPUs.

172. Scott Says:

Mark #170, #171: OK, but we already addressed this. Once you take into account the cost of encoding onto the Chimera graph, and (especially) the existence of classical algorithms like Selby’s, it’s totally unclear that one should expect any constant-factor speedup either.

And yes, I completely agree that special-purpose machines can be extremely useful. But among other things, that means that the right comparison going forward—if people want to talk about practicality at all, rather than about basic science—is D-Wave vs. what you could’ve gotten using classical GPUs.

173. D-Wave – Fast Enough to Win my Bet? | Observations on Quantum Computing & Physics Says:

[…] provided this excellent write-up, and the former D-Wave critic-in-chief remains in retirement. Scott Aaronson's blog entry on the matter strikes a (comparatively) conciliatory tone. One of his comments explains one of the reason for […]

174. Marnie Dunsmore Says:

“But among other things, that means that the right comparison going forward—if people want to talk about practicality at all, rather than about basic science—is D-Wave vs. what you could’ve gotten using classical GPUs.”

yeah, and not GPUs optimized for laptops and cell phones, but the latest sub 20nm gate length GPUs with all the bells and whistles, optimized for server farms, with cooling.

175. Job Says:

If you have a machine that can save considerable time on a task that is performed as Google’s scale it makes sense to make one. Think about GPUs.

That’s not a particularly good example since Google’s infrastructure is known to run on “cheap and fast” commodity hardware.

Unspecialized hardware scales better because it’s cheaper to replace.

Also, tasks that can be performed ahead of time, such as indexing, don’t benefit as much from specialized hardware unless it translates to what is essentially an asymptotic speedup, or something close to that.

176. Greg Kuperberg Says:

OK, but we already addressed this. Once you take into account the cost of encoding onto the Chimera graph, and (especially) the existence of classical algorithms like Selby’s, it’s totally unclear that one should expect any constant-factor speedup either.

On the contrary, if you count the overhead of encoding into the D-Wave’s connectivity graph (the so-called “Chimera” graph, not otherwise known to be an important class of graphs), then you can expect a big slowdown compared to classical computation for a lot of problems, by well worse than a constant factor.

For instance there is the paper on using D-Wave to calculate Ramsey numbers, in particular to calculate R(3,3) and R(2,8). Maybe they could use this new device to calculate R(2,10). No one here should get too excited though, because there is a classical algorithm to calculate R(2,n) for any n that’s tough to beat. (Scott already knows this algorithm, so this is not mainly addressed to him.)

177. Google and Quantum Computing : Stephen E. Arnold @ Beyond Search Says:

[…] concerns a numerical recipe running on Google spiffy new D-Wave quantum computer. I also read “Google, D-Wave, and the Case of the Factor-10^8 Speedup for WHAT?” This paper points out that Google is making progress with the D-Wave quantum computer. How much […]

178. asdf Says:

Popular press messes things up again:

Sheesh.

179. Marnie Dunsmore Says:

Job #175

“That’s not a particularly good example since Google’s infrastructure is known to run on “cheap and fast” commodity hardware.”

A question for you: “Why do you refer to the GPUs that Google currently uses, which are optimized for servers, with state-of-the-art caching, sub 20nm implementation, and massive use of parallelism, with decades of R&D behind these advances, as “”cheap and fast” commodity hardware”?

180. Job Says:

Marnie #175,

Because i’m not referring to the relatively tiny fraction of Google’s infrastructure that’s using advanced GPUs to tackle specific problems.

181. Marnie Dunsmore Says:

Job #180

While I admit I don’t have *inside* knowledge about how Google implements its servers, if you look at the market growth of server GPUs made by companies like AMD and Intel, it’s obvious that this is a huge and growing market.

While it might be a tiny fraction of Google’s infrastructure now (again, I don’t know), in the coming years, these high end server GPUs will dominate in server farms.

Why? Heat and Speed.

Heat: Generating a lot of heat costs energy, which the server farm owners pay for. They also have to cool these server farms, which is another costly undertaking. And then there are the environmental issues of server farm heat generation.

Speed: For most servers these days, response time really matters, so query look ups have to be fast. So if Google is running on “cheap commodity” GPUs (I’m wondering about this) it must be because of legacy and replacement cost issues.

One thing is for sure: the cost per bit of a quantum computer (cost defined as investment , speed and heat) won’t be cheaper than conventional server GPUs such as those in the Intel Xeon GPU family any time soon. It is conventional GPUs in this class against which quantum computers should be benchmarked.

182. Job Says:

if you look at the market growth of server GPUs made by companies like AMD and Intel, it’s obvious that this is a huge and growing market.

I imagine that’s because GPUs have become mainstream and there is demand for hardware-accelerated rendering on the cloud, among other reasons. If Google had tons of GPUs they would have a GPU offering in their cloud services (e.g. to match Amazon’s) which AFAIK they don’t.

Also can we refer to GPUs as “specialized hardware” in the context of machine learning or bitcoin mining when 1) they’re so widely available and 2) they’re being used for a purpose for which they were not originally intended?

I admit in 2) that i don’t know how GPUs are used in this context, but i’m guessing that it’s not to render graphics.

In that sense, a hypothetical data-center filled with printers for the single purpose of PostScript processing is also using “specialized hardware”. It’s a bit misleading.

Is there a better example of truly specialized hardware operating at a large scale?

183. D-Wave van Google nog ver weg van kwantum'wonder' - Geleerd uitschotGeleerd uitschot Says:

[…] computer is maar een rekenmachine, een snelle, maar toch een heel beperkte machine. Dat stellen Scott Aaronson van het befaamde MIT in de VS en Matthias Troyer van de niet minder befaamde ETH in Zürich in […]

184. Mark Says:

Google does use lots of GPUs and is going to rely on them even more as they use more and more ML in their products.

It is probably sufficient for Google that there is a 2x speedup over their current hardware on ML optimization jobs to make it a reasonable investment.

Google employs top economists and they surely know what is the cost-benefit for these kinds of investments. These are of course bets, sometimes risky ones, but their potential is also huge.

From Scott’s Q&A and his post I get the impression that their issue is with “dirtiness” of the way D-Wave tries to do quantum computation. Anyone with some experience in the industry knows that it is not uncommon to sacrifice the theoretical cleanness to get something working even if it is full of engineering hacks.

185. Scott Says:

Mark #184: Then you haven’t understood the point at all. The issue we care about is not “theoretical cleanliness” for its own sake. It’s that, if your qubits aren’t much cleaner than D-Wave’s are now, then as far as we know today you will not see a speedup over the best classical algorithms! And that expectation is consistent with all the data we’ve seen so far from experiments on the D-Wave machine. And if there’s no speedup, then there’s no practical reason to use a QC at all rather than a classical computer. There might be scientific reasons, but to go that way, you’d have to concede that current QC experiments should be evaluated as basic science rather than as business propositions—which was the skeptics’ point all along!

186. Henning Dekant Says:

Scott #186:

“And if there’s no speedup, then there’s no practical reason to use a QC at all rather than a classical computer.”

That is unless a QA machines could establish and grow a conventional polynominal speed-up by outracing Moore’s law -which after all is getting a bit long in the tooth.

Boring in comparison to real quantum speed-up, but potentially still profitable.

187. Wojciech Burkot Says:

Scott,
Great job with the blog.

It digresses towards the end and switches to Polish for last few posts but first of all I do not believe it adds much more than what you and people in comments say, so I am okay if you moderate it away.

188. On DWave’s latest progress | Edinburgh Mostly Quantum Lab Says:

[…] Google’s joint announcement with DWave has hit the airwaves and just as with the three previous generations DWave is yet again claiming fantastic speedups of their quantum annealing machine over classical algorithms. The good news here is that the scientific manuscript that underpins the press release is far more upfront about the caveats this time around, and just as before the caveat is that the speedup all but disappears when the DWave machine is compared to algorithms optimised for the specific problem that the DWave machine solves. All of this is explained in much detail by retired DWave chief critic Scott Aaronson in a blog post and a Q&A here. […]

189. Richard Cleve Says:

Hi Scott. In your Dec 11 Q&A, you wrote that “In quantum mechanics, we know that particles can tunnel through barriers. (This is the language that the physicists use, which is a little bit misleading.)”. Can you explain why?

190. Scott Says:

Richard #189: Sure. I find it misleading because no one ever describes a classical thermal process like simulated annealing as mysteriously “tunneling through” a barrier—they just describe it as jiggling around randomly, which of course sometimes lets you get up and over the barrier. And I see quantum annealing as not fundamentally different from this—it’s just that, because of destructive interference between the different paths that lead “back down” the barrier, a quantum particle can sometimes get “up and over” a barrier exponentially faster than a classical stochastic particle would. Or not; it all depends on the height and width of the barrier, as Farhi, Goldstone, and Gutmann showed. But the language of “tunneling” makes it sounds as if the barrier width wouldn’t even be relevant—like the quantum particle, being magical and spooky, would just instantly “tunnel through” to the global minimum! Like, obviously the physicists know better, and I even use the word “tunneling” myself when talking to physicists. But I also have the blessing and the curse of being hypersensitive to how journalists and laypeople misinterpret these things.

191. Henning Dekant Says:

Scott #191, you must just *love* an article on tunneling like that one 🙂

192. Marnie Dunsmore Says:

@Scott #190

“a quantum particle can sometimes get “up and over” a barrier exponentially faster than a classical stochastic particle would. Or not; it all depends on the height and width of the barrier, as Farhi, Goldstone, and Gutmann showed. ”

Doesn’t quantum mechanics say that with small *probability*, that particles can cross a potential barrier, which is not predicted by classical physics?

Or is there some other aspect of quantum tunneling (described by Farhi, Goldstone, and Gutmann ?) that says that quantum particles also travel *faster* across a potential barrier, than predicted by classical physics?

193. Scott Says:

Marnie #192: I’m sure an actual physicist could weigh in and provide more insight, but my understanding is that the notion that it’s impossible for a classical particle to make it over the barrier is an artifact of comparing quantum mechanics only against deterministic classical theories (and not just any deterministic theories, but specific ones, with nothing to privilege them other than the fact that people believed them before 1925!). For me, one of the great lessons of quantum computing is that the better comparison is always quantum mechanics versus classical probabilistic theories—which, in this case, would naturally correspond to a particle at nonzero temperature. And such a particle has some nonzero probability of making it up over an arbitrarily large barrier, just because of thermal fluctuations! This means that the only differences between quantum tunneling and “classical tunneling” are going to be quantitative: they’ll have to do with (e.g.) the probability of tunneling vs. reflection, and with the tunneling time. And indeed all those quantitative things can be different in the quantum case, as the Farhi et al. example shows (but physicists had many examples before that, not for optimization algorithms but for the infinitely more confusing world of actual physical systems).

194. Marnie Dunsmore Says:

@Scott #193

OK. Thank you. That’s a very good explanation. I agree: Comparison should be against classical probabilistic physics. I suppose a term such as probabilistic barrier penetration could be used instead of “tunneling”.

I’ll have to read the Farhi paper.

195. Wojciech Burkot Says:

@Marnie 192 @Scott #193
Splitting the hair further:

“..better comparison is always quantum mechanics versus classical probabilistic theories—which, in this case, would naturally correspond to a particle at nonzero temperature..”
For a physicist, it is difficult to speak about a temperature of a single particle.
Yet, tunneling, understood as a particular form of a solution to Schoredinger equation in a presence of a potential barrier higher than available energy of a particle, can be seen in this case, too, despite the fluctuations of energy being by no means thermal.

Whether we say it is tunneling through or we say that because of Heisnenberg uncertainity principle (this one subject to recursive hair splitting, if needed) we get or get not sufficient \delta E to climb over the barrier for some \delta t the particle spends in the vincinity, is just a matter of personal preference in choosing words to describe that picture:
https://en.wikipedia.org/wiki/Quantum_tunnelling#/media/File:Quantum_Tunnelling_animation.gif (note that periodicity is an artefact of a gif animation, as infinite barriers at -10 and 10, directing the particle back to the center would result in discrete energy spectrum)

196. Chris Says:

Hi Scott #102,

“[John Martinis] told me that some of the engineering D-Wave had done (e.g., just figuring out how to integrate hundreds of superconducting qubits while still having lines into them to control the couplings) would be useful to his group. That’s one of the main things that caused me to moderate a bit […]”

I would like to add a bit of a physics perspective on this (largely for readers with a more computer science based background). Working on the scalability of physical systems with the view to making them into fully fledged quantum computing devices has been a major research direction ever since DiVincenzo laid out his famous set of criteria [1]. His first criterion is “A scalable physical system with well-defined qubits” and there are only five* criteria in total. Important quantum computing roadmaps were devised with his criteria being extremely central to them [2].

Different physical systems initially met the different criteria to different extents. I think a good example with regards to the original post is cold atoms and ion traps. The former aren’t as clean as the latter but it has been argued that they could scale up more easily. Irregardless, both research communities worked on both fronts: some people in the ion trap community have focused on making their qubits even better and cleaner; others have focused on scaling an ordinary ion trap into a larger, multi-zone ion trap; meanwhile some cold atom researchers have been making their atomic qubits better and others have been working on aspects related to how to address individual atomic qubits when they are trapped in huge arrays.

I think this shows that D-wave aren’t unique in working on scalability and hopefully, it also shows that quantum computer researchers have had for a long time a clear idea of what they want to accomplish.

————————-
* = Sometimes people say “five plus two” criteria because the first five criteria about getting a single device to perform a quantum computation and the next two are about linking these devices together.

197. Aula Says:

Scott #193: I think the crucial difference (at least for most physicists) is that a classical particle with constant total energy can’t possibly enter the region where it would have to have negative kinetic energy to offset the large potential energy, while a quantum particle can easily do exactly that (I think it’s even possible to have a situation where a quantum particle trapped in a finite potential well has an almost 50% probability of being in the “forbidden” region). Of course, if you allow the particle to change its total energy through interactions, then there’s hardly any qualitative difference left between the classical and quantum worlds, but then again, in that case it’s not really correct to speak of “tunneling” even in the quantum world.

198. Daniel Says:

Scott #193, Aula #197

Just to expand a little on Aula’s comment, even if you’re comparing with a classical *probabilistic* theory, where you have some ensemble of particles with some distribution in energy, I think you would still only observe particles making it to the other side if the distribution had nonzero support on particles having energy that high. If you had a Gaussian distribution, then it would only be a matter of which fraction of the particles has enough energy to go over. OTOH, if you had e.g. a square distribution, or some sort of cutoff on energies higher than the barrier, then none of particles would get through (even if you prefere to talk about probabilistic classical mechanics without reference to an ensemble, I guess this would remain true).

However, I admit that off the top of my head I’m not sure what wavepacket for a quantum-mechanical particle would make for the closest analogue to that, and what would happen with its tunneling properties, but my intuition would be that that would lead to tunneling as a more authentic quantum-mechanical process.

199. Chris Says:

Aula #197:

It is then interesting to think about what it means to have a negative kinetic energy and what happens when we try to measure the particle in the classically forbidden region.

To be fairly sure that we are measuring the particle in the classically forbidden region, we need to determine its position such that the uncertainty in the position is small (i.e. less than the size of the forbidden zone). Heisenberg’s uncertainty principle tells us that the uncertainty in momentum is therefore large and because kinetic energy is proportional to momentum squared, the uncertainty in kinetic energy is also large. I hope this sort of gives an insight into what is going on.

200. Scott Says:

Wojciech, Aula, Chris, Daniel: Thanks for the comments.

Firstly, in the classical probabilistic case, rather than thinking about the “temperature” of an individual particle, we should either think of a particle immersed in a bath (so, getting randomly jostled around at some rate), or else we should remember that, when we talk about optimization, we’re really interested in a “particle” in a high-dimensional configuration space—so, say, n coupled spins in real space.

Secondly, wouldn’t the natural classical probabilistic analogue of a wavepacket, be a “particle” (or, again, really n particles) with a certain average energy, but whose energy can sometimes become much larger because of thermal fluctuations, thereby letting it get over the barrier?

201. Daniel Says:

Chris #199

Well, but you can talk about tunnelling without having to ‘measure the particle in the forbidden zone’, right? You can just do it in a scattering formalism, where your particle is sent from far away, with a momentum distribution very peaked around some value, and just ask what is the probability that it was transmitted vs. reflected by the barrier, without having to worry about localizing inside the forbidden region.

Scott #200

Well, if you’re really describing your classical theory by a thermal bath, with some distribution (say, Gaussian) around an average value, then there’s always some tiny probability that a few of the particles will have high enough energy to go over the barrier. But suppose that your classical energy distribution had some hard cutoff value – there’s no probability that your particles can have energy larger than a given value, maybe because that’s higher than the total energy of the N particles in the ensemble combined, or something relativistic consideration. That’s a well-defined physical situation where, I believe, classical mechanics wouldn’t have tunnelling but quantum mechanics would (although if your cutoff is due to relativistic constraints, it might make sense to say that such a high barrier is also unphysical for some reason).

Maybe that’s just a comparison with the “wrong” classical probabilistic theory, though. And I certainly wasn’t trying to connect it to what happens in optimization. I’m just trying, myself, to understand how much of tunnelling actually comes from “probabilistic” and how much comes from “quantum”.

202. Wojciech Burkot Says:

@Scott #200 So recursive hair splitting is on :). In the scenarios you describe, for one reason or another the energy E_0 of the “particle” traversing the barrier is higher than the barrier potential V. Then, in the barier region, if it is wide enough to ignore edge effects, the “particle” is quasi free with E = E_0-V. As a consequence, the transmission probability in such a case could not possibly depend on the barrier width, unless I am missing something obvious.

In tunneling case, the particle transmission through a barrier is exponentially attenuated, with dampening constant proportional to (V-E_0)^(-1/2). This dampening brings an observable dependency of a transmission on the barrier width. Kinda like “get out of jail free” another, non-thermal way of getting out of a potential well.

203. Marnie Dunsmore Says:

A cool paper:

Quantum mechanical versus semiclassical tunneling and decay times
Shegelski, Kavka, and Hnybida

“There are many physical systems in condensed matter physics, atomic physics, and physical chemistry with finite lifetimes. In some cases, the lifetime calculated using the WKB approximation is in good agreement with experiment. Familiar examples are alpha decay, electron emission from
the surface of cold metals induced by external electric fields,
and ionization of atoms in strong electric fields. The latter is similar to alpha decay in its formulation.”

“The semiclassical approach regards the particle as bouncing
back and forth inside a potential well and attempting to
escape each time it encounters the barrier. We show in this
paper that a fully quantum mechanical treatment is accessible at the undergraduate level. The pedagogical advantage of this treatment is that students see the striking differences between the actual behavior and the picture given by the semiclassical approach. In particular, the presence of bound states within the potential well is included in the quantum mechanical treatment, but is ignored in the semiclassical approach, leading to much different results.”

204. Alex Selby Says:

A few comments on the Google paper (apologies for the long post):

It is acknowledged in the paper (section IIC) that D-Wave operating on the weak-strong problem class does not outperform the best classical algorithms. Out of interest, I tried my program on the weak-strong instances from fig. 3 and found a gentle scaling in terms of problem size and problem difficulty. At 300 qubits it was about 20x slower than D-Wave, and by 950 qubits it was about 3x faster for 50% hardness problems and about 600x faster for 85% hardness problems.

For what it’s worth, this is not the best classical performance possible. The weak-strong problem class semi-collapses with the couplings chosen, because it is easy to show that the minimum energy is always attained at a state with spins pointing the same way within each cell. That means you can treat each cell as a single binary variable, and a classical solver that actually exploited this would be a lot faster, able to solve the hardest case of the biggest instance given in comfortably less than a millisecond. The corresponding scaling behaviour in fig. 4 would be very gentle – not much more than linear over the range shown. It would take maybe 100,000 qubits to generate a hardish (requiring seconds to solve) problem of this form.

But this is not a surprise to anyone (though it puts some newspaper reports of D-Wave being 10^8 times faster than your PC into some sort of context). Their point was to compare with Simulated Annealing, not the best classical solver.

Section IIIA on Number Partitioning is more problematic, I think, because of the claim (in the abstract and Table 1 note) that for random instances of the number partitioning problem, QA scales better than the best known classical algorithm. For the best known classical algorithm, they are taking the complete Karmarkar-Karp (CKK) algorithm which they say (from experiment) has a running time of 2^(0.87n). This is compared with 2^(0.8n) for QA et al.

There is a problem here because CKK is a long way from being the best classical algorithm. For example, the Schroeppel-Shamir algorithm has a running time of essentially 2^(0.5n). It also uses 2^(0.25n) units of memory, but this space cost is always going to be negligible when comparing with a 2^(0.8n) algorithm. (Schroeppel-Shamir is itself not the state-of-the-art, but that’s another story.)

Of course in both cases (weak-strong clusters and number partitioning) it is perfectly valid to compare with Simulated Annealing for the purposes of probing to what extent D-Wave devices are using tunnelling, and so on, and the results here are very interesting in this respect and I think basic research like this is to be applauded, but comparison with the best known classical algorithms, on the evidence here, seems very much to favour the classical in both cases.

I’m not sure about the characterisation in section VI of Simulated Annealing as the benchmark generic classical algorithm, their point being that better classical schemes require individual crafting. For one thing, SA often has lots of parameters to tune to get it to behave well, which is a kind of crafting. Secondly, is it so wrong to design an algorithm for a class of problems? I think that’s what makes the comparison of quantum with classical algorithms interesting. Quantum computing potentially gives an enormous speed advantage but only in very restricted settings, which forces you to compute in certain ways. Classical algorithms plod along, but allow you the great flexibility of doing anything you want. If it is felt necessary to use something like SA as a benchmark algorithm, then why not use Parallel Tempering? It’s going to be applicable in similar settings, it is physically meaningful and as far as I know it always performs better, sometimes much better.

I don’t completely agree with section IIC. They say that my program replaces 8 qubit clusters with a single large spin and it is this that gives it its advantage. But it only uses 4 qubit clusters, not 8, so can’t immediately set the state of a whole cell to the correct value, and in any case I don’t think this is really point of the method. I believe its benefit comes from using low treewidth subgraphs, which allow large (and “thick”) neighbourhood searches, in combination with suitable perturbations of the state. At the moment it typically uses treewidth 1 or 2 subgraphs of the quotient graph of 4-qubit clusters. If it used single qubits instead of clusters of 4 qubits, then it could instead use treewidth 4-8 subgraphs, which should end up being similar. (I’m not sure there is a well-defined distinction.)

This section also says they don’t expect this method to work well with the 10^3 or 2^10 cubic lattices. It’s possible this is correct, but I don’t see a reason to believe that it wouldn’t continue to be effective. For example in the 2^10 lattice, there are very large low-treewidth subgraphs. The classic snake-in-a-box gives a pathwidth 1 subgraph of size something like 0.3*2^10. So moving up from pathwidth 1 to treewidth 1 and then to treewidth 6 or so, which is a reasonable size to exhaust, would give something even bigger and even more thickly connected. Of course it is true that this doesn’t itself prove anything – in the end it would just have to be tried. The moral may be that you need surprisingly high connectivity to kill off useful low-treewidth subgraphs. (The method certainly wouldn’t work well in a complete graph, for example.)

Incidentally, you can also use the same treewidth method to do QMC which is quite an efficient way of implementing it.

205. Chris Bowen Says:

Fun fact regarding tunneling. Wherever metal meets semiconductor in the CMOS device enabling you to read this comment, the classical barrier is about 10 kT. Without tunneling, none of this would be possible.

206. Paul Says:

@Chris Bowen #205:

Finally, a comment which caught my attention on this thread!

Do you really mean 10 kilotons of TNT energy equivalent for a barrier hight? Can not possibly be right…

207. Qal Says:

Isn’t the main question still whether or not there is entanglement among qubits across the chip rather than within only an 8-qubit cell cluster? If so, wouldn’t good experiments be to reproduce similar results of whatever proof-of-entanglement experiments were done on the 8-qubit cell cluster some years ago (thinking of the MRT one for example) instead with a 4×4 cluster chip (1024 qubits, suitable for D-Wave 2), where the clusters would have strong intraconnectivity and their 4×4 interconnectivity would be K4,4? (simulate a D-wave chip cell’s quantum behavior with the entire chip)

208. Qal Says:

Sorry that’s 512 qubits not 1024. Actually, the 4×4 super(cluster)qubit/K4,4 simulation experiments could even have been done on the D-Wave 1 chips (if they had sufficient qubit yields) but much more likely so on the 512-qubit D-Wave 2 chips.

Paul #206: no, Chris #205 means k (i.e. k_B, Boltzmann’s constant) times T, absolute temperature. Since the devices he mentions are at room temperature, 10kT is about a quarter of an electron volt of energy.

210. Marnie Dunsmore Says:

@Nick and @Chris,

Thanks for the reminder! : )

And this metal meets Si “tunneling” occurs at all the ohmic contacts in the CMOS device: gate, source, drain and bulk.

I still have this metal image of being taught tunneling, and remember thinking about the renegade electrons getting out of their quantum well and sneaking over the energy barrier.

211. Rubbernecker Says:

Thanks, Scott, for providing a forum for such an informed discussion of the DWave device.

Some relatively uninformed comments to which I welcome correction:

(1) “Tunnelling” of a particle ??? Bah, humbug, there are only waves (see QFT).

(2) Thanks Jeremy for pointing out the obvious engineering advantages to DWave’s approach. Let someone else design a perfect qubit, and then you can license it.

(3) Semiconductor process scaling will also inevitably reduce the qubit energies, and decrease the physical dimensions and perhaps the energy barriers. More useful entanglement?

(4) Any fabricated device will be imperfect, and the physical geometry will never be perfectly symmetrical. Presumably, quantum probability waves will be reflected imperfectly. Does that have any impact on implementing Shor’s algorithm which relies on cancellation to suppress non-solutions?

Happy holidays everyone!

212. Scott Says:

Rubbernecker #211:

Regarding (2)—at risk of repeating myself hoarse, no one is hoping for a “perfect qubit,” just a qubit that’s good enough to support an honest asymptotic speedup. The issue with D-Wave’s qubits is not that they’re imperfect; it’s that they might not be good enough to support any speedup at all over the best known classical algorithms.

Regarding (4), the answer to your question is the entire theory of quantum fault-tolerance! It was proven around 1996 that, in principle, you can do an arbitrarily reliable universal quantum computation even with imperfect components, just as von Neumann proved in the 1950s that you can do reliable classical computation even with imperfect components. All the experimental work in quantum computing since then has been working toward the goal of realizing that discovery (since again, absolutely no one expects perfect components).

213. Charles H. Bennett Says:

Selby’s algorithm avoids the bottleneck of individual spin flips in a manner reminiscent of the classical Swendsen-Wang and Wolff algorithms, which flip clusters of spins instead of individual spins, thereby reducing critical slowing down in Monte Carlo simulations of Ising and Potts models.

214. gentzen Says:

Just wanted to say how impressed I was after reading your Q&A in MIT news. I was impressed by your honest and immediate support of the work of Alex Selby, and I was even more impressed by your work with Alex Arkhipov on the computational complexity of linear optics. I heard about your BossonSampling fame before, but I didn’t realize that it included the description of a realistic experiment that could convincingly prove that quantum effects allow to perform otherwise intractable calculations.

215. Links for December 2015 - foreXiv Says:

[…] Aronson resumes his role as chief explainer of D-wave on his blog and the MIT news […]

216. Post-quan-tum key agree-ment-IT大道 Says:

[…] large quantum computers can be built ( and not of the D-Wave variety ) then they’ll make quite a mess of public-key cryptography. RSA and systems based on […]

217. Nameof Thefrog Says:

A D-Wave I would not call a dirty approach, but something, and may it be “Quantum Anneahealer” and beats simulated, and beats that it compared unfair, and beats to that it kind of cheats, but that doesn’t make it a Quantum Computer, and should not define if it is something less or bad. It’s cheating because it fills a gap, and take the lead with something else. It could be just what it is, and still would not be less.

218. Blasts From the Past | Gödel's Lost Letter and P=NP Says:

[…] claiming a speedup over classical methods. See also this interview with Scott Aaronson and a comment with remarks on the paper in Scott’s own post on it. We have been intending for a long time […]

219. Commercial quantum computer works, sort of – MyPhysNet Says:

[…] are some important caveats, as MIT quantum-computer expert Scott Aaronson points out on his blog Shtetl-Optimized. It seems, for example, that there is another classical algorithm that can solve these problems […]

220. Blasts From the Past | TRIFORCE STUDIO Says:

[…] claiming a speedup over classical methods. See also this interview with Scott Aaronson and a comment with remarks on the paper in Scott’s own post on it. We have been intending for a long time to […]

221. The State of Quantum Computing Today | wrLapinsky's Blog Says:

[…] the availability of the D-Wave 2X system with more than 1,000 qubits. On the other hand, there are lots of skeptics about whether what D-Wave is creating is really a quantum computer. It clearly uses […]

222. Buzz : Google est encore loin de l'ordinateur quantique miracle (MAJ) – Sciences et technologies Says:

[…] Surtout, comme le rappellent les experts des ordinateurs et des calculs quantiques que sont Scott Aaronson et Matthias Troyer, et comme ne le cachent d’ailleurs pas les chercheurs de Google, la […]