Turn down the quantum volume

Several people asked me to comment on the recent announcement by Honeywell that they’ll soon have what they call “the most powerful” quantum computer (see here for press release, here for Forbes article, here for paper).

I’m glad that Honeywell, which many people might know as an air-conditioner manufacturer, has entered the race for trapped-ion QC. I wish them success. I’ve known about what they were doing in part because Drew Potter, my friend and colleague in UT Austin’s physics department, took a one-year leave from UT to contribute to their effort.

Here I wanted to comment about one detail in Honeywell’s announcement: namely, the huge emphasis on “quantum volume” as the central metric for judging quantum computing progress, and the basis for calling their own planned device the “most powerful.” One journalist asked me to explain why quantum volume is such an important measure. I had to give her an honest answer: I don’t know whether it is.

Quantum volume was invented a few years ago by a group at IBM. According to one of their papers, it can be defined roughly as 2k, where k is the largest number such that you can run a k-qubit random quantum circuit, with depth k and with any-to-any connectivity, and have at least (say) 2/3 probability of measuring an answer that passes some statistical test. (In the paper, they use what Lijie Chen and I named Heavy Output Generation, though Google’s Linear Cross-Entropy Benchmark is similar.)

I don’t know why IBM takes the “volume” to be 2k rather than k itself. Leaving that aside, though, the idea was to invent a single “goodness measure” for quantum computers that can’t be gamed either by building a huge number of qubits that don’t maintain nearly enough coherence (what one might call “the D-Wave approach”), or by building just one perfect qubit, or by building qubits that behave well in isolation but don’t interact easily. Note that the any-to-any connectivity requirement makes things harder for architectures with nearest-neighbor interactions only, like the 2D superconducting chips being built by Google, Rigetti, or IBM itself.

You know the notion of a researcher’s h-index—defined as the largest h such that she’s published h papers that garnered h citations each? Quantum volume is basically an h-index for quantum computers. It’s an attempt to take several different yardsticks of experimental progress, none terribly useful in isolation, and combine them into one “consumer index.”

Certainly I sympathize with the goal of broadening people’s focus beyond the “but how many qubits does it have?” question—since the answer to that question is meaningless without further information about what the qubits can do. From that standpoint, quantum volume seems like a clear step in the right direction.

Alas, Goodhart’s Law states that “as soon as a measure becomes a target, it ceases to be a good measure.” That happened years ago with the h-index, which now regularly pollutes academic hiring and promotion decisions, to the point where its inventor expressed regrets. Quantum volume is now looking to me like another example of Goodhart’s Law at work.

The position of Honeywell’s PR seems to be that, if they can build a device that can apply 6 layers of gates to 6 qubits, with full connectivity and good fidelity, that will then count as “the world’s most powerful quantum computer,” since it will have the largest volume. One problem here is that such a device could be simulated by maintaining a vector of only 26=64 amplitudes. This is nowhere near quantum supremacy (i.e., beating classical computers at some well-defined task), which is a necessary though not sufficient condition for doing anything useful.

Think of a university that achieves an average faculty-to-student ratio of infinity by holding one class with zero students. It gets the “best score” only by exploiting an obvious defect in the scoring system.

So what’s the alternative? The policy I prefer is simply to tell the world all your system specs, as clearly as you can, with no attempts made to bury the lede. How many qubits do you have? With what coherence times? With what connectivity? What are the 1- and 2-qubit gate fidelities? What depth of circuit can you do? What resources do the standard classical algorithms need to simulate your system? Most importantly: what’s the main drawback of your system, the spec that’s the worst, the one you most need to improve? What prevents you from having a scalable quantum computer right now? And are you going to tell me, or will you make me scour Appendix III.B in your paper, or worse yet, ask one of your competitors?

I confess that the answers to the above questions are hard to summarize in a single number (unless you, like, concatenated binary encodings of them or something). But they can be ineffably combined, to produce a progress metric that one of my postdocs suggested calling “quantum scottness,” and which roughly equals the number of expressions of wide-eyed surprise minus the number of groans.

37 Responses to “Turn down the quantum volume”

  1. squinky Says:

    Thanks for that. I think the import is that if a trapped-ion device can do QC and be commercial at all. That would add weight to the possibilty there will be more distinct pathways in the main sequnce of QC developments. So even considering all you say for me the story is still a 1 and not a zero in that regard.

    As long as I’m on the horn I expect you’ll be seeing a lot of this next item soon. https://ia.acs.org.au/article/2020/today-s-encryption-is-already-broken.html
    It is not about a computation-theoretical breakthrough (no one is claiming it is so there is nothing to rebut) but rather it points to the possibility that Nvidia and such can get large advantage from quantum-inspired algorithms. Supposedly, much larger than I would have thought, though more than one person I know has been warning this could happen over the last year. Caveat – I can’t really verify the story is true but its an ACS-affiliated news source.

  2. New top story on Hacker News: Turn Down the Quantum Volume – News about world Says:

    […] Turn Down the Quantum Volume 3 by furcyd | 0 comments on Hacker News. […]

  3. Anonimouse Says:

    Quantum Scottness? Is this in reference to a particular person and/or incident?

  4. asdf Says:

    Scott did you just add a lot more social media tracking beacons to your blog? It might also be because of my own recent browser swap, but I just had to spend around 5 minutes updating my adblock rules to block out the beacons (facebook, twitter, google one+, plus the specificfeed button that wasn’t an actual transclude, and maybe some other stuff). I feel like I need a shower now ;-P.

  5. asdf Says:

    By the way, does anyone know if there is any news about vetting MIP=RE*? Is the paper even submitted anywhere? Last time I looked, the arxiv page didn’t say anything about that.

  6. Job Says:

    The position of Honeywell’s PR seems to be that, if they can build a device that can apply 6 layers of gates to 6 qubits, with full connectivity and good fidelity, that will then count as “the world’s most powerful quantum computer,” since it will have the largest volume. One problem here is that such a device could be simulated by maintaining a vector of only 2^6=64 amplitudes. This is nowhere near quantum supremacy (i.e., beating classical computers at some well-defined task), which is a necessary though not sufficient condition for doing anything useful.

    I’m not sure i agree.

    I think connectivity, depth and fidelity are far more important than any supremacy result that’s associated with the hardware.

    The reason is that you can get a supremacy result even with a specialized and highly-constrained non-computer.

    In fact, it’s easier. The more constrained it is, the easier it is to establish supremacy over classical machines, in practical terms – imagine having to repeat Google’s experiment with full connectivity.

    Plus, even a 40 qubit machine with 40 layers, full connectivity and perfect fidelity falls short of a supremacy result, and it would undoubtedly be the world’s most powerful QC.

    IMO the quantum volume metric is actually the one that’s least susceptible to hype and exaggerated claims.

  7. Scott Says:

    Anonimouse #3: I assumed the person to whom it was in reference was I… 🙂

  8. Scott Says:

    asdf #4: I didn’t change a thing, but the little tracking buttons aren’t loading for me anymore! So clearly something happened. Could this blog be under attack? If it is, does it have any loyal troops or defenders? 🙂

  9. Scott Says:

    asdf #5: I know that it’s in the process of being vetted, including by reading groups that formed for that purpose. I wouldn’t expect a verdict anytime soon. (I’m now wondering whether the coronavirus will make it go faster or slower…)

  10. Scott Says:

    Job #6: OK, I hope you’re right! The fact that I like what it’s trying to capture is exactly why I didn’t come down unequivocally against QV in this post. It’s just that, mindful of Goodhart’s Law, I refuse to cede my judgment about what is or isn’t interesting to any invented metric.

  11. Will Says:

    Scott #8 I think you can see my email? I took a look at a few things, send me an email, I will offer any help to keep your blog safe!

  12. Jim Says:

    Don’t we already have the tools to take a stab at coming up with reasonable metrics for connectivity, noise, and other factors that we know are important for quantum computers?

    With a large dose of handwaving, what about a metric like:

    [# of qubits] * [Tree width of the qubit graph] * depth ?

    Tree width, because we don’t want to ‘score’ two N-qubit computers strung together with a single wire the same as a 2N-qubit machine.

  13. Gerard Says:

    squinky #1

    That article makes very little sense to me. Surely a classical computer (which a GPU is, just with different architectural choices) running a quantum simulator can’t factor more efficiently than the current best factoring algorithms. If someone is making that claim I’d expect to see information on how large the numbers are they are factoring and in how much time, and of course ideally their code, not talk about hacking wifi networks.

  14. Scott Says:

    Will #11: Thanks so much for the offer! Fortunately, it looks like it’s fixed now. I had to disable the WordPress plugin (which, of course, had suddenly been discontinued with zero warning), and replace it with a successor plugin.

  15. paul Says:

    As the author of the Honeywell Forbes piece referenced by Scott, and as a long time reader of his blog, I thought I would weigh in on this thread.
    First, I always appreciate Scott’s thoughtful insights as he explains the technical aspects of issues from different points of view before rendering his opinion. In that process, he usually uncovers things I haven’t even thought about.

    For the most part, I agree with him on quantum volume, but I also agree with Rob. I would love to see a spec sheet on every QC so we could examine and determine their areas of deficiency and strength for ourselves. As an analyst, that would be golden. There is some truth to Goodhart’s Law, particularly since quantum volume has been given a lot of public exposure. I just hope it doesn’t become the point of a public relations leapfrog game among companies.

    But in one respect, too much of a good thing might not be a bad thing. Quantum volume is really the only thing about quantum that the average person can understand. It may not be perfect, but it is a single number and makes for easy comparison between very complicated systems. We have a long haul in front of us to get to our goal in quantum computing. We need the general public to stay interested and to stand behind the efforts and financial resources we’ll need to get to that goal, and the concept of quantum volume might help.

  16. Ian Hincks Says:

    In many ways I like QV, though I agree that no metric can be perfect. To my taste, it conflates pre-processing compilation tricks a bit too much with raw performance, especially in the current era where different groups have different focuses, so that seeing this distinction is more important.

    Like the quantum supremacy experiment, it is not a scalable protocol; it requires knowing whether all of the measured bitstrings from a device are heavy or not, and therefore requires classical simulation of the random circuits. One can make the slight adjustment and sample the 2 qubit gates from the Clifford group rather than from U(4) to make the protocol scalable. It is a safe assumption that current devices are not intrinsically better at running Clifford circuits than they are at running general circuits (aside from the unrelated overall factor-of-2 arising because one can synthesis a random 2-qubit clifford with 1.5 e.g. CNOTs instead of 3 for a random U(3)). Indeed, quantum supremacy made an analagous argument with their elided circuits. However, I think switching to Cliffords causes the nice Porter-Thomas distributions to become flat or bivalued, and so it’s not clear to me if heavy output would still be appropriate.

    As an aside, I think QV vs logQV must have been a marketing thing. Multiplying your QV by 10 every year sounds much more compelling to the public/investors than adding 3.2 usable qubits per year.

  17. Scott Says:

    paul #15: Thanks for explaining your position. But why do you say that quantum volume is “the only thing about quantum that the average person can understand”?? In order to understand it, they’d first need to understand the various concepts that go into it—qubits, circuit depth, qubit connectivity patterns, Heavy Output Generation—and then they’d need to understand why those concepts are being combined in this way and not some other way. I don’t fully understand the latter.

    Isn’t it easier, relatively speaking, to explain to an average person more primitive concepts like “how many qubits are there,” “how many things can you do to the qubits before it all falls apart,” and “how hard would it be for a conventional computer to do the same thing”? And if you wanted to know how exciting some particular advance on these metrics was, there’d always be the time-tested method of talking to experts not involved in the work, and seeing how excited they were. This method has a built-in safeguard of being able to account for any caveats that the headline measures don’t capture.

    Or when you wrote “understand,” did you mean only in the sense that the average person “understands” an IQ score: i.e., not at all how it’s calculated or why it’s calculated that way rather than another way, but just that various experts agree that “more means you’re smarter”?

    If so, then with QV like with IQ, one next faces the question: and what if the experts don’t agree?

  18. Zalman Stern Says:

    (Disclaimer: I am a Google employee. I do not speak for my employer nor do I work on Quantum Computing.)

    The main gist of quantum volume presentations I’ve seen heavily implies that the problems one wants to solve have their own volume property and as the volume of the computing systems increases, problems of a given volume will become productively solvable using those systems. One can argue about whether a single volume number applies or if we need to at least account for all the dimensions. (If one is transporting sand, a single volume metric for a truck is probably fine. If one is moving large blocks of solid tungsten, one wants to be sure every dimension of the truck is larger than every dimension of the block, etc.) Even then, the idea is usually presented more as an abstraction rather than a quantitative metric. (It is a pretty picture on a slide, not a table of things and their quantum volumes.)

    It also seems fairly clear we’ll have to relax the strict “fully connected” measure to something that accounts for a degree of connectedness.

    However the big question is do we have any reason to accept this model at all? If we look at quantum algorithms, particularly those that are candidates for running in the NISQ domain, do we see volume like requirement in what it takes to usefully run them?

    -Z-

  19. Carey Underwood Says:

    > If so, then with QV like with IQ, one next faces the question: and what if the experts don’t agree?

    I fear that the human-default heuristic resembles “If there are a few that are using the metric, then anyone poking holes is clearly just being persnickety about it. And can therefore be safely disregarded for the purposes of a high-level view.”

    Would the exceptions be better served by careful bite-sized explanations of the subtleties? I guess that’s basically a blog. Fighting the tendency head-on feels… like an impedance mismatch, or trying to change the course of a river by paddling from a pier: you can move a few gallons for a moment and even make some waves, but…

  20. Scott Says:

    Carey #19: With IQ, at least there’s an attempt to calculate it in a principled way: namely, administer every different intelligence test that you can think of, look at the correlations between the scores, do linear algebra on the correlation matrix and extract the principal component, and then normalize the resulting “g-factor” onto a bell curve, with a mean of 100 and a standard deviation of 15.

    By contrast, the definition of quantum volume seems purely ad hoc. As I said, in designing a totally ad hoc “goodness measure” for QC experiments, I think one could’ve done a lot worse! But it would be greatly preferable if the goodness measure could be derived from some principled starting point, rather than simply made up.

  21. fred Says:

    No matter how you slice it, the problem is that we’re nowhere near to doing anything remotely interesting with whatever currently tries to pass as a QC. And “quantum supremacy” right now is about having a big bundle of quantum “stuff” that can’t be simulated efficiently… okay.

    But at the same time the people in the field are super eager to take credit for the really hard work they’ve done.

    Contrast this with the very first primitive classical computer, the “Colossus Mark 1” – there was no need to come up with various oddball metrics to characterize the thing because it solved an actual real life problem that mattered: breaking the Nazi Enigma machine!

  22. Scott Says:

    fred #21: The Colossus machines were used for the Lorenz cipher, not for Enigma.

    More importantly: if classical computers didn’t already exist, if they weren’t the beneficiaries of 60 years of Moore’s Law (one of the most astonishing virtuous cycles in human history), and if QCs didn’t need to beat them in order to be interesting, then we’d be having a completely different conversation. Existing QCs all look like weaklings precisely because they’re growing up in the shadow of King Kong, and because they’ll only become useful after they overtake and defeat King Kong, at least in some battles. And King Kong is also rapidly growing.

    While this is obvious once it’s spelled out explicitly, it’s easy to generate misleading analogies by forgetting it.

  23. Neticis Says:

    What about multiplying volume by fidelity? Such metric would also show possibilities for computations, where precision is more important than in other areas.

  24. fred Says:

    Scott #22

    Forget Moore’s law. My point is that nothing currently labeled a QC can even be used to do something equivalent to breaking a WW2 level encryption.

    you wrote
    “The position of Honeywell’s PR seems to be that, if they can build a device that can apply 6 layers of gates to 6 qubits, […]. One problem here is that such a device could be simulated by maintaining a vector of only 26=64 amplitudes. This is nowhere near quantum supremacy ”

    Obviously you’re very focused on proving supremacy, I get it.

    But I think the error is trying to simultaneously beat classical computers at “something” while at the same time expecting something that’s programmable in a general sense.
    Now that QS is out of the way, it’s time to forget about it and focus on an architecture that can do a proper general clean quantum computation, regardless of performance/size, and can be scaled up progressively.
    You gotta learn to walk before you can run.

  25. Paul Says:

    When I used “understand” I meant it like how the general public uses the term “gas mileage” for vehicle fuel consumption.

    Almost everyone, from plumbers to neurosurgeons, use gas mileage as a judgment factor when evaluating cars, estimating how far they can drive, budgeting, and a hundred other reasons. In truth, they know nothing about the elements that go into determining gas mileage – such as aerodynamics, coefficient of friction, thermodynamics of hydrocarbons, effects of variable velocity, and many other things. Even without this detailed knowledge, it is still a widely used and valuable metric. If you asked five experts to explain gas mileage you would probably get 5 dissimilar answers. Put those experts in a group discussion about it and they would likely argue about certain points. Do I care that hey don’t agree? Not me. I still want to know what my gas mileage is when I drive from Austin to Phoenix.

    By the way, your postdoc has my vote for Quantum Scottness

  26. Scott Says:

    Paul #25: I don’t think that analogy works. Yes, most laypeople are surely ignorant of all the factors that go into making gas mileage better or worse. But they still know that it means (or is supposed to mean) how many miles you can expect to get per gallon. But how many laypeople could be expected to know (in a similar sense) what quantum volume means? I had to look it up again when I wrote this post.

    By contrast, if someone claimed their QC could solve some problem in 10 seconds that would take 2 years for the fastest known code running on the world’s currently largest classical supercomputer—well, the claim might be true or false, but I don’t think anyone would be terribly confused about what it meant.

  27. Job Says:

    With QV, the connectivity requirement is kind of important because k is both the width and the depth.

    In practice, depth is more important than width, for the same reason that qubit quality is more important than qubit count. And higher connectivity corresponds to increased depth.

    If you relax that requirement, then depth would need to be something like k^2 or more.

    I’m usually just interested in the size of the circuits that I would be able to move from an emulator to actual hardware.

    In a sense, that’s the effective size of the QC. And i would use the H/Toffoli gateset here since it has the least platform affinity and is a nice reality check given how expensive AND gates are on most implementations.

    If your hardware doesn’t run all 20×20 H/AND circuits with fidelity 2/3 or more then you don’t have a 20×20 QC. Fair? 🙂

  28. Lou Scheffer Says:

    Scott #20: Your idea on how to measure IQ makes sense. Has anyone actually done this? If so, I’d be very interested in the *second* principal component. Work ethic? Socioeconomic status? Curiosity? Lots of possibilities…

  29. Scott Says:

    Lou Scheffer #28: It’s not “my idea”; it’s simply how it’s been done (at least in theory) for the past century. The psychometrics literature (try googling) will have more info than you can possibly imagine about the other PCA components and so forth.

  30. Tamás V Says:

    they’ll soon have what they call “the most powerful” quantum computer

    Does anybody take such announcements seriously after Rigetti’s 128-qubit-QC-in-one-year announcement? (Does Honeywell have a crystal ball to know that nobody else will make any strides soon?)

  31. gentzen Says:

    Linkage post by David Eppstein on the inventor of the h-index expressing regrets:

    Jorge Hirsch repudiates the h-index he invented (M): “I have now come to believe that it can also fail spectacularly and have severe unintended negative consequences. I can understand how the sorcerer’s apprentice must have felt.” (This is an aside; the actual linked article is about Hirsch’s difficulty in breaching the orthodox consensus on superconductivity.)

    Now “breaching the orthodox consensus” sounds interesting, so I read it. He writes:

    I expected back then that ‘this mechanism’ would be quickly accepted … Alternatively, that somebody would prove me wrong, so I could move on. So where are we 30 years later?

    I have not moved on. I have since then published well over 100 papers on hole superconductivity …, the papers have been by and large ignored and the community is as unconvinced as it was 30 years ago (or even more) that this has anything to do with real world superconductivity.

    The part “somebody would prove me wrong, so I could move on” touched me. Not because I have an opinion on hole superconductivity, or even the ability to have an opinion (if I would spend time on it). But because of the cases were I could prove somebody wrong:

    I have read pages 1-6 and browsed the rest of the draft. From the results, my guess is that Theorem 11, Lemma 13, and Theorem 14 will hold water. I am skeptical about the XXX result, as you may have guessed. However, let me make it clear that I will study your draft because I expect to learn interesting ideas and concepts while studying it, not because I want to disprove your XXX result.

    I am not even sure whether I would be able to make somebody accept that his proof of some lemma or theorem is flawed, if the sole purpose why I read his paper would have been to prove him wrong. But this is only part of the truth. Often I know exactly why some proof is wrong and already wrote something like “The difficulty is that “iff” instead of “if” in Corollary 11 is not justified (the proof of Lemma 10 is wrong). What happens is that …” to the author. But am too lazy to follow up, if my explanations did not convince the author yet. And then, there are the realities and constraints of my actual life. So I have to answer:

    sadly, I didn’t find time to do anything related to logic since I last answered you. …

    However, you will probably be able to find issues yourself in the less expected parts of your paper, if you try seriously.

    But in the end, I don’t really know whether an author would benefit from being proven wrong. Russell did prove Frege wrong, and Frege accepted it. But subsequently, especially since his wife died soon after, he became depressed for many years.

  32. asdf Says:

    I didn’t realize that’s what the h-index was. I like that Yitang Zhang got a MacArthur award, various other awards, multiple tenure offers, with an h-index of 1. He has only published one paper of note, but it apparently went over pretty well ;-).

  33. Honeywell a-t-il créé l’ordinateur quantique le plus puissant du monde ? Says:

    […] spécialiste des théories de la complexité et des algorithmes quantiques Scott Aaronson dans Turn down the quantum volume publié juste après l’annonce d’Honeywell. Il rappelle que le volume quantique qu’Honeywell […]

  34. Joe Shipman Says:

    There’s something a little strange here.
    Why make up a measure? Why not take a well-known algorithmic problem which QCs, while they have not achieved supremacy over classical computers, can at least DO, and ask how big an instance of it they can do?

    At this time I still get dismissive contemptuous “it’s not that simple” responses whenever I ask someone who knows quantum computation “how big a number can QCs currently factor”, but certainly that WILL be a good measure eventually. OK, I get it. Shor’s algorithm isn’t far too complicated to be implementable on QCs we can build today. But I’m not asking that. I’m just asking for ANY algorithmic problem that QCs have nontrivial power to do even if they’re far short of supremacy to classical computers or even to humans working with pencil and paper. It doesn’t have to be Shor factoring or Grover searching, just something that is framed as an algorithmic problem and which QCs can be built to tackle with nontrivial performance.

    Of course we know “generate random bits” is such a thing, so this criterion is already achieved. Call that level 0. For level 1, it has to solve a problem nontrivially harder than that.

  35. Joe Shipman Says:

    Sorry for typo above, meant to say Shor’s algorithm “is” too complicated, not “isn’t” too complicated.

  36. Andy Says:

    Dear Scott,

    I believe that the current containment measures (in particular implemented as severely as in Italy, Spain, etc.) will in the long run do more harm than the virus. For example, optimistic projections by The Guardian for Ireland indicate that strict containment measures could force as much as 25% of the entire population out of a job, at which point it is longer a question of giving people financial support but rather of how to restart an economy that has come to a grinding halt. With such ramifications in plain sight, I conjecture that it is only a question of time before (for better or worse) we return to our usual modus operandi even if doing so will overstretch hospital capacity for a (hopefully only short) while.

    Do you agree with my assessment / prediction? If not, why?

  37. Wojtek Says:

    Joe Shipman #34. I hope you would like it – we did just what you proposed. https://arxiv.org/abs/2007.06539 (v2 in prep., please, ignore the errors for now, the numbers and graphs unlikely to change). Since programming QCs is an art not a random process, the correlation between the platform’s QV and the results was not a 100% – the best results were on QV 16 Vigo, rather than machines upgraded to QV=32. Caveat: improvements of the hardware were responsible for just a small part of the change. Implementation details matter (https://arxiv.org/abs/2006.05828). All in all, how about benchmarking the hardware on an ability to run unstructured search in a given search space with expected number of oracle calls better than classically needed? Even if it can be classically modelled for some time to come.