## Chinese BosonSampling experiment: the gloves are off

Two weeks ago, I blogged about the striking claim, by the group headed by Chaoyang Lu and Jianwei Pan at USTC in China, to have achieved quantum supremacy via BosonSampling with 50-70 detected photons. I also did a four-part interview on the subject with Jonathan Tennenbaum at Asia Times, and other interviews elsewhere. None of that stopped some people, who I guess didn’t google, from writing to tell me how disappointed they were by my silence!

The reality, though, is that a lot has happened since the original announcement, so it’s way past time for an update.

**I. The Quest to Spoof**

Most importantly, other groups almost immediately went to work trying to refute the quantum supremacy claim, by finding some efficient classical algorithm to spoof the reported results. It’s important to understand that this is exactly how the process is *supposed* to work: as I’ve often stressed, a quantum supremacy claim is credible only if it’s open to the community to refute and if no one can. It’s also important to understand that, for reasons we’ll go into, there’s a decent chance that people *will* succeed in simulating the new experiment classically, although they haven’t yet. All parties to the discussion agree that the new experiment is, far and away, the closest any BosonSampling experiment has ever gotten to the quantum supremacy regime; the hard part is to figure out if it’s already there.

Part of me feels guilty that, as one of reviewers on the *Science* paper—albeit, one stressed and harried by kids and covid—it’s now clear that I didn’t exercise the amount of diligence that I could have, in searching for ways to kill the new supremacy claim. But another part of me feels that, with quantum supremacy claims, much like with proposals for new cryptographic codes, vetting *can’t be the responsibility of one or two reviewers*. Instead, provided the claim is serious—as this one obviously is—the only thing to do is to get the paper out, so that the *entire community* can then work to knock it down. Communication between authors and skeptics is also a hell of a lot faster when it doesn’t need to go through a journal’s editorial system.

Not surprisingly, one skeptic of the new quantum supremacy claim is Gil Kalai, who (despite Google’s result last year, which Gil still believes must be in error) rejects the entire possibility of quantum supremacy on quasi-metaphysical grounds. But other skeptics are current and former members of the Google team, including Sergio Boixo and John Martinis! And—pause to enjoy the irony—*Gil has effectively teamed up with the Google folks* on questioning the new claim. Another central figure in the vetting effort—one from whom I’ve learned much of what I know about the relevant issues over the last week—is Dutch quantum optics professor and frequent *Shtetl-Optimized* commenter Jelmer Renema.

Without further ado, why might the new experiment, impressive though it was, be efficiently simulable classically? A central reason for concern is photon loss: as Chaoyang Lu has now explicitly confirmed (it was implicit in the paper), up to ~70% of the photons get lost on their way through the beamsplitter network, leaving only ~30% to be detected. At least with “Fock state” BosonSampling—i.e., the original kind, the kind with single-photon inputs that Alex Arkhipov and I proposed in 2011—it seems likely to me that such a loss rate would be fatal for quantum supremacy; see for example this 2019 paper by Renema, Shchesnovich, and Garcia-Patron.

Incidentally, if anything’s become clear over the last two weeks, it’s that I, the co-inventor of BosonSampling, am no longer any sort of expert on the subject’s literature!

Anyway, one source of uncertainty regarding the photon loss issue is that, as I said in my last post, the USTC experiment implemented a 2016 variant of BosonSampling called Gaussian BosonSampling (GBS)—and Jelmer tells me that the computational complexity of GBS in the presence of losses hasn’t yet been analyzed in the relevant regime, though there’s been work aiming in that direction. A second source of uncertainty is simply that the classical simulations work in a certain limit—namely, fixing the rate of noise and then letting the numbers of photons and modes go to infinity—but any real experiment has a fixed number of photons and modes (in USTC’s case, they’re ~50 and ~100 respectively). It wouldn’t do to reject USTC’s claim via a theoretical asymptotic argument that would equally well apply to *any* non-error-corrected quantum supremacy demonstration!

OK, but if an efficient classical simulation of lossy GBS experiments exists, then *what is it?* How does it work? It turns out that we have a plausible candidate for the answer to that, originating with a 2014 paper by Gil Kalai and Guy Kindler. Given a beamsplitter network, Kalai and Kindler considered an infinite hierarchy of better and better approximations to the BosonSampling distribution for that network. Roughly speaking, at the first level (k=1), one pretends that the photons are just classical distinguishable particles. At the second level (k=2), one correctly models quantum interference involving *pairs* of photons, but none of the higher-order interference. At the third level (k=3), one correctly models three-photon interference, and so on until k=n (where n is the total number of photons), when one has reproduced the original BosonSampling distribution. At least when k is small, the time needed to spoof outputs at the k^{th} level of the hierarchy should grow like n^{k}. As theoretical computer scientists, Kalai and Kindler didn’t care whether their hierarchy produced any physically realistic kind of noise, but later work, by Shchesnovich, Renema, and others, showed that (as it happens) it does.

In its original paper, the USTC team ruled out the possibility that the first, k=1 level of this hierarchy could explain its experimental results. More recently, in response to inquiries by Sergio, Gil, Jelmer, and others, Chaoyang tells me they’ve ruled out the possibility that the k=2 level can explain their results either. We’re now eagerly awaiting the answer for larger values of k.

Let me add that I owe Gil Kalai the following public *mea culpa*. While his objections to QC have often struck me as unmotivated and weird, in the case at hand, Gil’s 2014 work with Kindler is clearly helping drive the scientific discussion forward. In other words, at least with BosonSampling, it turns out that Gil put his finger precisely on a key issue. He did exactly what every QC skeptic should do, and what I’ve always implored the skeptics to do.

**II. BosonSampling vs. Random Circuit Sampling: A Tale of HOG and CHOG and LXEB**

There’s a broader question: why should skeptics of a BosonSampling experiment even have to *think* about messy details like the rate of photon losses? Why shouldn’t that be solely the *experimenters’* job?

To understand what I mean, consider the situation with Random Circuit Sampling, the task Google demonstrated last year with 53 qubits. There, the Google team simply collected the output samples and fed them into a benchmark that they called “Linear Cross-Entropy” (LXEB), closely related to what Lijie Chen and I called “Heavy Output Generation” (HOG) in a 2017 paper. With suitable normalization, an ideal quantum computer would achieve an LXEB score of 2, while classical random guessing would achieve an LXEB score of 1. Crucially, according to a 2019 result by me and Sam Gunn, under a plausible (albeit strong) complexity assumption, *no* subexponential-time classical spoofing algorithm should be able to achieve an LXEB score that’s even slightly higher than 1. In its experiment, Google reported an LXEB score of about 1.002, with a confidence interval *much* smaller than 0.002. Hence: quantum supremacy (subject to our computational assumption), with no further need to know anything about the sources of noise in Google’s chip! (More explicitly, Boixo, Smelyansky, and Neven did a calculation in 2017 to show that the Kalai-Kindler type of spoofing strategy *definitely* isn’t going to work against RCS and Linear XEB, with no computational assumption needed.)

So then why couldn’t the USTC team do something analogous with BosonSampling? Well, they tried to. They defined a measure that they called “HOG,” although it’s different from my and Lijie Chen’s HOG, more similar to a cross-entropy. Following Jelmer, let me call their measure CHOG, where the C could stand for Chinese, Chaoyang’s, or Changed. They calculated the CHOG for their experimental samples, and showed that it exceeds the CHOG that you’d get from the k=1 and k=2 levels of the Kalai-Kindler hierarchy, as well as from various other spoofing strategies, thereby ruling those out as classical explanations for their results.

The trouble is this: *unlike *with Random Circuit Sampling and LXEB, with BosonSampling and CHOG, we *know* that there are fast classical algorithms that achieve better scores than the trivial algorithm, the algorithm that just picks samples at random. That follows from Kalai and Kindler’s work, and it even more simply follows from a 2013 paper by me and Arkhipov, entitled “BosonSampling Is Far From Uniform.” Worse yet, with BosonSampling, we currently have no analogue of my 2019 result with Sam Gunn: that is, a result that would tell us (under suitable complexity assumptions) the highest possible CHOG score that we expect any efficient classical algorithm to be able to get. And since we don’t know exactly where that ceiling is, we can’t tell the experimentalists exactly what target they need to surpass in order to claim quantum supremacy. Absent such definitive guidance from us, the experimentalists are left playing whac-a-mole against *this* possible classical spoofing strategy, and *that* one, and *that* one.

This is an issue that I and others were aware of for years, although the new experiment has certainly underscored it. Had I understood *just how serious* the USTC group was about scaling up BosonSampling, and fast, I might’ve given the issue some more attention!

**III. Fock vs. Gaussian BosonSampling**

Above, I mentioned another complication in understanding the USTC experiment: namely, their reliance on Gaussian BosonSampling (GBS) rather than Fock BosonSampling (FBS), sometimes also called Aaronson-Arkhipov BosonSampling (AABS). Since I gave this issue short shrift in my previous post, let me make up for it now.

In FBS, the initial state consists of either 0 or 1 photons in each input mode, like so: |1,…,1,0,…,0⟩. We then pass the photons through our beamsplitter network, and measure the number of photons in each output mode. The result is that the amplitude of each possible output configuration can be expressed as the permanent of some n×n matrix, where n is the total number of photons. It was interest in the permanent, which plays a central role in classical computational complexity, that led me and Arkhipov to study BosonSampling in the first place.

The trouble is, preparing initial states like |1,…,1,0,…,0⟩ turns out to be really hard. No one has yet build a source that reliably outputs *one and only one photon* at exactly a specified time. This led two experimental groups to propose an idea that, in a 2013 post on this blog, I named Scattershot BosonSampling (SBS). In SBS, you get to use the more readily available “Spontaneous Parametric Down-Conversion” (SPDC) photon sources, which output superpositions over different numbers of photons, of the form $$\sum_{n=0}^{\infty} \alpha_n |n \rangle |n \rangle, $$ where α_{n} decreases exponentially with n. You then measure the left half of each entangled pair, *hope* to see exactly one photon, and are guaranteed that if you do, then there’s also exactly one photon in the right half. Crucially, one can show that, if Fock BosonSampling is hard to simulate approximately using a classical computer, then the Scattershot kind must be as well.

OK, so what’s *Gaussian* BosonSampling? It’s simply the generalization of SBS where, instead of SPDC states, our input can be an arbitrary “Gaussian state”: for those in the know, a state that’s exponential in some quadratic polynomial in the creation operators. If there are m modes, then such a state requires ~m^{2} independent parameters to specify. The quantum optics people have a much easier time creating these Gaussian states than they do creating single-photon Fock states.

While the amplitudes in FBS are given by permanents of matrices (and thus, the probabilities by the absolute squares of permanents), the probabilities in GBS are given by a more complicated matrix function called the Hafnian. Roughly speaking, while the permanent counts the number of perfect matchings in a bipartite graph, the Hafnian counts the number of perfect matchings in an *arbitrary* graph. The permanent and the Hafnian are both #P-complete. In the USTC paper, they talk about yet another matrix function called the “Torontonian,” which was invented two years ago. I gather that the Torontonian is just the modification of the Hafnian for the situation where you only have “threshold detectors” (which decide whether one or more photons are present in a given mode), rather than “number-resolving detectors” (which *count* how many photons are present).

If Gaussian BosonSampling includes Scattershot BosonSampling as a special case, and if Scattershot BosonSampling is at least as hard to simulate classically as the original BosonSampling, then you might hope that GBS would *also* be at least as hard to simulate classically as the original BosonSampling. Alas, this doesn’t follow. Why not? Because for all we know, a *random* GBS instance might be a lot easier than a *random* SBS instance. Just because permanents can be expressed using Hafnians, doesn’t mean that a random Hafnian is as hard as a random permanent.

Nevertheless, I think it’s very likely that the sort of analysis Arkhipov and I did back in 2011 could be mirrored in the Gaussian case. I.e., instead of starting with reasonable assumptions about the distribution and hardness of random permanents, and then concluding the classical hardness of approximate BosonSampling, one would start with reasonable assumptions about the distribution and hardness of random Hafnians (or “Torontonians”), and conclude the classical hardness of approximate GBS. But this is theoretical work that remains to be done!

**IV. Application to Molecular Vibronic Spectra?**

In 2014, Alan Aspuru-Guzik and collaborators put out a paper that made an amazing claim: namely that, contrary to what I and others had said, BosonSampling was *not* an intrinsically useless model of computation, good only for refuting QC skeptics like Gil Kalai! Instead, they said, a BosonSampling device (specifically, what would later be called a GBS device) could be directly applied to solve a practical problem in quantum chemistry. This is the computation of “molecular vibronic spectra,” also known as “Franck-Condon profiles,” whatever those are.

I never understood nearly enough about chemistry to evaluate this striking proposal, but I was always a bit skeptical of it, for the following reason. Nothing in the proposal seemed to take seriously that BosonSampling is a *sampling* task! A chemist would typically have some *specific numbers* that she wants to estimate, of which these “vibronic spectra” seemed to be an example. But while it’s often convenient to estimate physical quantities via Monte Carlo sampling over simulated observations of the physical system you care about, that’s not the *only* way to estimate physical quantities! And worryingly, in all the other examples we’d seen where BosonSampling could be used to estimate a number, the same number could *also* be estimated using one of several polynomial-time classical algorithms invented by Leonid Gurvits. So why should vibronic spectra be an exception?

After an email exchange with Alex Arkhipov, Juan Miguel Arrazola, Leonardo Novo, and Raul Garcia-Patron, I believe we finally got to the bottom of it, and the answer is: vibronic spectra are *not* an exception.

In terms of BosonSampling, the vibronic spectra task is simply to estimate the probability histogram of some weighted sum like $$ w_1 s_1 + \cdots + w_ m s_m, $$ where w_{1},…,w_{m} are fixed real numbers, and (s_{1},…,s_{m}) is a possible outcome of the BosonSampling experiment, s_{i} representing the number of photons observed in mode i. Alas, while it takes some work, it turns out that Gurvits’s classical algorithms can be adapted to estimate these histograms. Granted, running the actual BosonSampling experiment would provide *slightly* more detailed information—namely, some exact sampled values of $$ w_1 s_1 + \cdots + w_ m s_m, $$ rather than merely additive approximations to the values—but since we’d still need to sort those sampled values into coarse “bins” in order to compute a histogram, it’s not clear why that additional precision would ever be of chemical interest.

This is a pity, since if the vibronic spectra application *had* beaten what was doable classically, then it would’ve provided not merely a first practical use for BosonSampling, but also a lovely way to *verify* that a BosonSampling device was working as intended.

**V. Application to Finding Dense Subgraphs?**

A different potential application of Gaussian BosonSampling, first suggested by the Toronto-based startup Xanadu, is finding dense subgraphs in a graph. (Or at least, providing an initial seed to classical optimization methods that search for dense subgraphs.)

This is an NP-hard problem, so to say that I was skeptical of the proposal would be a gross understatement. Nevertheless, it turns out that there *is* a striking observation by the Xanadu team at the core of their proposal: namely that, given a graph G and a positive even integer k, a GBS device can be used to sample a random subgraph of G of size k, with probability *proportional to the square of the number of perfect matchings in that subgraph*. Cool, right? And potentially even useful, especially if the number of perfect matchings could serve as a rough indicator of the subgraph’s density! Alas, Xanadu’s Juan Miguel Arrazola himself recently told me that there’s a cubic-time classical algorithm for the same sampling task, so that the possible quantum speedup that one could get from GBS in this way is at most polynomial. The search for a useful application of BosonSampling continues!

And that’s all for now! I’m grateful to all the colleagues I talked to over the last couple weeks, including Alex Arkhipov, Juan Miguel Arrazola, Sergio Boixo, Raul Garcia-Patron, Leonid Gurvits, Gil Kalai, Chaoyang Lu, John Martinis, and Jelmer Renema, while obviously taking sole responsibility for any errors in the above. I look forward to a spirited discussion in the comments, and of course I’ll post updates as I learn more!

Comment #1 December 16th, 2020 at 10:05 am

The typical steps of Quantum Supremacy seem to be:

1) celebrate that Gil Kalai’s crackpot claims are finally refuted!

2) various confusing corrections and fine points that only Gil Kalai can follow

Comment #2 December 16th, 2020 at 10:09 am

A question about boson sampling:

Years ago it seemed that the difficulty was firing many photons simultaneously, the experiment had to do an exponentially growing amount of generations to have a successful event.

Has there been some major breakthrough in photonics?

Comment #3 December 16th, 2020 at 10:13 am

fred #1: In case this wasn’t clear from the post—Gil will lose the Quantum Supremacy War, indeed he probably already lost it with Google’s experiment. What we’re trying to figure out is whether, even while losing the wider war, he’ll make a better-than-expected showing in the Battle for BosonSampling. 😀

Comment #4 December 16th, 2020 at 10:20 am

fred #2: Part III of the post talked about exactly that. Yes, Scattershot BosonSampling (which I blogged about back in 2014) was a major conceptual advance, which lets you use SPDC sources with no loss in classical computational hardness compared to single-photon sources. Gaussian BosonSampling, what the new experiment used, gives a further efficiency gain compared to SBS, but possibly

witha loss in classical computational hardness—we’re not sure yet!Comment #5 December 16th, 2020 at 12:18 pm

I don’t think I was the one who came up with the abbreviation CHOG. My first suggestion was ‘Feral HOG’ but I confess the only reason for that was so that we could make jokes for all time to come about how boson sampling needs k = 30-50 Feral HOGs [1].

But seriously: I think it would be good if someone (and as Scott suggested, I think Chaoyang is the natural person to do it) came up with a different name than HOG altogether, since I’ve seen lots of people be confused by how dissimilar the two are. Maybe ‘cross entropy difference’?

[1] https://www.vox.com/future-perfect/2019/8/6/20756162/30-to-50-feral-hogs-meme-assault-weapons-guns-kids for the people who don’t waste all their time on twitter.

Comment #6 December 16th, 2020 at 12:31 pm

Jelmer #5: I love “Feral HOG”! You did call it “Chinese HOG” in emails to me; the contraction to “CHOG” was mine. In any case, yes, I agree that Chaoyang gets renaming rights if he wants them!

Comment #7 December 16th, 2020 at 7:38 pm

Hi Scott #6 Jelmer #5

Haha, I think CHOG is good, and here I would name the “C” after “C”hinese or “C”hen Ming-Cheng the co-first author of this paper and a brilliant young mind who proposed this analysis and did a lot of hard work behind it.

Comment #8 December 16th, 2020 at 8:30 pm

“But another part of me feels that, with quantum supremacy claims, much like with proposals for new cryptographic codes, vetting can’t be the responsibility of one or two reviewers.”

Absolutely.

Comment #9 December 16th, 2020 at 9:15 pm

Is that algorithm described somewhere?

I can see sampling with probability proportional to the number of perfect matchings being easy classically (just pick k/2 disjoint edges at random?), but i’m wondering how they get the square.

Comment #10 December 16th, 2020 at 11:29 pm

I’m glad that you admit that in this case Gil’s 2014 work seems to be spot on.

If you really mean that you want QC skeptics to keep doing their good work I think it would help if your tone towards QC skeptics changes a bit. Most of your posts come out as mocking them.

Comment #11 December 16th, 2020 at 11:34 pm

Regarding the reviewers role: I feel Nature should choose reviewers more carefully. You have been a champion of boson sampling so don’t you think that there’s bound to be at least a subconscious bias? You were rooting for this to work out right?

Maybe Science could have chosen someone more skeptical since they are bound to question the results more mercilessly. Or at least someone more agnostic to the outcome.

I mean clearly Scott is one of the biggest cheerleaders for QC and quantum supremacy.

Comment #12 December 17th, 2020 at 3:13 am

>Alas, Xanadu’s Juan Miguel Arrazola himself recently told me that there’s a cubic-time >classical algorithm for the same sampling task

Juan Miguel Arrazola and colleagues kindly informed us of their nice results [which was later put on arXiv.2010.15595], after we submitted our manuscript. We had communicated over this and concluded that because in our work, collision modes dominate where the actual number of photons is approximate twice the number of clicked detectors, and thus the new algorithm cannot be readily used to speed up simulating our current experiment. We added a note at the end of our arXiv version of our paper available https://arxiv.org/abs/2012.01625 which was unedited (>2000 words cut due to editorial request) and much longer than the Science version.

Comment #13 December 17th, 2020 at 6:12 am

Dear Scott,

It is nice to to see that this time you took under the account advances in the classical simulation of Boson Samling and Gaussian Boson sampling in recent years.

Two comments about simulation papers that you refer to.

Renema,Shchesnovich, and Garcia-Patron https://arxiv.org/abs/1809.01953 – as far as I remember that nice work deals implicitelly with the standard m>>n^2 regime as the authors replace averages over unitary group by averages over GUE ensemble. Hence, it is not clear if their bounds bound hold if n~m.

Kalai and Kinder paper https://arxiv.org/abs/1409.3093 – that work also studies permanents of GUE ensemble. Therefore, it is not clear for me if it can be used if n~m.

And this is the relevant regime here as USTC paper had m=100 modes with n<=70 photons.

Comment #14 December 17th, 2020 at 7:33 am

A clarification: In my previous post I wrongly coined GUE ensemble. The proper matrices ensemble that is relevant to both borks is of course the ensemble of complex Gaussian matrixes.

Comment #15 December 17th, 2020 at 10:29 am

Hi Michal,

Yes, that is an excellent point! Our results are indeed for #modes >> #photons, which is violated in the case of the experiment. The only one who has seriously studied the case #modes \(\approx \) #photons is Shchesnovich. I tried to find the refence but couldn’t.

Off the top of my head, he showed that for small k, in the high-collision regime there is a lower bound to the trace distance between the kth order approximation and the actual distribution, which arises from the fact that you begin to ‘feel’ the unitary constraint. No idea how that translates to the experiment though.

Comment #16 December 17th, 2020 at 10:30 am

correction: #modes >> #photons^2

Comment #17 December 17th, 2020 at 12:19 pm

Rahul #11: When I sent in my review, I clearly explained that I was only qualified to comment as a theoretical scientist, and that they needed to find others to vet the experimental optics aspects of the paper. Alas, I fear that the editors largely disregarded that advice, and relied on me a lot more than they should have.

Having said that, I 100% stand by the decision to publish the work in

Science. Indeed, even the skeptics who I’ve spoken to about it—the people who are doing the most right now to try to undermine the supermacy claim—agree that publication was more than warranted.If they’d gotten people like Jelmer Renema, Valery Shchesnovich, Daniel Brod, Christine Silberhorn, Sergio Boixo, etc. to review the paper prior to publication, the difference is simply that some of the technical back-and-forth that you’re seeing happen right now, could’ve instead happened behind the scenes. I don’t know if that would’ve been better or worse.

Regarding Gil, and regarding my tone: in my defense, I’ve given Gil’s ideas more attention and more airtime than almost anyone else in the “pro-QC” camp, and Gil has repeatedly written that he’s flattered by that! Having said that, the fundamental problem is that Gil has staked out so many positions on QC over the years that we

knowto be ludicrously off-base, that he makes it really, really hard to separate the wheat from the chaff. Still, now that we’ve seen the direct relevance of Kalai-Kindler noise to an actual experiment—well, regardless of whether the supremacists or the anti-supremacists end up winning this particular skirmish (or whether it devolves into a complicated stalemate), I take public responsibility for not having engaged more carefully with the Kalai-Kindler paper before, and I’ll ease up on the ribbing. 🙂Comment #18 December 18th, 2020 at 2:02 am

>>> Not surprisingly, one skeptic of the new quantum supremacy claim is Gil Kalai,

>>> who (despite Google’s result last year, which Gil still believes must be in error)

Can you maybe comment on Gil Kalais (and ofc. Rinott, Shoman) approach in “Statistical Aspects of the Quantum Supremacy Demonstration”? (https://arxiv.org/abs/2008.05177)

I am not deep enough in the matter and simply not good enough in the maths to understand this. If my basic understanding is correct, one cannot directly proof quantum advantage, as the classical part cannot practically be executed. So the argument must be scaled from “smaller” quantum circuits to the larger ones.

For this scaling, a noise and fidelity model has to be found (via estimators) and the problem lies in garantuing, that this model is actually valid. Looking at Fig. 8 in the arxiv preprint, it doesn’t look like the estimator results in the expected data.

Looking around but I have not seen a discussion of the findings and would be happy to hear some arguments, why Gil Kalai should be wrong about this here. Can anyone shed some light on where I get it wrong?

Comment #19 December 18th, 2020 at 2:36 am

BosonSampling kind of reminds me the classical (reversible) billard-computation model. Is there a more formal analogy, or not al all?

On another note: is it fully rejected by the QC community the fact that there might be a limit in the number of qbits which can be operated coherently? I mean: this limit might exist due to unavoidable sources of noise (I’m a dark matter physicist and we see cosmic rays even 2km underground), but the question is..are these limits so far away that we can safely build an useful QC (which means running useful algorithms + the error-correction qbits) ? Is this question fully answered by a theorem or it is a sort of open question? Very informative post: thanks Scott. When I read your blog I wonder why I did not went into TCS…during my physics undergrad after reading a CS book I learned just by chance about P,NP and the rest of the zoo and I was so impressed that I was about to change field. It is a wonderful perspective on the physical world.

Comment #20 December 18th, 2020 at 4:54 pm

Here is a comment in a different direction that I think is also important. This boson sampling experiment is often described as simply — as in the title of this post — “Chinese boson sampling” or similar. It’s very good up to a point to credit Chinese scientists for first-rate research. But it’s also obviously simplistic to credit “China”, a nation of 1.4 billion people, with achieving quantum supremacy. No one calls Google Sycamore “American quantum supremacy” or “American quantum circuit sampling”. It was reasonably common to credit “Google”, which is better than “American”, but still not all that great. “Sycamore” is a good name, and “Google Sycamore” is fair.

The name that the Shanghai research used themselves is Jiǔzhāng, which literally means Nine Chapters. It is refers an ancient Chinese math book, The Nine Chapters on the Mathematical Art. I think that “Nine Chapters” is also a great name, at least as good as “Sycamore”. I think it would be better then to call it something like the Chinese Nine Chapters boson sampling or quantum supremacy experiment.

Comment #21 December 18th, 2020 at 7:58 pm

+1 to Greg Kuperberg’s comment. “Jiuzhang” is not hard to say.

Comment #22 December 18th, 2020 at 11:03 pm

Greg and Ryan: I admit that I paused for a minute when writing “Chinese BosonSampling.” But then I reflected that, for better or worse, neither I nor most people would’ve had the slightest hesitation in writing about an “Argentinian experiment” or “Australian experiment” or whatever. And the alternatives—e.g., “USTC experiment,” “Hefei experiment,” “Jiuzhang experiment,” “Pan group experiment,” “Chaoyang Lu’s group’s experiment”—all had the problem either that they singled out one member of the team, or that most of my readers, glancing at the title, wouldn’t understand what thing I was referring to.

Comment #23 December 18th, 2020 at 11:55 pm

For me at least, it distinctly helps to translate “Jiǔzhāng” to “Nine Chapters”, even if (heretofore?) it is nonstandard. Some foreign names and terms benefit from translation. E.g., it seems fine that the Pearl River is called that in English, and not “Zhū Jiāng”. On the flip side, it seems dubious to call porcelain “china”, even if the name was meant to give credit. I guess I take the point that it is non-standard, so at least at first it could be “Chinese Nine Chapters BosonSampling”.

Comment #24 December 19th, 2020 at 8:19 am

Scott #22

I agree with the others that your title seems a bit off. I think what’s missing is simply the word “experiment”. If the title were something like “Chinese Boson Sampling Experiment” or “Chinese Boson Sampling Results” it would be fine. But “Chinese Boson Sampling” sounds like you’re talking about some specific Chinese kind of Boson Sampling that is essentially unique to China and different from any kind of Boson Sampling that might be done in Australia or Germany, for example.

Anyway that’s just my two cents.

Comment #25 December 19th, 2020 at 8:49 am

I always get a good laugh of any mention of Gil on the blog, outside of him who would you have as your top 3 arch nemesis’ in academia?

Comment #26 December 19th, 2020 at 10:13 am

Gerard #24: Ok, edited!

Comment #27 December 20th, 2020 at 1:23 am

Scott:

Any comments on this :

https://www.insidehighered.com/admissions/article/2020/12/14/u-texas-will-stop-using-controversial-algorithm-evaluate-phd

Maybe a blog post request?

The Death and Life of an Admissions Algorithm

U of Texas at Austin has stopped using a machine-learning system to evaluate applicants for its Ph.D. in computer science. Critics say the system exacerbates existing inequality in the field.

Comment #28 December 20th, 2020 at 2:00 am

Rahul #27: As you can imagine, there was plenty of discussion about this among the CS faculty, but I have less to say about it than you might think. Based on a few years’ experience, I actually thought the machine-learning system was pretty impressively calibrated—sure, it’s biased, but (the part people keep forgetting)

the only relevant question is whether humans are even more biased.On the other hand, mostly the system just confirmed what anyone could see for themselves with two minutes’ perusal of an application. So there might be a little more work, but we’ll manage fine without it.Comment #29 December 20th, 2020 at 3:23 am

Thanks Scott.

Beyond this particular UT case, what’s your general opinion on using AI for things like recruiting, prison parole, admissions etc.

The question you raised whether humans may be even more biased, is it empirically testable?

Comment #30 December 20th, 2020 at 9:25 am

Regarding the first quantum supremacy claim from Arute et al.: Physicists are really pulling the wool over non-experts’ eyes if they claim quantum supremacy from solutions to a “computational task” that can’t be posed before the “computing” device is built. One could not have posed the “computational task” (which includes exact specification of gate operations) before experimental work began, because of the gates were non-ideal and rotation angles were calibrated in-situ.

Comment #31 December 20th, 2020 at 10:28 am

Till #30: Nope, you’re just flatly misinformed about that. The mathematical specification of the Linear XEB task includes the specific pattern of 1- and 2-qubit gates (obviously), but it does

notinclude noise, decoherence, or any other imperfections in the device.Comment #32 December 20th, 2020 at 11:57 am

Scott #30: I think they’re referring to the unitary refitting used in Google’s supremacy paper. There were limitations in how exactly the parameters of the two qubit gate that was used could be hit, but it was possible to run experiments to figure out pretty accurately which nearby two qubit gate was being performed. So the latter was used when computing fidelities.

Basically, in an ideal experiment, the circuit to execute would be generated by a referee and then run by the quantum machine as well as the classical machine. Instead, the quantum machine was saying “here are the operations I can do the best right now for each pair of qubits” and then the referee was picking a circuit that used those operations. This means that if you wanted to run the circuits again at a later date, the quantum hardware would underperform since its calibrated gates would differ slightly.

The thing that convinced me that this caveat wasn’t such a big deal (I was worried about it at the time) is that it’s not a fundamental obstacle to performing arbitrary computations. Because the angle deviations are small, and because the “ideal” angles weren’t divisors of a full turn, the overhead of decomposing desired gates into the two qubit gates that happen to be calibrated best is the same as the overhead of decomposing into a common two qubit gate. Similarly, it would still be possible to create client-certified random numbers under this limitation.

Comment #33 December 21st, 2020 at 8:38 am

Craig Gidney #32: Ok, thanks for clarifying what was meant! While I agree with you that this issue is minor, it’s definitely one that would be good to fix in future quantum supremacy experiments.

Comment #34 December 23rd, 2020 at 8:47 pm

Ain’t it a bit racialist to say “Chinese bosonSampling”? It sounds a bit like “the China virus”. Does you agree?

Comment #35 December 23rd, 2020 at 8:51 pm

Ali G #34: See above; I already changed it to “Chinese BosonSampling

experiment.” Do people no longer read before commenting?Comment #36 December 29th, 2020 at 7:10 am

[…] For more details see my post about the recent HQCA Boson Sampling experiment, and also Scott Aaronson’s recent post. […]

Comment #37 December 30th, 2020 at 12:46 am

Scott #31: The gates themselves are systematically incorrect rotations. That is not noise and it is a mistake to conflate systematics and noise. The authors themselves wrote that these errors can be corrected with single-qubit rotations. But they didn’t correct them, because doing so would have reduced the gate depth or some such.

My statement stands: One could not have posed the “computational task” (which includes exact specification of gate operations) before experimental work began.

Comment #38 December 30th, 2020 at 1:29 am

Scott #31: Arute et al. write “For the full experiment, we generate quantum circuits using the two-qubit unitaries measured for each pair during simultaneous operation, rather than a standard gate for all pairs. The typical two-qubit gate is a full iSWAP with 1/6th of a full CZ. Using individually calibrated gates in no way limits the universality of the demonstration.”

The calibrations may not impact “the universality of the demonstration” (whatever that means), but they certainly impact the ability to perform predefined computational tasks. What their “quantum processor” “computes” does not exist independently of the experiment. How can anyone call it a “computational task”?

Comment #39 December 30th, 2020 at 1:12 pm

Till #38: The best way to run experiments consists of two phases: 1) calibration and 2) running the desired circuits for a particular experiment. Note that the circuits used in the calibration are different from the circuits in the experiment, and in our case they do not involve more than two qubits. With newer calibration methods we discover small systematic deviations between the “native gates” and some standard gate. Experiments work better if the circuit design includes native gates. One reason is that we are often interested in composite gates, which can be implemented equally well with native gates (see Appendix A of arXiv:2010.07965) The specific values of the native gates are known after calibration, and before running the circuits.

In the specific case of the 2019 experiment using “Sycamore” standard gates instead of native gates halfs the fidelity, see S30 in the supplementary material. Also the native are universal, see Sec. VII F.

Comment #40 December 30th, 2020 at 1:38 pm

Sergio #39: Thanks!!

So, my summary would be that yes, there’s a calibration phase, and the 2-qubit gates used depend on the outcome of that phase, but there’s still a clear separation enforced between the calibration phase and the actual running of the QC.

Of course, it would be great in future sampling-based quantum supremacy experiments to be able to specify all the gates strictly from the outset, with no need for a calibration phase. One step at a time, though!

Comment #41 December 30th, 2020 at 3:35 pm

Sergio #39: Being an experimentalist myself, I think I understand approximately what was done for calibration in the paper and also that the gates are universal. But your explanation doesn’t address my objection: The paper does not describe a “computational” task because the task, due to the calibration issue, is defined only for that specific experiment.

Computation is supposed to be more general than that or is this a new model of computation where, before you give it a problem, the machine first tells you what (non-standard) unitaries it can perform?

Scott #40: If QC were not riding a hype cycle, it might be reasonable to cut the experiments more slack. But less rigor now is just going lead to more disillusionment in the future.

Comment #42 December 31st, 2020 at 1:08 am

Till, in case you’re interested, John Martinis sent me the following message:

Thanks for the nice discussion. We did not cheat by recalibrating via the large RCS circuit: we only calibrated using one and two qubit gates.

Scott, this brings up a rather interesting concept that might be interesting for your readers. The essence of Till’s idea was actually done as a test in figure S42 of the supplement (see here), which is actually one of the most important and interesting checks of the paper. Here, the numerical simulation puts in varying amounts of phase shift in a single qubit gate in the middle of the circuit, showing that for a pi phase shift you get zero fidelity, showing explicitly that one error kills XEB. However, if you look closely at the figure, you see the cosine curve is shifted slightly to the right from optimal, showing that the gate’s calibration was not quite perfect. If we would have shifted the calibration of the gate, the fidelity would have been a tiny bit higher. (But of course we did not do that in our analysis.) You can imagine by doing this to each gate, all these tiny improvements might add up to something bigger, but probably only a 10% improvement since the measured fidelity matched so well the computed one. This matching is one of the most important findings of the experiment.

It would be interesting to do this analysis of Figure S42 for each qubit and each time step. (The data is archived so it can be done). If you found a constant phase shift over time, this would indicate an overall calibration error. If it depends on time, then there may be some kind of tails in the time-trace of waveform control. In general, I thought this data could be used to build some kind of crosstalk model to understand the control errors to the qubit, especially since the crosstalk is small and you could maybe solve the general problem with a perturbation analysis.

This is something I wanted to do after the QS experiment, but never got around to it. If Till or someone else is interested, I would be happy to talk and potentially collaborate on a paper to understand this better. I think it could be an important tool for probing the inner workings of a real quantum computer system: what happens to the 1- and 2-qubit calibrations when you run a complex circuit?

Scott, if you wish to post this I would be happy to discuss with interested readers.

John

Comment #43 December 31st, 2020 at 12:03 pm

Scott #42: Thanks for sharing. John’s message explains how much calibration was done, which is about as expected (but I’m no expert on superconducting QC). I guess the authors view this tayloring of the problem to the specific machine as reasonable, but it conflicts with the ideal (since at least Turing) that computation is not tied to any one computing device.

Comment #44 December 31st, 2020 at 12:39 pm

Till #43: I’m with you in wanting a better experiment where calibration is less of an issue! I guess knowing some of the people involved, knowing the years of toil it took to get this far makes me less willing to knock “quantum supremacy modulo some initial calibration.”

Comment #45 January 5th, 2021 at 2:25 pm

[…] have not abated. They have just recently been summarized by Gil Kalai here, and also here by Scott Aaronson in regard to an independent claim by researchers from USTC in […]

Comment #46 January 17th, 2021 at 11:23 am

A comment on Arute et al. (Sycamore): Random quantum circuit (RQS) sampling that leads to chaotic behavior: Two years ago, Google proposed a random “quantum chaos” task, which they argued is hard and would prove quantum supremacy (no need to approximate the output distribution). Chaotic behavior and sensitivity to initial conditions is a deterministic non-linear phenomenon which is closely related to errors and noise. It seems that one cannot differentiate between the so-called “random quantum chaos” result and classical chaotic behavior (being highly sensitive to noise). I have an inkling suspicion that chaotic behavior also inflicts the new Sycamore processor because Arute et al. write: “Our processor uses transmon qubits, which can be thought of as nonlinear superconducting resonators at 5–7 GHz” (p. 506)”.