## Archive for the ‘Complexity’ Category

### Two papers

Tuesday, April 21st, 2015

Just to get myself back into the habit of blogging:

For those of you who don’t read Lance’s and Bill’s blog, there was a pretty significant breakthrough in complexity theory announced last week.  (And yes, I’m now spending one of the two or so uses of the word “breakthrough” that I allow myself per year—wait, did I just spend the second one with this sentence?)  Ben Rossman (a former MIT PhD student whose thesis committee I was honored to serve on), Rocco Servedio, and Li-Yang Tan have now shown that the polynomial hierarchy is infinite relative to a random oracle, thereby solving the main open problem from Johan Håstad’s 1986 PhD thesis.  While it feels silly even to mention it, the best previous result in this direction was to separate PNP from Σ2P relative to a random oracle, which I did in my Counterexample to the Generalized Linial-Nisan Conjecture paper.  In some sense Rossman et al. infinitely improve on that (using completely different techniques).  Proving their result boils down to proving a new lower bound on the sizes of constant-depth circuits.  Basically, they need to show that, for every k, there are problems that can be solved by small circuits with k layers of AND, OR, and NOT gates, but for which the answer can’t even be guessed, noticeably better than chance, by any small circuit with only k-1 layers of AND, OR, and NOT gates.  They achieve that using a new generalization of the method of random restrictions.  Congratulations to Ben, Rocco, and Li-Yang!

Meanwhile, if you want to know what I’ve been doing for the last couple months, one answer is contained in this 68-page labor of love preprint by me and my superb PhD students Daniel Grier and Luke Schaeffer.  There we give a full classification of all possible sets of classical reversible gates acting on bits (like the Fredkin, Toffoli, and CNOT gates), as well as a linear-time algorithm to decide whether one reversible gate generates another one (previously, that problem wasn’t even known to be decidable).  We thereby completely answer a question that basically no one was asking, although I don’t understand why not.

### Memrefuting

Wednesday, February 11th, 2015

(in which I bring this blog back to the “safe, uncontroversial” territory of arguing with people who think they can solve NP-complete problems in polynomial time)

A few people have asked my opinion about “memcomputing”: a computing paradigm that’s being advertised, by its developers, as a way to solve NP-complete problems in polynomial time.  According to the paper Memcomputing NP-complete problems in polynomial time using polynomial resources and collective states, memcomputing “is based on the brain-like notion that one can process and store information within the same units (memprocessors) by means of their mutual interactions.”  The authors are explicit that, in their view, this idea allows the Subset Sum problem to be solved with polynomial resources, by exploring all 2n possible subsets in parallel, and that this refutes the Extended Church-Turing Thesis.  They’ve actually built ‘memcomputers’ that solve small instances of Subset Sum, and they hope to scale them up, though they mention hardware limitations that have made doing so difficult—more about that later.

A bunch of people (on Hacker News, Reddit, and elsewhere) tried to explain the problems with the Subset Sum claim when the above preprint was posted to the arXiv last year.  However, an overlapping set of authors has now simply repeated the claim, unmodified, in a feature article in this month’s Scientific American.  Unfortunately the SciAm article is behind a paywall, but here’s the relevant passage:

Memcomputing really shows advantages when applied to one of the most difficult types of problems we know of in computer science: calculating all the properties of a large series of integers. This is the kind of challenge a computer faces when trying to decipher complex codes. For instance, give the computer 100 integers and then ask it to find at least one subset that adds up to zero. The computer would have to check all possible subsets and then sum all numbers in each subset. It would plow through each possible combination, one by one, which is an exponentially huge increase in processing time. If checking 10 integers took one second, 100 integers would take 1027 seconds—millions of trillions of years … [in contrast,] a memcomputer can calculate all subsets and sums in just one step, in true parallel fashion, because it does not have to shuttle them back and forth to a processor (or several processors) in a series of sequential steps. The single-step approach would take just a single second.

For those tuning in from home: in the Subset Sum problem, we’re given n integers a1,…,an, and we want to know whether there exists a subset of them that sums to a target integer k.  (To avoid trivializing the problem, either k should be nonzero or else the subset should be required to be nonempty, a mistake in the passage quoted above.)

To solve Subset Sum in polynomial time, the basic idea of “memcomputing” is to generate waves at frequencies that encode the sums of all possible subsets of ai‘s, and then measure the resulting signal to see if there’s a frequency there that corresponds to k.

Alas, there’s a clear scalability problem that seems to me to completely kill this proposal, as a practical way of solving NP-complete problems.  The problem is that the signal being measured is (in principle!) a sum of waves of exponentially many different frequencies.  By measuring this wave and taking a Fourier transform, one will not be able to make out the individual frequencies until one has monitored the signal for an exponential amount of time.  There are actually two issues here:

(1) Even if there were just a single frequency, measuring the frequency to exponential precision will take exponential time. This can be easily seen by contemplating even a moderately large n.  Thus, suppose n=1000.  Then we would need to measure a frequency to a precision of one part in ~21000. If the lowest frequency were (say) 1Hz, then we would be trying to distinguish frequencies that differ by far less than the Planck scale.  But distinguishing frequencies that close would require so much energy that one would exceed the Schwarzschild limit and create a black hole!  The alternative is to make the lowest frequency slower than the lifetime of the universe, causing an exponential blowup in the amount of time we need to run the experiment.

(2) Because there are exponentially many frequencies, the amplitude of each frequency will get attenuated by an exponential amount.  Again, suppose that n=1000, so that we’re talking about attenuation by a ~2-1000 factor.  Then given any amount of input radiation that could be gathered in physical universe, the expected amount of amplitude on each frequency would correspond to a microscopically small fraction of 1 photon — so again, it would take exponential time for us to notice any radiation at all on the frequency that interests us (unless we used an insensitive test that was liable to confuse that frequency with many other nearby frequencies).

What do the authors have to say about these issues?  Here are the key passages from the above-mentioned paper:

all frequencies involved in the collective state (1) are dampened by the factor 2-n.  In the case of the ideal machine, i.e., a noiseless machine, this would not represent an issue because no information is lost.  On the contrary, when noise is accounted for, the exponential factor represents the hardest limitation of the experimentally fabricated machine, which we reiterate is a technological limit for this particular realization of a memcomputing machine but not for all of them …

In conclusion we have demonstrated experimentally a deterministic memcomputing machine that is able to solve an NP-complete problem in polynomial time (actually in one step) using only polynomial resources.  The actual machine we built clearly suffers from technological limitations due to unavoidable noise that impair [sic] the scalability.  This issue can, however, be overcome in other UMMs [universal memcomputing machines] using other ways to encode such information.

The trouble is that no other way to encode such information is ever mentioned.  And that’s not an accident: as explained above, when n becomes even moderately large, this is no longer a hardware issue; it’s a fundamental physics issue.

It’s important to realize that the idea of solving NP-complete problems in polynomial time using an analog device is far from new: computer scientists discussed such ideas extensively in the 1960s and 1970s.  Indeed, the whole point of my NP-complete Problems and Physical Reality paper was to survey the history of such attempts, and (hopefully!) to serve as a prophylactic against people making more such attempts without understanding the history.  For computer scientists ultimately came to realize that all proposals along these lines simply “smuggle the exponentiality” somewhere that isn’t being explicitly considered, exactly like all proposals for perpetual-motion machines smuggle the entropy increase somewhere that isn’t being explicitly considered.  The problem isn’t a practical one; it’s one of principle.  And I find it unfortunate that the recent memcomputing papers show no awareness of this story.

(Incidentally, quantum computing is interesting precisely because, out of all “post-Extended-Church-Turing” computing proposals, it’s the only one for which we can’t articulate a clear physical reason why it won’t scale, analogous to the reasons given above for memcomputing.  With quantum computing the tables are turned, with the skeptics forced to handwave about present-day practicalities, while the proponents wield the sharp steel of accepted physical law.  But as readers of this blog well know, quantum computing doesn’t seem to promise the polynomial-time solution of NP-complete problems, only of more specialized problems.)

### Quantum Machine Learning Algorithms: Read the Fine Print

Monday, February 2nd, 2015

So, I’ve written a 4-page essay of that title, which examines the recent spate of quantum algorithms for clustering, classification, support vector machines, and other “Big Data” problems that grew out of a 2008 breakthrough on solving linear systems by Harrow, Hassidim, and Lloyd, as well as the challenges in applying these algorithms to get genuine exponential speedups over the best classical algorithms.  An edited version of the essay will be published as a Commentary in Nature Physics.  Thanks so much to Iulia Georgescu at Nature for suggesting that I write this.

Update (April 4, 2015): The piece has now been published.

### Quantum computing news items (by reader request)

Monday, January 12th, 2015

Within the last couple months, there was a major milestone in the quest to build a scalable quantum computer, and also a major milestone in the quest to figure out what you would do with a quantum computer if you had one.  As I’ve admitted many times, neither of those two quests is really the reason why I got into quantum computing—I’m one of the people who would still want to study this field, even if there were no serious prospect either of building a quantum computer or of doing anything useful with it for a thousand years—but for some reason that I don’t fully understand, both of those goals do seem to excite other people.

So, OK, the experimental breakthrough was the Martinis group’s use of quantum error-correction with superconducting qubits, to preserve a logical bit for several times longer than the underlying physical qubits survived for.  Shortly before this came out, I heard Krysta Svore give a talk at Yale in which she argued that preserving a logical qubit for longer than the physical qubits was the next experimental milestone (the fourth, out of seven she listed) along the way to a scalable, fault-tolerant quantum computer.  Well, it looks like that milestone may have been crossed.  (update: I’ve since learned from Graeme Smith, in the comments section, that the milestone crossed should really be considered the “3.5th,” since even though quantum error-correction was used, the information that was being protected was classical.  I also learned from commenter Jacob that the seven milestones Krysta listed came from a Science paper by Schoelkopf and Devorret.  She cited the paper; the forgetfulness was entirely mine.)

In more detail, the Martinis group used a linear array of 9 qubits: 5 data qubits interleaved with 4 measurement qubits. The authors describe this setup as a “precursor” to Kitaev’s surface code (which would involve a 2-dimensional array).  They report that, after 8 cycles of error detection and correction, they were able to suppress the effective error rate compared to the physical qubits by a factor of 8.5.  They also use quantum state tomography to verify that their qubits were indeed in entangled states as they did this.

Of course, this is not yet a demonstration of any nontrivial fault-tolerant computation, let alone of scaling such a computation up to where it’s hard to simulate with a classical computer.  But it pretty clearly lies along the “critical path” to that.

As I blogged back in September, Google recently hired Martinis’s group away from UC Santa Barbara, where they’ll work on superconducting quantum annealing, as a step along the way to full universal QC.  As I mentioned then, the Martinis group’s “Xmon” qubits have maybe 10,000 times the coherence times of D-Wave’s qubits, at least when you measure coherence in the usual ways.  The fact that Martinis et al. are carefully doing quantum state tomography and demonstrating beneficial error-correction before scaling up are further indications of the differences between their approach and D-Wave’s.  Of course, even if you do everything right, there’s still no guarantee that you’ll outperform a classical computer anytime soon: it might simply be that the things you can do in the near future (e.g., quantum annealing for NP-complete problems) are not things where you’re going to outperform the best classical algorithms.  But it’s certainly worth watching closely.

Meanwhile, the quantum algorithms breakthrough came in a paper last month by an extremely well-known trio down the Infinite Corridor from me: Farhi, Goldstone, and Gutmann.  In slightly earlier work, Farhi et al. proposed a new quantum algorithm for NP-hard optimization problems.  Their algorithm badly needs a name; right now they’re just calling it the “QAOA,” or Quantum Approximate Optimization Algorithm.  But here’s what you need to know: their new algorithm is different from their famous adiabatic algorithm, although it does become equivalent to the adiabatic algorithm in a certain infinite limit.  Rather than staying in the ground state of some Hamiltonian, the QAOA simply

1. starts with a uniform superposition over all n-bit strings,
2. applies a set of unitary transformations, one for each variable and constraint of the NP-hard instance,
3. repeats the set some number of times p (the case p=1 is already interesting), and then
4. measures the state in the computational basis to see what solution was obtained.

The unitary transformations have adjustable real parameters, and a big part of the game is figuring out how to set the parameters to get a good solution.

The original, hyper-ambitious goal of the QAOA was to solve the Unique Games problem in quantum polynomial time—thereby disproving the Unique Games Conjecture (which I previously blogged about here), unless NP⊆BQP.  It hasn’t yet succeeded at that goal.  In their earlier work, Farhi et al. managed to show that the QAOA solves the MAX-CUT problem on 3-regular graphs with approximation ratio 0.6924, which is better than random guessing, but not as good as the best-known classical algorithms (Goemans-Williamson, or for the degree-3 case, Halperin-Livnat-Zwick), let alone better than those algorithms (which is what would be needed to refute the UGC).

In their new work, Farhi et al. apply the QAOA to a different problem: the poetically-named MAX E3LIN2.  Here you’re given a collection of linear equations mod 2 in n Boolean variables, where each equation involves exactly 3 variables, and each variable appears in at most D equations.  The goal is to satisfy as many of the equations as possible, assuming that they’re not all satisfiable (if they were then the problem would be trivial).  If you just guess a solution randomly, you’ll satisfy a 1/2 fraction of the equations.  Håstad gave a polynomial-time classical algorithm that satisfies a 1/2+c/D fraction of the maximum number of satisfiable equations, for some constant c.  This remains the best approximation ratio that we know how to achieve classically.  Meanwhile, Trevisan showed that if there’s a polynomial-time classical algorithm that satisfies a 1/2+c/√D fraction of the max number of satisfiable equations, for a sufficiently large constant c, then P=NP.

OK, so what do Farhi et al. do?  They show that the QAOA, with suitably tuned parameters, is able to satisfy a 1/2+c/D3/4 fraction of the total number of equations in polynomial time, for some constant c.  (In particular, this implies that a 1/2+c/D3/4 fraction of the equations are satisfiable—assuming, as Farhi et al. do, that two equations directly contradicting each other, like x+y+z=0 and x+y+z=1, never appear in the same instance.)

Now, the above is a bigger fraction than the best-known classical algorithm satisfies!  (And not only that, but here the fraction is of the total number of equations, rather than the number of satisfiable equations.)  Farhi et al. also show that, if the constraint hypergraph doesn’t contain any small cycles, then QAOA can satisfy a 1/2+c/√D fraction of the equations in polynomial time, which is essentially the best possible unless NP⊆BQP.

The importance of this result is not that anyone cares about the MAX E3LIN2 problem for its own sake.  Rather it’s that, as far as I know, this is the first time that a quantum algorithm has been proved to achieve a better approximation ratio for a natural NP-hard optimization problem than the best known classical algorithm achieves.  People have discussed that as a hypothetical possibility for 20 years, but (again, unless I’m missing something) we never had a good example until now.  The big question now is whether the 1/2+c/D3/4 performance can be matched classically, or whether there truly is an NP-intermediate region of this optimization problem where quantum outperforms classical.  (The third possibility, that doing as well as the quantum algorithm is already NP-hard, is one that I won’t even speculate about.  For, as Boaz Barak rightly points out in the comments section, the quantum algorithm is still being analyzed only in the regime where solutions are combinatorially guaranteed to exist—and that regime can’t possibly be NP-hard, unless NP=coNP.)

[Above, I corrected some errors that appeared in the original version of this post—thanks to Ed Farhi and to the commenters for bringing them to my attention.]

Update (Feb. 3, 2015): Boaz Barak has left the following comment:

in a work with Ankur Moitra, Oded Regev, David Stuerer and Aravindan Vijayaraghavan we were able to match (in fact exceed) the guarantees of the Farhi et al paper via a classical efficient algorithm. (Namely satisfy 1/2 + C/√D fraction of the equations). p.s. we hope to post this on the arxiv soon

### Quantum Complexity Theory Student Project Showcase 3

Friday, December 26th, 2014

Merry Christmas (belatedly)!  This year Quanta Claus has brought us eight fascinating final project reports from students in my 6.845 Quantum Complexity Theory class, covering everything from interactive proofs to query and communication complexity to quantum algorithms to quantum gates (and one project even includes a web-based demo you can try!).  Continuing in the tradition of the two previous showcases, I’m sharing the reports here; some of these works might also be posted to the arXiv and/or submitted to journals.  Thanks so much to the students who volunteered to participate in the showcase, and to all the students for making this such a great class.

### PostBQP Postscripts: A Confession of Mathematical Errors

Sunday, November 30th, 2014

tl;dr: This post reveals two errors in one of my most-cited papers, and also explains how to fix them.  Thanks to Piotr Achinger, Michael Cohen, Greg Kuperberg, Ciaran Lee, Ryan O’Donnell, Julian Rosen, Will Sawin, Cem Say, and others for their contributions to this post.

If you look at my Wikipedia page, apparently one of the two things in the world that I’m “known for” (along with algebrization) is “quantum Turing with postselection.”  By this, Wikipedia means my 2004 definition of the complexity class PostBQP—that is, the class of decision problems solvable in bounded-error quantum polynomial time, assuming the ability to postselect (or condition) on certain measurement outcomes—and my proof that PostBQP coincides with the classical complexity PP (that is, the class of decision problems expressible in terms of whether the number of inputs that cause a given polynomial-time Turing machine to accept does or doesn’t exceed some threshold).

To explain this a bit: even without quantum mechanics, it’s pretty obvious that, if you could “postselect” on exponentially-unlikely events, then you’d get huge, unrealistic amounts of computational power.  For example (and apologies in advance for the macabre imagery), you could “solve” NP-complete problems in polynomial time by simply guessing a random solution, then checking whether the solution is right, and shooting yourself if it happened to be wrong!  Conditioned on still being alive (and if you like, appealing to the “anthropic principle”), you must find yourself having guessed a valid solution—assuming, of course, that there were any valid solutions to be found.  If there weren’t any, then you’d seem to be out of luck!  (Exercise for the reader: generalize this “algorithm,” so that it still works even if you don’t know in advance whether your NP-complete problem instance has any valid solutions.)

So with the PostBQP=PP theorem, the surprise was not that postselection gives you lots of computational power, but rather that postselection combined with quantum mechanics gives you much more power even than postselection by itself (or quantum mechanics by itself, for that matter).  Since PPP=P#P, the class PP basically captures the full difficulty of #P-complete counting problems—that is, not just solving an NP-complete problem, but counting how many solutions it has.  It’s not obvious that a quantum computer with postselection can solve counting problems, but that’s what the theorem shows.  That, in turn, has implications for other things: for example, I showed it can be used to prove classical facts about PP, like the fact that PP is closed under intersection (the Beigel-Reingold-Spielman Theorem), in a straightforward way; and it’s also used to show the hardness of quantum sampling problems, in the work of Bremner-Jozsa-Shepherd as well as my BosonSampling work with Arkhipov.

I’m diffident about being “known for” something so simple; once I had asked the question, the proof of PostBQP=PP took me all of an hour to work out.  Yet PostBQP ended up being a hundred times more influential for quantum computing theory than things on which I expended a thousand times more effort.  So on balance, I guess I’m happy to call PostBQP my own.

That’s why today’s post comes with a special sense of intellectual responsibility.  Within the last month, it’s come to my attention that there are at least two embarrassing oversights in my PostBQP paper from a decade ago, one of them concerning the very definition of PostBQP.  I hasten to clarify: once one fixes up the definition, the PostBQP=PP theorem remains perfectly valid, and all the applications of PostBQP that I mentioned above—for example, to reproving Beigel-Reingold-Spielman, and to the hardness of quantum sampling problems—go through just fine.  But if you think I have nothing to be embarrassed about: well, read on.

The definitional subtlety came clearly to my attention a few weeks ago, when I was lecturing about PostBQP in my 6.845 Quantum Complexity Theory graduate class.  I defined PostBQP as the class of languages L⊆{0,1}* for which there exists a polynomial-time quantum Turing machine M such that, for all inputs x∈{0,1}*,

• M(x) “succeeds” (determined, say, by measuring its first output qubit in the {|0>,|1>} basis) with nonzero probability.
• If x∈L, then conditioned on M(x) succeeding, M(x) “accepts” (determined, say, by measuring its second output qubit in the {|0>,|1>} basis) with probability at least 2/3.
• If x∉L, then conditioned on M(x) succeeding, M(x) accepts with probability at most 1/3.

I then had to reassure the students that PostBQP, so defined, was a “robust” class: that is, that the definition doesn’t depend on stupid things like which set of quantum gates we allow. I argued that, even though we’re postselecting on exponentially-unlikely events, it’s still OK, because the Solovay-Kitaev Theorem lets us approximate any desired unitary to within exponentially-small error, with only a polynomial increase in the size of our quantum circuit. (Here we actually need the full power of the Solovay-Kitaev Theorem, in contrast to ordinary BQP, where we only need part of the power.)

A student in the class, Michael Cohen, immediately jumped in with a difficulty: what if M(x) succeeded, not with exponentially-small probability, but with doubly-exponentially-small probability—say, exp(-2n)?  In that case, one could no longer use the Solovay-Kitaev Theorem to show the irrelevance of the gate set.  It would no longer even be clear that PostBQP⊆PP, since the PP simulation might not be able to keep track of such tiny probabilities.

Thinking on my feet, I replied that we could presumably choose a set of gates—for example, gates involving rational numbers only—for which doubly-exponentially-small probabilities would never arise.  Or if all else failed, we could simply add to the definition of PostBQP that M(x) had to “succeed” with probability at least 1/exp(n): after all, that was the only situation I ever cared about anyway, and the only one that ever arose in the applications of PostBQP.

But the question still gnawed at me: was there a problem with my original, unamended definition of PostBQP?  If we weren’t careful in choosing our gate set, could we have cancellations that produced doubly-exponentially-small probabilities?  I promised I’d think about it more.

By a funny coincidence, just a couple weeks later, Ciaran Lee, a student at Oxford, emailed me the exact same question.  So on a train ride from Princeton to Boston, I decided to think about it for real.  It wasn’t hard to show that, if the gates involved square roots of rational numbers only—for example, if we’re dealing with the Hadamard and Toffoli gates, or the cos(π/8) and CNOT gates, or other standard gate sets—then every measurement outcome has at least 1/exp(n) probability, so there’s no problem with the definition of PostBQP.  But I didn’t know what might happen with stranger gate sets.

As is my wont these days—when parenting, teaching, and so forth leave me with almost no time to concentrate on math—I posted the problem to MathOverflow.  Almost immediately, I got incisive responses.  First, Piotr Achinger pointed out that, if we allow arbitrary gates, then it’s easy to get massive cancellations.  In more detail, let {an} be extremely-rapidly growing sequence of integers, say with an+1 > exp(an).  Then define

$$\alpha = \sum_{n=1}^{\infty} 0.1^{a_n}.$$

If we write out α in decimal notation, it will consist of mostly 0’s, but with 1’s spaced further and further apart, like so: 0.1101000000000001000….  Now consider a gate set that involves α as well as 0.1 and -0.1 as matrix entries.  Given n qubits, it’s not hard to see that we can set up an interference experiment in which one of the paths leading to a given outcome E has amplitude α, and the other paths have amplitudes $$-(0.1^{a_1}), -(0.1^{a_2}), \ldots, -(0.1^{a_k}),$$ where k is the largest integer such that ak≤n. In that case, the total amplitude of E will be about $$0.1^{a_{k+1}},$$ which for most values of n is doubly-exponentially small in n. Of course, by simply choosing a faster-growing sequence {an}, we can cause an even more severe cancellation.

Furthermore, by modifying the above construction to involve two crazy transcendental numbers α and β, I claim that we can set up a PostBQP computation such that deciding what happens is arbitrarily harder than PP (though still computable)—say, outside of exponential space, or even triple-exponential space. Moreover, we can do this despite the fact that the first n digits of α and β remain computable in O(n) time. The details are left as an exercise for the interested reader.

Yet even though we can engineer massive cancellations with crazy gates, I still conjectured that nothing would go wrong with “normal” gates: for example, gates involving algebraic amplitudes only. More formally, I conjectured that any finite set A=(a1,…,ak) of algebraic numbers is “tame,” in the sense that, if p is any degree-n polynomial with integer coefficients at most exp(n) in absolute value, then p(a1,…,ak)≠0 implies |p(a1,…,ak)|≥1/exp(n). And indeed, Julian Rosen on MathOverflow found an elegant proof of this fact. I’ll let you read it over there if you’re interested, but briefly, it interprets the amplitude we want as one particular Archimedean valuation of a certain element of a number field, and then lower-bounds the amplitude by considering the product of all Archimedean and non-Archimedean valuations (the latter of which involves the p-adic numbers). Since this was a bit heavy-duty for me, I was grateful when Will Sawin reformulated the proof in linear-algebraic terms that I understood.

And then came the embarrassing part. A few days ago, I was chatting with Greg Kuperberg, the renowned mathematician and author of our climate-change parable. I thought he’d be interested in this PostBQP progress, so I mentioned it to him. Delicately, Greg let me know that he had recently proved the exact same results, for the exact same reason (namely, fixing the definition of PostBQP), for the latest revision of his paper How Hard Is It to Approximate the Jones Polynomial?. Moreover, he actually wrote to me in June to tell me about this! At the time, however, I regarded it as “pointless mathematical hairsplitting” (who cares about these low-level gate-set issues anyway?). So I didn’t pay it any attention—and then I’d completely forgotten about Greg’s work when the question resurfaced a few months later. This is truly a just punishment for looking down on “mathematical hairsplitting,” and not a lesson I’ll soon forget.

Anyway, Greg’s paper provides yet a third proof that the algebraic numbers are tame, this one using Galois conjugates (though it turns out that, from a sufficiently refined perspective, Greg’s proof is equivalent to the other two).

There remains one obvious open problem here, one that I noted in the MathOverflow post and in which Greg is also extremely interested. Namely, we now know that it’s possible to screw up PostBQP using gates with amplitudes that are crazy transcendental numbers (closely related to the Liouville numbers). And we also know that, if the gates have algebraic amplitudes, then everything is fine: all events have at least 1/exp(n) probability. But what if the gates have not-so-crazy transcendental amplitudes, like 1/e, or (a bit more realistically) cos(2)?  I conjecture that everything is still fine, but the proof techniques that worked for the algebraic case seem useless here.

Stepping back, how great are the consequences of all this for our understanding of PostBQP? Fortunately, I claim that they’re not that great, for the following reason. As Adleman, DeMarrais, and Huang already noted in 1997—in the same paper that proved BQP⊆PP—we can screw up the definition even of BQP, let alone PostBQP, using a bizarre enough gate set. For example, suppose we had a gate G that mapped |0> to x|0>+y|1>, where y was a real number whose binary expansion encoded the halting problem (for example, y might equal Chaitin’s Ω).  Then by applying G more and more times, we could learn more and more bits of y, and thereby solve an uncomputable problem in the limit n→∞.

Faced with this observation, most quantum computing experts would say something like: “OK, but this is silly! It has no physical relevance, since we’ll never come across a magical gate like G—if only we did! And at any rate, it has nothing to do with quantum computing specifically: even classically, one could imagine a coin that landed heads with probability equal to Chaitin’s Ω. Therefore, the right way to deal with this is simply to define BQP in such a way as to disallow such absurd gates.” And indeed, that is what’s done today—usually without even remarking on it.

Now, it turns out that even gates that are “perfectly safe” for defining BQP, can turn “unsafe” when it comes to defining PostBQP. To screw up the definition of PostBQP, it’s not necessary that a gate involve uncomputable (or extremely hard-to-compute) amplitudes: the amplitudes could all be easily computable, but they could still be “unsafe” because of massive cancellations, as in the example above involving α. But one could think of this as a difference of degree, rather than of kind. It’s still true that there’s a large set of gates, including virtually all the gates anyone has ever cared about in practice (Toffoli, Hadamard, π/8, etc. etc.), that are perfectly safe for defining the complexity class; it’s just that the set is slightly smaller than it was for BQP.

The other issue with the PostBQP=PP paper was discovered by Ryan O’Donnell and Cem Say.  In Proposition 3 of the paper, I claim that PostBQP = BQPPostBQP||,classical, where the latter is the class of problems solvable by a BQP machine that’s allowed to make poly(n) parallel, classical queries to a PostBQP oracle.  As Ryan pointed out to me, nothing in my brief argument for this depended on quantum mechanics, so it would equally well show that PostBPP = BPPPostBPP||, where PostBPP (also known as BPPpath) is the classical analogue of PostBQP, and BPPPostBPP|| is the class of problems solvable by a BPP machine that can make poly(n) parallel queries to a PostBPP oracle.  But BPPPostBPP|| clearly contains BPPNP||, which in turn contains AM—so we would get AM in PostBPP, and therefore AM in PostBQP=PP.  But Vereshchagin gave an oracle relative to which AM is not contained in PP.  Since there was no nonrelativizing ingredient anywhere in my argument, the only possible conclusion is that my argument was wrong.  (This, incidentally, provides a nice illustration of the value of oracle results.)

In retrospect, it’s easy to pinpoint what went wrong.  If we try to simulate BPPPostBPP|| in PostBPP, our random bits will be playing a dual role: in choosing the queries to be submitted to the PostBPP oracle, and in providing the “raw material for postselection,” in computing the responses to those queries.  But in PostBPP, we only get to postselect once.  When we do, the two sets of random bits that we’d wanted to keep separate will get hopelessly mixed up, with the postselection acting on the “BPP” random bits, not just on the “PostBPP” ones.

How can we fix this problem?  Well, when defining the class BQPPostBQP||,classical, suppose we require the queries to the PostBQP oracle to be not only “classical,” but deterministic: that is, they have to be generated in advance by a P machine, and can’t depend on any random bits whatsoever.  And suppose we define BPPPostBPP||,classical similarly.  In that case, it’s not hard to see that the equalities BQPPostBQP||,classical = PostBQP and BPPPostBPP||,classical = PostBPP both go through.  You don’t actually care about this, do you?  But Ryan O’Donnell and Cem Say did, and that’s good enough for me.

I wish I could say that these are the only cases of mistakes recently being found in decade-old papers of mine, but alas, such is not the case.  In the near future, my student Adam Bouland, MIT undergrad Mitchell Lee, and Singapore’s Joe Fitzsimons will post to the arXiv a paper that grew out of an error in my 2005 paper Quantum Computing and Hidden Variables. In that paper, I introduced a hypothetical generalization of the quantum computing model, in which one gets to see the entire trajectory of a hidden variable, rather than just a single measurement outcome. I showed that this generalization would let us solve problems somewhat beyond what we think we can do with a “standard” quantum computer. In particular, we could solve the collision problem in O(1) queries, efficiently solve Graph Isomorphism (and all other problems in the Statistical Zero-Knowledge class), and search an N-element list in only ~N1/3 steps, rather than the ~N1/2 steps of Grover’s search algorithm. That part of the paper remains fine!

On the other hand, at the end of the paper, I also gave a brief argument to show that, even in the hidden-variable model, ~N1/3 steps are required to search an N-element list. But Mitchell Lee and Adam Bouland discovered that that argument is wrong: it fails to account for all the possible ways that an algorithm could exploit the correlations between the hidden variable’s values at different moments in time. (I’ve previously discussed this error in other blog posts, as well as in the latest edition of Quantum Computing Since Democritus.)

If we suitably restrict the hidden-variable theory, then we can correctly prove a lower bound of ~N1/4, or even (with strong enough assumptions) ~N1/3; and we do that in the forthcoming paper. Even with no restrictions, as far as we know an ~N1/3 lower bound for search with hidden variables remains true. But it now looks like proving it will require a major advance in our understanding of hidden-variable theories: for example, a proof that the “Schrödinger theory” is robust to small perturbations, which I’d given as the main open problem in my 2005 paper.

As if that weren’t enough, in my 2003 paper Quantum Certificate Complexity, I claimed (as a side remark) that one could get a recursive Boolean function f with an asymptotic gap between the block sensitivity bs(f) and the randomized certificate complexity RC(f). However, two and a half years ago, Avishay Tal discovered that this didn’t work, because block sensitivity doesn’t behave nicely under composition.  (In assuming it did, I was propagating an error introduced earlier by Wegener and Zádori.)  More broadly, Avishay showed that there is no recursively-defined Boolean function with an asymptotic gap between bs(f) and RC(f). On the other hand, if we just want some Boolean function with an asymptotic gap between bs(f) and RC(f), then Raghav Kulkarni observed that we can use a non-recursive function introduced by Xiaoming Sun, which yields bs(f)≈N3/7 and RC(f)≈N4/7. This is actually a larger separation than the one I’d wrongly claimed.

Now that I’ve come clean about all these things, hopefully the healing can begin at last.

### Lens of Computation on the Sciences

Tuesday, November 25th, 2014

This weekend, the Institute for Advanced Study in Princeton hosted a workshop on the “Lens of Computation in the Sciences,” which was organized by Avi Wigderson, and was meant to showcase theoretical computer science’s imperialistic ambitions to transform every other field.  I was proud to speak at the workshop, representing CS theory’s designs on physics.  But videos of all four of the talks are now available, and all are worth checking out:

Unfortunately, the videos were slow to buffer when I last tried it.  While you’re waiting, you could also check my PowerPoint slides, though they overlap considerably with my previous talks.  (As always, if you can’t read PowerPoint, then go ask another reader of this blog to convert the file into a format you like.)

Thanks so much to Avi, and everyone else at IAS, for organizing an awesome workshop!

### What does the NSA think of academic cryptographers? Recently-declassified document provides clues

Sunday, November 16th, 2014

Brighten Godfrey was one of my officemates when we were grad students at Berkeley.  He’s now a highly-successful computer networking professor at the University of Illinois Urbana-Champaign, where he studies the wonderful question of how we could get the latency of the Internet down to the physical limit imposed by the finiteness of the speed of light.  (Right now, we’re away from that limit by a factor of about 50.)

Last week, Brighten brought to my attention a remarkable document: a 1994 issue of CryptoLog, an NSA internal newsletter, which was recently declassified with a few redactions.  The most interesting thing in the newsletter is a trip report (pages 12-19 in the newsletter, 15-22 in the PDF file) by an unnamed NSA cryptographer, who attended the 1992 EuroCrypt conference, and who details his opinions on just about every talk.  If you’re interested in crypto, you really need to read this thing all the way through, but here’s a small sampling of the zingers:

• Three of the last four sessions were of no value whatever, and indeed there was almost nothing at Eurocrypt to interest us (this is good news!). The scholarship was actually extremely good; it’s just that the directions which external cryptologic researchers have taken are remarkably far from our own lines of interest.
• There were no proposals of cryptosystems, no novel cryptanalysis of old designs, even very little on hardware design. I really don’t see how things could have been any better for our purposes. We can hope that the absentee cryptologists stayed away because they had no new ideas, or even that they’ve taken an interest in other areas of research.
• Alfredo DeSantis … spoke on “Graph decompositions and secret-sharing schemes,” a silly topic which brings joy to combinatorists and yawns to everyone else.
• Perhaps it is beneficial to be attacked, for you can easily augment your publication list by offering a modification.
• This result has no cryptanalytic application, but it serves to answer a question which someone with nothing else to think about might have asked.
• I think I have hammered home my point often enough that I shall regard it as proved (by emphatic enunciation): the tendency at IACR meetings is for academic scientists (mathematicians, computer scientists, engineers, and philosophers masquerading as theoretical computer scientists) to present commendable research papers (in their own areas) which might affect cryptology at some future time or (more likely) in some other world. Naturally this is not anathema to us.
• The next four sessions were given over to philosophical matters. Complexity theorists are quite happy to define concepts and then to discuss them even though they have no examples of them.
• Don Beaver (Penn State), in another era, would have been a spellbinding charismatic preacher; young, dashing (he still wears a pony-tail), self-confident and glib, he has captured from Silvio Micali the leadership of the philosophic wing of the U.S. East Coast cryptanalytic community.
• Those of you who know my prejudice against the “zero-knowledge” wing of the philosophical camp will be surprised to hear that I enjoyed the three talks of the session better than any of that ilk that I had previously endured. The reason is simple: I took along some interesting reading material and ignored the speakers. That technique served to advantage again for three more snoozers, Thursday’s “digital signature and electronic cash” session, but the final session, also on complexity theory, provided some sensible listening.
• But it is refreshing to find a complexity theory talk which actually addresses an important problem!
• The other two talks again avoided anything of substance.  [The authors of one paper] thought it worthwhile, in dealing [with] the general discrete logarithm problem, to prove that the problem is contained in the complexity classes NP and co-AM, but is unlikely to be in co-NP.
• And Ueli Maurer, again dazzling us with his brilliance, felt compelled, in “Factoring with an Oracle” to arm himself with an Oracle (essentially an Omniscient Being that complexity theorists like to turn to when they can’t solve a problem) while factoring. He’s calculating the time it would take him (and his Friend) to factor, and would like also to demonstrate his independence by consulting his Partner as seldom as possible. The next time you find yourself similarly equipped, you will perhaps want to refer to his paper.
• The conference again offered an interesting view into the thought processes of the world’s leading “cryptologists.” It is indeed remarkable how far the Agency has strayed from the True Path.

Of course, it would be wise not to read too much into this: it’s not some official NSA policy statement, but the griping of a single, opinionated individual somewhere within the NSA, who was probably bored and trying to amuse his colleagues.  All the same, it’s a fascinating document, not only for its zingers about people who are still very much active on the cryptographic scene, but also for its candid insights into what the NSA cares about and why, and for its look into the subculture within cryptography that would lead, years later, to Neal Koblitz’s widely-discussed anti-provable-security manifestos.

Reading this document drove home for me that the “provable security wars” are a very simple matter of the collision of two communities with different intellectual goals, not of one being right and the other being wrong.  Here’s a fun exercise: try reading this trip report while remembering that, in the 1980s—i.e., the decade immediately preceding the maligned EuroCrypt conference—the “philosophic wing” of cryptography that the writer lampoons actually succeeded in introducing revolutionary concepts (interactive proofs, zero-knowledge, cryptographic pseudorandomness, etc.) that transformed the field, concepts that have now been recognized with no fewer than three Turing Awards (to Yao, Goldwasser, and Micali).  On the other hand, it’s undoubtedly true that this progress was of no immediate interest to the NSA.  On the third hand, the “philosophers” might reply that helping the NSA wasn’t their goal.  The best interests of the NSA don’t necessarily coincide with the best interests of scientific advancement (not to mention the best interests of humanity—but that’s a separate debate).

### Der Quantencomputer

Friday, November 14th, 2014

### Speaking Truth to Parallelism at Cornell

Friday, October 3rd, 2014

This week I was at my alma mater, Cornell, to give a talk at the 50th anniversary celebration of its computer science department.  You can watch the streaming video here; my talk runs from roughly 1:17:30 to 1:56 (though if you’ve seen other complexity/physics/humor shows by me, this one is pretty similar, except for the riff about Cornell at the beginning).

Along with the 50th anniversary celebration, Bill Gates was also on campus to dedicate Bill and Melinda Gates Hall, the new home of Cornell’s CS department.  Click here for streaming video of a Q&A that Gates did with Cornell students, where I thought he acquitted himself quite well, saying many sensible things about education, the developing world, etc. that other smart people could also say, but that have extra gravitas coming from him.  Gates has also become extremely effective at wrapping barbs of fact inside a soft mesh of politically-unthreatening platitudes—but listen carefully and you’ll hear the barbs.  The amount of pomp and preparation around Gates’s visit reminded me of when President Obama visited MIT, befitting the two men’s approximately equal power.  (Obama has nuclear weapons, but then again, he also has Congress.)

And no, I didn’t get to meet Gates or shake his hand, though I did get to stand about ten feet from him at the Gates Hall dedication.  (He apparently spent most of his time at Cornell meeting with plant breeders, and other people doing things relevant to the Gates Foundation’s interests.)

Thanks so much to Bobby and Jon Kleinberg, and everyone else who invited me to this fantastic event and helped make it happen.  May Cornell’s CS department have a great next 50 years.

One last remark before I close this post.  Several readers have expressed disapproval and befuddlement over the proposed title of my next book, “Speaking Truth to Parallelism.”  In the words of commenter TonyK:

That has got to be the worst title in the history of publishing! “Speaking Truth to Parallelism”? It doesn’t even make sense! I count myself as one of your fans, Scott, but you’re going to have to do better than that if you want anybody else to buy your book. I know you can do better — witness “Quantum Computing Since Democritus”.

However, my experiences at Cornell this week helped to convince me that, not only does “Speaking Truth to Parallelism” make perfect sense, it’s an activity that’s needed now more than ever.  What it means, of course, is fighting a certain naïve, long-ago-debunked view of quantum computers—namely, that they would achieve exponential speedups by simply “trying every possible answer in parallel”—that’s become so entrenched in the minds of many journalists, laypeople, and even scientists from other fields that it feels like nothing you say can possibly dislodge it.  The words out of your mouth will literally be ignored, misheard, or even contorted to the opposite of what they mean, if that’s what it takes to preserve the listener’s misconception about quantum computers being able to solve NP-hard optimization problems by sheer magic.  (Much like in the Simpsons-visit-Australia episode, where Marge’s request for “coffee” is misheard over and over as “beer.”)  You probably think I’m exaggerating, and I’d agree with you—if I hadn’t experienced this phenomenon hundreds of times over the last decade.

So, to take one example: after my talk at Cornell, an audience member came up to me to say that it was a wonderful talk, but that what he really wanted to know was whether I thought quantum computers could solve problems in the “NP space” in linear time, by trying all the possible solutions at once.  He didn’t seem to realize that I’d spent the entire previous half hour answering that exact question, explaining why the answer was “no.”  Coincidentally, this week I also got an email from a longtime reader of this blog, saying that he read and loved Quantum Computing Since Democritus, and wanted my feedback on a popular article he’d written about quantum computing.  What was the gist of the article?  You guessed it: “quantum computing = generic exponential speedups for optimization, machine learning, and Big Data problems, by trying all the possible answers at once.”

These people’s enthusiasm for quantum computing tends to be so genuine, so sincere, that I find myself unable to blame them—even when they’ve done the equivalent of going up to Richard Dawkins and thanking him for having taught them that evolution works for the good of the entire species, just as its wise Designer intended.  I do blame the media and other careless or unscrupulous parties for misleading people about quantum computing, but most of all I blame myself, for not making my explanations clear enough.  In the end, then, meeting the “NP space” folks only makes me want to redouble my efforts to Speak Truth to Parallelism: eventually, I feel, the nerd world will get this point.

Update (Oct. 4): I had regarded this (perhaps wrongly) as too obvious to state, but particularly for non-native English speakers, I’d better clarify: “speaking truth to parallelism” is a deliberate pun on the left-wing protester phrase “speaking truth to power.”  So whatever linguistic oddness there is in my phrase, I’d say it simply inherits from the original.

Another Update (Oct. 7): See this comment for my short summary of what’s known about the actual technical question (can quantum computers solve NP-complete problems in polynomial time, or not?).

Another Update (Oct. 8): Many commenters wrote to point out that the video of my talk at Cornell is now password-protected, and no longer publicly available.  I wrote to my contacts at Cornell to ask about this, and they said they’re planning to release lightly-edited versions of the videos soon, but will look into the matter in the meantime.