Archive for the ‘Complexity’ Category

PostBQP Postscripts: A Confession of Mathematical Errors

Sunday, November 30th, 2014

tl;dr: This post reveals two errors in one of my most-cited papers, and also explains how to fix them.  Thanks to Piotr Achinger, Michael Cohen, Greg Kuperberg, Ciaran Lee, Ryan O’Donnell, Julian Rosen, Will Sawin, Cem Say, and others for their contributions to this post.

If you look at my Wikipedia page, apparently one of the two things in the world that I’m “known for” (along with algebrization) is “quantum Turing with postselection.”  By this, Wikipedia means my 2004 definition of the complexity class PostBQP—that is, the class of decision problems solvable in bounded-error quantum polynomial time, assuming the ability to postselect (or condition) on certain measurement outcomes—and my proof that PostBQP coincides with the classical complexity PP (that is, the class of decision problems expressible in terms of whether the number of inputs that cause a given polynomial-time Turing machine to accept does or doesn’t exceed some threshold).

To explain this a bit: even without quantum mechanics, it’s pretty obvious that, if you could “postselect” on exponentially-unlikely events, then you’d get huge, unrealistic amounts of computational power.  For example (and apologies in advance for the macabre imagery), you could “solve” NP-complete problems in polynomial time by simply guessing a random solution, then checking whether the solution is right, and shooting yourself if it happened to be wrong!  Conditioned on still being alive (and if you like, appealing to the “anthropic principle”), you must find yourself having guessed a valid solution—assuming, of course, that there were any valid solutions to be found.  If there weren’t any, then you’d seem to be out of luck!  (Exercise for the reader: generalize this “algorithm,” so that it still works even if you don’t know in advance whether your NP-complete problem instance has any valid solutions.)

So with the PostBQP=PP theorem, the surprise was not that postselection gives you lots of computational power, but rather that postselection combined with quantum mechanics gives you much more power even than postselection by itself (or quantum mechanics by itself, for that matter).  Since PPP=P#P, the class PP basically captures the full difficulty of #P-complete counting problems—that is, not just solving an NP-complete problem, but counting how many solutions it has.  It’s not obvious that a quantum computer with postselection can solve counting problems, but that’s what the theorem shows.  That, in turn, has implications for other things: for example, I showed it can be used to prove classical facts about PP, like the fact that PP is closed under intersection (the Beigel-Reingold-Spielman Theorem), in a straightforward way; and it’s also used to show the hardness of quantum sampling problems, in the work of Bremner-Jozsa-Shepherd as well as my BosonSampling work with Arkhipov.

I’m diffident about being “known for” something so simple; once I had asked the question, the proof of PostBQP=PP took me all of an hour to work out.  Yet PostBQP ended up being a hundred times more influential for quantum computing theory than things on which I expended a thousand times more effort.  So on balance, I guess I’m happy to call PostBQP my own.

That’s why today’s post comes with a special sense of intellectual responsibility.  Within the last month, it’s come to my attention that there are at least two embarrassing oversights in my PostBQP paper from a decade ago, one of them concerning the very definition of PostBQP.  I hasten to clarify: once one fixes up the definition, the PostBQP=PP theorem remains perfectly valid, and all the applications of PostBQP that I mentioned above—for example, to reproving Beigel-Reingold-Spielman, and to the hardness of quantum sampling problems—go through just fine.  But if you think I have nothing to be embarrassed about: well, read on.

The definitional subtlety came clearly to my attention a few weeks ago, when I was lecturing about PostBQP in my 6.845 Quantum Complexity Theory graduate class.  I defined PostBQP as the class of languages L⊆{0,1}* for which there exists a polynomial-time quantum Turing machine M such that, for all inputs x∈{0,1}*,

• M(x) “succeeds” (determined, say, by measuring its first output qubit in the {|0>,|1>} basis) with nonzero probability.
• If x∈L, then conditioned on M(x) succeeding, M(x) “accepts” (determined, say, by measuring its second output qubit in the {|0>,|1>} basis) with probability at least 2/3.
• If x∉L, then conditioned on M(x) succeeding, M(x) accepts with probability at most 1/3.

I then had to reassure the students that PostBQP, so defined, was a “robust” class: that is, that the definition doesn’t depend on stupid things like which set of quantum gates we allow. I argued that, even though we’re postselecting on exponentially-unlikely events, it’s still OK, because the Solovay-Kitaev Theorem lets us approximate any desired unitary to within exponentially-small error, with only a polynomial increase in the size of our quantum circuit. (Here we actually need the full power of the Solovay-Kitaev Theorem, in contrast to ordinary BQP, where we only need part of the power.)

A student in the class, Michael Cohen, immediately jumped in with a difficulty: what if M(x) succeeded, not with exponentially-small probability, but with doubly-exponentially-small probability—say, exp(-2n)?  In that case, one could no longer use the Solovay-Kitaev Theorem to show the irrelevance of the gate set.  It would no longer even be clear that PostBQP⊆PP, since the PP simulation might not be able to keep track of such tiny probabilities.

Thinking on my feet, I replied that we could presumably choose a set of gates—for example, gates involving rational numbers only—for which doubly-exponentially-small probabilities would never arise.  Or if all else failed, we could simply add to the definition of PostBQP that M(x) had to “succeed” with probability at least 1/exp(n): after all, that was the only situation I ever cared about anyway, and the only one that ever arose in the applications of PostBQP.

But the question still gnawed at me: was there a problem with my original, unamended definition of PostBQP?  If we weren’t careful in choosing our gate set, could we have cancellations that produced doubly-exponentially-small probabilities?  I promised I’d think about it more.

By a funny coincidence, just a couple weeks later, Ciaran Lee, a student at Oxford, emailed me the exact same question.  So on a train ride from Princeton to Boston, I decided to think about it for real.  It wasn’t hard to show that, if the gates involved square roots of rational numbers only—for example, if we’re dealing with the Hadamard and Toffoli gates, or the cos(π/8) and CNOT gates, or other standard gate sets—then every measurement outcome has at least 1/exp(n) probability, so there’s no problem with the definition of PostBQP.  But I didn’t know what might happen with stranger gate sets.

As is my wont these days—when parenting, teaching, and so forth leave me with almost no time to concentrate on math—I posted the problem to MathOverflow.  Almost immediately, I got incisive responses.  First, Piotr Achinger pointed out that, if we allow arbitrary gates, then it’s easy to get massive cancellations.  In more detail, let {an} be extremely-rapidly growing sequence of integers, say with an+1 > exp(an).  Then define

$$\alpha = \sum_{n=1}^{\infty} 0.1^{a_n}.$$

If we write out α in decimal notation, it will consist of mostly 0’s, but with 1’s spaced further and further apart, like so: 0.1101000000000001000….  Now consider a gate set that involves α as well as 0.1 and -0.1 as matrix entries.  Given n qubits, it’s not hard to see that we can set up an interference experiment in which one of the paths leading to a given outcome E has amplitude α, and the other paths have amplitudes $$-(0.1^{a_1}), -(0.1^{a_2}), \ldots, -(0.1^{a_k}),$$ where k is the largest integer such that ak≤n. In that case, the total amplitude of E will be about $$0.1^{a_{k+1}},$$ which for most values of n is doubly-exponentially small in n. Of course, by simply choosing a faster-growing sequence {an}, we can cause an even more severe cancellation.

Furthermore, by modifying the above construction to involve two crazy transcendental numbers α and β, I claim that we can set up a PostBQP computation such that deciding what happens is arbitrarily harder than PP (though still computable)—say, outside of exponential space, or even triple-exponential space. Moreover, we can do this despite the fact that the first n digits of α and β remain computable in O(n) time. The details are left as an exercise for the interested reader.

Yet even though we can engineer massive cancellations with crazy gates, I still conjectured that nothing would go wrong with “normal” gates: for example, gates involving algebraic amplitudes only. More formally, I conjectured that any finite set A=(a1,…,ak) of algebraic numbers is “tame,” in the sense that, if p is any degree-n polynomial with integer coefficients at most exp(n) in absolute value, then p(a1,…,ak)≠0 implies |p(a1,…,ak)|≥1/exp(n). And indeed, Julian Rosen on MathOverflow found an elegant proof of this fact. I’ll let you read it over there if you’re interested, but briefly, it interprets the amplitude we want as one particular Archimedean valuation of a certain element of a number field, and then lower-bounds the amplitude by considering the product of all Archimedean and non-Archimedean valuations (the latter of which involves the p-adic numbers). Since this was a bit heavy-duty for me, I was grateful when Will Sawin reformulated the proof in linear-algebraic terms that I understood.

And then came the embarrassing part. A few days ago, I was chatting with Greg Kuperberg, the renowned mathematician and author of our climate-change parable. I thought he’d be interested in this PostBQP progress, so I mentioned it to him. Delicately, Greg let me know that he had recently proved the exact same results, for the exact same reason (namely, fixing the definition of PostBQP), for the latest revision of his paper How Hard Is It to Approximate the Jones Polynomial?. Moreover, he actually wrote to me in June to tell me about this! At the time, however, I regarded it as “pointless mathematical hairsplitting” (who cares about these low-level gate-set issues anyway?). So I didn’t pay it any attention—and then I’d completely forgotten about Greg’s work when the question resurfaced a few months later. This is truly a just punishment for looking down on “mathematical hairsplitting,” and not a lesson I’ll soon forget.

Anyway, Greg’s paper provides yet a third proof that the algebraic numbers are tame, this one using Galois conjugates (though it turns out that, from a sufficiently refined perspective, Greg’s proof is equivalent to the other two).

There remains one obvious open problem here, one that I noted in the MathOverflow post and in which Greg is also extremely interested. Namely, we now know that it’s possible to screw up PostBQP using gates with amplitudes that are crazy transcendental numbers (closely related to the Liouville numbers). And we also know that, if the gates have algebraic amplitudes, then everything is fine: all events have at least 1/exp(n) probability. But what if the gates have not-so-crazy transcendental amplitudes, like 1/e, or (a bit more realistically) cos(2)?  I conjecture that everything is still fine, but the proof techniques that worked for the algebraic case seem useless here.

Stepping back, how great are the consequences of all this for our understanding of PostBQP? Fortunately, I claim that they’re not that great, for the following reason. As Adleman, DeMarrais, and Huang already noted in 1997—in the same paper that proved BQP⊆PP—we can screw up the definition even of BQP, let alone PostBQP, using a bizarre enough gate set. For example, suppose we had a gate G that mapped |0> to x|0>+y|1>, where y was a real number whose binary expansion encoded the halting problem (for example, y might equal Chaitin’s Ω).  Then by applying G more and more times, we could learn more and more bits of y, and thereby solve an uncomputable problem in the limit n→∞.

Faced with this observation, most quantum computing experts would say something like: “OK, but this is silly! It has no physical relevance, since we’ll never come across a magical gate like G—if only we did! And at any rate, it has nothing to do with quantum computing specifically: even classically, one could imagine a coin that landed heads with probability equal to Chaitin’s Ω. Therefore, the right way to deal with this is simply to define BQP in such a way as to disallow such absurd gates.” And indeed, that is what’s done today—usually without even remarking on it.

Now, it turns out that even gates that are “perfectly safe” for defining BQP, can turn “unsafe” when it comes to defining PostBQP. To screw up the definition of PostBQP, it’s not necessary that a gate involve uncomputable (or extremely hard-to-compute) amplitudes: the amplitudes could all be easily computable, but they could still be “unsafe” because of massive cancellations, as in the example above involving α. But one could think of this as a difference of degree, rather than of kind. It’s still true that there’s a large set of gates, including virtually all the gates anyone has ever cared about in practice (Toffoli, Hadamard, π/8, etc. etc.), that are perfectly safe for defining the complexity class; it’s just that the set is slightly smaller than it was for BQP.

The other issue with the PostBQP=PP paper was discovered by Ryan O’Donnell and Cem Say.  In Proposition 3 of the paper, I claim that PostBQP = BQPPostBQP||,classical, where the latter is the class of problems solvable by a BQP machine that’s allowed to make poly(n) parallel, classical queries to a PostBQP oracle.  As Ryan pointed out to me, nothing in my brief argument for this depended on quantum mechanics, so it would equally well show that PostBPP = BPPPostBPP||, where PostBPP (also known as BPPpath) is the classical analogue of PostBQP, and BPPPostBPP|| is the class of problems solvable by a BPP machine that can make poly(n) parallel queries to a PostBPP oracle.  But BPPPostBPP|| clearly contains BPPNP||, which in turn contains AM—so we would get AM in PostBPP, and therefore AM in PostBQP=PP.  But Vereshchagin gave an oracle relative to which AM is not contained in PP.  Since there was no nonrelativizing ingredient anywhere in my argument, the only possible conclusion is that my argument was wrong.  (This, incidentally, provides a nice illustration of the value of oracle results.)

In retrospect, it’s easy to pinpoint what went wrong.  If we try to simulate BPPPostBPP|| in PostBPP, our random bits will be playing a dual role: in choosing the queries to be submitted to the PostBPP oracle, and in providing the “raw material for postselection,” in computing the responses to those queries.  But in PostBPP, we only get to postselect once.  When we do, the two sets of random bits that we’d wanted to keep separate will get hopelessly mixed up, with the postselection acting on the “BPP” random bits, not just on the “PostBPP” ones.

How can we fix this problem?  Well, when defining the class BQPPostBQP||,classical, suppose we require the queries to the PostBQP oracle to be not only “classical,” but deterministic: that is, they have to be generated in advance by a P machine, and can’t depend on any random bits whatsoever.  And suppose we define BPPPostBPP||,classical similarly.  In that case, it’s not hard to see that the equalities BQPPostBQP||,classical = PostBQP and BPPPostBPP||,classical = PostBPP both go through.  You don’t actually care about this, do you?  But Ryan O’Donnell and Cem Say did, and that’s good enough for me.

I wish I could say that these are the only cases of mistakes recently being found in decade-old papers of mine, but alas, such is not the case.  In the near future, my student Adam Bouland, MIT undergrad Mitchell Lee, and Singapore’s Joe Fitzsimons will post to the arXiv a paper that grew out of an error in my 2005 paper Quantum Computing and Hidden Variables. In that paper, I introduced a hypothetical generalization of the quantum computing model, in which one gets to see the entire trajectory of a hidden variable, rather than just a single measurement outcome. I showed that this generalization would let us solve problems somewhat beyond what we think we can do with a “standard” quantum computer. In particular, we could solve the collision problem in O(1) queries, efficiently solve Graph Isomorphism (and all other problems in the Statistical Zero-Knowledge class), and search an N-element list in only ~N1/3 steps, rather than the ~N1/2 steps of Grover’s search algorithm. That part of the paper remains fine!

On the other hand, at the end of the paper, I also gave a brief argument to show that, even in the hidden-variable model, ~N1/3 steps are required to search an N-element list. But Mitchell Lee and Adam Bouland discovered that that argument is wrong: it fails to account for all the possible ways that an algorithm could exploit the correlations between the hidden variable’s values at different moments in time. (I’ve previously discussed this error in other blog posts, as well as in the latest edition of Quantum Computing Since Democritus.)

If we suitably restrict the hidden-variable theory, then we can correctly prove a lower bound of ~N1/4, or even (with strong enough assumptions) ~N1/3; and we do that in the forthcoming paper. Even with no restrictions, as far as we know an ~N1/3 lower bound for search with hidden variables remains true. But it now looks like proving it will require a major advance in our understanding of hidden-variable theories: for example, a proof that the “Schrödinger theory” is robust to small perturbations, which I’d given as the main open problem in my 2005 paper.

As if that weren’t enough, in my 2003 paper Quantum Certificate Complexity, I claimed (as a side remark) that one could get a recursive Boolean function f with an asymptotic gap between the block sensitivity bs(f) and the randomized certificate complexity RC(f). However, two and a half years ago, Avishay Tal discovered that this didn’t work, because block sensitivity doesn’t behave nicely under composition.  (In assuming it did, I was propagating an error introduced earlier by Wegener and Zádori.)  More broadly, Avishay showed that there is no recursively-defined Boolean function with an asymptotic gap between bs(f) and RC(f). On the other hand, if we just want some Boolean function with an asymptotic gap between bs(f) and RC(f), then Raghav Kulkarni observed that we can use a non-recursive function introduced by Xiaoming Sun, which yields bs(f)≈N3/7 and RC(f)≈N4/7. This is actually a larger separation than the one I’d wrongly claimed.

Now that I’ve come clean about all these things, hopefully the healing can begin at last.

Lens of Computation on the Sciences

Tuesday, November 25th, 2014

This weekend, the Institute for Advanced Study in Princeton hosted a workshop on the “Lens of Computation in the Sciences,” which was organized by Avi Wigderson, and was meant to showcase theoretical computer science’s imperialistic ambitions to transform every other field.  I was proud to speak at the workshop, representing CS theory’s designs on physics.  But videos of all four of the talks are now available, and all are worth checking out:

Unfortunately, the videos were slow to buffer when I last tried it.  While you’re waiting, you could also check my PowerPoint slides, though they overlap considerably with my previous talks.  (As always, if you can’t read PowerPoint, then go ask another reader of this blog to convert the file into a format you like.)

Thanks so much to Avi, and everyone else at IAS, for organizing an awesome workshop!

What does the NSA think of academic cryptographers? Recently-declassified document provides clues

Sunday, November 16th, 2014

Brighten Godfrey was one of my officemates when we were grad students at Berkeley.  He’s now a highly-successful computer networking professor at the University of Illinois Urbana-Champaign, where he studies the wonderful question of how we could get the latency of the Internet down to the physical limit imposed by the finiteness of the speed of light.  (Right now, we’re away from that limit by a factor of about 50.)

Last week, Brighten brought to my attention a remarkable document: a 1994 issue of CryptoLog, an NSA internal newsletter, which was recently declassified with a few redactions.  The most interesting thing in the newsletter is a trip report (pages 12-19 in the newsletter, 15-22 in the PDF file) by an unnamed NSA cryptographer, who attended the 1992 EuroCrypt conference, and who details his opinions on just about every talk.  If you’re interested in crypto, you really need to read this thing all the way through, but here’s a small sampling of the zingers:

• Three of the last four sessions were of no value whatever, and indeed there was almost nothing at Eurocrypt to interest us (this is good news!). The scholarship was actually extremely good; it’s just that the directions which external cryptologic researchers have taken are remarkably far from our own lines of interest.
• There were no proposals of cryptosystems, no novel cryptanalysis of old designs, even very little on hardware design. I really don’t see how things could have been any better for our purposes. We can hope that the absentee cryptologists stayed away because they had no new ideas, or even that they’ve taken an interest in other areas of research.
• Alfredo DeSantis … spoke on “Graph decompositions and secret-sharing schemes,” a silly topic which brings joy to combinatorists and yawns to everyone else.
• Perhaps it is beneficial to be attacked, for you can easily augment your publication list by offering a modification.
• This result has no cryptanalytic application, but it serves to answer a question which someone with nothing else to think about might have asked.
• I think I have hammered home my point often enough that I shall regard it as proved (by emphatic enunciation): the tendency at IACR meetings is for academic scientists (mathematicians, computer scientists, engineers, and philosophers masquerading as theoretical computer scientists) to present commendable research papers (in their own areas) which might affect cryptology at some future time or (more likely) in some other world. Naturally this is not anathema to us.
• The next four sessions were given over to philosophical matters. Complexity theorists are quite happy to define concepts and then to discuss them even though they have no examples of them.
• Don Beaver (Penn State), in another era, would have been a spellbinding charismatic preacher; young, dashing (he still wears a pony-tail), self-confident and glib, he has captured from Silvio Micali the leadership of the philosophic wing of the U.S. East Coast cryptanalytic community.
• Those of you who know my prejudice against the “zero-knowledge” wing of the philosophical camp will be surprised to hear that I enjoyed the three talks of the session better than any of that ilk that I had previously endured. The reason is simple: I took along some interesting reading material and ignored the speakers. That technique served to advantage again for three more snoozers, Thursday’s “digital signature and electronic cash” session, but the final session, also on complexity theory, provided some sensible listening.
• But it is refreshing to find a complexity theory talk which actually addresses an important problem!
• The other two talks again avoided anything of substance.  [The authors of one paper] thought it worthwhile, in dealing [with] the general discrete logarithm problem, to prove that the problem is contained in the complexity classes NP and co-AM, but is unlikely to be in co-NP.
• And Ueli Maurer, again dazzling us with his brilliance, felt compelled, in “Factoring with an Oracle” to arm himself with an Oracle (essentially an Omniscient Being that complexity theorists like to turn to when they can’t solve a problem) while factoring. He’s calculating the time it would take him (and his Friend) to factor, and would like also to demonstrate his independence by consulting his Partner as seldom as possible. The next time you find yourself similarly equipped, you will perhaps want to refer to his paper.
• The conference again offered an interesting view into the thought processes of the world’s leading “cryptologists.” It is indeed remarkable how far the Agency has strayed from the True Path.

Of course, it would be wise not to read too much into this: it’s not some official NSA policy statement, but the griping of a single, opinionated individual somewhere within the NSA, who was probably bored and trying to amuse his colleagues.  All the same, it’s a fascinating document, not only for its zingers about people who are still very much active on the cryptographic scene, but also for its candid insights into what the NSA cares about and why, and for its look into the subculture within cryptography that would lead, years later, to Neal Koblitz’s widely-discussed anti-provable-security manifestos.

Reading this document drove home for me that the “provable security wars” are a very simple matter of the collision of two communities with different intellectual goals, not of one being right and the other being wrong.  Here’s a fun exercise: try reading this trip report while remembering that, in the 1980s—i.e., the decade immediately preceding the maligned EuroCrypt conference—the “philosophic wing” of cryptography that the writer lampoons actually succeeded in introducing revolutionary concepts (interactive proofs, zero-knowledge, cryptographic pseudorandomness, etc.) that transformed the field, concepts that have now been recognized with no fewer than three Turing Awards (to Yao, Goldwasser, and Micali).  On the other hand, it’s undoubtedly true that this progress was of no immediate interest to the NSA.  On the third hand, the “philosophers” might reply that helping the NSA wasn’t their goal.  The best interests of the NSA don’t necessarily coincide with the best interests of scientific advancement (not to mention the best interests of humanity—but that’s a separate debate).

Der Quantencomputer

Friday, November 14th, 2014

Speaking Truth to Parallelism at Cornell

Friday, October 3rd, 2014

This week I was at my alma mater, Cornell, to give a talk at the 50th anniversary celebration of its computer science department.  You can watch the streaming video here; my talk runs from roughly 1:17:30 to 1:56 (though if you’ve seen other complexity/physics/humor shows by me, this one is pretty similar, except for the riff about Cornell at the beginning).

Along with the 50th anniversary celebration, Bill Gates was also on campus to dedicate Bill and Melinda Gates Hall, the new home of Cornell’s CS department.  Click here for streaming video of a Q&A that Gates did with Cornell students, where I thought he acquitted himself quite well, saying many sensible things about education, the developing world, etc. that other smart people could also say, but that have extra gravitas coming from him.  Gates has also become extremely effective at wrapping barbs of fact inside a soft mesh of politically-unthreatening platitudes—but listen carefully and you’ll hear the barbs.  The amount of pomp and preparation around Gates’s visit reminded me of when President Obama visited MIT, befitting the two men’s approximately equal power.  (Obama has nuclear weapons, but then again, he also has Congress.)

And no, I didn’t get to meet Gates or shake his hand, though I did get to stand about ten feet from him at the Gates Hall dedication.  (He apparently spent most of his time at Cornell meeting with plant breeders, and other people doing things relevant to the Gates Foundation’s interests.)

Thanks so much to Bobby and Jon Kleinberg, and everyone else who invited me to this fantastic event and helped make it happen.  May Cornell’s CS department have a great next 50 years.

One last remark before I close this post.  Several readers have expressed disapproval and befuddlement over the proposed title of my next book, “Speaking Truth to Parallelism.”  In the words of commenter TonyK:

That has got to be the worst title in the history of publishing! “Speaking Truth to Parallelism”? It doesn’t even make sense! I count myself as one of your fans, Scott, but you’re going to have to do better than that if you want anybody else to buy your book. I know you can do better — witness “Quantum Computing Since Democritus”.

However, my experiences at Cornell this week helped to convince me that, not only does “Speaking Truth to Parallelism” make perfect sense, it’s an activity that’s needed now more than ever.  What it means, of course, is fighting a certain naïve, long-ago-debunked view of quantum computers—namely, that they would achieve exponential speedups by simply “trying every possible answer in parallel”—that’s become so entrenched in the minds of many journalists, laypeople, and even scientists from other fields that it feels like nothing you say can possibly dislodge it.  The words out of your mouth will literally be ignored, misheard, or even contorted to the opposite of what they mean, if that’s what it takes to preserve the listener’s misconception about quantum computers being able to solve NP-hard optimization problems by sheer magic.  (Much like in the Simpsons-visit-Australia episode, where Marge’s request for “coffee” is misheard over and over as “beer.”)  You probably think I’m exaggerating, and I’d agree with you—if I hadn’t experienced this phenomenon hundreds of times over the last decade.

So, to take one example: after my talk at Cornell, an audience member came up to me to say that it was a wonderful talk, but that what he really wanted to know was whether I thought quantum computers could solve problems in the “NP space” in linear time, by trying all the possible solutions at once.  He didn’t seem to realize that I’d spent the entire previous half hour answering that exact question, explaining why the answer was “no.”  Coincidentally, this week I also got an email from a longtime reader of this blog, saying that he read and loved Quantum Computing Since Democritus, and wanted my feedback on a popular article he’d written about quantum computing.  What was the gist of the article?  You guessed it: “quantum computing = generic exponential speedups for optimization, machine learning, and Big Data problems, by trying all the possible answers at once.”

These people’s enthusiasm for quantum computing tends to be so genuine, so sincere, that I find myself unable to blame them—even when they’ve done the equivalent of going up to Richard Dawkins and thanking him for having taught them that evolution works for the good of the entire species, just as its wise Designer intended.  I do blame the media and other careless or unscrupulous parties for misleading people about quantum computing, but most of all I blame myself, for not making my explanations clear enough.  In the end, then, meeting the “NP space” folks only makes me want to redouble my efforts to Speak Truth to Parallelism: eventually, I feel, the nerd world will get this point.

Update (Oct. 4): I had regarded this (perhaps wrongly) as too obvious to state, but particularly for non-native English speakers, I’d better clarify: “speaking truth to parallelism” is a deliberate pun on the left-wing protester phrase “speaking truth to power.”  So whatever linguistic oddness there is in my phrase, I’d say it simply inherits from the original.

Another Update (Oct. 7): See this comment for my short summary of what’s known about the actual technical question (can quantum computers solve NP-complete problems in polynomial time, or not?).

Another Update (Oct. 8): Many commenters wrote to point out that the video of my talk at Cornell is now password-protected, and no longer publicly available.  I wrote to my contacts at Cornell to ask about this, and they said they’re planning to release lightly-edited versions of the videos soon, but will look into the matter in the meantime.

Microsoft SVC

Tuesday, September 23rd, 2014

By now, the news that Microsoft abruptly closed its Silicon Valley research lab—leaving dozens of stellar computer scientists jobless—has already been all over the theoretical computer science blogosphere: see, e.g., Lance, Luca, Omer Reingold, Michael Mitzenmacher.  I never made a real visit to Microsoft SVC (only went there once IIRC, for a workshop, while a grad student at Berkeley); now of course I won’t have the chance.

The theoretical computer science community, in the Bay Area and elsewhere, is now mobilizing to offer visiting positions to the “refugees” from Microsoft SVC, until they’re able to find more permanent employment.  I was happy to learn, this week, that MIT’s theory group will likely play a small part in that effort.

Like many others, I confess to bafflement about Microsoft’s reasons for doing this.  Won’t the severe damage to MSR’s painstakingly-built reputation, to its hiring and retention of the best people, outweigh the comparatively small amount of money Microsoft will save?  Did they at least ask Mr. Gates, to see whether he’d chip in the proverbial change under his couch cushions to keep the lab open?  Most of all, why the suddenness?  Why not wind the lab down over a year, giving the scientists time to apply for new jobs in the academic hiring cycle?  It’s not like Microsoft is in a financial crisis, lacking the cash to keep the lights on.

Yet one could also view this announcement as a lesson in why academia exists and is necessary.  Yes, one should applaud those companies that choose to invest a portion of their revenue in basic research—like IBM, the old AT&T, or Microsoft itself (which continues to operate great research outfits in Redmond, Santa Barbara, both Cambridges, Beijing, Bangalore, Munich, Cairo, and Herzliya).  And yes, one should acknowledge the countless times when academia falls short of its ideals, when it too places the short term above the long.  All the same, it seems essential that our civilization maintain institutions for which the pursuit and dissemination of knowledge are not just accoutrements for when financial times are good and the Board of Directors is sympathetic, but are the institution’s entire reasons for being: those activities that the institution has explicitly committed to support for as long as it exists.

Do theoretical computer scientists despise practitioners? (Answer: no, that’s crazy)

Thursday, August 28th, 2014

A roboticist and Shtetl-Optimized fan named Jon Groff recently emailed me the following suggestion for a blog entry:

I think a great idea for an entry would be the way that in fields like particle physics the theoreticians and experimentalists get along quite well but in computer science and robotics in particular there seems to be a great disdain for the people that actually do things from the people that like to think about them. Just thought I’d toss that out there in case you are looking for some subject matter.

After I replied (among other things, raising my virtual eyebrows over his rosy view of the current state of theoretician/experimentalist interaction in particle physics), Jon elaborated on his concerns in a subsequent email:

[T]here seems to be this attitude in CS that getting your hands dirty is unacceptable. You haven’t seen it because you sit a lofty heights and I tend to think you always have. I have been pounding out code since ferrite cores. Yes, Honeywell 1648A, so I have been looking up the posterior of this issue rather than from the forehead as it were. I guess my challenge would be to find a noteworthy computer theoretician somewhere and ask him:
1) What complete, working, currently functioning systems have you designed?
2) How much of the working code did you contribute?
3) Which of these systems is still operational and in what capacity?
Or say, if the person was a famous robotics professor or something you may ask:
1) Have you ever actually ‘built’ a ‘robot’?
2) Could you, if called upon, design and build an easily tasked robot safe for home use using currently available materials and code?

So I wrote a second reply, which Jon encouraged me to turn into a blog post (kindly giving me permission to quote him).  In case it’s of interest to anyone else, my reply is below.

Dear Jon,

For whatever it’s worth, when I was an undergrad, I spent two years working as a coder for Cornell’s RoboCup robot soccer team, handling things like the goalie.  (That was an extremely valuable experience, one reason being that it taught me how badly I sucked at meeting deadlines, documenting my code, and getting my code to work with other people’s code.)   Even before that, I wrote shareware games with my friend Alex Halderman (now a famous computer security expert at U. of Michigan); we made almost \$30 selling them.  And I spent several summers working on applied projects at Bell Labs, back when that was still a thing.  And by my count, I’ve written four papers that involved code I personally wrote and experiments I did (one on hypertext, one on stylometric clusteringone on Boolean function query properties, one on improved simulation of stabilizer circuits—for the last of these, the code is actually still used by others).  While this is all from the period 1994-2004 (these days, if I need any coding done, I use the extremely high-level programming language called “undergrad”), I don’t think it’s entirely true to say that I “never got my hands dirty.”

But even if I hadn’t had any of those experiences, or other theoretical computer scientists hadn’t had analogous ones, your questions still strike me as unfair.  They’re no more fair than cornering a star coder or other practical person with questions like, “Have you ever proved a theorem?  A nontrivial theorem?  Why is BPP contained in P/poly?  What’s the cardinality of the set of Turing-degrees?”  If the coder can’t easily answer these questions, would you say it means that she has “disdain for theorists”?  (I was expecting some discussion of this converse question in your email, and was amused when I didn’t find any.)

Personally, I’d say “of course not”: maybe the coder is great at coding, doesn’t need theory very much on a day-to-day basis and doesn’t have much free time to learn it, but (all else equal) would be happy to know more.  Maybe the coder likes theory as an outsider, even has friends from her student days who are theorists, and who she’d go to if she ever did need their knowledge for her work.  Or maybe not.  Maybe she’s an asshole who looks down on anyone who doesn’t have the exact same skill-set that she does.  But I certainly couldn’t conclude that from her inability to answer basic theory questions.

I’d say just the same about theorists.  If they don’t have as much experience building robots as they should have, don’t know as much about large software projects as they should know, etc., then those are all defects to add to the long list of their other, unrelated defects.  But it would be a mistake to assume that they failed to acquire this knowledge because of disdain for practical peoplerather than for mundane reasons like busyness or laziness.

Indeed, it’s also possible that they respect practical people all the more, because they tried to do the things the practical people are good at, and discovered for themselves how hard they were.  Maybe they became theorists partly because of that self-discovery—that was certainly true in my case.  Maybe they’d be happy to talk to or learn from a practical roboticist like yourself, but are too shy or too nerdy to initiate the conversation.

Speaking of which: yes, let’s let bloom a thousand collaborations between theorists and practitioners!  Those are the lifeblood of science.  On the other hand, based on personal experience, I’m also sensitive to the effect where, because of pressures from funding agencies, theorists have to try to pretend their work is “practically relevant” when they’re really just trying to discover something cool, while meantime, practitioners have to pretend their work is theoretically novel or deep, when really, they’re just trying to write software that people will want to use.  I’d love to see both groups freed from this distorting influence, so that they can collaborate for real reasons rather than fake ones.

(I’ve also often remarked that, if I hadn’t gravitated to the extreme theoretical end of computer science, I think I might have gone instead to the extreme practical end, rather than to any of the points in between.  That’s because I hate the above-mentioned distorting influence: if I’m going to try to understand the ultimate limits of computation, then I should pursue that wherever it leads, even if it means studying computational models that won’t be practical for a million years.  And conversely, if I’m going to write useful software, I should throw myself 100% into that, even if it means picking an approach that’s well-understood, clunky, and reliable over an approach that’s new, interesting, elegant, and likely to fail.)

Best,
Scott

Subhash Khot’s prizewinning research

Saturday, August 16th, 2014

I already congratulated Subhash Khot in my last post for winning the Nevanlinna Award, but this really deserves a separate post.  Khot won theoretical computer science’s highest award largely for introducing and exploring the Unique Games Conjecture (UGC), which says (in one sentence) that a large number of the approximation problems that no one has been able to prove NP-hard, really are NP-hard.  In particular, if the UGC is true, then for MAX-CUT and dozens of other important optimization problems, no polynomial-time algorithm can always get you closer to the optimal solution than some semidefinite-programming-based algorithm gets you, unless P=NP.  The UGC might or might not be true—unlike with (say) P≠NP itself, there’s no firm consensus around it—but even if it’s false, the effort to prove or disprove it has by now had a huge impact on theoretical computer science research, leading to connections with geometry, tiling, analysis of Boolean functions, quantum entanglement, and more.

There are a few features that make the UGC interesting, compared to most other questions considered in complexity theory.  Firstly, the problem that the UGC asserts is NP-hard—basically, given a list of linear equations in 2 variables each, to satisfy as many of the equations as you can—is a problem with “imperfect completeness.”  This means that, if you just wanted to know whether all the linear equations were simultaneously satisfiable, the question would be trivial to answer, using Gaussian elimination.  So the problem only becomes interesting once you’re told that the equations are not simultaneously satisfiable, but you’d like to know (say) whether it’s possible to satisfy 99% of the equations or only 1%.  A second feature is that, because of the 2010 work of Arora, Barak, and Steurer, we know that there is an algorithm that solves the unique games problem in “subexponential time”: specifically, in time exp(npoly(δ)), where δ is the completeness error (that is, the fraction of linear equations that are unsatisfiable, in the case that most of them are satisfiable).  This doesn’t mean that the unique games problem can’t be NP-hard: it just means that, if there is an NP-hardness proof, then the reduction will need to blow up the instance sizes by an npoly(1/δ) factor.

To be clear, neither of the above features is unique (har, har) to unique games: we’ve long known NP-complete problems, like MAX-2SAT, that have the imperfect completeness feature, and we also know NP-hardness reductions that blow up the instance size by an npoly(1/δ) factor for inherent reasons (for example, for the Set Cover problem).  But perhaps nothing points as clearly as UGC at the directions that researchers in hardness of approximation and probabilistically checkable proofs (PCP) would like to be able to go.  A proof of the Unique Games Conjecture would basically be a PCP theorem on steroids.  (Or, since we already have “PCP theorems on steroids,” maybe a PCP theorem on PCP?)

It’s important to understand that, between the UGC being true and the unique games problem being solvable in polynomial time, there’s a wide range of intermediate possibilities, many of which are being actively investigated.  For example, the unique games problem could be “NP-hard,” but via a reduction that itself takes subexponential time (i.e., it could be hard assuming the Exponential-Time Hypothesis).  It could be solvable much faster than Arora-Barak-Steurer but still not in P.  Or, even if the problem weren’t solvable any faster than is currently known, it could be “hard without being NP-hard,” having a similar status to factoring or graph isomorphism.  Much current research into the UGC is focused on a particular algorithm called the Sum-of-Squares algorithm (i.e., the Laserre hierarchy).  Some researchers suspect that, if any algorithm will solve the unique games problem in polynomial time (or close to that), it will be Sum-of-Squares; conversely, if one could show that Sum-of-Squares failed, one would’ve taken a major step toward proving the UGC.

For more, I recommend this Quanta magazine article, or Luca Trevisan’s survey, or Subhash’s own survey.  Or those pressed for time can simply check out this video interview with Subhash.  If you’d like to try my wife Dana’s puzzle games inspired by PCP, which Subhash uses 2 minutes into the video to explain what he works on, see here.  Online, interactive versions of these puzzle games are currently under development.  Also, if you have questions about the UGC or Subhash’s work, go ahead and ask: I’ll answer if I can, and otherwise rely on in-house expertise.

Congratulations again to Subhash!

Is the P vs. NP problem ill-posed? (Answer: no.)

Wednesday, August 13th, 2014

A couple days ago, a reader wrote to me to ask whether it’s possible that the solution to the P vs. NP problem is simply undefined—and that one should enlarge the space of possible answers using non-classical logics (the reader mentioned something called Catuṣkoṭi logic).  Since other people have emailed me with similar questions in the past, I thought my response might be of more general interest, and decided to post it here.

Thanks for your mail!  I’m afraid I don’t agree with you that there’s a problem in the formulation of P vs. NP.  Let me come at it this way:

Do you also think there might be a problem in the formulation of Goldbach’s Conjecture?  Or the Twin Prime Conjecture?  (I.e., that maybe the definition of “prime number” needs to be modified using Catuṣkoṭi logic?)  Or any other currently-unsolved problem in any other part of math?

If you don’t, then my question would be: why single out P vs. NP?

After all, P vs. NP can be expressed as a Π2-sentence: that is, as a certain relationship among positive integers, which either holds or doesn’t hold.  (In this case, the integers would encode Turing machines, polynomial upper bounds on their running time, and an NP-complete problem like 3SAT — all of which are expressible using the basic primitives of arithmetic.)  In terms of its logical form, then, it’s really no different than the Twin Prime Conjecture and so forth.

So then, do you think that statements of arithmetic, like there being no prime number between 24 and 28, might also be like the Parallel Postulate?  That there might be some other, equally-valid “non-Euclidean arithmetic” where there is a prime between 24 and 28?  What exactly would one mean by that?  I understand exactly what one means by non-Euclidean geometries, but to my mind, geometry is less “fundamental” (at least in a logical sense) than positive integers are.  And of course, even if one believes that non-Euclidean geometries are just as “fundamental” as Euclidean geometry — an argument that seems harder to make for, say, the positive integers versus the Gaussian integers or finite fields or p-adics  — that still doesn’t change the fact that questions about Euclidean geometry have definite right answers.

Let me acknowledge two important caveats to what I said:

First, it’s certainly possible that P vs. NP might be independent of standard formal systems like ZF set theory (i.e., neither provable nor disprovable in them).  That’s a possibility that everyone acknowledges, even if (like me) they consider it rather unlikely.  But note that, even if P vs. NP were independent of our standard formal systems, that still wouldn’t mean that the question was ill-posed!  There would still either be a Turing machine that decided 3SAT in polynomial time, or else there wouldn’t be.  It would “only” mean that the usual axioms of set theory wouldn’t suffice to tell us which.

The second caveat is that P vs. NP, like any other mathematical question, can be generalized and extended in all sorts of interesting ways.  So for example, one can define analogues of P vs. NP over the reals and complex numbers (which are also currently open, but which might be easier than the Boolean version).  Or, even if P≠NP, one can still ask if randomized algorithms, or nonuniform algorithms, or quantum algorithms, might be able to solve NP-complete problems in polynomial time.  Or one can ask whether NP-complete problems are at least efficiently solvable “on average,” if not in the worst case.  Every one of these questions has been actively researched, and you could make a case that some of them are just as interesting as the original P vs. NP question, if not more interesting — if history had turned out a little different, any one of these might have been what we’d taken as our “flagship” question, rather than P vs. NP.  But again, this still doesn’t change the fact that the original P vs. NP question has some definite answer (like, for example, P≠NP…), even if we can’t prove which answer it is, even if we won’t be able to prove it for 500 years.

And please keep in mind that, if P vs. NP were solved after being open for hundreds of years, it would be far from the first such mathematical problem!  Fermat’s Last Theorem stayed open for 350 years, and the impossibility of squaring the circle and trisecting the angle were open for more than 2000 years.  Any time before these problems were solved, one could’ve said that maybe people had failed because the question itself was ill-posed, but one would’ve been mistaken.  People simply hadn’t invented the right ideas yet.

Best regards,
Scott

Unrelated Announcements: As most of you have probably seen, Subhash Khot won the Nevanlinna Prize, while Maryam Mirzakhani, Artur Avila, Manjul Bhargava and Martin Hairer won the Fields Medal. Mirzakhani is the first female Fields Medalist. Congratulations to all!

Also, I join the rest of the world in saying that Robin Williams was a great actor—there was no one better at playing “the Robin Williams role” in any given movie—and his loss is a loss for humanity.

The Power of the Digi-Comp II: My First Conscious Paperlet

Friday, July 4th, 2014

Foreword: Right now, I have a painfully-large stack of unwritten research papers.  Many of these are “paperlets”: cool things I noticed that I want to tell people about, but that would require a lot more development before they became competitive for any major theoretical computer science conference.  And what with the baby, I simply don’t have time anymore for the kind of obsessive, single-minded, all-nighter-filled effort needed to bulk my paperlets up.  So starting today, I’m going to try turning some of my paperlets into blog posts.  I don’t mean advertisements or sneak previews for papers, but replacements for papers: blog posts that constitute the entirety of what I have to say for now about some research topic.  “Peer reviewing” (whether signed or anonymous) can take place in the comments section, and “citation” can be done by URL.  The hope is that, much like with 17th-century scientists who communicated results by letter, this will make it easier to get my paperlets done: after all, I’m not writing Official Papers, just blogging for colleagues and friends.

Of course, I’ve often basically done this before—as have many other academic bloggers—but now I’m going to go about it more consciously.  I’ve thought for years that the Internet would eventually change the norms of scientific publication much more radically than it so far has: that yes, instant-feedback tools like blogs and StackExchange and MathOverflow might have another decade or two at the periphery of progress, but their eventual destiny is at the center.  And now that I have tenure, it hit me that I can do more than prognosticate about such things.  I’ll start small: I won’t go direct-to-blog for big papers, papers that cry out for LaTeX formatting, or joint papers.  I certainly won’t do it for papers with students who need official publications for their professional advancement.  But for things like today’s post—on the power of a wooden mechanical computer now installed in the lobby of the building where I work—I hope you agree that the Science-by-Blog Plan fits well.

Oh, by the way, happy July 4th to American readers!  I hope you find that a paperlet about the logspace-interreducibility of a few not-very-well-known computational models captures everything that the holiday is about.

The Power of the Digi-Comp II

by Scott Aaronson

Abstract

I study the Digi-Comp II, a wooden mechanical computer whose only moving parts are balls, switches, and toggles.  I show that the problem of simulating (a natural abstraction of) the Digi-Comp, with a polynomial number of balls, is complete for CC (Comparator Circuit), a complexity class defined by Subramanian in 1990 that sits between NL and P.  This explains why the Digi-Comp is capable of addition, multiplication, division, and other arithmetical tasks, and also implies new tasks of which the Digi-Comp is capable (and that indeed are complete for it), including the Stable Marriage Problem, finding a lexicographically-first perfect matching, and the simulation of other Digi-Comps.  However, it also suggests that the Digi-Comp is not a universal computer (not even in the circuit sense), making it a very interesting way to fall short of Turing-universality.  I observe that even with an exponential number of balls, simulating the Digi-Comp remains in P, but I leave open the problem of pinning down its complexity more precisely.

Introduction

To celebrate his 60th birthday, my colleague Charles Leiserson (who some of you might know as the “L” in the CLRS algorithms textbook) had a striking contraption installed in the lobby of the MIT Stata Center.  That contraption, pictured below, is a custom-built, supersized version of a wooden mechanical computer from the 1960s called the Digi-Comp II, now manufactured and sold by a company called Evil Mad Scientist.

Click here for a short video showing the Digi-Comp’s operation (and here for the user’s manual).  Basically, the way it works is this: a bunch of balls (little steel balls in the original version, pool balls in the supersized version) start at the top and roll to the bottom, one by one.  On their way down, the balls may encounter black toggles, which route each incoming ball either left or right.  Whenever this happens, the weight of the ball flips the toggle to the opposite setting: so for example, if a ball goes left, then the next ball to encounter the same toggle will go right, and the ball after that will go left, and so on.  The toggles thus maintain a “state” for the computer, with each toggle storing one bit.

Besides the toggles, there are also “switches,” which the user can set at the beginning to route every incoming ball either left or right, and whose settings aren’t changed by the balls.  And then there are various wooden tunnels and ledges, whose function is simply to direct the balls in a desired way as they roll down.  A ball could reach different locations, or even the same location in different ways, depending on the settings of the toggles and switches above that location.  On the other hand, once we fix the toggles and switches, a ball’s motion is completely determined: there’s no random or chaotic element.

“Programming” is done by configuring the toggles and switches in some desired way, then loading a desired number of balls at the top and letting them go.  “Reading the output” can be done by looking at the final configuration of some subset of the toggles.

Whenever a ball reaches the bottom, it hits a lever that causes the next ball to be released from the top.  This ensures that the balls go through the device one at a time, rather than all at once.  As we’ll see, however, this is mainly for aesthetic reasons, and maybe also for the mechanical reason that the toggles wouldn’t work properly if two or more balls hit them at once.  The actual logic of the machine doesn’t care about the timing of the balls; the sheer number of balls that go through is all that matters.

The Digi-Comp II, as sold, contains a few other features: most notably, toggles that can be controlled by other toggles (or switches).  But I’ll defer discussion of that feature to later.  As we’ll see, we already get a quite interesting model of computation without it.

One final note: of course the machine that’s sold has a fixed size and a fixed geometry.  But for theoretical purposes, it’s much more interesting to consider an arbitrary network of toggles and switches (not necessarily even planar!), with arbitrary size, and with an arbitrary number of balls fed into it.  (I’ll give a more formal definition in the next section.)

The Power of the Digi-Comp

So, what exactly can the Digi-Comp do?  As a first exercise, you should convince yourself that, by simply putting a bunch of toggles in a line and initializing them all to “L” (that is, Left), it’s easy to set up a binary counter.  In other words, starting from the configuration, say, LLL (in which three toggles all point left), as successive balls pass through we can enter the configurations RLL, LRL, RRL, etc.  If we interpret L as 0 and R as 1, and treat the first bit as the least significant, then we’re simply counting from 0 to 7 in binary.  With 20 toggles, we could instead count to 1,048,575.

But counting is not the most interesting thing we can do.  As Charles eagerly demonstrated to me, we can also set up the Digi-Comp to perform binary addition, binary multiplication, sorting, and even long division.  (Excruciatingly slowly, of course: the Digi-Comp might need even more work to multiply 3×5, than existing quantum computers need to factor the result!)

To me, these demonstrations served only as proof that, while Charles might call himself a theoretical computer scientist, he’s really a practical person at heart.  Why?  Because a theorist would know that the real question is not what the Digi-Comp can do, but rather what it can’t do!  In particular, do we have a universal computer on our hands here, or not?

If the answer is yes, then it’s amazing that such a simple contraption of balls and toggles could already take us over the threshold of universality.  Universality would immediately explain why the Digi-Comp is capable of multiplication, division, sorting, and so on.  If, on the other hand, we don’t have universality, that too is extremely interesting—for we’d then face the challenge of explaining how the Digi-Comp can do so many things without being universal.

It might be said that the Digi-Comp is certainly not a universal computer, since if nothing else, it’s incapable of infinite loops.  Indeed, the number of steps that a given Digi-Comp can execute is bounded by the number of balls, while the number of bits it can store is bounded by the number of toggles: clearly we don’t have a Turing machine.  This is true, but doesn’t really tell us what we want to know.  For, as discussed in my last post, we can consider not only Turing-machine universality, but also the weaker (but still interesting) notion of circuit-universality.  The latter means the ability to simulate, with reasonable efficiency, any Boolean circuit of AND, OR, and NOT gates—and hence, in particular, to compute any Boolean function on any fixed number of input bits (given enough resources), or to simulate any polynomial-time Turing machine (given polynomial resources).

The formal way to ask whether something is circuit-universal, is to ask whether the problem of simulating the thing is P-complete.  Here P-complete (not to be confused with NP-complete!) basically means the following:

There exists a polynomial p such that any S-step Turing machine computation—or equivalently, any Boolean circuit with at most S gates—can be embedded into our system if we allow the use of poly(S) computing elements (in our case, balls, toggles, and switches).

Of course, I need to tell you what I mean by the weasel phrase “can be embedded into.”  After all, it wouldn’t be too impressive if the Digi-Comp could “solve” linear programming, primality testing, or other highly-nontrivial problems, but only via “embeddings” in which we had to do essentially all the work, just to decide how to configure the toggles and switches!  The standard way to handle this issue is to demand that the embedding be “computationally simple”: that is, we should be able to carry out the embedding in L (logarithmic space), or some other complexity class believed to be much smaller than the class (P, in this case) for which we’re trying to prove completeness.  That way, we’ll be able to say that our device really was “doing something essential”—i.e., something that our embedding procedure couldn’t efficiently do for itself—unless the larger complexity class collapses with the smaller one (i.e., unless L=P).

So then, our question is whether simulating the Digi-Comp II is a P-complete problem under L-reductions, or alternatively, whether the problem is in some complexity class believed to be smaller than P.  The one last thing we need is a formal definition of “the problem of simulating the Digi-Comp II.”  Thus, let DIGICOMP be the following problem:

We’re given as inputs:

• A directed acyclic graph G, with n vertices.  There is a designated vertex with indegree 0 and outdegree 1 called the “source,” and a designated vertex with indegree 1 and outdegree 0 called the “sink.”  Every internal vertex v (that is, every vertex with both incoming and outgoing edges) has exactly two outgoing edges, labeled “L” (left) and “R” (right), as well as one bit of internal state s(v)∈{L,R}.
• For each vertex v, an “initial” value for its internal state s(v).
• A positive integer T (encoded in unary notation), representing the number of balls dropped successively from the source vertex.

Computation proceeds as follows: each time a ball appears at the source vertex, it traverses the path induced by the L and R states of the vertices that it encounters, until it reaches a terminal vertex, which might or might not be the sink.  As the ball traverses the path, it flips s(v) for each vertex v that it encounters: L goes to R and R goes to L.  Then the next ball is dropped in.

The problem is to decide whether any balls reach the sink.

Here the internal vertices represent toggles, and the source represents the chute at the top from which the balls drop.  Switches aren’t included, since (by definition) the reduction can simply fix their values to “L” to “R” and thereby simplify the graph.

Of course we could consider other problems: for example, the problem of deciding whether an odd number of balls reach the sink, or of counting how many balls reach the sink, or of computing the final value of every state-variable s(v).  However, it’s not hard to show that all of these problems are interreducible with the DIGICOMP problem as defined above.

The Class CC

My main result, in this paperlet, is to pin down the complexity of the DIGICOMP problem in terms of a complexity class called CC (Comparator Circuit): a class that’s obscure enough not to be in the Complexity Zoo (!), but that’s been studied in several papers.  CC was defined by Subramanian in his 1990 Stanford PhD thesis; around the same time Mayr and Subramanian showed the inclusion NL ⊆ CC (the inclusion CC ⊆ P is immediate).  Recently Cook, Filmus, and Lê revived interest in CC with their paper The Complexity of the Comparator Circuit Value Problem, which is probably the best current source of information about this class.

OK, so what is CC?  Informally, it’s the class of problems that you can solve using a comparator circuit, which is a circuit that maps n bits of input to n bits of output, and whose only allowed operation is to sort any desired pair of bits.  That is, a comparator circuit can repeatedly apply the transformation (x,y)→(x∧y,x∨y), in which 00, 01, and 11 all get mapped to themselves, while 10 gets mapped to 01.  Note that there’s no facility in the circuit for copying bits (i.e., for fanout), so sorting could irreversibly destroy information about the input.  In the comparator circuit value problem (or CCV), we’re given as input a description of a comparator circuit C, along with an input x∈{0,1}n and an index i∈[n]; then the problem is to determine the final value of the ith bit when C is applied to x.  Then CC is simply the class of all languages that are L-reducible to CCV.

As Cook et al. discuss, there are various other characterizations of CC: for example, rather than using a complete problem, we can define CC directly as the class of languages computed by uniform families of comparator circuits.  More strikingly, Mayr and Subramanian showed that CC has natural complete problems, which include (decision versions of) the famous Stable Marriage Problem, as well as finding the lexicographically first perfect matching in a bipartite graph.  So perhaps the most appealing definition of CC is that it’s “the class of problems that can be easily mapped to the Stable Marriage Problem.”

It’s a wide-open problem whether CC=NL or CC=P: as usual, one can give oracle separations, but as far as anyone knows, either equality could hold without any dramatic implications for “standard” complexity classes.  (Of course, the conjunction of these equalities would have a dramatic implication.)  What got Cook et al. interested was that CC isn’t even known to contain (or be contained in) the class NC of parallelizable problems.  In particular, linear-algebra problems in NC, like determinant, matrix inversion, and iterated matrix multiplication—not to mention other problems in P, like linear programming and greatest common divisor—might all be examples of problems that are efficiently solvable by Boolean circuits, but not by comparator circuits.

One final note about CC.  Cook et al. showed the existence of a universal comparator circuit: that is, a single comparator circuit C able to simulate any other comparator circuit C’ of some fixed size, given a description of C’ as part of its input.

DIGICOMP is CC-Complete

I can now proceed to my result: that, rather surprisingly, the Digi-Comp II can solve exactly the problems in CC, giving us another characterization of that class.

I’ll prove this using yet another model of computation, which I call the pebbles model.  In the pebbles model, you start out with a pile of x pebbles; the positive integer x is the “input” to your computation.  Then you’re allowed to apply a straight-line program that consists entirely of the following two operations:

1. Given any pile of y pebbles, you can split it into two piles consisting of ⌈y/2⌉ and ⌊y/2⌋ pebbles respectively.
2. Given any two piles, consisting of y and z pebbles respectively, you can combine them into a single pile consisting of y+z pebbles.

Your program “accepts” if and only if some designated output pile contains at least one pebble (or, in a variant that can be shown to be equivalent, if it contains an odd number of pebbles).

As suggested by the imagery, you don’t get to make “backup copies” of the piles before splitting or combining them: if, for example, you merge y with z to create y+z, then y isn’t also available to be split into ⌈y/2⌉ and ⌊y/2⌋.

Note that the ceiling and floor functions are the only “nonlinear” elements of the pebbles model: if not for them, we’d simply be applying a sequence of linear transformations.

I can now divide my CC-completeness proof into two parts: first, that DIGICOMP (i.e., the problem of simulating the Digi-Comp II) is equivalent to the pebbles model, and second, that the pebbles model is equivalent to comparator circuits.

Let’s first show the equivalence between DIGICOMP and pebbles.  The reduction is simply this: in a given Digi-Comp, each edge will be associated to a pile, with the number of pebbles in the pile equal to the total number of balls that ever traverse that edge.  Thus, we have T balls dropped in to the edge incident to the source vertex, corresponding to an initial pile with T pebbles.  Multiple edges pointing to the same vertex (i.e., fan-in) can be modeled by combining the associated piles into a single pile.  Meanwhile, a toggle has the effect of splitting a pile: if y balls enter the toggle in total, then ⌈y/2⌉ balls will ultimately exit in whichever direction the toggle was pointing initially (whether left or right), and ⌊y/2⌋ balls will ultimately exit in the other direction.  It’s clear that this equivalence works in both directions: not only does it let us simulate any given Digi-Comp by a pebble program, it also lets us simulate any pebble program by a suitably-designed Digi-Comp.

OK, next let’s show the equivalence between pebbles and comparator circuits.  As a first step, given any comparator circuit, I claim that we can simulate it by a pebble program.  The way to do it is simply to use a pile of 0 pebbles to represent each “0” bit, and a pile of 1 pebble to represent each “1” bit.  Then, any time we want to sort two bits, we simply merge their corresponding piles, then split the result back into two piles.  The result?  00 gets mapped to 00, 11 gets mapped to 11, and 01 and 10 both get mapped to one pebble in the ⌈y/2⌉ pile and zero pebbles in the ⌊y/2⌋ pile.  At the end, a given pile will have a pebble in it if and only if the corresponding output bit in the comparator circuit is 1.

One might worry that the input to a comparator circuit is a sequence of bits, whereas I said before that the input to a pebble program is just a single pile.  However, it’s not hard to see that we can deal with this, without leaving the world of logspace reductions, by breaking up an initial pile of n pebbles into n piles each of zero pebbles or one pebble, corresponding to any desired n-bit string (along with some extra pebbles, which we subsequently ignore).  Alternatively, we could generalize the pebbles model so that the input can consist of multiple piles.  One can show, by a similar “breaking-up” trick, that this wouldn’t affect the pebbles model’s equivalence to the DIGICOMP problem.

Finally, given a pebble program, I need to show how to simulate it by a comparator circuit.  The reduction works as follows: let T be the number of pebbles we’re dealing with (or even just an upper bound on that number).  Then each pile will be represented by its own group of T wires in the comparator circuit.  The Hamming weight of those T wires—i.e., the number of them that contain a ‘1’ bit—will equal the number of pebbles in the corresponding pile.

To merge two piles, we first merge the corresponding groups of T wires.  We then use comparator gates to sort the bits in those 2T wires, until all the ‘1’ bits have been moved into the first T wires.  Finally, we ignore the remaining T wires for the remainder of the computation.

To split a pile, we first use comparator gates to sort the bits in the T wires, until all the ‘1’ bits have been moved to the left.  We then route all the odd-numbered wires into “Pile A” (the one that’s supposed to get ⌈y/2⌉ pebbles), and route all the even-numbered wires into “Pile B” (the one that’s supposed to get ⌊y/2⌋ pebbles).  Finally, we introduce T additional wires with 0’s in them, so that piles A and B have T wires each.

At the end, by examining the leftmost wire in the group of wires corresponding to the output pile, we can decide whether that pile ends up with any pebbles in it.

Since it’s clear that all of the above transformations can be carried out in logspace (or even smaller complexity classes), this completes the proof that DIGICOMP is CC-complete under L-reductions.  As corollaries, the Stable Marriage and lexicographically-first perfect matching problems are L-reducible to DIGICOMP—or informally, are solvable by easily-described, polynomial-size Digi-Comp machines (and indeed, characterize the power of such machines).  Combining my result with the universality result of Cook et al., a second corollary is that there exists a “universal Digi-Comp”: that is, a single Digi-Comp D that can simulate any other Digi-Comp D’ of some polynomially-smaller size, so long as we initialize some subset of the toggles in D to encode a description of D’.

How Does the Digi-Comp Avoid Universality?

Let’s now step back and ask: given that the Digi-Comp is able to do so many things—division, Stable Marriage, bipartite matching—how does it fail to be a universal computer, at least a circuit-universal one?  Is the Digi-Comp a counterexample to the oft-repeated claims of people like Stephen Wolfram, about the ubiquity of universal computation and the difficulty of avoiding it in any sufficiently complex system?  What would need to be added to the Digi-Comp to make it circuit-universal?  Of course, we can ask the same questions about pebble programs and comparator circuits, now that we know that they’re all computationally equivalent.

The reason for the failure of universality is perhaps easiest to see in the case of comparator circuits.  As Steve Cook pointed out in a talk, comparator circuits are “1-Lipschitz“: that is, if you have a comparator circuit acting on n input bits, and you change one of the input bits, your change can affect at most one output bit.  Why?  Well, trace through the circuit and use induction.  So in particular, there’s no amplification of small effects in comparator circuits, no chaos, no sensitive dependence on initial conditions, no whatever you want to call it.  Now, while chaos doesn’t suffice for computational universality, at least naïvely it’s a necessary condition, since there exist computations that are chaotic.  Of course, this simpleminded argument can’t be all there is to it, since otherwise we would’ve proved CCP.  What the argument does show is that, if CC=P, then the encoding of a Boolean circuit into a comparator circuit (or maybe into a collection of such circuits) would need to be subtle and non-obvious: it would need to take computations with the potential for chaos, and reduce them to computations without that potential.

Once we understand this 1-Lipschitz business, we can also see it at work in the pebbles model.  Given a pebble program, suppose someone surreptitiously removed a single pebble from one of the initial piles.  For want of that pebble, could the whole kingdom be lost?  Not really.  Indeed, you can convince yourself that the output will be exactly the same as before, except that one output pile will have one fewer pebble than it would have otherwise.  The reason is again an induction: if you change x by 1, that affects at most one of ⌈x/2⌉ and ⌊x/2⌋ (and likewise, merging two piles affects at most one pile).

We now see the importance of the point I made earlier, about there being no facility in the piles model for “copying” a pile.  If we could copy piles, then the 1-Lipschitz property would fail.  And indeed, it’s not hard to show that in that case, we could implement AND, OR, and NOT gates with arbitrary fanout, and would therefore have a circuit-universal computer.  Likewise, if we could copy bits, then comparator circuits—which, recall, map (x,y) to (x∧y,x∨y)—would implement AND, OR, and NOT with arbitrary fanout, and would be circuit-universal.  (If you’re wondering how to implement NOT: one way to do it is to use what’s known in quantum computing as the “dual-rail representation,” where each bit b is encoded by two bits, one for b and the other for ¬b.  Then a NOT can be accomplished simply by swapping those bits.  And it’s not hard to check that comparator gates in a comparator circuit, and combining and splitting two piles in a pebble program, can achieve the desired updates to both the b rails and the ¬b rails when an AND or OR gate is applied.  However, we could also just omit NOT gates entirely, and use the fact that computing the output of even a monotone Boolean circuit is a P-complete problem under L-reductions.)

In summary, then, the inability to amplify small effects seems like an excellent candidate for the central reason why the power of comparator circuits and pebble programs hits a ceiling at CC, and doesn’t go all the way up to P.  It’s interesting, in this connection, that while transistors (and before them, vacuum tubes) can be used to construct logic gates, the original purpose of both of them was simply to amplify information: to transform a small signal into a large one.  Thus, we might say, comparator circuits and pebble programs fail to be computationally universal because they lack transistors or other amplifiers.

I’d like to apply exactly the same analysis to the Digi-Comp itself: that is, I’d like to say that the reason the Digi-Comp fails to be universal (unless CC=P) is that it, too, lacks the ability to amplify small effects (wherein, for example, the drop of a single ball would unleash a cascade of other balls).  In correspondence, however, David Deutsch pointed out a problem: namely, if we just watch a Digi-Comp in action, then it certainly looks like it has an amplification capability!  Consider, for example, the binary counter discussed earlier.  Suppose a column of ten toggles is in the configuration RRRRRRRRRR, representing the integer 1023.  Then the next ball to fall down will hit all ten toggles in sequence, resetting them to LLLLLLLLLL (and thus, resetting the counter to 0).  Why isn’t this precisely the amplification of a small effect that we were looking for?

Well, maybe it’s amplification, but it’s not of a kind that does what we want computationally.  One way to see the difficulty is that we can’t take all those “L” settings we’ve produced as output, and feed them as inputs to further gates in an arbitrary way.  We could do it if the toggles were arranged in parallel, but they’re arranged serially, so that flipping any one toggle inevitably has the potential also to flip the toggles below it.  Deutsch describes this as a “failure of composition”: in some sense, we do have a fan-out or copying operation, but the design of the Digi-Comp prevents us from composing the fan-out operation with other operations in arbitrary ways, and in particular, in the ways that would be needed to simulate any Boolean circuit.

So, what features could we add to the Digi-Comp to make it universal?  Here’s the simplest possibility I was able to come up with: suppose that, scattered throughout the device, there were balls precariously perched on ledges, in such a way that whenever one was hit by another ball, it would get dislodged, and both balls would continue downward.  We could, of course, chain several of these together, so that the two balls would in turn dislodge four balls, the four would dislodge eight, and so on.  I invite you to check that this would provide the desired fan-out gate, which, when combined with AND, OR, and NOT gates that we know how to implement (e.g., in the dual-rail representation described previously), would allow us to simulate arbitrary Boolean circuits.  In effect, the precariously perched balls would function as “transistors” (of course, painfully slow transistors, and ones that have to be laboriously reloaded with a ball after every use).

As a second possibility, Charles Leiserson points out to me that the Digi-Comp, as sold, has a few switches and toggles that can be controlled by other toggles.  Depending on exactly how one modeled this feature, it’s possible that it, too, could let us implement arbitrary fan-out gates, and thereby boost the Digi-Comp up to circuit-universality.

Open Problems

My personal favorite open problem is this:

What is the complexity of simulating a Digi-Comp II if the total number of balls dropped in is exponential, rather than polynomial?  (In other words, if the positive integer T, representing the number of balls, is encoded in binary rather than in unary?)

From the equivalence between the Digi-Comp and pebble programs, we can already derive a conclusion about the above problem that’s not intuitively obvious: namely, that it’s in P.  Or to say it another way: it’s possible to predict the exact state of a Digi-Comp with n toggles, after T balls have passed through it, using poly(n, log T) computation steps.  The reason is simply that, if there are T balls, then the total number of balls that pass through any given edge (the only variable we need to track) can be specified using log2T bits.  This, incidentally, gives us a second sense in which the Digi-Comp is not a universal computer: namely, even if we let the machine “run for exponential time” (that is, drop exponentially many balls into it), unlike a conventional digital computer it can’t possibly solve all problems in PSPACE, unless P=PSPACE.

However, this situation also presents us with a puzzle: if we let DIGICOMPEXP be the problem of simulating a Digi-Comp with an exponential number of balls, then it’s clear that DIGICOMPEXP is hard for CC and contained in P, but we lack any information about its difficulty more precise than that.  At present, I regard both extremes—that DIGICOMPEXP is in CC (and hence, no harder than ordinary DIGICOMP), and that it’s P-complete—as within the realm of possibility (along with the possibility that DIGICOMPEXP is intermediate between the two).

By analogy, one can also consider comparator circuits where the entities that get compared are integers from 1 to T rather than bits—and one can then consider the power of such circuits, when T is allowed to grow exponentially.  In email correspondence, however, Steve Cook sent me a proof that such circuits have the same power as standard, Boolean comparator circuits.  It’s not clear whether this tells us anything about the power of a Digi-Comp with exponentially many balls.

A second open problem is to formalize the feature of Digi-Comp that Charles mentioned—namely, toggles and switches controlled by other toggles—and see whether, under some reasonable formalization, that feature bumps us up to P-completeness (i.e., to circuit-universality).  Personally, though, I confess I’d be even more interested if there were some feature we could add to the machine that gave us a larger class than CC, but that still wasn’t all of P.

A third problem is to pin down the power of Digi-Comps (or pebble programs, or comparator circuits) that are required to be planar.  While my experience with woodcarving is limited, I imagine that planar or near-planar graphs are a lot easier to carve than arbitrary graphs (even if the latter present no problems of principle).

A fourth problem has to do with the class CC in general, rather than the Digi-Comp in particular, but I can’t resist mentioning it.  Let CCEXP be the complexity class that’s just like CC, but where the comparator circuit (or pebble program, or Digi-Comp) is exponentially large and specified only implicitly (that is, by a Boolean circuit that, given as input a binary encoding of an integer i, tells you the ith bit of the comparator circuit’s description).  Then it’s easy to see that PSPACE ⊆ CCEXP ⊆ EXP.  Do we have CCEXP = PSPACE or CCEXP = EXP?  If not, then CCEXP would be the first example I’ve ever seen of a natural complexity class intermediate between PSPACE and EXP.

Acknowledgments

I thank Charles Leiserson for bringing the Digi-Comp II to MIT, and thereby inspiring this “research.”  I also thank Steve Cook, both for giving a talk that first brought the complexity class CC to my attention, and for helpful correspondence.  Finally I thank David Deutsch for the point about composition.