Archive for the ‘Announcements’ Category

The Turing movie

Tuesday, December 16th, 2014

Last week I finally saw The Imitation Game, the movie with Benedict Cumberbatch as Alan Turing.

OK, so for those who haven’t yet seen it: should you?  Here’s my one paragraph summary: imagine that you told the story of Alan Turing—one greatest triumphs and tragedies of human history, needing no embellishment whatsoever—to someone who only sort-of understood it, and who filled in the gaps with weird fabrications and Hollywood clichés.  And imagine that person retold the story to a second person, who understood even less, and that that person retold it to a third, who understood least of all, but who was charged with making the movie that would bring Turing’s story before the largest audience it’s ever had.  And yet, imagine that enough of the enormity of the original story made it through this noisy channel, that the final product was still pretty good.  (Except, imagine how much better it could’ve been!)

The fabrications were especially frustrating to me, because we know it’s possible to bring Alan Turing’s story to life in a way that fully honors the true science and history.  We know that, because Hugh Whitemore’s 1986 play Breaking the Code did it.  The producers of The Imitation Game would’ve done better just to junk their script, and remake Breaking the Code into a Hollywood blockbuster.  (Note that there is a 1996 BBC adaptation of Breaking the Code, with Derek Jacobi as Turing.)

Anyway, the movie focuses mostly on Turing’s codebreaking work at Bletchley Park, but also jumps around in time to his childhood at Sherborne School, and to his arrest for “homosexual indecency” and its aftermath.  Turing’s two world-changing papers—On Computable Numbers and Computing Machinery and Intelligence—are both mentioned, though strangely, his paper about computing zeroes of the Riemann zeta function is entirely overlooked.

Here are my miscellaneous comments:

  • The boastful, trash-talking, humor-impaired badass-nerd of the movie seems a lot closer to The Big Bang Theory‘s Sheldon Cooper, or to some other Hollywood concept of “why smart people are so annoying,” than to the historical Alan Turing.  (At least in Sheldon’s case, the archetype is used for laughs, not drama or veracity.)  As portrayed in the definitive biography (Andrew Hodges’ Alan Turing: The Enigma), Turing was eccentric, sure, and fiercely individualistic (e.g., holding up his pants with pieces of string), but he didn’t get off on insulting the intelligence of the people around him.
  • In the movie, Turing is pretty much singlehandedly responsible for designing, building, and operating the Bombes (the codebreaking machines), which he does over the strenuous objections of his superiors.  This, of course, is absurd: Bletchley employed about 10,000 people at its height.  Turing may have been the single most important cog in the operation, but he was still a cog.  And by November 1942, the operation was already running smoothly enough that Turing could set sail for the US (in waters that were now much safer, thanks to Bletchley!), to consult on other cryptographic projects at Bell Labs.
  • But perhaps the movie’s zaniest conceit is that Turing was also in charge of deciding what to do with Bletchley’s intelligence (!).  In the movie, it falls to him, not the military, to decide which ship convoys will be saved, and which sacrificed to prevent spilling Bletchley’s secret.  If that had any historicity to it, it would surely be the most military and political power ever entrusted to a mathematician (update: see the comments section for potential counterexamples).
  • It’s true that Turing (along with three other codebreakers) wrote a letter directly to Winston Churchill, pleading for more funding for Bletchley Park—and that Churchill saw the letter, and ordered “Action this day! Make sure they have all they want on extreme priority.”  However, the letter was not a power play to elevate Turing over Hugh Alexander and his other colleagues: in fact, Alexander co-signed the letter.  More broadly, the fierce infighting between Turing and everyone else at Bletchley Park, central to the movie’s plot, seems to have been almost entirely invented for dramatic purposes.
  • The movie actually deserves a lot of credit for getting right that the major technical problem of Bletchley Park was how to get the Bombes to search through keys fast enough—and that speeding things up is where Turing made a central contribution.  As a result, The Imitation Game might be the first Hollywood movie ever made whose plot revolves around computational efficiency.  (Counterexamples, anyone?)  Unfortunately, the movie presents Turing’s great insight as being that one can speed up the search by guessing common phrases, like “HEIL HITLER,” that are likely to be in the plaintext.  That was, I believe, obvious to everyone from the beginning.
  • Turing never built a computer in his own home, and he never named a computer “Christopher,” after his childhood crush Christopher Morcom.  (On the other hand, Christopher Morcom existed, and his early death from tuberculosis really did devastate Turing, sending him into morbid-yet-prescient ruminations about whether a mind could exist separately from a brain.)
  • I found it ironic that The Imitation Game, produced in 2014, is far more squeamish about on-screen homosexuality than Breaking the Code, produced in 1986.  Turing talks about being gay (which is an improvement over 2001’s Enigma, which made Turing straight!), but is never shown embracing another man.  However, the more important problem is that the movie botches the story of the burglary of Turing’s house (i.e., the event that led to Turing’s arrest and conviction for homosexual indecency), omitting the role of Turing’s own naiveté in revealing his homosexuality to the police, and substituting some cloak-and-dagger spy stuff.  Once again, Breaking the Code handled this perfectly.
  • In one scene, Euler is pronounced “Yooler.”

For more, see an excellent piece in Slate, How Accurate Is The Imitation Game?.  And for other science bloggers’ reactions, see this review by Christos Papadimitriou (which I thought was extremely kind, though it focuses more on Turing himself than on the movie), this reaction by Peter Woit, which largely echoes mine, and this by Clifford Johnson.

PostBQP Postscripts: A Confession of Mathematical Errors

Sunday, November 30th, 2014

tl;dr: This post reveals two errors in one of my most-cited papers, and also explains how to fix them.  Thanks to Piotr Achinger, Michael Cohen, Greg Kuperberg, Ciaran Lee, Ryan O’Donnell, Julian Rosen, Will Sawin, Cem Say, and others for their contributions to this post.


If you look at my Wikipedia page, apparently one of the two things in the world that I’m “known for” (along with algebrization) is “quantum Turing with postselection.”  By this, Wikipedia means my 2004 definition of the complexity class PostBQP—that is, the class of decision problems solvable in bounded-error quantum polynomial time, assuming the ability to postselect (or condition) on certain measurement outcomes—and my proof that PostBQP coincides with the classical complexity PP (that is, the class of decision problems expressible in terms of whether the number of inputs that cause a given polynomial-time Turing machine to accept does or doesn’t exceed some threshold).

To explain this a bit: even without quantum mechanics, it’s pretty obvious that, if you could “postselect” on exponentially-unlikely events, then you’d get huge, unrealistic amounts of computational power.  For example (and apologies in advance for the macabre imagery), you could “solve” NP-complete problems in polynomial time by simply guessing a random solution, then checking whether the solution is right, and shooting yourself if it happened to be wrong!  Conditioned on still being alive (and if you like, appealing to the “anthropic principle”), you must find yourself having guessed a valid solution—assuming, of course, that there were any valid solutions to be found.  If there weren’t any, then you’d seem to be out of luck!  (Exercise for the reader: generalize this “algorithm,” so that it still works even if you don’t know in advance whether your NP-complete problem instance has any valid solutions.)

So with the PostBQP=PP theorem, the surprise was not that postselection gives you lots of computational power, but rather that postselection combined with quantum mechanics gives you much more power even than postselection by itself (or quantum mechanics by itself, for that matter).  Since PPP=P#P, the class PP basically captures the full difficulty of #P-complete counting problems—that is, not just solving an NP-complete problem, but counting how many solutions it has.  It’s not obvious that a quantum computer with postselection can solve counting problems, but that’s what the theorem shows.  That, in turn, has implications for other things: for example, I showed it can be used to prove classical facts about PP, like the fact that PP is closed under intersection (the Beigel-Reingold-Spielman Theorem), in a straightforward way; and it’s also used to show the hardness of quantum sampling problems, in the work of Bremner-Jozsa-Shepherd as well as my BosonSampling work with Arkhipov.

I’m diffident about being “known for” something so simple; once I had asked the question, the proof of PostBQP=PP took me all of an hour to work out.  Yet PostBQP ended up being a hundred times more influential for quantum computing theory than things on which I expended a thousand times more effort.  So on balance, I guess I’m happy to call PostBQP my own.

That’s why today’s post comes with a special sense of intellectual responsibility.  Within the last month, it’s come to my attention that there are at least two embarrassing oversights in my PostBQP paper from a decade ago, one of them concerning the very definition of PostBQP.  I hasten to clarify: once one fixes up the definition, the PostBQP=PP theorem remains perfectly valid, and all the applications of PostBQP that I mentioned above—for example, to reproving Beigel-Reingold-Spielman, and to the hardness of quantum sampling problems—go through just fine.  But if you think I have nothing to be embarrassed about: well, read on.


The definitional subtlety came clearly to my attention a few weeks ago, when I was lecturing about PostBQP in my 6.845 Quantum Complexity Theory graduate class.  I defined PostBQP as the class of languages L⊆{0,1}* for which there exists a polynomial-time quantum Turing machine M such that, for all inputs x∈{0,1}*,

  • M(x) “succeeds” (determined, say, by measuring its first output qubit in the {|0>,|1>} basis) with nonzero probability.
  • If x∈L, then conditioned on M(x) succeeding, M(x) “accepts” (determined, say, by measuring its second output qubit in the {|0>,|1>} basis) with probability at least 2/3.
  • If x∉L, then conditioned on M(x) succeeding, M(x) accepts with probability at most 1/3.

I then had to reassure the students that PostBQP, so defined, was a “robust” class: that is, that the definition doesn’t depend on stupid things like which set of quantum gates we allow. I argued that, even though we’re postselecting on exponentially-unlikely events, it’s still OK, because the Solovay-Kitaev Theorem lets us approximate any desired unitary to within exponentially-small error, with only a polynomial increase in the size of our quantum circuit. (Here we actually need the full power of the Solovay-Kitaev Theorem, in contrast to ordinary BQP, where we only need part of the power.)

A student in the class, Michael Cohen, immediately jumped in with a difficulty: what if M(x) succeeded, not with exponentially-small probability, but with doubly-exponentially-small probability—say, exp(-2n)?  In that case, one could no longer use the Solovay-Kitaev Theorem to show the irrelevance of the gate set.  It would no longer even be clear that PostBQP⊆PP, since the PP simulation might not be able to keep track of such tiny probabilities.

Thinking on my feet, I replied that we could presumably choose a set of gates—for example, gates involving rational numbers only—for which doubly-exponentially-small probabilities would never arise.  Or if all else failed, we could simply add to the definition of PostBQP that M(x) had to “succeed” with probability at least 1/exp(n): after all, that was the only situation I ever cared about anyway, and the only one that ever arose in the applications of PostBQP.

But the question still gnawed at me: was there a problem with my original, unamended definition of PostBQP?  If we weren’t careful in choosing our gate set, could we have cancellations that produced doubly-exponentially-small probabilities?  I promised I’d think about it more.

By a funny coincidence, just a couple weeks later, Ciaran Lee, a student at Oxford, emailed me the exact same question.  So on a train ride from Princeton to Boston, I decided to think about it for real.  It wasn’t hard to show that, if the gates involved square roots of rational numbers only—for example, if we’re dealing with the Hadamard and Toffoli gates, or the cos(π/8) and CNOT gates, or other standard gate sets—then every measurement outcome has at least 1/exp(n) probability, so there’s no problem with the definition of PostBQP.  But I didn’t know what might happen with stranger gate sets.

As is my wont these days—when parenting, teaching, and so forth leave me with almost no time to concentrate on math—I posted the problem to MathOverflow.  Almost immediately, I got incisive responses.  First, Piotr Achinger pointed out that, if we allow arbitrary gates, then it’s easy to get massive cancellations.  In more detail, let {an} be extremely-rapidly growing sequence of integers, say with an+1 > exp(an).  Then define

$$ \alpha = \sum_{n=1}^{\infty} 0.1^{a_n}. $$

If we write out α in decimal notation, it will consist of mostly 0’s, but with 1’s spaced further and further apart, like so: 0.1101000000000001000….  Now consider a gate set that involves α as well as 0.1 and -0.1 as matrix entries.  Given n qubits, it’s not hard to see that we can set up an interference experiment in which one of the paths leading to a given outcome E has amplitude α, and the other paths have amplitudes $$ -(0.1^{a_1}), -(0.1^{a_2}), \ldots, -(0.1^{a_k}), $$ where k is the largest integer such that ak≤n. In that case, the total amplitude of E will be about $$0.1^{a_{k+1}},$$ which for most values of n is doubly-exponentially small in n. Of course, by simply choosing a faster-growing sequence {an}, we can cause an even more severe cancellation.

Furthermore, by modifying the above construction to involve two crazy transcendental numbers α and β, I claim that we can set up a PostBQP computation such that deciding what happens is arbitrarily harder than PP (though still computable)—say, outside of exponential space, or even triple-exponential space. Moreover, we can do this despite the fact that the first n digits of α and β remain computable in O(n) time. The details are left as an exercise for the interested reader.

Yet even though we can engineer massive cancellations with crazy gates, I still conjectured that nothing would go wrong with “normal” gates: for example, gates involving algebraic amplitudes only. More formally, I conjectured that any finite set A=(a1,…,ak) of algebraic numbers is “tame,” in the sense that, if p is any degree-n polynomial with integer coefficients at most exp(n) in absolute value, then p(a1,…,ak)≠0 implies |p(a1,…,ak)|≥1/exp(n). And indeed, Julian Rosen on MathOverflow found an elegant proof of this fact. I’ll let you read it over there if you’re interested, but briefly, it interprets the amplitude we want as one particular Archimedean valuation of a certain element of a number field, and then lower-bounds the amplitude by considering the product of all Archimedean and non-Archimedean valuations (the latter of which involves the p-adic numbers). Since this was a bit heavy-duty for me, I was grateful when Will Sawin reformulated the proof in linear-algebraic terms that I understood.

And then came the embarrassing part. A few days ago, I was chatting with Greg Kuperberg, the renowned mathematician and author of our climate-change parable. I thought he’d be interested in this PostBQP progress, so I mentioned it to him. Delicately, Greg let me know that he had recently proved the exact same results, for the exact same reason (namely, fixing the definition of PostBQP), for the latest revision of his paper How Hard Is It to Approximate the Jones Polynomial?. Moreover, he actually wrote to me in June to tell me about this! At the time, however, I regarded it as “pointless mathematical hairsplitting” (who cares about these low-level gate-set issues anyway?). So I didn’t pay it any attention—and then I’d completely forgotten about Greg’s work when the question resurfaced a few months later. This is truly a just punishment for looking down on “mathematical hairsplitting,” and not a lesson I’ll soon forget.

Anyway, Greg’s paper provides yet a third proof that the algebraic numbers are tame, this one using Galois conjugates (though it turns out that, from a sufficiently refined perspective, Greg’s proof is equivalent to the other two).

There remains one obvious open problem here, one that I noted in the MathOverflow post and in which Greg is also extremely interested. Namely, we now know that it’s possible to screw up PostBQP using gates with amplitudes that are crazy transcendental numbers (closely related to the Liouville numbers). And we also know that, if the gates have algebraic amplitudes, then everything is fine: all events have at least 1/exp(n) probability. But what if the gates have not-so-crazy transcendental amplitudes, like 1/e, or (a bit more realistically) cos(2)?  I conjecture that everything is still fine, but the proof techniques that worked for the algebraic case seem useless here.

Stepping back, how great are the consequences of all this for our understanding of PostBQP? Fortunately, I claim that they’re not that great, for the following reason. As Adleman, DeMarrais, and Huang already noted in 1997—in the same paper that proved BQP⊆PP—we can screw up the definition even of BQP, let alone PostBQP, using a bizarre enough gate set. For example, suppose we had a gate G that mapped |0> to x|0>+y|1>, where y was a real number whose binary expansion encoded the halting problem (for example, y might equal Chaitin’s Ω).  Then by applying G more and more times, we could learn more and more bits of y, and thereby solve an uncomputable problem in the limit n→∞.

Faced with this observation, most quantum computing experts would say something like: “OK, but this is silly! It has no physical relevance, since we’ll never come across a magical gate like G—if only we did! And at any rate, it has nothing to do with quantum computing specifically: even classically, one could imagine a coin that landed heads with probability equal to Chaitin’s Ω. Therefore, the right way to deal with this is simply to define BQP in such a way as to disallow such absurd gates.” And indeed, that is what’s done today—usually without even remarking on it.

Now, it turns out that even gates that are “perfectly safe” for defining BQP, can turn “unsafe” when it comes to defining PostBQP. To screw up the definition of PostBQP, it’s not necessary that a gate involve uncomputable (or extremely hard-to-compute) amplitudes: the amplitudes could all be easily computable, but they could still be “unsafe” because of massive cancellations, as in the example above involving α. But one could think of this as a difference of degree, rather than of kind. It’s still true that there’s a large set of gates, including virtually all the gates anyone has ever cared about in practice (Toffoli, Hadamard, π/8, etc. etc.), that are perfectly safe for defining the complexity class; it’s just that the set is slightly smaller than it was for BQP.


The other issue with the PostBQP=PP paper was discovered by Ryan O’Donnell and Cem Say.  In Proposition 3 of the paper, I claim that PostBQP = BQPPostBQP||,classical, where the latter is the class of problems solvable by a BQP machine that’s allowed to make poly(n) parallel, classical queries to a PostBQP oracle.  As Ryan pointed out to me, nothing in my brief argument for this depended on quantum mechanics, so it would equally well show that PostBPP = BPPPostBPP||, where PostBPP (also known as BPPpath) is the classical analogue of PostBQP, and BPPPostBPP|| is the class of problems solvable by a BPP machine that can make poly(n) parallel queries to a PostBPP oracle.  But BPPPostBPP|| clearly contains BPPNP||, which in turn contains AM—so we would get AM in PostBPP, and therefore AM in PostBQP=PP.  But Vereshchagin gave an oracle relative to which AM is not contained in PP.  Since there was no nonrelativizing ingredient anywhere in my argument, the only possible conclusion is that my argument was wrong.  (This, incidentally, provides a nice illustration of the value of oracle results.)

In retrospect, it’s easy to pinpoint what went wrong.  If we try to simulate BPPPostBPP|| in PostBPP, our random bits will be playing a dual role: in choosing the queries to be submitted to the PostBPP oracle, and in providing the “raw material for postselection,” in computing the responses to those queries.  But in PostBPP, we only get to postselect once.  When we do, the two sets of random bits that we’d wanted to keep separate will get hopelessly mixed up, with the postselection acting on the “BPP” random bits, not just on the “PostBPP” ones.

How can we fix this problem?  Well, when defining the class BQPPostBQP||,classical, suppose we require the queries to the PostBQP oracle to be not only “classical,” but deterministic: that is, they have to be generated in advance by a P machine, and can’t depend on any random bits whatsoever.  And suppose we define BPPPostBPP||,classical similarly.  In that case, it’s not hard to see that the equalities BQPPostBQP||,classical = PostBQP and BPPPostBPP||,classical = PostBPP both go through.  You don’t actually care about this, do you?  But Ryan O’Donnell and Cem Say did, and that’s good enough for me.


I wish I could say that these are the only cases of mistakes recently being found in decade-old papers of mine, but alas, such is not the case.  In the near future, my student Adam Bouland, MIT undergrad Mitchell Lee, and Singapore’s Joe Fitzsimons will post to the arXiv a paper that grew out of an error in my 2005 paper Quantum Computing and Hidden Variables. In that paper, I introduced a hypothetical generalization of the quantum computing model, in which one gets to see the entire trajectory of a hidden variable, rather than just a single measurement outcome. I showed that this generalization would let us solve problems somewhat beyond what we think we can do with a “standard” quantum computer. In particular, we could solve the collision problem in O(1) queries, efficiently solve Graph Isomorphism (and all other problems in the Statistical Zero-Knowledge class), and search an N-element list in only ~N1/3 steps, rather than the ~N1/2 steps of Grover’s search algorithm. That part of the paper remains fine!

On the other hand, at the end of the paper, I also gave a brief argument to show that, even in the hidden-variable model, ~N1/3 steps are required to search an N-element list. But Mitchell Lee and Adam Bouland discovered that that argument is wrong: it fails to account for all the possible ways that an algorithm could exploit the correlations between the hidden variable’s values at different moments in time. (I’ve previously discussed this error in other blog posts, as well as in the latest edition of Quantum Computing Since Democritus.)

If we suitably restrict the hidden-variable theory, then we can correctly prove a lower bound of ~N1/4, or even (with strong enough assumptions) ~N1/3; and we do that in the forthcoming paper. Even with no restrictions, as far as we know an ~N1/3 lower bound for search with hidden variables remains true. But it now looks like proving it will require a major advance in our understanding of hidden-variable theories: for example, a proof that the “Schrödinger theory” is robust to small perturbations, which I’d given as the main open problem in my 2005 paper.

As if that weren’t enough, in my 2003 paper Quantum Certificate Complexity, I claimed (as a side remark) that one could get a recursive Boolean function f with an asymptotic gap between the block sensitivity bs(f) and the randomized certificate complexity RC(f). However, two and a half years ago, Avishay Tal discovered that this didn’t work, because block sensitivity doesn’t behave nicely under composition.  (In assuming it did, I was propagating an error introduced earlier by Wegener and Zádori.)  More broadly, Avishay showed that there is no recursively-defined Boolean function with an asymptotic gap between bs(f) and RC(f). On the other hand, if we just want some Boolean function with an asymptotic gap between bs(f) and RC(f), then Raghav Kulkarni observed that we can use a non-recursive function introduced by Xiaoming Sun, which yields bs(f)≈N3/7 and RC(f)≈N4/7. This is actually a larger separation than the one I’d wrongly claimed.

Now that I’ve come clean about all these things, hopefully the healing can begin at last.

A few quick announcements

Tuesday, October 14th, 2014

I gave a new survey talk at Yale, entitled “When Exactly Do Quantum Computers Provide a Speedup?”  Here are the PowerPoint slides.  Thanks so much to Rob Schoelkopf for inviting me, and to everyone else at Yale for an awesome visit.

Aephraim Steinberg asks me to announce that the call for nominations for the 2015 John Stewart Bell Prize is now available.

Ronitt Rubinfeld asks me to remind people that the STOC’2015 submission deadline is November 4.  Here’s the call for papers.

Likewise, Jeff Kinne asks me to remind people that the Complexity’2015 submission deadline is November 26.  Here’s the call for papers.

Microsoft SVC

Tuesday, September 23rd, 2014

By now, the news that Microsoft abruptly closed its Silicon Valley research lab—leaving dozens of stellar computer scientists jobless—has already been all over the theoretical computer science blogosphere: see, e.g., Lance, Luca, Omer Reingold, Michael Mitzenmacher.  I never made a real visit to Microsoft SVC (only went there once IIRC, for a workshop, while a grad student at Berkeley); now of course I won’t have the chance.

The theoretical computer science community, in the Bay Area and elsewhere, is now mobilizing to offer visiting positions to the “refugees” from Microsoft SVC, until they’re able to find more permanent employment.  I was happy to learn, this week, that MIT’s theory group will likely play a small part in that effort.

Like many others, I confess to bafflement about Microsoft’s reasons for doing this.  Won’t the severe damage to MSR’s painstakingly-built reputation, to its hiring and retention of the best people, outweigh the comparatively small amount of money Microsoft will save?  Did they at least ask Mr. Gates, to see whether he’d chip in the proverbial change under his couch cushions to keep the lab open?  Most of all, why the suddenness?  Why not wind the lab down over a year, giving the scientists time to apply for new jobs in the academic hiring cycle?  It’s not like Microsoft is in a financial crisis, lacking the cash to keep the lights on.

Yet one could also view this announcement as a lesson in why academia exists and is necessary.  Yes, one should applaud those companies that choose to invest a portion of their revenue in basic research—like IBM, the old AT&T, or Microsoft itself (which continues to operate great research outfits in Redmond, Santa Barbara, both Cambridges, Beijing, Bangalore, Munich, Cairo, and Herzliya).  And yes, one should acknowledge the countless times when academia falls short of its ideals, when it too places the short term above the long.  All the same, it seems essential that our civilization maintain institutions for which the pursuit and dissemination of knowledge are not just accoutrements for when financial times are good and the Board of Directors is sympathetic, but are the institution’s entire reasons for being: those activities that the institution has explicitly committed to support for as long as it exists.

Speaking Truth to Parallelism: The Book

Monday, September 22nd, 2014

A few months ago, I signed a contract with MIT Press to publish a new book: an edited anthology of selected posts from this blog, along with all-new updates and commentary.  The book’s tentative title (open to better suggestions) is Speaking Truth to Parallelism: Dispatches from the Frontier of Quantum Computing Theory.  The new book should be more broadly accessible than Quantum Computing Since Democritus, although still far from your typical pop-science book.  My goal is to have STTP out by next fall, to coincide with Shtetl-Optimized‘s tenth anniversary.

If you’ve been a regular reader, then this book is my way of thanking you for … oops, that doesn’t sound right.  If it were a gift, I should give it away for free, shouldn’t I?  So let me rephrase: buying this reasonably-priced book can be your way of thanking me, if you’ve enjoyed my blog all these years.  But it will also (I hope) be a value-added proposition: not only will you be able to put the book on your coffee table to impress an extremely nerdy subset of your friends, you’ll also get “exclusive content” unavailable on the blog.

To be clear, the posts that make it into the book will be ruthlessly selected: nothing that’s pure procrastination, politics, current events, venting, or travelogue, only the choice fillets that could plausibly be claimed to advance the public understanding of science.  Even for those, I’ll add additional background material, and take out digs unworthy of a book (making exceptions for anything that really cracks me up on a second reading).

If I had to pick a unifying theme for the book, I’d sigh and then say: it’s about a certain attitude toward the so-called “deepest questions,” like the nature of quantum mechanics or the ultimate limits of computation or the mind/body problem or the objectivity of mathematics or whether our universe is a computer simulation.  It’s an attitude that I wish more popular articles managed to get across, and at any rate, that people ought to adopt when reading those articles.  The attitude combines an openness to extraordinary claims, with an unceasing demand for clarity about the nature of those claims, and an impatience whenever that demand is met with evasion, obfuscation, or a “let’s not get into technicalities right now.”  It’s an attitude that constantly asks questions like:

“OK, so what can you actually do that’s different?”
“Why doesn’t that produce an absurd result when applied to simple cases?”
“Why isn’t that just a fancy way of saying what I could’ve said in simpler language?”
“Why couldn’t you have achieved the same thing without your ‘magic ingredient’?”
“So what’s your alternative account for how that happens?”
“Why isn’t that obvious?”
“What’s really at stake here?”
“What’s the catch?”

It’s an attitude that accepts the possibility that such questions might have satisfying answers—in which case, a change in worldview will be in order.  But not before answers are offered, openly debated, and understood by the community of interested people.

Of all the phrases I use on this blog, I felt “Speaking Truth to Parallelism” best captured the attitude in question.  I coined the phrase back in 2007, when D-Wave’s claims to be solving Sudoku puzzles with a quantum computer unleashed a tsunami of journalism about QCs—what they are, how they would work, what they could do—that (in my opinion) perfectly illustrated how not to approach a metaphysically-confusing new technology.  Having said that, the endless debate around D-Wave won’t by any means be the focus of this book: it will surface, of course, but only when it helps to illustrate some broader point.

In planning this book, the trickiest issue was what to do with comments.  Ultimately, I decided that the comments make Shtetl-Optimized what it is—so for each post I include, I’ll include a brief selection of the most interesting comments, together with my responses to them.  My policy will be this: by default, I’ll consider any comments on this blog to be fair game for quoting in the book, in whole or in part, and attributed to whatever handle the commenter used.  However, if you’d like to “opt out” of having your comments quoted, I now offer you a three-month window in which to do so: just email me, or leave a comment (!) on this thread.  You can also request that certain specific comments of yours not be quoted, or that your handle be removed from your comments, or your full name added to them—whatever you want.

Update (9/24): After hearing from several of you, I’ve decided on the following modified policy.  In all cases where I have an email address, I will contact the commenters about any of their comments that I’m thinking of using, to request explicit permission to use them.  In the hopefully-rare cases where I can’t reach a given commenter, but where their comment raised what seems like a crucial point requiring a response in the book, I might quote from the comment anyway—but in those cases, I’ll be careful not to reproduce very long passages, in a way that might run afoul of the fair use exception.

Steven Pinker’s inflammatory proposal: universities should prioritize academics

Thursday, September 11th, 2014

If you haven’t yet, I urge you to read Steven Pinker’s brilliant piece in The New Republic about what’s broken with America’s “elite” colleges and how to fix it.  The piece starts out as an evisceration of an earlier New Republic article on the same subject by William Deresiewicz.  Pinker agrees with Deresiewicz that something is wrong, but finds Deresiewicz’s diagnosis of what to be lacking.  The rest of Pinker’s article sets out his own vision, which involves America’s top universities taking the radical step of focusing on academics, and returning extracurricular activities like sports to their rightful place as extras: ways for students to unwind, rather than a university’s primary reason for existing, or a central criterion for undergraduate admissions.  Most controversially, this would mean that the admissions process at US universities would become more like that in virtually every other advanced country: a relatively-straightforward matter of academic performance, rather than an exercise in peering into the applicants’ souls to find out whether they have a special je ne sais quoi, and the students (and their parents) desperately gaming the intentionally-opaque system, by paying consultants tens of thousands of dollars to develop souls for them.

(Incidentally, readers who haven’t experienced it firsthand might not be able to understand, or believe, just how strange the undergraduate admissions process in the US has become, although Pinker’s anecdotes give some idea.  I imagine anthropologists centuries from now studying American elite university admissions, and the parenting practices that have grown up around them, alongside cannibalism, kamikaze piloting, and other historical extremes of the human condition.)

Pinker points out that a way to assess students’ ability to do college coursework—much more quickly and accurately than by relying on the soul-detecting skills of admissions officers—has existed for a century.  It’s called the standardized test.  But unlike in the rest of the world (even in ultraliberal Western Europe), standardized tests are politically toxic in the US, seen as instruments of racism, classism, and oppression.  Pinker reminds us of the immense irony here: standardized tests were invented as a radical democratizing tool, as a way to give kids from poor and immigrant families the chance to attend colleges that had previously only been open to the children of the elite.  They succeeded at that goal—too well for some people’s comfort.

We now know that the Ivies’ current emphasis on sports, “character,” “well-roundedness,” and geographic diversity in undergraduate admissions was consciously designed (read that again) in the 1920s, by the presidents of Harvard, Princeton, and Yale, as a tactic to limit the enrollment of Jews.  Nowadays, of course, the Ivies’ “holistic” admissions process no longer fulfills that original purpose, in part because American Jews learned to play the “well-roundedness” game as well as anyone, shuttling their teenage kids between sports, band practice, and faux charity work, while hiring professionals to ghostwrite application essays that speak searingly from the heart.  Today, a major effect of “holistic” admissions is instead to limit the enrollment of Asian-Americans (especially recent immigrants), who tend disproportionately to have superb SAT scores, but to be deficient in life’s more meaningful dimensions, such as lacrosse, student government, and marching band.  More generally—again, pause to wallow in the irony—our “progressive” admissions process works strongly in favor of the upper-middle-class families who know how to navigate it, and against the poor and working-class families who don’t.

Defenders of the status quo have missed this reality on the ground, it seems to me, because they’re obsessed with the notion that standardized tests are “reductive”: that is, that they reduce a human being to a number.  Aren’t there geniuses who bomb standardized tests, they ask, as well as unimaginative grinds who ace them?  And if you make test scores a major factor in admissions, then won’t students and teachers train for the tests, and won’t that pervert open-ended intellectual curiosity?  The answer to both questions, I think, is clearly “yes.”  But the status-quo-defenders never seem to take the next step, of examining the alternatives to standardized testing, to see whether they’re even worse.

I’d say the truth is this: spots at the top universities are so coveted, and so much rarer than the demand, that no matter what you use as your admissions criterion, that thing will instantly get fetishized and turned into a commodity by students, parents, and companies eager to profit from their anxiety.  If it’s grades, you’ll get a grades fetish; if sports, you’ll get a sports fetish; if community involvement, you’ll get soup kitchens sprouting up for the sole purpose of giving ambitious 17-year-olds something to write about in their application essays.  If Harvard and Princeton announced that from now on, they only wanted the most laid-back, unambitious kids, the ones who spent their summers lazily skipping stones in a lake, rather than organizing their whole lives around getting in to Harvard and Princeton, tens of thousands of parents in the New York metropolitan area would immediately enroll their kids in relaxation and stone-skipping prep courses.  So, given that reality, why not at least make the fetishized criterion one that’s uniform, explicit, predictively valid, relatively hard to game, and relevant to universities’ core intellectual mission?

(Here, I’m ignoring criticisms specific to the SAT: for example, that it fails to differentiate students at the extreme right end of the bell curve, thereby forcing the top schools to use other criteria.  Even if those criticisms are true, they could easily be fixed by switching to other tests.)

I admit that my views on this matter might be colored by my strange (though as I’ve learned, not at all unique) experience, of getting rejected from almost every “top” college in the United States, and then, ten years later, getting recruited for faculty jobs by the very same institutions that had rejected me as a teenager.  Once you understand how undergraduate admissions work, the rejections were unsurprising: I was a 15-year-old with perfect SATs and a published research paper, but not only was I young and immature, with spotty grades and a weird academic trajectory, I had no sports, no music, no diverse leadership experiences.  I was a narrow, linear, A-to-B thinker who lacked depth and emotional intelligence: the exact opposite of what Harvard and Princeton were looking for in every way.  The real miracle is that despite these massive strikes against me, two schools—Cornell and Carnegie Mellon—were nice enough to give me a chance.  (I ended up going to Cornell, where I got a great education.)

Some people would say: so then what’s the big deal?  If Harvard or MIT reject some students that maybe they should have admitted, those students will simply go elsewhere, where—if they’re really that good—they’ll do every bit as well as they would’ve done at the so-called “top” schools.  But to me, that’s uncomfortably close to saying: there are millions of people who go on to succeed in life despite childhoods of neglect and poverty.  Indeed, some of those people succeed partly because of their rough childhoods, which served as the crucibles of their character and resolve.  Ergo, let’s neglect our own children, so that they too can have the privilege of learning from the school of hard knocks just like we did.  The fact that many people turn out fine despite unfairness and adversity doesn’t mean that we should inflict unfairness if we can avoid it.

Let me end with an important clarification.  Am I saying that, if I had dictatorial control over a university (ha!), I would base undergraduate admissions solely on standardized test scores?  Actually, no.  Here’s what I would do: I would admit the majority of students mostly based on test scores.  A minority, I would admit because of something special about them that wasn’t captured by test scores, whether that something was musical or artistic talent, volunteer work in Africa, a bestselling smartphone app they’d written, a childhood as an orphaned war refugee, or membership in an underrepresented minority.  Crucially, though, the special something would need to be special.  What I wouldn’t do is what’s done today: namely, to turn “specialness” and “well-roundedness” into commodities that the great mass of applicants have to manufacture before they can even be considered.

Other than that, I would barely look at high-school grades, regarding them as too variable from one school to another.  And, while conceding it might be impossible, I would try hard to keep my university in good enough financial shape that it didn’t need any legacy or development admits at all.


Update (Sep. 14): For those who feel I’m exaggerating the situation, please read the story of commenter Jon, about a homeschooled 15-year-old doing graduate-level work in math who, three years ago, was refused undergraduate admission to both Berkeley and Caltech, with the math faculty powerless to influence the admissions officers. See also my response.

Raise a martini glass for Google and Martinis!

Saturday, September 6th, 2014

We’ve already been discussing this in the comments section of my previous post, but a few people emailed me to ask when I’d devote a separate blog post to the news.

OK, so for those who haven’t yet heard: this week Google’s Quantum AI Lab announced that it’s teaming up with John Martinis, of the University of California, Santa Barbara, to accelerate the Martinis group‘s already-amazing efforts in superconducting quantum computing.  (See here for the MIT Tech‘s article, here for Wired‘s, and here for the WSJ‘s.)  Besides building some of the best (if not the best) superconducting qubits in the world, Martinis, along with Matthias Troyer, was also one of the coauthors of two important papers that found no evidence for any speedup in the D-Wave machines.  (However, in addition to working with the Martinis group, Google says it will also continue its partnership with D-Wave, in an apparent effort to keep reality more soap-operatically interesting than any hypothetical scenario one could make up on a blog.)

I have the great honor of knowing John Martinis, even once sharing the stage with him at a “Physics Cafe” in Aspen.  Like everyone else in our field, I profoundly admire the accomplishments of his group: they’ve achieved coherence times in the tens of microseconds, demonstrated some of the building blocks of quantum error-correction, and gotten tantalizingly close to the fault-tolerance threshold for the surface code.  (When, in D-Wave threads, people have challenged me: “OK Scott, so then which experimental quantum computing groups should be supported more?,” my answer has always been some variant of: “groups like John Martinis’s.”)

So I’m excited about this partnership, and I wish it the very best.

But I know people will ask: apart from the support and well-wishes, do I have any predictions?  Alright, here’s one.  I predict that, regardless of what happens, commenters here will somehow make it out that I was wrong.  So for example, if the Martinis group, supported by Google, ultimately succeeds in building a useful, scalable quantum computer—by emphasizing error-correction, long coherence times (measured in the conventional way), “gate-model” quantum algorithms, universality, and all the other things that D-Wave founder Geordie Rose has pooh-poohed from the beginning—commenters will claim that still most of the credit belongs to D-Wave, for whetting Google’s appetite, and for getting it involved in superconducting QC in the first place.  (The unstated implication being that, even if there were little or no evidence that D-Wave’s approach would ever lead to a genuine speedup, we skeptics still would’ve been wrong to state that truth in public.)  Conversely, if this venture doesn’t live up to the initial hopes, commenters will claim that that just proves Google’s mistake: rather than “selling out to appease the ivory-tower skeptics,” they should’ve doubled down on D-Wave.  Even if something completely different happens—let’s say, Google, UCSB, and D-Wave jointly abandon their quantum computing ambitions, and instead partner with ISIS to establish the world’s first “Qualiphate,” ruling with a niobium fist over California and parts of Oregon—I would’ve been wrong for having failed to foresee that.  (Even if I did sort of foresee it in the last sentence…)

Yet, while I’ll never live to see the blog-commentariat acknowledge the fundamental reasonableness of my views, I might live to see scalable quantum computers become a reality, and that would surely be some consolation.  For that reason, even if for no others, I once again wish the Martinis group and Google’s Quantum AI Lab the best in their new partnership.


Unrelated Announcement: Check out a lovely (very basic) introductory video on quantum computing and information, narrated by John Preskill and Spiros Michalakis, and illustrated by Jorge Cham of PhD Comics.

Subhash Khot’s prizewinning research

Saturday, August 16th, 2014

I already congratulated Subhash Khot in my last post for winning the Nevanlinna Award, but this really deserves a separate post.  Khot won theoretical computer science’s highest award largely for introducing and exploring the Unique Games Conjecture (UGC), which says (in one sentence) that a large number of the approximation problems that no one has been able to prove NP-hard, really are NP-hard.  In particular, if the UGC is true, then for MAX-CUT and dozens of other important optimization problems, no polynomial-time algorithm can always get you closer to the optimal solution than some semidefinite-programming-based algorithm gets you, unless P=NP.  The UGC might or might not be true—unlike with (say) P≠NP itself, there’s no firm consensus around it—but even if it’s false, the effort to prove or disprove it has by now had a huge impact on theoretical computer science research, leading to connections with geometry, tiling, analysis of Boolean functions, quantum entanglement, and more.

There are a few features that make the UGC interesting, compared to most other questions considered in complexity theory.  Firstly, the problem that the UGC asserts is NP-hard—basically, given a list of linear equations in 2 variables each, to satisfy as many of the equations as you can—is a problem with “imperfect completeness.”  This means that, if you just wanted to know whether all the linear equations were simultaneously satisfiable, the question would be trivial to answer, using Gaussian elimination.  So the problem only becomes interesting once you’re told that the equations are not simultaneously satisfiable, but you’d like to know (say) whether it’s possible to satisfy 99% of the equations or only 1%.  A second feature is that, because of the 2010 work of Arora, Barak, and Steurer, we know that there is an algorithm that solves the unique games problem in “subexponential time”: specifically, in time exp(npoly(δ)), where δ is the completeness error (that is, the fraction of linear equations that are unsatisfiable, in the case that most of them are satisfiable).  This doesn’t mean that the unique games problem can’t be NP-hard: it just means that, if there is an NP-hardness proof, then the reduction will need to blow up the instance sizes by an npoly(1/δ) factor.

To be clear, neither of the above features is unique (har, har) to unique games: we’ve long known NP-complete problems, like MAX-2SAT, that have the imperfect completeness feature, and we also know NP-hardness reductions that blow up the instance size by an npoly(1/δ) factor for inherent reasons (for example, for the Set Cover problem).  But perhaps nothing points as clearly as UGC at the directions that researchers in hardness of approximation and probabilistically checkable proofs (PCP) would like to be able to go.  A proof of the Unique Games Conjecture would basically be a PCP theorem on steroids.  (Or, since we already have “PCP theorems on steroids,” maybe a PCP theorem on PCP?)

It’s important to understand that, between the UGC being true and the unique games problem being solvable in polynomial time, there’s a wide range of intermediate possibilities, many of which are being actively investigated.  For example, the unique games problem could be “NP-hard,” but via a reduction that itself takes subexponential time (i.e., it could be hard assuming the Exponential-Time Hypothesis).  It could be solvable much faster than Arora-Barak-Steurer but still not in P.  Or, even if the problem weren’t solvable any faster than is currently known, it could be “hard without being NP-hard,” having a similar status to factoring or graph isomorphism.  Much current research into the UGC is focused on a particular algorithm called the Sum-of-Squares algorithm (i.e., the Laserre hierarchy).  Some researchers suspect that, if any algorithm will solve the unique games problem in polynomial time (or close to that), it will be Sum-of-Squares; conversely, if one could show that Sum-of-Squares failed, one would’ve taken a major step toward proving the UGC.

For more, I recommend this Quanta magazine article, or Luca Trevisan’s survey, or Subhash’s own survey.  Or those pressed for time can simply check out this video interview with Subhash.  If you’d like to try my wife Dana’s puzzle games inspired by PCP, which Subhash uses 2 minutes into the video to explain what he works on, see here.  Online, interactive versions of these puzzle games are currently under development.  Also, if you have questions about the UGC or Subhash’s work, go ahead and ask: I’ll answer if I can, and otherwise rely on in-house expertise.

Congratulations again to Subhash!

Seth Teller (1964-2014)

Friday, July 11th, 2014

Seth Teller

Seth Teller was a colleague of mine in CSAIL and the EECS department, and was one of my favorite people in all of MIT.  He was a brilliant roboticist, who (among many other things) spearheaded MIT’s participation in the DARPA Grand Challenge for self-driving cars, and who just recently returned from a fact-finding trip to Fukushima, Japan, to see how robots could help in investigating the damaged reactor cores there.  I saw Seth twice a week at lab and department lunches, and he often struck up conversations with me about quantum computing, cosmology, and other things.  His curiosity was immense, wide-ranging, and almost childlike (in the best way).  One small indication of his character is that, in the DARPA challenge, Seth opted not to preload MIT’s car with detailed data about the course, because he thought doing so made the challenge scientifically less interesting—even though DARPA’s rules allowed such preloading, the other teams did it, and it almost certainly would have improved MIT’s standing in the competition.

Seth was a phenomenal speaker, whose passion and clarity always won me over even though my research interests were far from his.  I made it a point to show up for lab lunch whenever I knew he’d be speaking.  Seth was also, from what I’ve heard, a superb mentor and teacher, who won an award earlier this year for his undergraduate advising.

Seth died ten days ago, on July 1st.  (See here for MIT News’s detailed obituary, and here for an article in Cambridge Day.)  While no cause of death was given at the time, according to an update yesterday in the MIT Tech, the death has been ruled a suicide.  Seth is survived by his wife, Rachel, and by two daughters.

With his cheerful, can-do disposition, Seth is one of the last people on earth I’d imagine doing this: whatever he was going through, he did an unbelievable job of hiding it.  I’m certain he wouldn’t abandon his family unless he was suffering unimaginable pain.  If there’s a tiny atom of good to come out of this, I hope that at least one other person contemplating suicide will reflect on how much Seth had to live for, and that doing so will inspire that person to get the medical help they need.

Incidentally, outside of his research and teaching, Seth was also an activist for protecting the physical environment and open spaces of East Cambridge.  At the “Wild and Crazy Ideas Session” of one CSAIL retreat, Seth floated a truly wild idea: to replace Memorial Drive, or at least the part of it that separates the MIT campus from the Charles River, by an underground tunnel, so that the land above the tunnel could be turned into a beautiful riverfront park.  In his characteristic fashion, Seth had already done a pretty detailed engineering analysis, estimating the cost at “merely” a few hundred million dollars: a lot, but a worthy investment in MIT’s future.  In any case, I can’t imagine a better way to memorialize Seth than to name some green space in East Cambridge after him, and I hope that happens.

Seth will be sorely missed.  My thoughts go out to his family at this difficult time.

CCC’s Declaration of Independence

Friday, June 6th, 2014

Recently, the participants of the Conference on Computational Complexity (CCC)—the latest iteration of which I’ll be speaking at next week in Vancouver—voted to declare their independence from the IEEE, and to become a solo, researcher-organized conference.  See this open letter for the reasons why (basically, IEEE charged a huge overhead, didn’t allow open access to the proceedings, and increased rather than decreased the administrative burden on the organizers).  As a former member of the CCC Steering Committee, I’m in violent agreement with this move, and only wish we’d managed to do it sooner.

Now, Dieter van Melkebeek (the current Steering Committee chair) is asking complexity theorists to sign a public Letter of Support, to make it crystal-clear that the community is behind the move to independence.  And Jeff Kinne has asked me to advertise the letter on my blog.  So, if you’re a complexity theorist who agrees with the move, please go there and sign (it already has 111 signatures, but could use more).

Meanwhile, I wish to express my profound gratitude to Dieter, Jeff, and the other Steering Committee members for their efforts toward independence.  The only thing I might’ve done differently would be to add a little more … I dunno, pizzazz to the documents explaining the reasons for the move.  Like:

When in the Course of human events, it becomes necessary for a conference to dissolve the organizational bands that have connected it with the IEEE, and to assume among the powers of the earth, the separate and equal station to which the Laws of Mathematics and the CCC Charter entitle it, a decent respect to the opinions of theorist-kind requires that the participants should declare the causes which impel them to the separation.

We hold these truths to be self-evident (but still in need of proof), that P and NP are created unequal, that one-way functions exist, that the polynomial hierarchy is infinite…