## Archive for the ‘Announcements’ Category

### The 8000th Busy Beaver number eludes ZF set theory: new paper by Adam Yedidia and me

Tuesday, May 3rd, 2016

I’ve supervised a lot of great student projects in my nine years at MIT, but my inner nerdy teenager has never been as personally delighted by a project as it is right now.  Today, I’m proud to announce that Adam Yedidia, a PhD student at MIT (but an MEng student when he did most of this work), has explicitly constructed a one-tape, two-symbol Turing machine with 7,918 states, whose behavior (when run on a blank tape) can never be proven from the usual axioms of set theory, under reasonable consistency hypotheses.  Adam has also constructed a 4,888-state Turing machine that halts iff there’s a counterexample to Goldbach’s Conjecture, and a 5,372-state machine that halts iff there’s a counterexample to the Riemann Hypothesis.  In all three cases, this is the first time we’ve had a reasonable explicit upper bound on how many states you need in a Turing machine before you can see the behavior in question.

Here’s our research paper, on which Adam generously included me as a coauthor, even though he did the heavy lifting.  Also, here’s a github repository where you can download all the code Adam used to generate these Turing machines, and even use it to build your own small Turing machines that encode interesting mathematical statements.  Finally, here’s a YouTube video where Adam walks you through how to use his tools.

A more precise statement of our main result is this: we give a 7,918-state Turing machine, called Z (and actually explicitly listed in our paper!), such that:

1. Z runs forever, assuming the consistency of a large-cardinal theory called SRP (Stationary Ramsey Property), but
2. Z can’t be proved to run forever in ZFC (Zermelo-Fraenkel set theory with the Axiom of Choice, the usual foundation for mathematics), assuming that ZFC is consistent.

A bit of background: it follows, as an immediate consequence of Gödel’s Incompleteness Theorem, that there’s some computer program, of some length, that eludes the power of ordinary mathematics to prove what it does, when it’s run with an unlimited amount of memory.  So for example, such a program could simply enumerate all the possible consequences of the ZFC axioms, one after another, and halt if it ever found a contradiction (e.g., a proof of 1+1=3).  Assuming ZFC is consistent, this program must run forever.  But again assuming ZFC is consistent, ZFC can’t prove that the program runs forever, since if it did, then it would prove its own consistency, thereby violating the Second Incompleteness Theorem!

Alas, this argument still leaves us in the dark about where, in space of computer programs, the “Gödelian gremlin” rears its undecidable head.  A program that searches for an inconsistency in ZFC is a fairly complicated animal: it needs to encode not only the ZFC axiom schema, but also the language and inference rules of first-order logic.  Such a program might be thousands of lines long if written in a standard programming language like C, or millions of instructions if compiled down to a bare-bones machine code.  You’d certainly never run across such a program by chance—not even if you had a computer the size of the observable universe, trying one random program after another for billions of years in a “primordial soup”!

So the question stands—a question that strikes me as obviously important, even though as far as I know, only one or two people ever asked the question before us; see here for example.  Namely: do the axioms of set theory suffice to analyze the behavior of every computer program that’s at most, let’s say, 50 machine instructions long?  Or are there super-short programs that already exhibit “Gödelian behavior”?

Theoretical computer scientists might object that this is “merely a question of constants.”  Well yes, OK, but the origin of life in our universe—a not entirely unrelated puzzle—is also “merely a question of constants”!  In more detail, we know that it’s possible with our laws of physics to build a self-replicating machine: say, DNA or RNA and their associated paraphernalia.  We also know that tiny molecules like H2O and CO2 are not self-replicating.  But we don’t know how small the smallest self-replicating molecule can be—and that’s an issue that influences whether we should expect to find ourselves alone in the universe or find it teeming with life.

Some people might also object that what we’re asking about has already been studied, in the half-century quest to design the smallest universal Turing machine (the subject of Stephen Wolfram’s $25,000 prize in 2007, to which I responded with my own$25.00 prize).  But I see that as fundamentally different, for the following reason.  A universal Turing machine—that is, a machine that simulates any other machine that’s described to it on its input tape—has the privilege of offloading almost all of its complexity onto the description format for the input machine.  So indeed, that’s exactly what all known tiny universal machines do!  But a program that checks (say) Goldbach’s Conjecture, or the Riemann Hypothesis, or the consistency of set theory, on an initially blank tape, has no such liberty.  For such machines, the number of states really does seem like an intrinsic measure of complexity, because the complexity can’t be shoehorned anywhere else.

One can also phrase what we’re asking in terms of the infamous Busy Beaver function.  Recall that BB(n), or the nth Busy Beaver number, is defined to be the maximum number of steps that any n-state Turing machine takes when run on an initially blank tape, assuming that the machine eventually halts. The Busy Beaver function was the centerpiece of my 1998 essay Who Can Name the Bigger Number?, which might still attract more readers than anything else I’ve written since. As I stressed there, if you’re in a biggest-number-naming contest, and you write “BB(10000),” you’ll destroy any opponent—however otherwise mathematically literate they are—who’s innocent of computability theory.  For BB(n) grows faster than any computable sequence of integers: indeed, if it didn’t, then one could use that fact to solve the halting problem, contradicting Turing’s theorem.

But the BB function has a second amazing property: namely, it’s a perfectly well-defined integer function, and yet once you fix the axioms of mathematics, only finitely many values of the function can ever be proved, even in principle.  To see why, consider again a Turing machine M that halts if and only if there’s a contradiction in ZF set theory.  Clearly such a machine could be built, with some finite number of states k.  But then ZF set theory can’t possibly determine the value of BB(k) (or BB(k+1), BB(k+2), etc.), unless ZF is inconsistent!  For to do so, ZF would need to prove that M ran forever, and therefore prove its own consistency, and therefore be inconsistent by Gödel’s Theorem.

OK, but we can now ask a quantitative question: how many values of the BB function is it possible for us to know?  Where exactly is the precipice at which this function “departs the realm of mortals and enters the realm of God”: is it closer to n=10 or to n=10,000,000?  In practice, four values of BB have been determined so far:

• BB(1)=1
• BB(2)=6
• BB(3)=21 (Lin and Rado 1965)

We also know some lower bounds:

See Heiner Marxen’s page or the Googology Wiki (which somehow I only learned about today) for more information.

Some Busy Beaver enthusiasts have opined that even BB(6) will never be known exactly.  On the other hand, the abstract argument from before tells us only that, if we confine ourselves to (say) ZF set theory, then there’s some k—possibly in the tens of millions or higher—such that the values of BB(k), BB(k+1), BB(k+2), and so on can never be proven.  So again: is the number of knowable values of the BB function more like 10, or more like a million?

It’s hopeless to design a Turing machine by hand for all but the simplest tasks, so as a first step, Adam created a new programming language, called Laconic, specifically for writing programs that compile down to small Turing machines.  Laconic programs actually compile to an intermediary language called TMD (Turing Machine Descriptor), and from there to Turing machines.

Even then, we estimate that a direct attempt to write a Laconic program that searched for a contradiction in ZFC would lead to a Turing machine with millions of states.  There were three ideas needed to get the state count down to something reasonable.

The first was to take advantage of the work of Harvey Friedman, who’s one of the one or two people I mentioned earlier who’s written about these problems before.  In particular, Friedman has been laboring since the 1960s to find “natural” arithmetical statements that are provably independent of ZFC or other strong set theories.  (See this AMS Notices piece by Martin Davis for a discussion of Friedman’s progress as of 2006.)  Not only does Friedman’s quest continue, but some of his most important progress has come only within the last year.  His statements—typically involving objects called “order-invariant graphs”—strike me as alien, and as far removed from anything I’d personally have independent reasons to think about (but is that just a sign of my limited perspective?).  Be that as it may, Friedman’s statements still seem a lot easier to encode as short computer programs than the full apparatus of first-order logic and set theory!  So that’s what we started with; our work wouldn’t have been possible without Friedman (who we consulted by email throughout the project).

The second idea was something we called “on-tape processing.”  Basically, instead of compiling directly from Laconic down to Turing machine, Adam wrote an interpreter in Turing machine (which took about 4000 states—a single, fixed cost), and then had the final Turing machine first write a higher-level program onto its tape and then interpret that program.  Instead of the compilation process producing a huge multiplicative overhead in the number of Turing machine states (and a repetitive machine), this approach gives us only an additive overhead.  We found that this one idea decreased the number of states by roughly an order of magnitude.

The third idea was first suggested in 2002 by Ben-Amram and Petersen (and refined for us by Luke Schaeffer); we call it “introspective encoding.”  When we write the program to be interpreted onto the Turing machine tape, the naïve approach would use one Turing machine state per bit.  But that’s clearly wasteful, since in an n-state Turing machine, every state contains ~log(n) bits of information (because of the other states it needs to point to).  A better approach tries to exploit as many of those bits as it can; doing that gave us up to a factor-of-5 additional savings in the number of states.

For Goldbach’s Conjecture and the Riemann Hypothesis, we paid the same 4000-state overhead for the interpreter, but then the program to be interpreted was simpler, giving a smaller overall machine.  Incidentally, it’s not intuitively obvious that the Riemann Hypothesis is equivalent to the statement that some particular computer program runs forever, but it is—that follows, for example, from work by Lagarias and by Davis, Matijasevich, and Robinson (we used the latter; an earlier version of this post incorrectly stated that we used the Lagarias result).

To preempt the inevitable question in the comments section: yes, we did run these Turing machines for a while, and no, none of them had halted after a day or so.  But before you interpret that as evidence in favor of Goldbach, Riemann, and the consistency of ZFC, you should probably know that a Turing machine to test whether all perfect squares are less than 5, produced using Laconic, needed to run for more than an hour before it found the first counterexample (namely, 32=9) and halted.  Laconic Turing machines are optimized only for the number of states, not for speed, to put it mildly.

Of course, three orders of magnitude still remain between the largest value of n (namely, 4) for which BB(n) is known to be knowable in ZFC-based mathematics, and the smallest value of n (namely, 7,918) for which BB(n) is known to be unknowable.  I’m optimistic that further improvements are possible to the machine Z—whether that means simplifications to Friedman’s statement, a redesigned interpreter (possibly using lambda calculus?), or a “multi-stage rocket model” where a bare-bones interpreter would be used to unpack a second, richer interpreter which would be used to unpack a third, etc., until you got to the actual program you cared about.  But I’d be shocked if anyone in my lifetime determined the value of BB(10), for example, or proved the value independent of set theory.  Even after the Singularity happens, I imagine that our robot overlords would find the determination of BB(10) quite a challenge.

In an early Shtetl-Optimized post, I described theoretical computer science as “quantitative epistemology.”  Constructing small Turing machines whose behavior eludes set theory is not conventional theoretical computer science by any stretch of the imagination: it’s closer in practice to programming languages or computer architecture, or even the recreational practice known as code-golfing.  On the other hand, I’ve never been involved with any other project that was so clearly, explicitly about pinning down the quantitative boundary between the knowable and the unknowable.

Comments on our paper are welcome.

Addendum: Some people might wonder “why Turing machines,” as opposed to a more reasonable programming language like C or Python.  Well, first of all, we needed a language that could address an unlimited amount of memory.  Also, the BB function is traditionally defined in terms of Turing machines.  But the most important issue is that we wanted there to be no suspicion whatsoever that our choice of programming language was artificially helping to make our machine small.  And hopefully everyone can agree that one-tape, two-symbol Turing machines aren’t designed for anyone’s convenience!

### A postdoc post

Saturday, March 19th, 2016

I apologize that this announcement is late in this year’s hiring season, but here goes.  I’m seeking postdocs in computational complexity and/or quantum information science to join me at UT Austin starting in Fall of 2016.  As I mentioned before, there’s a wonderful CS theory group at UT that you can work with and benefit from, including Adam Klivans, David Zuckerman, Anna Gal, Vijaya Ramachandran, Brent Waters, Eric Price, Greg Plaxton, and of course my wife Dana Moshkovitz, who will be joining UT as well.  If you’re interested, please email me a CV and a short cover letter, and ask your PhD adviser and one or two others to email me recommendation letters.  The postdoc would be for two years by default.

Update (March 26): If you want to be considered for next year, please get your application to me by March 31st.

Another Update: I’m very honored, along with fourteen others, to have received a 2016 US National Security Science and Engineering Faculty Fellowship (NSSEFF), which supports unclassified basic research related in some way to DoD interests. My project is called “Paths to Quantum Supremacy.” Now that my Waterman award has basically been spent down, this is where much of the funding for quantum computing initiatives at UT Austin will come from for the next five years.

### From Boston to Austin

Sunday, February 28th, 2016

I have some big news—well, not for the world, but for me personally.  Starting this summer, I’ll be leaving MIT, and starting a new phase of my life, as David J. Bruton Jr. Centennial Professor of Computer Science at the University of Texas at Austin.  I’ll also be the founding director of UT Austin’s new quantum computing center, whose motto will be “Hook ’em Hadamards“, and whose logo will depict a fierce longhorn bull, whose horns are bra and ket signs enclosing an inner product between two quantum states.  My brilliant and talented wife, Dana Moshkovitz Aaronson, will also be joining UT Austin, as a tenured Associate Professor of Computer Science.  Our current PhD students will remain enrolled at MIT, while also spending as much time as they like in Austin.

I’ll deeply miss MIT and Boston.  More than anything else, I’ll miss the phenomenal students at MIT, who I’ve had the immense privilege to teach and learn from for nine years.  Go Beavers!  I’m grateful as well to my many friends and colleagues who made my years at MIT so rewarding: a time of personal growth, in which I developed from a skinny, insecure 26-year-old nerd, blogging and trying to prove oracle separations, into a pot-bellied, tenured, 34-year-old married-father nerd, still blogging and trying to prove the same oracle separations (but no longer as diligently).

To nip an otherwise-inevitable rumor: I wasn’t forced to leave MIT over anything here on Shtetl-Optimized.  I feel the need to say this because, within the last year, I’ve spent hundreds of miserable hours scrolling through social media threads wherein stranger after stranger proclaimed me basically the world’s worst scum (tied, perhaps, with the other Scott A.), and even called on MIT to fire me.  Given that experience, it was repeatedly jarring for me to reenter reality and discover how irrelevant this all was, either to MIT or to any of the universities that recruited me and Dana.  Bizarre as it sounds, CS departments mostly cared about what actual research we were doing and could bring to them!  So students and faculty afraid to debate anything controversial online under their real names, however politely, should know that even in 2016, the banner of academic freedom yet waves.

Without further ado, let me list ten things that are awesome about Austin and that helped attract me and Dana there.

1. One of the strongest CS departments and theory groups in the world.  From 1984 until his death in 2002, UT Austin was home to Edsger Dijkstra, who not only discovered Dijkstra’s algorithm but also penned the immortal words that might as well be tattooed on my stomach: computer science is no more about computers than astronomy is about telescopes.  Today, Austin’s CS department is rapidly expanding, and just within theory, is home to David Zuckerman, Anna Gal, Vijaya Ramachandran, Brent Waters, Eric Price, and Greg Plaxton.  With me and Dana there as well, I can say with all due modesty that we intend to compete against any CS theory program anywhere in the world.
2. Adam Klivans.  The closest I’ve had to a mentor in the exceedingly narrow field of theoretical computer science humor.
3. An outstanding recent track record with CS theory PhD students.  Since the turn of the century, UT Austin has produced Sasha Sherstov, Anup Rao, Allison Bishop LewkoSeth Pettie, Vladimir Trifonov, Raghu Meka, and other stars of the CS theory world.  That record lets me without the slightest hesitation tell hotshot undergrads who want to do classical and/or quantum complexity theory to apply to Austin for grad school.
4. The opportunity to build—or rather, rebuild—a UT presence in quantum computing.  While I’m excited to help build a new group—and I feel like it’s the right time in my career to do that—I can’t say that this is the first time UT Austin will have a significant presence in quantum computing.  Way back in the late 70s and early 80s, UT was home to most of the (proto) “quantum computing research” that existed on earth.  It’s there that John Archibald Wheeler philosophized about “It from Bit,” that Bryce deWitt popularized the Many-Worlds Interpretation and Hugh Everett gave his only public lecture on the subject, that David Deutsch did a four-year postdoc in which he formed the seeds of the idea of quantum computing, and that Wojciech Zurek, William Wootters, and Benjamin Schumacher (who between them, founded decoherence theory, proved the No-Cloning Theorem, and coined the term “qubit”) did their PhDs.  I’m honored to tread in their footsteps.
5. Money.  Texas, as it turns out, has a lot of it.  Now, the conventional wisdom would be that Texas’ wealth is mostly irrelevant to academic scientists, because it’s controlled by reactionary oilmen for whom curiosity-driven research is not exactly the top priority.  That might have been true about the administrations of George W. Bush or Rick Perry.  But Texas’ current governor, Greg Abbott, while still a right-wing Republican, also pushed through an aggressive \$4-billion measure called the Governor’s University Research Initiative, one of whose central goals is to recruit leading scientists to Texas.
6. Weather.  To a first approximation, Austin is lovely and pleasant during the academic year (even as the planet warms, this should remain true for at least a few more decades)—and while I’d sort of vaguely imagined all of Texas as a giant desert, Austin turns out to be lush and green and full of swimming holes.  The summers, of course, are hot enough to fuse hydrogen.  But for academics like me and Dana, it turns out that there’s an elegant solution to that, one unavailable for dealing with New England winters.  That solution is to leave town, to use June, July, and August for miscellaneous academic travel.
7. Quality of life.  If we’re being honest, I’m not someone likely to spend much time at indie-rock festivals, or whatever quirky cultural stuff it is that’s made Austin the fastest-growing city in the US.  But here’s something I do care about: even though highway traffic in Austin is bad and getting worse, that need not affect my life too much.  Research indicates that, for roughly the price of our current 2-bedroom condo in Cambridge, we could get a lovely 4-bedroom with a yard that’s in walking distance to the UT campus, as well as to stores, restaurants, good schools, and parks.
8. Schools.  I had a pretty miserable experience growing up.  I don’t know if Lily (or any future children Dana and I have) will be anything like I was, but given that she’s in an “at-risk population” for nerdiness, I’d love to live in a place with nerd education options that don’t stink.  Austin, happily, has two renowned math/science magnet schools—Kealing Middle School and LASA High School—which admit based on test scores.  (By contrast, in most parts in the US, such programs either don’t exist or admit purely by lottery.)  Austin also has the only elementary school, anywhere, whose admissions director told me that sure, they’d let a student skip a grade if it made sense academically.
9. Steven Weinberg.  I confess: it probably affected me more than it should that arguably the greatest scientist now walking the earth, a legend of physics who could go wherever the hell he wanted, has chosen to spend the past thirty-plus years at UT Austin.  On our last visit there, Dana, my parents, and I had the honor of having dinner with Weinberg.  After we’d exchanged stories about Telluride House at Cornell, where Weinberg and I both lived as undergrads (and which apparently changed little between the 1950s and the 1990s), Weinberg sung the praises of Austin for hours.  (Admittedly, part of why Weinberg enjoys Austin so much is that there it’s easy to be on a first-name basis with the mayor, tech billionaires, and all the other leaders of the city—an advantage that might be Nobel-laureate-specific!)
10. Adventure.  By background and temperament, I’m probably one of the “least Texan” Americans imaginable: a nerdy east-coast Jewish intellectual who enjoys snow, can’t much tolerate spicy food, is bored by cowboy movies and fears physical confrontation.  Indeed, until I actually visited the place, my only real associations with Texas were tumbleweeds blowing across a parched desert hellscape, oilmen with giant belt buckles guffawing about so-called global warming, crazed football hooligans filling city-sized stadiums, shotguns, rattlesnakes, and George W. Bush.  But then, the contrast between the over-the-top image and the pleasant reality of Austin had the paradoxical effect of making moving to Texas feel like an adventure—an adventure with an acceptable risk profile.  Like, if I’m going to uproot at all, why not to a place that’s strange and different and interesting?

Even given the above, some people will ask about things they’d consider obvious dealbreakers for moving to Texas.  In particular, what about the infamous new law that essentially forces UT Austin to let students carry concealed firearms to class?  Well, I oppose that law.  Indeed, when I haven’t been angering the social-justice left, I’ve been angering the right by (for example) blogging about my strong support for gun control.  To me, it seems like a terrible idea for the Texas state legislature, which provides only 14% of the UT system’s budget, to force on UT a gun policy that its faculty and students overwhelmingly hate.  And I admired Steven Weinberg’s announcement that he intends to defy the law in his classroom, and fight it out in court if necessary.  (Weinberg also gave, as one reason to oppose the law, how much harder it will make it for UT to recruit faculty.)

But at the same time … Dana is Israeli.  For her, it’s perfectly normal to go outside and see 18-year-old girls chatting and laughing with huge-ass machine guns slung over their shoulders.  Having spent a month of each year in Tel Aviv, seeing passersby with guns has become, if not exactly normal to me, then not something I fear 2% as much as I fear crashing my car.  And indeed, if one takes a statistical approach to risk, Austin has a much lower per-capita violent crime rate than Boston does.

And yes, I know, the US and Israel have completely different gun cultures: in Israel, for example, the only people carrying around semiautomatics are trained and monitored conscripts; there’s no concept of a private “right” to such a weapon.  And yes, the principle matters.  But if one is unwilling to move to any place that has any laws one disagrees with, one should probably look into faculty positions on offshore barges or Jupiter.

Austin itself, of course, is only slightly less liberal than Portland, the blueberry in the tomato soup as Rick Perry so memorably put it.  Even so, the maps insist that Austin is in Texas, which means that while there one will probably encounter Texans.  (A friend, on hearing that Dana took a quick liking to Austin when she visited, quipped that it was probably because Austin reminded her of Israel: “hot and surrounded by hostile territory.”)

Now, the actual Texans who I’ve met so far have been frighteningly warm and hospitable.  But the question stands: what will I do if, while living there, I meet (let’s suppose) some sun-calloused cattle ranchers who consider me an arrogant, effete coastal liberal who patronizes them in blog posts like this one?  What if they tell me to scram, head back east, and never mess with Texas again?

Well, I’ve already decided what I’d like to do in this hypothetical situation.  I’d like to invite the ranchers over to my place for some barbecued beers and ice-cold steaks, or whatever it is you eat in Texas, and tell them all about quantum query algorithms, and ask them about cattle feed, and try to find common ground, just like I tried to find common ground with the other end of the political spectrum—with the folks who called me a clueless, patriarchal, entitled white male douchebro who silenced their already-marginalized voices by not agreeing with everything they said.  For I’ve increasingly come to the conviction that, while you might fail to find common ground with someone, you’ve got to try, you’ve got to steelman their argument and learn whatever you can from it.  I once, for example, thought about the Religious Right as purely contemptible, deserving only unthinking snark, and I was completely wrong.  Even when I was right on the underlying issues, I was wrong on the epistemology.  In Texas, hopefully I’ll have a chance to do better.

In summary:

1. Wherever you’re at in your career, if you’d like to do quantum information and/or theoretical computer science research on the wild frontier—if QMA, QCMA, and BQP/qpoly strike you as little more than wayward calves to be lassoed in—then please consider joining us at UT Austin.  To be concrete: we’ll be looking for distinguished senior faculty to hire under the Governor’s University Research Initiative, we’ll be looking for tenure-track junior faculty in both CS and ECE, we’ll be looking for postdocs, we’ll be looking for grad students, and finally we’ll be looking for undergrads who’d like to join the innovative Turing Scholars honors program.
2. If you’d just like to come for a week and give a seminar, we’ll have money for that too.

Totally Unrelated Update (Feb. 29): Michael Mitzenmacher has asked me to announce that nominations are open for the SIGACT Distinguished Service Prize. More information is available here.

### Marvin Minsky

Tuesday, January 26th, 2016

Yesterday brought the sad news that Marvin Minsky passed away at age 88.  I never met Minsky (I wish I had); I just had one email exchange with him back in 2002, about Stephen Wolfram’s book.  But Minsky was my academic great-grandfather (through Manuel Blum and Umesh Vazirani), and he influenced me in many other ways.  For example, in his and Papert’s 1968 book Perceptrons—notorious for “killing neural net research for a decade,” because of its mis- or over-interpreted theorems about the representational limitations of single-layer neural nets—the way Minsky and Papert proved those theorems was by translating questions about computation into questions about the existence or nonexistence of low-degree polynomials with various properties, and then answering the latter questions using MATH.  Their “polynomial method” is now a mainstay of quantum algorithms research (having been brought to the subject by Beals et al.), and in particular, has been a mainstay of my own career.  Hardly Minsky’s best-known contribution to human knowledge, but that even such a relatively minor part of his oeuvre could have legs half a century later is a testament to his impact.

I’m sure readers will have other thoughts to share about Minsky, so please do so in the comments section.  Personal reminiscences are especially welcome.

### Happy Third Birthday Lily!

Thursday, January 21st, 2016

Uri Bram posted a cute little article about whether he was justified, as a child, to tell his parents that he wouldn’t clean up his room because doing so would only increase the universe’s entropy and thereby hasten its demise. The article quotes me, Sean Carroll, and others about that important question.

On Wednesday I gave a TCS+ online seminar about “The Largest Possible Quantum Speedups.” If you’re interested, you can watch the YouTube video here.

(I promised a while ago that I’d upload some examples of Lily’s MOMA-worthy modern artworks.  So, here are two!)

A few quotable quotes:

Daddy, when you were little, you were a girl like me!

I’m feeling a bit juicy [thirsty for juice].

Saba and Safta live in Israel. They’re mommy’s friends! [Actually they’re mommy’s parents.]

Me: You’re getting bigger every day!
Lily: But I’m also getting smaller every day!

Me: Then Goldilocks tasted the third bowl, which was Baby Bear’s, and it was just right!  So she ate it all up.  Then Goldilocks went…
Lily: No, then Goldilocks ate some cherries in the kitchen before she went to the bedroom.  And blueberries.
Me: Fine, so she ate cherries and blueberries.  Then she went to the bedroom, and she saw that there were three beds…
Lily: No, four beds!
Me: Fine, four beds.  So she laid in the first bed, but she said, “this bed is too hard.”
Lily: No, it was too comfortable!
Me: Too comfortable?  Is she some kind of monk?

Me [pointing to a taxidermed black bear in a museum]: What’s that?
Lily: A bear!
Me: Is it Winnie the Pooh?
Lily: No, it’s a different kind of bear.
Me [pointing to a tan bear in the next case]: So what about that one? Is that Winnie?
Lily: Yes! That’s Winnie the Pooh!
[Looking at it more closely] No, it’s a different kind of Winnie.

Lily: Why is it dark outside?
Me: Because it’s night time.
Lily: Why is it night time?
Me: Because the sun went to the other side of the world.
Lily: It went to China!
Me: Yes! It did in fact go to China.
Lily: Why did the sun go to China?
Me: Well, more accurately, it only seemed to go there, because the world that we’re on is spinning.
Lily: Why is the world spinning?
Me: Because of the conservation of angular momentum.
Lily: Why is the … consibation of amomomo?
Me: I suppose because of Noether’s Theorem, and the fact that our laws of physics are symmetric under spatial rotations.
Lily: Why is…
Me: That’s enough for today Lily!

### Talk, be merry, and be rational

Monday, November 23rd, 2015

Yesterday I wrote a statement on behalf of a Scott Alexander SlateStarCodex/rationalist meetup, which happened last night at MIT (in the same room where I teach my graduate class), and which I’d really wanted to attend but couldn’t.  I figured I’d share the statement here:

I had been looking forward to attending tonight’s MIT SlateStarCodex meetup as I hardly ever look forward to anything. Alas, I’m now stuck in Chicago, with my flight cancelled due to snow, and with all flights for the next day booked up. But instead of continuing to be depressed about it, I’ve decided to be happy that this meetup is even happening at all—that there’s a community of people who can read, let’s say, a hypothetical debate moderator questioning Ben Carson about what it’s like to be a severed half-brain, and simply be amused, instead of silently trying to figure out who benefits from the post and which tribe the writer belongs to. (And yes, I know: the answer is the gray tribe.) And you can find this community anywhere—even in Cambridge, Massachusetts! Look, I spend a lot of time online, just getting more and more upset reading social justice debates that are full of people calling each other douchebags without even being able to state anything in the same galactic supercluster as the other side’s case. And then what gives me hope for humanity is to click over to the slatestarcodex tab, and to see all the hundreds of comments (way more than my blog gets) by people who disagree with each other but who all basically get it, who all have minds that don’t make me despair. And to realize that, when Scott Alexander calls an SSC meetup, he can fill a room just about anywhere … well, at least anywhere I would visit. So talk, be merry, and be rational.

I’m now back in town, and told by people who attended the meetup that it was crowded, disorganized, and great.  And now I’m off to Harvard, to attend the other Scott A.’s talk “How To Ruin A Perfectly Good Randomized Controlled Trial.”

Update (Nov. 24) Scott Alexander’s talk at Harvard last night was one of the finest talks I’ve ever attended. He was introduced to rapturous applause as simply “the best blogger on the Internet,” and as finally an important speaker, in a talk series that had previously wasted everyone’s time with the likes of Steven Pinker and Peter Singer. (Scott demurred that his most notable accomplishment in life was giving the talk at Harvard that he was just now giving.) The actual content, as Scott warned from the outset, was “just” a small subset of a basic statistics course, but Scott brought each point alive with numerous recent examples, from psychiatry, pharmacology, and social sciences, where bad statistics or misinterpretations of statistics were accepted by nearly everyone and used to set policy. (E.g., Alcoholics Anonymous groups that claimed an “over 95%” success rate, because the people who relapsed were kicked out partway through and not counted toward the total.) Most impressively, Scott leapt immediately into the meat, ended after 20 minutes, and then spent the next two hours just taking questions. Scott is publicity-shy, but I hope for others’ sake that video of the talk will eventually make its way online.

Then, after the talk, I had the honor of meeting two fellow Boston-area rationalist bloggers, Kate Donovan and Jesse Galef. Yes, I said “fellow”: for almost a decade, I’ve considered myself on the fringes of the “rationalist movement.” I’d hang out a lot with skeptic/effective-altruist/transhumanist/LessWrong/OvercomingBias people (who are increasingly now SlateStarCodex people), read their blogs, listen and respond to their arguments, answer their CS theory questions. But I was always vaguely uncomfortable identifying myself with any group that even seemed to define itself by how rational it was compared to everyone else (even if the rationalists constantly qualified their self-designation with “aspiring”!). Also, my rationalist friends seemed overly interested in questions like how to prevent malevolent AIs from taking over the world, which I tend to think we lack the tools to make much progress on right now (though, like with many other remote possibilities, I’m happy for some people to work on them and see if they find anything interesting).

So, what changed? Well, in the debates about social justice, public shaming, etc. that have swept across the Internet these past few years, it seems to me that my rationalist friends have proven themselves able to weigh opposing arguments, examine their own shortcomings, resist groupthink and hysteria from both sides, and attack ideas rather than people, in a way that the wider society—and most depressingly to me, the “enlightened, liberal” part of society—has often failed. In a real-world test (“real-world,” in this context, meaning social media…), the rationalists have walked the walk and rationaled the rational, and thus they’ve given me no choice but to stand up and be counted as one of them.

Have a great Thanksgiving, those of you in the US!

I’m sure there’s a lesson in here somewhere about what I should spend my time on.

### G. Phi. Fo. Fum.

Wednesday, November 4th, 2015

Update (Dec. 14): The long wait is over!  Here’s Laci’s paper on the arXiv.  So far, I’ve read it only deeply enough to note that it contains the following sentence:

A group G ≤ S(Ω) defines the category of G-isomorphisms of strings on the domain Ω; the natural notation for this category, the central object of study in this paper, would seem to be “G-Strings.”

With that, I believe Laci himself has outshone even reddit’s attempt to mine his breakthrough result for juvenile humor.

See also a nice Quanta article about Laci’s algorithm by Erica Klarreich.  There’s only one statement in the article that I disagree with: namely that, if graph isomorphism were inherently quasipolynomial time, then it would be the first natural example of such a problem.  We know other natural problems, like approximating free games and socially-optimal Nash equilibria, that are solvable in nO(log n) time but that can’t be in P unless 3SAT is solvable in ~exp(√n) time.

Update (Nov. 17): Video of Laci’s first talk is now available.

Breaking News (Nov. 12): Jeremy Kun has written up a phenomenal summary of Babai’s first lecture.  I haven’t carefully studied all of it, and in any case, there are many missing details to be filled in later (Babai told Kun that the preprint will be available “soon, soon!”).  But from the summary, four points stood out to me:

1. Babai actually claims a quasipolynomial-time algorithm for an interestingly more general problem than graph isomorphism, called string isomorphism.  This was already in the abstract, but googling didn’t reveal what string isomorphism was.  So, OK, here’s what it is: you’re given two strings x and y over some finite alphabet, as well as the generators of a group G of permutations of the string indices.  The problem is to determine whether you can transform x to y by applying a permutation in G.  Or even more generally: given a string x, find a full set of generators for the subgroup of G that fixes x.  See Kun’s post for the straightforward reductions from GI to these group-theoretic problems.
2. As was hinted in the abstract, in Babai’s analysis of his algorithm, there’s one step that relies on a statement whose only known proof depends on the Classification of Finite Simple Groups.  (Thus, it’s not the algorithm itself requires iterating through all the sporadic simple groups or anything like that; this only shows up in the correctness proof.)  This is not the first-ever computer-science application of the Classification of Finite Simple Groups (indeed, Babai himself has some previous ones), but it’s certainly the most dramatic.
3. In previous work on GI, the Johnson graph emerged over and over as a forehead-bangingly hard case that caused numerous algorithms to fail.  In the new work, it looks like Babai’s central technical innovation is to show that, in some sense, the Johnson graph is the only obstruction to taking the divide-and-conquer approaches that people that had tried before, and making them run in quasipolynomial time.  I.e., in each step of the recursion, either you can find a Johnson graph on a large fraction of the vertices and handle it specially, or else you can do something that works whenever there’s not a Johnson graph on a large fraction of the vertices.  Babai calls this “split-or-Johnson.”
4. Babai stressed that in some sense, his new algorithm is the culmination of a program laid out by Eugene Luks in 1982.  Now, the Classification of Finite Simple Groups was also more-or-less completed in the early 1980s.  To my mind, this raises a fascinating socio-mathematical question: which aspects of the new work, if any, could not have been done in the early 80s, possibly by Babai or Luks themselves?  what is it that needed another 30 years?  If the answer turns out to be “nothing,” then to me that’s an astounding illustration of the role of the individual in mathematical progress.  As in: Laci was nice enough to take a third-of-a-century break between his and Luks’ work in the early 80s, and the “natural next step” in their program … and still no one managed to use that break to beat him to the next step!

Earlier today, I was tipped off to what might be the theoretical computer science result of the decade.  My source asked me not to break the news on this blog—but since other theory bloggers (and twitterers) are now covering the story, I guess the graph is out of the Babai.

According to the University of Chicago’s theory seminar calendar, on Tuesday of next week (November 10), the legendary Laszlo Babai will be giving a talk about a new algorithm that solves the graph isomorphism problem in quasipolynomial time.  The previous fastest algorithm to decide whether two n-vertex graphs G and H are isomorphic—by Babai and Luks, back in 1983—ran in exp(√(n log n)) time.  If we credit the announcement, Babai has now gotten that down to exp(polylog(n)), putting one of the central problems of computer science “just barely above P.”  (For years, I’ve answered questions on this blog about the status of graph isomorphism—would I bet that it’s in BQP? in coNP? etc.—by saying that, as far as I and many others are concerned, it might as well just be in P.  Of course I’m happy to reaffirm that conjecture tonight.)

Next week, I assume, Laci will lecture to a packed house; then the experts will race to unpack the details.  Until then, we probably need to sit tight; I don’t know any more than what’s in the abstract.  For now, I’m delighted if commenters want to share general thoughts or questions about graph isomorphism (and I’ll try to answer what I can), but I won’t allow uninformed speculations or rumors about the details of the new result—not until Laci has had a chance to speak.

Update (Nov. 5): While we all wait with bated breath for more details, you can amuse yourself with the talk I gave at Laci’s 60th birthday conference five years ago.

Also, a comment of mine that I should probably promote to the main post:

Dana points out to me that non-native English speakers might not get the staggeringly clever pun in this post’s title (hey, it was the best I could do on a deadline).

So, alright, fee fi fo fum is what the approaching giant bellows in Jack and the Beanstalk. It means something big is on the horizon. Also, G is a graph, and Phi is an isomorphism.

Update (Nov. 12): So, Laci gave his talk. Video was made but does not appear to be available yet. However, Gabriel Gaster, who was in attendance, graciously live-tweeted everything. Here’s a Storify of the live-tweets. (What’s a “Storify”?)

### Six announcements

Monday, September 21st, 2015
1. I did a podcast interview with Julia Galef for her series “Rationally Speaking.”  See also here for the transcript (which I read rather than having to listen to myself stutter).  The interview is all about Aumann’s Theorem, and whether rational people can agree to disagree.  It covers a lot of the same ground as my recent post on the same topic, except with less technical detail about agreement theory and more … well, agreement.  At Julia’s suggestion, we’re planning to do a follow-up podcast about the particular intractability of online disagreements.  I feel confident that we’ll solve that problem once and for all.  (Update: Also check out this YouTube video, where Julia offers additional thoughts about what we discussed.)
2. When Julia asked me to recommend a book at the end of the interview, I picked probably my favorite contemporary novel: The Mind-Body Problem by Rebecca Newberger Goldstein.  Embarrassingly, I hadn’t realized that Rebecca had already been on Julia’s show twice as a guest!  Anyway, one of the thrills of my life over the last year has been to get to know Rebecca a little, as well as her husband, who’s some guy named Steve Pinker.  Like, they both live right here in Boston!  You can talk to them!  I was especially pleased two weeks ago to learn that Rebecca won the National Humanities Medal—as I told Julia, Rebecca Goldstein getting a medal at the White House is the sort of thing I imagine happening in my ideal fantasy world, making it a pleasant surprise that it happened in this one.  Huge congratulations to Rebecca!
3. The NSA has released probably its most explicit public statement so far about its plans to move to quantum-resistant cryptography.  For more see Bruce Schneier’s Crypto-Gram.  Hat tip for this item goes to reader Ole Aamot, one of the only people I’ve ever encountered whose name alphabetically precedes mine.
4. Last Tuesday, I got to hear Ayaan Hirsi Ali speak at MIT about her new book, Heretic, and then spend almost an hour talking to students who had come to argue with her.  I found her clear, articulate, and courageous (as I guess one has to be in her line of work, even with armed cops on either side of the lecture hall).  After the shameful decision of Brandeis in caving in to pressure and cancelling Hirsi Ali’s commencement speech, I thought it spoke well of MIT that they let her speak at all.  The bar shouldn’t be that low, but it is.
5. From far away on the political spectrum, I also heard Noam Chomsky talk last week (my first time hearing him live), about the current state of linguistics.  Much of the talk, it struck me, could have been given in the 1950s with essentially zero change (and I suspect Chomsky would agree), though a few parts of it were newer, such as the speculation that human languages have many of the features they do in order to minimize the amount of computation that the speaker needs to perform.  The talk was full of declarations that there had been no useful work whatsoever on various questions (e.g., about the evolutionary function of language), that they were total mysteries and would perhaps remain total mysteries forever.
6. Many of you have surely heard by now that Terry Tao solved the Erdös Discrepancy Problem, by showing that for every infinite sequence of heads and tails and every positive integer C, there’s a positive integer k such that, if you look at the subsequence formed by every kth flip, there comes a point where the heads outnumber tails or vice versa by at least C.  This resolves a problem that’s been open for more than 80 years.  For more details, see this post by Timothy Gowers.  Notably, Tao’s proof builds, in part, on a recent Polymath collaborative online effort.  It was a big deal last year when Konev and Lisitsa used a SAT-solver to prove that there’s always a subsequence with discrepancy at least 3; Tao’s result now improves on that bound by ∞.

### 6-photon BosonSampling

Wednesday, August 19th, 2015

The news is more-or-less what the title says!

In Science, a group led by Anthony Laing at Bristol has now reported BosonSampling with 6 photons, beating their own previous record of 5 photons, as well as the earlier record of 4 photons achieved a few years ago by the Walmsley group at Oxford (as well as the 3-photon experiments done by groups around the world).  I only learned the big news from a commenter on this blog, after the paper was already published (protip: if you’ve pushed forward the BosonSampling frontier, feel free to shoot me an email about it).

As several people explain in the comments, the main advance in the paper is arguably not increasing the number of photons, but rather the fact that the device is completely reconfigurable: you can try hundreds of different unitary transformations with the same chip.  In addition, the 3-photon results have an unprecedentedly high fidelity (about 95%).

The 6-photon results are, of course, consistent with quantum mechanics: the transition amplitudes are indeed given by permanents of 6×6 complex matrices.  Key sentence:

After collecting 15 sixfold coincidence events, a confidence of P = 0.998 was determined that these are drawn from a quantum (not classical) distribution.

No one said scaling BosonSampling would be easy: I’m guessing that it took weeks of data-gathering to get those 15 coincidence events.  Scaling up further will probably require improvements to the sources.

There’s also a caveat: their initial state consisted of 2 modes with 3 photons each, as opposed to what we really want, which is 6 modes with 1 photon each.  (Likewise, in the Walmsley group’s 4-photon experiments, the initial state consisted of 2 modes with 2 photons each.)  If the number of modes stayed 2 forever, then the output distributions would remain easy to sample with a classical computer no matter how many photons we had, since we’d then get permanents of matrices with only 2 distinct rows.  So “scaling up” needs to mean increasing not only the number of photons, but also the number of sources.

Nevertheless, this is an obvious step forward, and it came sooner than I expected.  Huge congratulations to the authors on their accomplishment!

But you might ask: given that 6×6 permanents are still pretty easy for a classical computer (the more so when the matrices have only 2 distinct rows), why should anyone care?  Well, the new result has major implications for what I’ve always regarded as the central goal of quantum computing research, much more important than breaking RSA or Grover search or even quantum simulation: namely, getting Gil Kalai to admit he was wrong.  Gil is on record, repeatedly, on this blog as well as his (see for example here), as saying that he doesn’t think BosonSampling will ever be possible even with 7 or 8 photons.  I don’t know whether the 6-photon result is giving him second thoughts (or sixth thoughts?) about that prediction.

### Jacob Bekenstein (1947-2015)

Tuesday, August 18th, 2015

Today I learned the sad news that Jacob Bekenstein, one of the great theoretical physicists of our time, passed away at the too-early age of 68.

Everyone knows what a big deal it was when Stephen Hawking discovered in 1975 that black holes radiate.  Bekenstein was the guy who, as a grad student in Princeton in the early 1970s, was already raving about black holes having nonzero entropy and temperature, and satisfying the Second Law of Thermodynamics—something just about everyone, including Hawking, considered nuts at the time.  It was, as I understand it, Hawking’s failed attempt to prove Bekenstein wrong that led to Hawking’s discovery of the Hawking radiation, and thence to the modern picture of black holes.

In the decades since, Bekenstein continued to prove ingenious physical inequalities, often using thought experiments involving black holes.  The most famous of these, the Bekenstein bound, says that the number of bits that can be stored in any bounded physical system is finite, and is upper-bounded by ~2.6×1043 MR, where M is the system’s mass in kilograms and R is its radius in meters.  (This bound is saturated by black holes, and only by black holes, which therefore emerge as the most compact possible storage medium—though probably not the best for retrieval!)  Bekenstein’s lectures were models of clarity and rigor: at conferences full of audacious speculations, he stood out to my non-expert eyes as someone who was simply trying to follow chains of logic from accepted physical principles, however mind-bogglingly far those chains led but no further.

I first met Bekenstein in 2003, when I was a grad student spending a semester at Hebrew University in Jerusalem.  I was struck by the kindness he showed a 21-year-old nobody, who wasn’t even a physics student, coming to bother him.  Not only did he listen patiently to my blather about applying computational complexity to physics, he said that of course physics should ultimately aim to understand everything as the output of some computer program, that he too was thinking in terms of computation when he studied black-hole entropy.  I remember pondering the fact that the greatest reductionist I’d ever met was wearing a yarmulke—and then scolding myself for wasting precious brain-cycles on such a trivial thought when there was science to discuss.  I met Bekenstein maybe four or five more times on visits to Israel, most recently a year and a half ago, when we shared walks to and from the hotel at a firewall workshop at the Weizmann Institute.  He was unfailingly warm, modest, and generous—totally devoid of the egotism that I’ve heard can occasionally afflict people of his stature.  Now, much like with the qubits hitting the event horizon, the information that comprised Jacob Bekenstein might seem to be gone, but it remains woven into the cosmos.