Archive for the ‘Nerd Interest’ Category

My biology paper in Science (really)

Friday, July 22nd, 2016

Think I’m pranking you, right?

You can see the paper right here (“Synthetic recombinase-based state machines in living cells,” by Nathaniel Roquet, Ava P. Soleimany, Alyssa C. Ferris, Scott Aaronson, and Timothy K. Lu).  [Update (Aug. 3): The previous link takes you to a paywall, but you can now access the full text of our paper here.  See also the Supplementary Material here.]  You can also read the MIT News article (“Scientists program cells to remember and respond to series of stimuli”).  In any case, my little part of the paper will be fully explained in this post.

A little over a year ago, two MIT synthetic biologists—Timothy Lu and his PhD student Nate Roquet—came to my office saying they had a problem they wanted help with.  Why me? I wondered.  Didn’t they realize I was a quantum complexity theorist, who so hated picking apart owl pellets and memorizing the names of cell parts in junior-high Life Science, that he avoided taking a single biology course since that time?  (Not counting computational biology, taught in a CS department by Richard Karp.)

Nevertheless, I listened to my biologist guests—which turned out to be an excellent decision.

Tim and Nate told me about a DNA system with surprisingly clear rules, which led them to a strange but elegant combinatorial problem.  In this post, first I need to spend some time to tell you the rules; then I can tell you the problem, and lastly its solution.  There are no mathematical prerequisites for this post, and certainly no biology prerequisites: everything will be completely elementary, like learning a card game.  Pen and paper might be helpful, though.

As we all learn in kindergarten, DNA is a finite string over the 4-symbol alphabet {A,C,G,T}.  We’ll find it more useful, though, to think in terms of entire chunks of DNA bases, which we’ll label arbitrarily with letters like X, Y, and Z.  For example, we might have X=ACT, Y=TAG, and Z=GATTACA.

We can also invert one of these chunks, which means writing it backwards while also swapping the A’s with T’s and the G’s with C’s.  We’ll denote this operation by * (the technical name in biology is “reverse-complement”).  For example:

X*=AGT, Y*=CTA, Z*=TGTAATC.

Note that (X*)*=X.

We can then combine our chunks and their inverses into a longer DNA string, like so:

ZYX*Y* = GATTACA TAG AGT CTA.

From now on, we’ll work exclusively with the chunks, and forget completely about the underlying A’s, C’s, G’s, and T’s.

Now, there are also certain special chunks of DNA bases, called recognition sites, which tell the little machines that read the DNA when they should start doing something and when they should stop.  Recognition sites come in pairs, so we’ll label them using various parenthesis symbols like ( ), [ ], { }.  To convert a parenthesis into its partner, you invert it: thus ( = )*, [ = ]*, { = }*, etc.  Crucially, the parentheses in a DNA string don’t need to “face the right ways” relative to each other, and they also don’t need to nest properly.  Thus, both of the following are valid DNA strings:

X ( Y [ Z [ U ) V

X { Y ] Z { U [ V

Let’s refer to X, Y, Z, etc.—the chunks that aren’t recognition sites—as letter-chunks.  Then it will be convenient to make the following simplifying assumptions:

  1. Our DNA string consists of an alternating sequence of recognition sites and letter-chunks, beginning and ending with letter-chunks.  (If this weren’t true, then we could just glom together adjacent recognition sites and adjacent letter-chunks, and/or add new dummy chunks, until it was true.)
  2. Every letter-chunk that appears in the DNA string appears exactly once (either inverted or not), while every recognition site that appears, appears exactly twice.  Thus, if there are n distinct recognition sites, there are 2n+1 distinct letter-chunks.
  3. Our DNA string can be decomposed into its constituent chunks uniquely—i.e., it’s always possible to tell which chunk we’re dealing with, and when one chunk stops and the next one starts.  In particular, the chunks and their reverse-complements are all distinct strings.

The little machines that read the DNA string are called recombinases.  There’s one kind of recombinase for each kind of recognition site: a (-recombinase, a [-recombinase, and so on.  When, let’s say, we let a (-recombinase loose on our DNA string, it searches for (‘s and )’s and ignores everything else.  Here’s what it does:

  • If there are no (‘s or )’s in the string, or only one of them, it does nothing.
  • If there are two (‘s facing the same way—like ( ( or ) )—it deletes everything in between them, including the (‘s themselves.
  • If there are two (‘s facing opposite ways—like ( ) or ) (—it deletes the (‘s, and inverts everything in between them.

Let’s see some examples.  When we apply [-recombinase to the string

A ( B [ C [ D ) E,

we get

A ( B D ) E.

When we apply (-recombinase to the same string, we get

A D* ] C* ] B* E.

When we apply both recombinases (in either order), we get

A D* B* E.

Another example: when we apply {-recombinase to

A { B ] C { D [ E,

we get

A D [ E.

When we apply [-recombinase to the same string, we get

A { B D* } C* E.

When we apply both recombinases—ah, but here the order matters!  If we apply { first and then [, we get

A D [ E,

since the [-recombinase now encounters only a single [, and has nothing to do.  On the other hand, if we apply [ first and then {, we get

A D B* C* E.

Notice that inverting a substring can change the relative orientation of two recognition sites—e.g., it can change { { into { } or vice versa.  It can thereby change what happens (inversion or deletion) when some future recombinase is applied.

One final rule: after we’re done applying recombinases, we remove the remaining recognition sites like so much scaffolding, leaving only the letter-chunks.  Thus, the final output

A D [ E

becomes simply A D E, and so on.  Notice also that, if we happen to delete one recognition site of a given type while leaving its partner, the remaining site will necessarily just bounce around inertly before getting deleted at the end—so we might as well “put it out of its misery,” and delete it right away.

My coauthors have actually implemented all of this in a wet lab, which is what most of the Science paper is about (my part is mostly in a technical appendix).  They think of what they’re doing as building a “biological state machine,” which could have applications (for example) to programming cells for medical purposes.

But without further ado, let me tell you the math question they gave me.  For reasons that they can explain better than I can, my coauthors were interested in the information storage capacity of their biological state machine.  That is, they wanted to know the answer to the following:

Suppose we have a fixed initial DNA string, with n pairs of recognition sites and 2n+1 letter-chunks; and we also have a recombinase for each type of recognition site.  Then by choosing which recombinases to apply, as well as which order to apply them in, how many different DNA strings can we generate as output?

It’s easy to construct an example where the answer is as large as 2n.  Thus, if we consider a starting string like

A ( B ) C [ D ] E { F } G < H > I,

we can clearly make 24=16 different output strings by choosing which subset of recombinases to apply and which not.  For example, applying [, {, and < (in any order) yields

A B C D* E F* G H* I.

There are also cases where the number of distinct outputs is less than 2n.  For example,

A ( B [ C [ D ( E

can produce only 3 outputs—A B C D E, A B D E, and A E—rather than 4.

What Tim and Nate wanted to know was: can the number of distinct outputs ever be greater than 2n?

Intuitively, it seems like the answer “has to be” yes.  After all, we already saw that the order in which recombinases are applied can matter enormously.  And given n recombinases, the number of possible permutations of them is n!, not 2n.  (Furthermore, if we remember that any subset of the recombinases can be applied in any order, the number of possibilities is even a bit greater—about e·n!.)

Despite this, when my coauthors played around with examples, they found that the number of distinct output strings never exceeded 2n. In other words, the number of output strings behaved as if the order didn’t matter, even though it does.  The problem they gave me was either to explain this pattern or to find a counterexample.

I found that the pattern holds:

Theorem: Given an initial DNA string with n pairs of recognition sites, we can generate at most 2n distinct output strings by choosing which recombinases to apply and in which order.

Let a recombinase sequence be an ordered list of recombinases, each occurring at most once: for example, ([{ means to apply (-recombinase, then [-recombinase, then {-recombinase.

The proof of the theorem hinges on one main definition.  Given a recombinase sequence that acts on a given DNA string, let’s call the sequence irreducible if every recombinase in the sequence actually finds two recognition sites (and hence, inverts or deletes a nonempty substring) when it’s applied.  Let’s call the sequence reducible otherwise.  For example, given

A { B ] C { D [ E,

the sequence [{ is irreducible, but {[ is reducible, since the [-recombinase does nothing.

Clearly, for every reducible sequence, there’s a shorter sequence that produces the same output string: just omit the recombinases that don’t do anything!  (On the other hand, I leave it as an exercise to show that the converse is false.  That is, even if a sequence is irreducible, there might be a shorter sequence that produces the same output string.)

Key Lemma: Given an initial DNA string, and given a subset of k recombinases, every irreducible sequence composed of all k of those recombinases produces the same output string.

Assuming the Key Lemma, let’s see why the theorem follows.  Given an initial DNA string, suppose you want to specify one of its possible output strings.  I claim you can do this using only n bits of information.  For you just need to specify which subset of the n recombinases you want to apply, in some irreducible order.  Since every irreducible sequence of those recombinases leads to the same output, you don’t need to specify an order on the subset.  Furthermore, for each possible output string S, there must be some irreducible sequence that leads to S—given a reducible sequence for S, just keep deleting irrelevant recombinases until no more are left—and therefore some subset of recombinases you could pick that uniquely determines S.  OK, but if you can specify each S uniquely using n bits, then there are at most 2n possible S’s.

Proof of Key Lemma.  Given an initial DNA string, let’s assume for simplicity that we’re going to apply all n of the recombinases, in some irreducible order.  We claim that the final output string doesn’t depend at all on which irreducible order we pick.

If we can prove this claim, then the lemma follows, since given a proper subset of the recombinases, say of size k<n, we can simply glom together everything between one relevant recognition site and the next one, treating them as 2k+1 giant letter-chunks, and then repeat the argument.

Now to prove the claim.  Given two letter-chunks—say A and B—let’s call them soulmates if either A and B or A* and B* will necessarily end up next to each other, whenever all n recombinases are applied in some irreducible order, and whenever A or B appears at all in the output string.  Also, let’s call them anti-soulmates if either A and B* or A* and B will necessarily end up next to each other if either appears at all.

To illustrate, given the initial DNA sequence,

A [ B ( C ] D ( E,

you can check that A and C are anti-soulmates.  Why?  Because if we apply all the recombinases in an irreducible sequence, then at some point, the [-recombinase needs to get applied, and it needs to find both [ recognition sites.  And one of these recognition sites will still be next to A, and the other will still be next to C (for what could have pried them apart?  nothing).  And when that happens, no matter where C has traveled in the interim, C* must get brought next to A.  If the [-recombinase does an inversion, the transformation will look like

A [ … C ] → A C* …,

while if it does a deletion, the transformation will look like

A [ … [ C* → A C*

Note that C’s [ recognition site will be to its left, if and only if C has been flipped to C*.  In this particular example, A never moves, but if it did, we could repeat the analysis for A and its [ recognition site.  The conclusion would be the same: no matter what inversions or deletions we do first, we’ll maintain the invariant that A and C* (or A* and C) will immediately jump next to each other, as soon as the [ recombinase is applied.  And once they’re next to each other, nothing will ever separate them.

Similarly, you can check that C and D are soulmates, connected by the ( recognition sites; D and B are anti-soulmates, connected by the [ sites; and B and E are soulmates, connected by the ( sites.

More generally, let’s consider an arbitrary DNA sequence, with n pairs of recognition sites.  Then we can define a graph, called the soulmate graph, where the 2n+1 letter-chunks are the vertices, and where X and Y are connected by (say) a blue edge if they’re soulmates, and by a red edge if they’re anti-soulmates.

When we construct this graph, we find that every vertex has exactly 2 neighbors, one for each recognition site that borders it—save the first and last vertices, which border only one recognition site each and so have only one neighbor each.  But these facts immediately determine the structure of the graph.  Namely, it must consist of a simple path, starting at the first letter-chunk and ending at the last one, together with possibly a disjoint union of cycles.

But we know that the first and last letter-chunks can never move anywhere.  For that reason, a path of soulmates and anti-soulmates, starting at the first letter-chunk and ending at the last one, uniquely determines the final output string, when the n recombinases are applied in any irreducible order.  We just follow it along, switching between inverted and non-inverted letter-chunks whenever we encounter a red edge.  The cycles contain the letter-chunks that necessarily get deleted along the way to that unique output string.  This completes the proof of the lemma, and hence the theorem.

 

There are other results in the paper, like a generalization to the case where there can be k pairs of recognition sites of each type, rather than only one. In that case, we can prove that the number of distinct output strings is at most 2kn, and that it can be as large as ~(2k/3e)n. We don’t know the truth between those two bounds.

Why is this interesting?  As I said, my coauthors had their own reasons to care, involving the number of bits one can store using a certain kind of DNA state machine.  I got interested for a different reason: because this is a case where biology threw up a bunch of rules that look like a random mess—the parentheses don’t even need to nest correctly?  inversion can also change the semantics of the recognition sites?  evolution never thought about what happens if you delete one recognition site while leaving the other one?—and yet, on analysis, all the rules work in perfect harmony to produce a certain outcome.  Change a single one of them, and the “at most 2n distinct DNA sequences” theorem would be false.  Mind you, I’m still not sure what biological purpose it serves for the rules to work in harmony this way, but they do.

But the point goes further.  While working on this problem, I’d repeatedly encounter an aspect of the mathematical model that seemed weird and inexplicable—only to have Tim and Nate explain that the aspect made sense once you brought in additional facts from biology, facts not in the model they gave me.  As an example, we saw that in the soulmate graph, the deleted substrings appear as cycles.  But surely excised DNA fragments don’t literally form loops?  Why yes, apparently, they do.  As a second example, consider the DNA string

A ( B [ C ( D [ E.

When we construct the soulmate graph for this string, we get the path

A–D–C–B–E.

Yet there’s no actual recombinase sequence that leads to A D C B E as an output string!  Thus, we see that it’s possible to have a “phantom output,” which the soulmate graph suggests should be reachable but that isn’t actually reachable.  According to my coauthors, that’s because the “phantom outputs” are reachable, once you know that in real biology (as opposed to the mathematical model), excised DNA fragments can also reintegrate back into the long DNA string.

Many of my favorite open problems about this model concern algorithms and complexity. For example: given as input an initial DNA string, does there exist an irreducible order in which the recombinases can be applied? Is the “utopian string”—the string suggested by the soulmate graph—actually reachable? If it is reachable, then what’s the shortest sequence of recombinases that reaches it? Are these problems solvable in polynomial time? Are they NP-hard? More broadly, if we consider all the subsets of recombinases that can be applied in an irreducible order, or all the irreducible orders themselves, what combinatorial conditions do they satisfy?  I don’t know—if you’d like to take a stab, feel free to share what you find in the comments!

What I do know is this: I’m fortunate that, before they publish your first biology paper, the editors at Science don’t call up your 7th-grade Life Science teacher to ask how you did in the owl pellet unit.


More in the comments:

  • Some notes on the generalization to k pairs of recognition sites of each type
  • My coauthor Nathaniel Roquet’s comments on the biology

Unrelated Announcement from My Friend Julia Wise (July 24): Do you like science and care about humanity’s positive trajectory? July 25 is the final day to apply for Effective Altruism Global 2016. From August 5-7 at UC Berkeley, a network of founders, academics, policy-makers, and more will gather to apply economic and scientific thinking to the world’s most important problems. Last year featured Elon Musk and the head of Google.org. This year will be headlined by Cass Sunstein, the co-author of Nudge. If you apply with this link, the organizers will give you a free copy of Doing Good Better by Will MacAskill. Scholarships are available for those who can’t afford the cost.  Read more here.  Apply here.

ITCS’2017: Special Guest Post by Christos Papadimitriou

Wednesday, July 6th, 2016

The wait is over.

Yes, that’s correct: the Call for Papers for the 2017 Innovations in Theoretical Computer Science (ITCS) conference, to be held in Berkeley this coming January 9-11, is finally up.  I attended ITCS’2015 in Rehovot, Israel and had a blast, and will attend ITCS’2017 if logistics permit.

But that’s not all: in a Shtetl-Optimized exclusive, the legendary Christos Papadimitriou, coauthor of the acclaimed Logicomix and ITCS’2017 program chair, has written us a guest post about what makes ITCS special and why you should come.  Enjoy!  –SA


ITCS:  A hidden treasure of TCS

by Christos Papadimitriou

Conferences, for me, are a bit like demonstrations: they were fun in the 1970s.  There was the Hershey STOC, of course, and that great FOCS in Providence, plus a memorable database theory gathering in Calabria.  Ah, children, you should have been there…

So, even though I was a loyal supporter of the ITCS idea from the beginning – the “I”, you recall, stands for innovation –, I managed to miss essentially all of them – except for those that left me no excuse.  For example, this year the program committee was unreasonably kind to my submissions, and so this January I was in Boston to attend.

I want to tell you about ITCS 2016, because it was a gas.

First, I saw all the talks.  I cannot recall this ever happening to me before.  I reconnected with fields of old, learned a ton, and got a few cool new ideas.

In fact, I believe that there was no talk with fewer than 60 people in the audience – and that’s about 70% of the attendees.  In most talks it was closer to 90%.  When was the last conference where you saw that?

And what is the secret of this enhanced audience attention?  One explanation is that smaller conference means small auditorium.  Listening to the talk no longer feels like watching a concert in a stadium, or an event on TV, it’s more like a story related by a friend.  Another gimmick that works well is that, at ITCS, session chairs start the session with a 10-minute “rant,” providing context and their own view of the papers in the session.

Our field got a fresh breath of cohesion at ITCS 2016: cryptographers listened to game theorists in the presence of folks who do data structures for a living, or circuit complexity – for a moment there, the seventies were back.

Ah, those papers, their cleverness and diversity and freshness!  Here is a sample of a few with a brief comment for each (take a look at the conference website for the papers and the presentations).

  • What is keeping quantum computers from conquering all of NP? It is the problem with destructive measurements, right?  Think again, say Aaronson, Bouland and Fitzsimons.  In their paper (pdf, slides) they consider several deviations from current restrictions, including non-destructive measurements, and the space ‘just above’ BQP turns out to be a fascinating and complex place.
  • Roei Tell (pdf, slides) asks another unexpected question: when is an object far from being far from having a property? On the way to an answer he discovers a rich and productive duality theory of property testing, as well as a very precise and sophisticated framework in which to explore it.
  • If you want to represent the permanent of a matrix as the determinant of another matrix of linear forms in the entries, how large must this second matrix be? – an old question by Les Valiant. The innovation by Landsberg and Ressayre (pdf, slides) is that they make fantastic progress in this important problem through geometric complexity: If certain natural symmetries are to be satisfied, the answer is exponential!

(A parenthesis:  The last two papers make the following important point clear: Innovation in ITCS is not meant to be the antithesis of mathematical sophistication.  Deep math and methodological innovation are key ingredients of the ITCS culture.)

  • When shall we find an explicit function requiring more than 3n gates? In their brave exploration of new territory for circuit complexity, Golovnev and Kulikov (pdf, slides) find one possible answer: “as soon as we have explicit dispersers for quadratic varieties.”
  • The student paper award went to Aviad Rubinstein for his work (pdf) on auctioning multiple items – the hardest nut in algorithmic mechanism design. He gives a PTAS for optimizing over a large – and widely used – class of “partitioning” heuristics.

Even though there were no lively discussions at the lobby during the sessions – too many folks attending, see? – the interaction was intense and enjoyable during the extra long breaks and the social events.

Plus we had the Graduating Bits night, when the youngest among us get 5 minutes to tell.  I would have traveled to Cambridge just for that!

All said, ITCS 2016 was a gem of a meeting.  If you skipped it, you really missed a good one.

But there is no reason to miss ITCS 2017, let me tell you a few things about it:

  • It will be in Berkeley, January 9 -11 2017, the week before the Barcelona SODA.
  • It will take place at the Simons Institute just a few days before the boot camps on Pseudorandomness and Learning.
  • I volunteered to be program chair, and the steering committee has decided to try a few innovations in the submission process:
  • Submission deadline is mid-September, so you have a few more weeks to collect your most innovative thoughts. Notification before the STOC deadline.
  • Authors will post a copy of their paper, and will submit to the committee a statement about it, say 1000 words max. Think of it as your chance to write a favorable referee report for your own paper!  Telling the committee why you think it is interesting and innovative.  If you feel this is self-evident, just tell us that.
  • The committee members will be the judges of the overall worth and innovative nature of the paper. Sub-reviewers are optional, and their opinion is not communicated to the rest of the committee.
  • The committee may invite speakers to present specific recent interesting work. Submitted papers especially liked by the committee may be elevated to “invited.”
  • Plus Graduating Bits, chair rants, social program, not to mention the Simons Institute auditorium and Berkeley in January.

You should come!

Leonard Susskind’s Open Letter on “The Lunatic”

Wednesday, June 22nd, 2016

In my own anti-Trump post two weeks ago, I started out by mentioning that Terry Tao and Stephen Hawking had recently denounced Trump, and jokingly wondered when we’d hear from Ed Witten.  Well, will Leonard Susskind of Stanford University—a creator of string theory, and one of the most legendarily original physicists and physics expositors of our time—do instead?

Over the last decade, it’s been a privilege for me to get to know Lenny, to learn from him, and recently, to collaborate with him on quantum circuit complexity and AdS/CFT.  Today, Lenny wrote to ask whether I’d share his open letter about the US election on this blog.  Of course I said yes.  Better yet, Lenny has agreed to my request to be available here to answer questions and comments.  Lenny’s views, even when close to mine (as they certainly are in this case), are still his, and I’d never want to speak on his behalf.  Better that you should hear it straight from the horse’s mouth—as you now will, without further ado.  –Scott A.


Letter to My Friends, by Leonard Susskind

I’m watching this thing that’s happening with disbelief, dismay, and disgust. There is a lunatic loose—I’m sure we all agree about that—but I keep hearing people say that they can’t vote for Hillary. I heard it at my daughter’s birthday party Sunday. Boy oh boy, will these people be sorry if the lunatic gets his way. Personally I do not find it an excuse that “I live in California, which will go Democrat whatever I do.”

I strongly believe in all things Bernie, but Hillary is not the Anti-Bernie. There is much less difference between Clinton and Sanders than the distortions of the nominating process might lead people to think. She’s for health care, he’s for health care; he’s for increased minimum wage, she’s for increased minimum wage; she’s for immigrant rights, he’s for immigrant rights; and on and on it goes.

The lunatic may be just that—a lunatic—but he is also a master of smear and innuendo.  He is a gigantic liar, and he knows that if you keep saying something over and over, it sticks in people’s minds. It’s called the Big Lie, and it works. Say it enough and it sows confusion and distrust, not only among the know-nothings, but even among those who know better.

The lunatic and his supporters are exceedingly dangerous. Tell your friends: don’t be fooled. The only thing between us and the lunatic is Hillary. Get off your ass and vote in Nov.

Leonard Susskind

Director, Stanford Institute for Theoretical Physics,

Stanford University

 

Three announcements

Monday, May 9th, 2016

(-3) Bonus Announcement of May 30: As a joint effort by Yuri Matiyasevich, Stefan O’Rear, and myself, and using the Not-Quite-Laconic language that Stefan adapted from Adam Yedidia’s Laconic, we now have a 744-state TM that halts iff there’s a counterexample to the Riemann Hypothesis.

(-2) Today’s Bonus Announcement: Stefan O’Rear says that his Turing machine to search for contradictions in ZFC is now down to 1919 states.  If verified, this is an important milestone: our upper bound on the number of Busy Beaver values that are knowable in standard mathematics is now less than the number of years since the birth of Christ (indeed, even since the generally-accepted dates for the writing of the Gospels).

Stefan also says that his Not-Quite-Laconic system has yielded a 1008-state Turing machine to search for counterexamples to the Riemann Hypothesis, improving on our 5372 states.

(-1) Another Bonus Announcement: Great news, everyone!  Using a modified version of Adam Yedidia’s Laconic language (which he calls NQL, for Not Quite Laconic), Stefan O’Rear has now constructed a 5349-state Turing machine that directly searches for contradictions in ZFC (or rather in Metamath, which is known to be equivalent to ZFC), and whose behavior is therefore unprovable in ZFC, assuming ZFC is consistent.  This, of course, improves on my and Adam’s state count by 2561 states—but it also fixes the technical issue with needing to assume a large cardinal axiom (SRP) in order to prove that the TM runs forever.  Stefan promises further state reductions in the near future.

In other news, Adam has now verified the 43-state Turing machine by Jared S that halts iff there’s a counterexample to Goldbach’s Conjecture.  The 27-state machine by code golf addict is still being verified.

(0) Bonus Announcement: I’ve had half a dozen “Ask Me Anything” sessions on this blog, but today I’m trying something different: a Q&A session on Quora.  The way it works is that you vote for your favorite questions; then on Tuesday, I’ll start with the top-voted questions and keep going down the list until I get tired.  Fire away!  (And thanks to Shreyes Seshasai at Quora for suggesting this.)

(1) When you announce a new result, the worst that can happen is that the result turns out to be wrong, trivial, or already known.  The best that can happen is that the result quickly becomes obsolete, as other people race to improve it.  With my and Adam Yedidia’s work on small Turing machines that elude set theory, we seem to be heading for that best case.  Stefan O’Rear wrote a not-quite-Laconic program that just searches directly for contradictions in a system equivalent to ZFC.  If we could get his program to compile, it would likely yield a Turing machine with somewhere around 6,000-7,000 states whose behavior was independent of ZFC, and would also fix the technical problem with my and Adam’s machine Z, where one needed to assume a large-cardinal axiom called SRP to prove that Z runs forever.  While it would require a redesign from the ground up, a 1,000-state machine whose behavior eludes ZFC also seems potentially within reach using Stefan’s ideas.  Meanwhile, our 4,888-state machine for Goldbach’s conjecture seems to have been completely blown out of the water: first, a commenter named Jared S says he’s directly built a 73-state machine for Goldbach (now down to 43 states); second, a commenter named “code golf addict” claims to have improved on that with a mere 31 states (now down to 27 states).  These machines are now publicly posted, but still await detailed verification.

(2) My good friend Jonah Sinick cofounded Signal Data Science, a data-science summer school that will be running for the second time this summer.  They operate on an extremely interesting model, which I’m guessing might spread more widely: tuition is free, but you pay 10% of your first year’s salary after finding a job in the tech sector.  He asked me to advertise them, so—here!

(3) I was sad to read the news that Uber and Lyft will be suspending all service in Austin, because the city passed an ordinance requiring their drivers to get fingerprint background checks, and imposing other regulations that Uber and Lyft argue are incompatible with their model of part-time drivers.  The companies, of course, are also trying to send a clear message to other cities about what will happen if they don’t get the regulatory environment they want.  To me, the truth of the matter is that Uber/Lyft are like the web, Google, or smartphones: clear, once-per-decade quality-of-life advances that you try once, and then no longer understand how you survived without.  So if Austin wants to maintain a reputation as a serious, modern city, it has no choice but to figure out some way to bring these companies back to the negotiating table.  On the other hand, I’d also say to Uber and Lyft that, even if they needed to raise fares to taxi levels to comply with the new regulations, I expect they’d still do a brisk business!

For me, the “value proposition” of Uber has almost nothing to do with the lower fares, even though they’re lower.  For me, it’s simply about being able to get from one place to another without needing to drive and park, and also without needing desperately to explain where you are, over and over, to a taxi dispatcher who sounds angry that you called and who doesn’t understand you because of a combination of language barriers and poor cellphone reception and your own inability to articulate your location.  And then wondering when and if your taxi will ever show up, because the dispatcher couldn’t promise a specific time, or hung up on you before you could ask them.  And then embarking on a second struggle, to explain to the driver where you’re going, or at least convince them to follow the Google Maps directions.  And then dealing with the fact that the driver has no change, you only have twenties and fifties, and their little machine that prints receipts is out of paper so you can’t submit your trip for reimbursement either.

So yes, I really hope Uber, Lyft, and the city of Austin manage to sort this out before Dana and I move there!  On the other hand, I should say that there’s another part of the new ordinance—namely, requiring Uber and Lyft cars to be labeled—that strikes me as an unalloyed good.  For if there’s one way in which Uber is less convenient than taxis, it’s that you can never figure out which car is your Uber, among all the cars stopping or slowing down near you that look vaguely like the one in the app.

The 8000th Busy Beaver number eludes ZF set theory: new paper by Adam Yedidia and me

Tuesday, May 3rd, 2016

I’ve supervised a lot of great student projects in my nine years at MIT, but my inner nerdy teenager has never been as personally delighted by a project as it is right now.  Today, I’m proud to announce that Adam Yedidia, a PhD student at MIT (but an MEng student when he did most of this work), has explicitly constructed a one-tape, two-symbol Turing machine with 7,918 states, whose behavior (when run on a blank tape) can never be proven from the usual axioms of set theory, under reasonable consistency hypotheses.  Adam has also constructed a 4,888-state Turing machine that halts iff there’s a counterexample to Goldbach’s Conjecture, and a 5,372-state machine that halts iff there’s a counterexample to the Riemann Hypothesis.  In all three cases, this is the first time we’ve had a reasonable explicit upper bound on how many states you need in a Turing machine before you can see the behavior in question.

Here’s our research paper, on which Adam generously included me as a coauthor, even though he did the heavy lifting.  Also, here’s a github repository where you can download all the code Adam used to generate these Turing machines, and even use it to build your own small Turing machines that encode interesting mathematical statements.  Finally, here’s a YouTube video where Adam walks you through how to use his tools.

A more precise statement of our main result is this: we give a 7,918-state Turing machine, called Z (and actually explicitly listed in our paper!), such that:

  1. Z runs forever, assuming the consistency of a large-cardinal theory called SRP (Stationary Ramsey Property), but
  2. Z can’t be proved to run forever in ZFC (Zermelo-Fraenkel set theory with the Axiom of Choice, the usual foundation for mathematics), assuming that ZFC is consistent.

A bit of background: it follows, as an immediate consequence of Gödel’s Incompleteness Theorem, that there’s some computer program, of some length, that eludes the power of ordinary mathematics to prove what it does, when it’s run with an unlimited amount of memory.  So for example, such a program could simply enumerate all the possible consequences of the ZFC axioms, one after another, and halt if it ever found a contradiction (e.g., a proof of 1+1=3).  Assuming ZFC is consistent, this program must run forever.  But again assuming ZFC is consistent, ZFC can’t prove that the program runs forever, since if it did, then it would prove its own consistency, thereby violating the Second Incompleteness Theorem!

Alas, this argument still leaves us in the dark about where, in space of computer programs, the “Gödelian gremlin” rears its undecidable head.  A program that searches for an inconsistency in ZFC is a fairly complicated animal: it needs to encode not only the ZFC axiom schema, but also the language and inference rules of first-order logic.  Such a program might be thousands of lines long if written in a standard programming language like C, or millions of instructions if compiled down to a bare-bones machine code.  You’d certainly never run across such a program by chance—not even if you had a computer the size of the observable universe, trying one random program after another for billions of years in a “primordial soup”!

So the question stands—a question that strikes me as obviously important, even though as far as I know, only one or two people ever asked the question before us; see here for example.  Namely: do the axioms of set theory suffice to analyze the behavior of every computer program that’s at most, let’s say, 50 machine instructions long?  Or are there super-short programs that already exhibit “Gödelian behavior”?

Theoretical computer scientists might object that this is “merely a question of constants.”  Well yes, OK, but the origin of life in our universe—a not entirely unrelated puzzle—is also “merely a question of constants”!  In more detail, we know that it’s possible with our laws of physics to build a self-replicating machine: say, DNA or RNA and their associated paraphernalia.  We also know that tiny molecules like H2O and CO2 are not self-replicating.  But we don’t know how small the smallest self-replicating molecule can be—and that’s an issue that influences whether we should expect to find ourselves alone in the universe or find it teeming with life.

Some people might also object that what we’re asking about has already been studied, in the half-century quest to design the smallest universal Turing machine (the subject of Stephen Wolfram’s $25,000 prize in 2007, to which I responded with my own $25.00 prize).  But I see that as fundamentally different, for the following reason.  A universal Turing machine—that is, a machine that simulates any other machine that’s described to it on its input tape—has the privilege of offloading almost all of its complexity onto the description format for the input machine.  So indeed, that’s exactly what all known tiny universal machines do!  But a program that checks (say) Goldbach’s Conjecture, or the Riemann Hypothesis, or the consistency of set theory, on an initially blank tape, has no such liberty.  For such machines, the number of states really does seem like an intrinsic measure of complexity, because the complexity can’t be shoehorned anywhere else.

One can also phrase what we’re asking in terms of the infamous Busy Beaver function.  Recall that BB(n), or the nth Busy Beaver number, is defined to be the maximum number of steps that any n-state Turing machine takes when run on an initially blank tape, assuming that the machine eventually halts. The Busy Beaver function was the centerpiece of my 1998 essay Who Can Name the Bigger Number?, which might still attract more readers than anything else I’ve written since. As I stressed there, if you’re in a biggest-number-naming contest, and you write “BB(10000),” you’ll destroy any opponent—however otherwise mathematically literate they are—who’s innocent of computability theory.  For BB(n) grows faster than any computable sequence of integers: indeed, if it didn’t, then one could use that fact to solve the halting problem, contradicting Turing’s theorem.

But the BB function has a second amazing property: namely, it’s a perfectly well-defined integer function, and yet once you fix the axioms of mathematics, only finitely many values of the function can ever be proved, even in principle.  To see why, consider again a Turing machine M that halts if and only if there’s a contradiction in ZF set theory.  Clearly such a machine could be built, with some finite number of states k.  But then ZF set theory can’t possibly determine the value of BB(k) (or BB(k+1), BB(k+2), etc.), unless ZF is inconsistent!  For to do so, ZF would need to prove that M ran forever, and therefore prove its own consistency, and therefore be inconsistent by Gödel’s Theorem.

OK, but we can now ask a quantitative question: how many values of the BB function is it possible for us to know?  Where exactly is the precipice at which this function “departs the realm of mortals and enters the realm of God”: is it closer to n=10 or to n=10,000,000?  In practice, four values of BB have been determined so far:

  • BB(1)=1
  • BB(2)=6
  • BB(3)=21 (Lin and Rado 1965)
  • BB(4)=107 (Brady 1975)

We also know some lower bounds:

See Heiner Marxen’s page or the Googology Wiki (which somehow I only learned about today) for more information.

Some Busy Beaver enthusiasts have opined that even BB(6) will never be known exactly.  On the other hand, the abstract argument from before tells us only that, if we confine ourselves to (say) ZF set theory, then there’s some k—possibly in the tens of millions or higher—such that the values of BB(k), BB(k+1), BB(k+2), and so on can never be proven.  So again: is the number of knowable values of the BB function more like 10, or more like a million?

This is the question that Adam and I (but mostly Adam) have finally addressed.

It’s hopeless to design a Turing machine by hand for all but the simplest tasks, so as a first step, Adam created a new programming language, called Laconic, specifically for writing programs that compile down to small Turing machines.  Laconic programs actually compile to an intermediary language called TMD (Turing Machine Descriptor), and from there to Turing machines.

Even then, we estimate that a direct attempt to write a Laconic program that searched for a contradiction in ZFC would lead to a Turing machine with millions of states.  There were three ideas needed to get the state count down to something reasonable.

The first was to take advantage of the work of Harvey Friedman, who’s one of the one or two people I mentioned earlier who’s written about these problems before.  In particular, Friedman has been laboring since the 1960s to find “natural” arithmetical statements that are provably independent of ZFC or other strong set theories.  (See this AMS Notices piece by Martin Davis for a discussion of Friedman’s progress as of 2006.)  Not only does Friedman’s quest continue, but some of his most important progress has come only within the last year.  His statements—typically involving objects called “order-invariant graphs”—strike me as alien, and as far removed from anything I’d personally have independent reasons to think about (but is that just a sign of my limited perspective?).  Be that as it may, Friedman’s statements still seem a lot easier to encode as short computer programs than the full apparatus of first-order logic and set theory!  So that’s what we started with; our work wouldn’t have been possible without Friedman (who we consulted by email throughout the project).

The second idea was something we called “on-tape processing.”  Basically, instead of compiling directly from Laconic down to Turing machine, Adam wrote an interpreter in Turing machine (which took about 4000 states—a single, fixed cost), and then had the final Turing machine first write a higher-level program onto its tape and then interpret that program.  Instead of the compilation process producing a huge multiplicative overhead in the number of Turing machine states (and a repetitive machine), this approach gives us only an additive overhead.  We found that this one idea decreased the number of states by roughly an order of magnitude.

The third idea was first suggested in 2002 by Ben-Amram and Petersen (and refined for us by Luke Schaeffer); we call it “introspective encoding.”  When we write the program to be interpreted onto the Turing machine tape, the naïve approach would use one Turing machine state per bit.  But that’s clearly wasteful, since in an n-state Turing machine, every state contains ~log(n) bits of information (because of the other states it needs to point to).  A better approach tries to exploit as many of those bits as it can; doing that gave us up to a factor-of-5 additional savings in the number of states.

For Goldbach’s Conjecture and the Riemann Hypothesis, we paid the same 4000-state overhead for the interpreter, but then the program to be interpreted was simpler, giving a smaller overall machine.  Incidentally, it’s not intuitively obvious that the Riemann Hypothesis is equivalent to the statement that some particular computer program runs forever, but it is—that follows, for example, from work by Lagarias and by Davis, Matijasevich, and Robinson (we used the latter; an earlier version of this post incorrectly stated that we used the Lagarias result).

To preempt the inevitable question in the comments section: yes, we did run these Turing machines for a while, and no, none of them had halted after a day or so.  But before you interpret that as evidence in favor of Goldbach, Riemann, and the consistency of ZFC, you should probably know that a Turing machine to test whether all perfect squares are less than 5, produced using Laconic, needed to run for more than an hour before it found the first counterexample (namely, 32=9) and halted.  Laconic Turing machines are optimized only for the number of states, not for speed, to put it mildly.

Of course, three orders of magnitude still remain between the largest value of n (namely, 4) for which BB(n) is known to be knowable in ZFC-based mathematics, and the smallest value of n (namely, 7,918) for which BB(n) is known to be unknowable.  I’m optimistic that further improvements are possible to the machine Z—whether that means simplifications to Friedman’s statement, a redesigned interpreter (possibly using lambda calculus?), or a “multi-stage rocket model” where a bare-bones interpreter would be used to unpack a second, richer interpreter which would be used to unpack a third, etc., until you got to the actual program you cared about.  But I’d be shocked if anyone in my lifetime determined the value of BB(10), for example, or proved the value independent of set theory.  Even after the Singularity happens, I imagine that our robot overlords would find the determination of BB(10) quite a challenge.

In an early Shtetl-Optimized post, I described theoretical computer science as “quantitative epistemology.”  Constructing small Turing machines whose behavior eludes set theory is not conventional theoretical computer science by any stretch of the imagination: it’s closer in practice to programming languages or computer architecture, or even the recreational practice known as code-golfing.  On the other hand, I’ve never been involved with any other project that was so clearly, explicitly about pinning down the quantitative boundary between the knowable and the unknowable.

Comments on our paper are welcome.

Addendum: Some people might wonder “why Turing machines,” as opposed to a more reasonable programming language like C or Python.  Well, first of all, we needed a language that could address an unlimited amount of memory.  Also, the BB function is traditionally defined in terms of Turing machines.  But the most important issue is that we wanted there to be no suspicion whatsoever that our choice of programming language was artificially helping to make our machine small.  And hopefully everyone can agree that one-tape, two-symbol Turing machines aren’t designed for anyone’s convenience!

“Largely just men doing sums”: My review of the excellent Ramanujan film

Sunday, May 1st, 2016

[Warning: This movie review contains spoilers, as well as a continued fraction expansion.]

These days, it takes an extraordinary occasion for me and Dana to arrange the complicated, rocket-launch-like babysitting logistics involved in going out for a night at the movies.  One such an occasion was an opening-weekend screening of The Man Who Knew Infinitythe new movie about Srinivasa Ramanujan and his relationship with G. H. Hardy—followed by a Q&A with Matthew Brown (who wrote and directed the film), Robert Kanigel (who wrote the biography on which the film was based), and Fields Medalist Manjul Bhargava (who consulted on the film).

I read Kanigel’s The Man Who Knew Infinity in the early nineties; it was a major influence on my life.  There were equations in that book to stop a nerdy 13-year-old’s pulse, like

$$1+9\left( \frac{1}{4}\right) ^{4}+17\left( \frac{1\cdot5}{4\cdot8}\right)
^{4}+25\left( \frac{1\cdot5\cdot9}{4\cdot8\cdot12}\right) ^{4}+\cdots
=\frac{2^{3/2}}{\pi^{1/2}\Gamma\left( 3/4\right) ^{2}}$$

$$\frac{1}{1+\frac{e^{-2\pi}}{1+\frac{e^{-4\pi}}{1+\frac{e^{-6\pi}}{1+\cdots}}%
}}=\left( \sqrt{\frac{5+\sqrt{5}}{2}}-\frac{\sqrt{5}+1}{2}\right)
\sqrt[5]{e^{2\pi}}$$

A thousand pages of exposition about Ramanujan’s mysterious self-taught mathematical style, the effect his work had on Hardy and Littlewood, his impact on the later development of analysis, etc., could never replace the experience of just staring at these things!  Popularizers are constantly trying to “explain” mathematical beauty by comparing it to art, music, or poetry, but I can best understand art, music, and poetry if I assume other people experience them like the above identities.  Across all the years and cultures and continents, can’t you feel Ramanujan himself leaping off your screen, still trying to make you see this bizarre aspect of the architecture of reality that the goddess Namagiri showed him in a dream?

Reading Kanigel’s book, I was also entranced by the culture of early-twentieth-century Cambridge mathematics: the Tripos, Wranglers, High Table.  I asked, why was I here and not there?  And even though I was (and remain) at most 1729-1729 of a Ramanujan, I could strongly identify with his story, because I knew that I, too, was about to embark on the journey from total scientific nobody to someone who the experts might at least take seriously enough to try to prove him wrong.

Anyway, a couple years after reading Kanigel’s biography, I went to the wonderful Canada/USA MathCamp, and there met Richard K. Guy, who’d actually known Hardy.  I couldn’t have been more impressed had Guy visited Platonic heaven and met π and e there.  To put it mildly, no one in my high school had known G. H. Hardy.

I often fantasized—this was the nineties—about writing the screenplay myself for a Ramanujan movie, so that millions of moviegoers could experience the story as I did.  Incidentally, I also fantasized about writing screenplays for Alan Turing and John Nash movies.  I do have a few mathematical biopic ideas that haven’t yet been taken, and for which any potential buyers should get in touch with me:

  • Radical: The Story of Évariste Galois
  • Give Me a Place to Stand: Archimedes’ Final Days
  • Mathématicienne: Sophie Germain In Her Prime
  • The Prime Power of Ludwig Sylow
    (OK, this last one would be more of a limited-market release)

But enough digressions; how was the Ramanujan movie?

Just as Ramanujan himself wasn’t an infallible oracle (many of his claims, e.g. his formula for the prime counting function, turned out to be wrong), so The Man Who Knew Infinity isn’t a perfect movie.  Even so, there’s no question that this is one of the best and truest movies ever made about mathematics and mathematicians, if not the best and truest.  If you’re the kind of person who reads this blog, go see it now.  Don’t wait!  As they stressed at the Q&A, the number of tickets sold in the first couple weeks is what determines whether or not the movie will see a wider release.

More than A Beautiful Mind or Good Will Hunting or The Imitation Game, or the play Proof, or the TV series NUMB3RS, the Ramanujan movie seems to me to respect math as a thing-in-itself, rather than just a tool or symbol for something else that interests the director much more.  The background to the opening credits—and what better choice could there be?—is just page after page from Ramanujan’s notebooks.  Later in the film, there’s a correct explanation of what the partition function P(n) is, and of one of Ramanujan’s and Hardy’s central achievements, which was to give an asymptotic formula for P(n), namely $$ P(n) \approx \frac{e^{π \sqrt{2n/3}}}{4\sqrt{3}n}, $$ and to prove the formula’s correctness.

The film also makes crystal-clear that pure mathematicians do what they do not because of applications to physics or anything else, but simply because they feel compelled to: for the devout Ramanujan, math was literally about writing down “the thoughts of God,” while for the atheist Hardy, math was a religion-substitute.  Notably, the movie explores the tension between Ramanujan’s untrained intuition and Hardy’s demands for rigor in a way that does them both justice, resisting the Hollywood urge to make intuition 100% victorious and rigor just a stodgy punching bag to be defeated.

For my taste, the movie could’ve gone even further in the direction of “letting the math speak”: for example, it could’ve explained just one of Ramanujan’s infinite series.  Audiences might even have liked some more T&A (theorems and asymptotic bounds).  During the Q&A that I attended, I was impressed to see moviegoers repeatedly pressing a somewhat-coy Manjul Bhargava to explain Ramanujan’s actual mathematics (e.g., what exactly were the discoveries in his first letter to Hardy?  what was in Ramanujan’s Lost Notebook that turned out to be so important?).  Then again, this was Cambridge, MA, so the possibility should at least be entertained that what I witnessed was unrepresentative of American ticket-buyers.

From what I’ve read, the movie is also true to South Indian dress, music, religion, and culture.  Yes, the Indian characters speak to each other in English rather than Tamil, but Brown explained that as a necessary compromise (not only for the audience’s sake, but also because Dev Patel and the other Indian actors didn’t speak Tamil).

Some reviews have mentioned issues with casting and characterization.  For example, Hardy is portrayed by Jeremy Irons, who’s superb but also decades older than Hardy was at the time he knew Ramanujan.  Meanwhile Ramanujan’s wife, Janaki, is played by a fully-grown Devika Bhise; the real Janaki was nine (!) when she married Ramanujan, and fourteen when Ramanujan left for England.  J. E. Littlewood is played as almost a comic-relief buffoon, so much so that it feels incongruous when, near the end of the film, Irons-as-Hardy utters the following real-life line:

I still say to myself when I am depressed and find myself forced to listen to pompous and tiresome people, “Well, I have done one thing you could never have done, and that is to have collaborated with Littlewood and Ramanujan on something like equal terms.”

Finally, a young, mustachioed Bertrand Russell is a recurring character.  Russell and Hardy really were friends and fellow WWI pacifists, but Hardy seeking out Bertie’s advice about each Ramanujan-related development seems like almost certainly just an irresistible plot device.

But none of that matters.  What bothered me more were the dramatizations of the prejudice Ramanujan endured in England.  Ramanujan is shown getting knocked to the ground, punched, and kicked by British soldiers barking anti-Indian slurs at him; he then shows up for his next meeting with Hardy covered in bruises, which Hardy (being aloof) neglects to ask about.  Ramanujan is also depicted getting shoved, screamed at, and told never to return by a math professor who he humiliates during a lecture.  I understand why Brown made these cinematic choices: there’s no question that Ramanujan experienced prejudice and snobbery in Cambridge, and that he often felt lonely and unwelcome there.  And it’s surely easier to show Ramanujan literally getting beaten up by racist bigots, than to depict his alienation from Cambridge society as the subtler matter that it most likely was.  To me, though, that’s precisely why the latter choice would’ve been even more impressive, had the film managed to pull it off.

Similarly, during World War I, the film shows not only Trinity College converted into a military hospital, and many promising students marched off to their deaths (all true), but also a shell exploding on campus near Ramanujan, after which Ramanujan gazes in horror at the bleeding dead bodies.  Like, isn’t the truth here dramatic enough?

One other thing: the movie leaves you with the impression that Ramanujan died of tuberculosis.  More recent analysis concluded that it was probably hepatic amoebiasis that he brought with him from India—something that could’ve been cured with the medicine of the time, had anyone correctly diagnosed it.  (Incidentally, the film completely omits Ramanujan’s final year, back in India, when he suffered a relapse of his illness and slowly withered away, yet with Janaki by his side, continued to do world-class research and exchanged letters with Hardy until the very last days.  Everyone I read commented that this was “the right dramatic choice,” but … I dunno, I would’ve shown it!)

But enough!  I fear that to harp on these defects is to hold the film to impossibly-high, Platonic standards, rather than standards that engage with the reality of Hollywood.  An anecdote that Brown related at the end of the Q&A session brought this point home for me.  Apparently, Brown struggled for an entire decade to attract funding for a film about a turn-of-the-century South Indian mathematician visiting Trinity College, Cambridge, whose work had no commercial or military value whatsoever.  At one point, Brown was actually told that he could get the movie funded, if he’d agree to make Ramanujan fall in love with a white nurse, so that a British starlet who would sell tickets could be cast as his love interest.  One can only imagine what a battle it must have been to get a correct explanation of the partition function onto the screen.

In the end, though, nothing made me appreciate The Man Who Knew Infinity more than reading negative reviews of it, like this one by Olly Richards:

Watching someone balancing algorithms or messing about with multivariate polynomials just isn’t conducive to urgently shovelling popcorn into your face.  Difficult to dislike, given its unwavering affection for its subject, The Man Who Knew Infinity is nevertheless hamstrung by the dryness of its subject … Sturdy performances and lovely scenery abound, but it’s still largely just men doing sums; important sums as it turns out, but that isn’t conveyed to the audience until the coda [which mentions black holes] tells us of the major scientific advances they aided.

On behalf of mathematics, on behalf of my childhood self, I’m grateful that Brown fought this fight, and that he won as much as he did.  Whether you walk, run, board a steamship, or take taxi #1729, go see this film.

Addendum: See also this review by Peter Woit, and this in Notices of the AMS by Ramanujan expert George Andrews.

Me interviewed by John Horgan (the author of “The End of Science”)

Thursday, April 21st, 2016

You can read it here.

It’s long (~12,000 words).  Rather than listing what this interview covers, it would be easier to list what it doesn’t cover.  (My favorite soda flavors?)

If you read this blog, much of what I say there will be old hat, but some of it will be new.  I predict that you’ll enjoy the interview iff you enjoy the blog.  Comments welcome.

Grading Trudeau on quantum computing

Sunday, April 17th, 2016

Update (4/19): Inspired by Trudeau’s performance (which they clocked at 35 seconds), Maclean’s magazine asked seven quantum computing researchers—me, Krysta Svore, Aephraim Steinberg, Barry Sanders, Davide Venturelli, Martin Laforest, and Murray Thom—to also explain quantum computing in 35 seconds or fewer.  You can see all the results here (here’s the audio from my entry).


The emails starting hitting me like … a hail of maple syrup from the icy north.  Had I seen the news?  Justin Trudeau, the dreamy young Prime Minister of Canada, visited the Perimeter Institute for Theoretical Physics in Waterloo, one of my favorite old haunts.  At a news conference at PI, as Trudeau stood in front of a math-filled blackboard, a reporter said to him: “I was going to ask you to explain quantum computing, but — when do you expect Canada’s ISIL mission to begin again, and are we not doing anything in the interim?”

Rather than answering immediately about ISIL, Trudeau took the opportunity to explain quantum computing:

“Okay, very simply, normal computers work, uh, by [laughter, applause] … no no no, don’t interrupt me.  When you walk out of here, you will know more … no, some of you will know far less about quantum computing, but most of you … normal computers work, either there’s power going through a wire, or not.  It’s 1, or a 0, they’re binary systems.  Uh, what quantum states allow for is much more complex information to be encoded into a single bit.  Regular computer bit is either a 1 or a 0, on or off.  A quantum state can be much more complex than that, because as we know [speeding up dramatically] things can be both particle and wave at the same times and the uncertainty around quantum states [laughter] allows us to encode more information into a much smaller computer.  So, that’s what exciting about quantum computing and that’s… [huge applause] don’t get me going on this or we’ll be here all day, trust me.”

What marks does Trudeau get for this?  On the one hand, the widespread praise for this reply surely says more about how low the usual standards for politicians are, and about Trudeau’s fine comic delivery, than about anything intrinsic to what he said.  Trudeau doesn’t really assert much here: basically, he just says that normal computers work using 1’s and 0’s, and that quantum computers are more complicated than that in some hard-to-explain way.  He gestures toward the uncertainty principle and wave/particle duality, but he doesn’t say anything about the aspects of QM most directly relevant to quantum computing—superposition or interference or the exponential size of Hilbert space—nor does he mention what quantum computers would or wouldn’t be used for.

On the other hand, I’d grade Trudeau’s explanation as substantially more accurate than what you’d get from a typical popular article.  For pay close attention to what the Prime Minister never says: he never says that a qubit would be “both 0 and 1 at the same time,” or any equivalent formulation.  (He does say that quantum states would let us “encode more information into a much smaller computer,” but while Holevo’s Theorem says that’s false for a common interpretation of “information,” it’s true for other reasonable interpretations.)  The humorous speeding up as he mentions particle/wave duality and the uncertainty principle clearly suggests that he knows it’s more subtle than just “0 and 1 at the same time,” and he also knows that he doesn’t really get it and that the journalists in the audience don’t either.  When I’m grading exams, I always give generous partial credit for honest admissions of ignorance.  B+.

Anyway, I’d be curious to know who at PI prepped Trudeau for this, and what they said.  Those with inside info, feel free to share in the comments (anonymously if you want!).

(One could also compare against Obama’s 2008 answer about bubblesort, which was just a mention of a keyword by comparison.)

Update: See also a Motherboard article where Romain Alléaume, Amr Helmy, Michele Mosca, and Aephraim Steinberg rate Trudeau’s answer, giving it 7/10, no score, 9/10, and 7/10 respectively.

Here’s some video of me spouting about Deep Questions

Thursday, February 4th, 2016

In January 2014, I attended an FQXi conference on Vieques island in Puerto Rico.  While there, Robert Lawrence Kuhn interviewed me for his TV program Closer to Truth, which deals with science and religion and philosophy and you get the idea.  Alas, my interview was at the very end of the conference, and we lost track of the time—so unbeknownst to me, a plane full of theorists was literally sitting on the runway waiting for me to finish philosophizing!  This was the second time Kuhn interviewed me for his show; the first time was on a cruise ship near Norway in 2011.  (Thankless hero that I am, there’s nowhere I won’t travel for the sake of truth.)

Anyway, after a two-year wait, the videos from Puerto Rico are finally available online.  While my vignettes cover what, for most readers of this blog, will be very basic stuff, I’m sort of happy with how they turned out: I still stutter and rock back and forth, but not as much as usual.  For your viewing convenience, here are the new videos:

I had one other vignette, about why the universe exists, but they seem to have cut that one.  Alas, if I knew why the universe existed in January 2014, I can’t remember any more.

One embarrassing goof: I referred to the inventor of Newcomb’s Paradox as “Simon Newcomb.”  Actually it was William Newcomb: a distant relative of Simon Newcomb, the 19th-century astronomer who measured the speed of light.

At their website, you can also see my older 2011 videos, and videos from others who might be known to readers of this blog, like Marvin Minsky, Roger Penrose, Rebecca Newberger Goldstein, David ChalmersSean Carroll, Max Tegmark, David Deutsch, Raphael Bousso, Freeman DysonNick BostromRay Kurzweil, Rodney Brooks, Stephen Wolfram, Greg Chaitin, Garrett Lisi, Seth Lloyd, Lenny Susskind, Lee Smolin, Steven Weinberg, Wojciech Zurek, Fotini Markopoulou, Juan Maldacena, Don Page, and David Albert.  (No, I haven’t yet watched most of these, but now that I linked to them, maybe I will!)

Thanks very much to Robert Lawrence Kuhn and Closer to Truth (and my previous self, I guess?) for providing Shtetl-Optimized content so I don’t have to.


Update: Andrew Critch of CFAR asked me to post the following announcement.

We’re seeking a full time salesperson for the Center for Applied Rationality in Berkeley, California. We’ve streamlined operations to handle large volume in workshop admissions, and now we need that volume to pour in. Your role would be to fill our workshops, events, and alumni community with people. Last year we had 167 total new alumni. This year we want 120 per month. Click here to find out more.

Marvin Minsky

Tuesday, January 26th, 2016

Yesterday brought the sad news that Marvin Minsky passed away at age 88.  I never met Minsky (I wish I had); I just had one email exchange with him back in 2002, about Stephen Wolfram’s book.  But Minsky was my academic great-grandfather (through Manuel Blum and Umesh Vazirani), and he influenced me in many other ways.  For example, in his and Papert’s 1968 book Perceptrons—notorious for “killing neural net research for a decade,” because of its mis- or over-interpreted theorems about the representational limitations of single-layer neural nets—the way Minsky and Papert proved those theorems was by translating questions about computation into questions about the existence or nonexistence of low-degree polynomials with various properties, and then answering the latter questions using MATH.  Their “polynomial method” is now a mainstay of quantum algorithms research (having been brought to the subject by Beals et al.), and in particular, has been a mainstay of my own career.  Hardly Minsky’s best-known contribution to human knowledge, but that even such a relatively minor part of his oeuvre could have legs half a century later is a testament to his impact.

I’m sure readers will have other thoughts to share about Minsky, so please do so in the comments section.  Personal reminiscences are especially welcome.