Archive for the ‘Adventures in Meatspace’ Category

Beyond fiction

Wednesday, August 8th, 2018

I now know firsthand what it’s like to be arrested by armed police officers, handcuffed, and sharply interrogated, while one’s wife and children look on helplessly.  This is not a prank post.

It happened in Philadelphia International Airport.  As someone who was born in Philadelphia, and who’s since visited ~40 countries on 6 continents and flies every week or two, I’ve long considered PHL possibly the most depressing airport on the planet (and the competition is fierce).

I’d just eaten dinner with my wife Dana and our two kids in a food court—after a day of travel that had already, before this happened, involved a missed flight and a lost suitcase, owing to a chain of mishaps that I’d (probably melodramatically) been describing to Dana as insane beyond the collective imagination of Homer and Shakespeare and Tolstoy and the world’s other literary giants to invent.  Again, that was before my arrest.

Two large uniformed men with holstered pistols saw me as we were exiting the airport, surrounded and handcuffed me, and demanded that I confess.

“I’m … sorry, officers,” I managed.  “I don’t understand what this is about.”

“Stop the games.  You know exactly what you took.  We have it all on video.  Where is it?”

Me, a thief?  I felt terrified to be at the beginning of a Kafka story.  But if I’m going to be brutally honest about it, I also felt … secretly vindicated in my irrational yet unshakeable beliefs that

  1. the laws of probability are broken, capricious horribleness reigning supreme over the universe,
  2. I’m despised by a large fraction of the world just for being who I am, and
  3. it’s only a matter of time until big, scary armed guys come for me, as they came for so many other nerdy misfits.

I almost wanted to say to the police: where have you been?  I’ve been expecting you my whole life.  And I wanted to say to Dana: you see??  see what I’ve been telling you all these years, about the nature of the universe we were born into?

Dana, for her part, was remonstrating with the officers that there must be some misunderstanding, that her husband was often absentminded but it’s completely impossible that he stole anything.  The officers brushed her away, told her to remove the kids from the situation.

“Are you gonna come clean?” one of the cops barked at me.  “We know you took it.”

“I didn’t take anything.”  Then I thought it over more.  “Or if somehow I did … then I’m certain that it would’ve been an accident, and I’d be more than happy to fix the…”

“Wait, if you did?  It sounds like you just confessed!”

“No, I definitely didn’t steal anything.  I’m just saying it’s possible that I might have mistakenly…”

“Your answers are rambling and all over the place.  Stop making up stories.  We know you did it.”

I’m not proud of myself for the next part, but the officers were so serious, and somehow I had to make them realize the sheer comical absurdity of what was happening.  “Look, I’m a computer science professor,” I said.  “I’ve never stolen a penny in my life, and it’s not something I’d ever…”

“Yeah, well I’m a police officer.  I’ve seen a lot in my thirty years in this job.  This is not about who you are, it’s about what you did.”

But what did I do?  After many more attempts to intimidate me, I was finally informed of the charge: “that smoothie place over there says you stole cash from their tip jar.”  Huh? How much?  One of the officers returned from the smoothie bar, and said, a bit sheepishly: “they say it was $4.”

Now a vague recollection came into sharper focus.  Yes, I had bought a berry smoothie for my daughter and a sparkling grapefruit juice for me.  I’d paid with a debit card, for reasons that I don’t remember, even though I normally pay cash.  My mind was elsewhere: on the missed flight, the lost suitcase, the brazen behavior of American Airlines (about which more later).  Then, completely forgetting I hadn’t paid cash this time, I looked down for my change: $4 in an unmarked plastic change cup.  I collected the change, put it in my wallet, then completely forgot about it.

After a minute, an employee angrily pointed down at a tray that the plastic cup was on (though not clearly at the cup itself), and said “hey, the tips go here!”  So I took a dollar from my wallet and put it on the tray.  I thought: this guy has some chutzpah, to demand a tip, and for an over-the-counter smoothie!  But whatever, he probably needs the dollar more than I do.  So if it will make him stop being angry…

But he was still angry.  He repeated: “this here is for tips!”

I said something to the effect of: “yeah, I know–that’s what you just told me, isn’t it?  So that’s why I just left you a tip!”  Sheesh.

At no point did he ever say, “you accidentally took from the tip jar,” or any other statement that would’ve clarified his meaning.

As I turned and walked away, I thought: yes, this is the strange world I was born into.  A world where people yell at me for not tipping at a smoothie bar–is that expected? I didn’t think it was–and then continue yelling even after I do.  But what did I expect?  Did I expect, as a nerdy outsider, to be able to buy normal people’s toleration with mere money?

As soon as I figured out what had happened, of course I offered to pay back the smoothie bar, not merely the $3 I still owed them, but $40 or whatever other amount would express my goodwill and compensate them for their trouble.  But the smoothie bar returned the $40 that I’d asked Dana to give them—I was unable to bring it myself on account of being handcuffed—and refused to press charges.  (In fact, apparently the employees hadn’t wanted to involve the police at all.  It was the manager, who hadn’t seen what happened, who’d insisted on it.)

So with no case, the police finally had no choice but to let me go–though not before giving me a stern lecture about never again putting my hands on stuff that’s not mine.

A week later, I’m still processing the experience.  In the rest of the post, I’d like to reflect on some lessons I think I learned from it.

First, it’s said that “a conservative is a liberal who’s been mugged; a liberal is a conservative who’s been arrested.”  It’s true: there are aspects of being arrested that are hard to understand until you’ve been through it.  While I’m white (well, insofar as Ashkenazim are), and while both officers who interrogated me happened to be African-Americans, what I went through further increased my sympathy for the many minority victims of aggressive policing.  Sitting in your armchair, it’s easy to think: in a liberal democracy, as long you know you did nothing wrong, even if you got arrested, frisked, detained, there’d probably be no real need to panic.  All you’d need to do is calmly clear up the misunderstanding and be back on your merry way.

But at least in my experience, an actual arrest isn’t like that.  The presumption of innocence, Miranda rights, all the things you might learn about in civics class—none of it seems to play any role.  From the very beginning, there’s an overwhelming presumption of guilt.  Everything you say gets interpreted as if you’re a red-handed criminal trying to fabricate a story, no matter how strained and how ludicrous such an interpretation might become.

And something strange happened: the officers seemed so certain I was guilty, that after only a few minutes I started to feel guilty.  I still had only a hazy sense of my “crime,” but I knew I was going to be punished for it, and I only hoped that the punishment wouldn’t tear me away from my family and previous life forever.

I came away from this incident with a visceral feel for just how easy it would be to procure a false confession from someone, which I didn’t have before but which will now stay with me as long as I live.

Second, it occurred to me that the sight of me, stuttering and potbellied complexity blogger, shackled and interrogated by armed policemen demanding that he confess to the theft of $3 from an airport stand, is a decent visual metaphor for much of my life.  If you doubt this, simply imagine Arthur Chu or Amanda Marcotte in place of the police officers.

It’s like: my accusers arrive on the scene committed to a specific, hostile theory of me: that I’m a petty thief of smoothie bars, let’s say, or a sexual-harassment-loving misogynist.  With all due modesty, people who know me might say that it’s not merely that I don’t fit the theory, that I happen to be innocent of the charge.  Rather, it’s that I’m one of the most astronomically, ridiculously unlikely people to fit the theory you could ever meet.  Not because I’m especially saintly, but simply because I already walk around all day feeling like my right to exist is conditional and might be revoked at any minute.  Breaking the normal people’s rules is the last thing on my agenda!  And yes, I still often feel that way, even as a professor with an endowed chair and awards and whatever.  The only times when I really relax, among strangers, is when everyone’s there to discuss ideas.

But my accusers don’t know any of that, or they refuse to believe it.  Everything I say gets interpreted in the light of the hostile theory, and therefore serves only as further confirmation of it.  Ironically—and this is key—the very unusual personality traits that make me so unlikely to be an offender, are also what throw off my accusers’ detection algorithms, and make them double down on their wrong theory.  When I’m trapped, I tend to fall back on the only tools I know: argument, openness, frank confession of my mistakes and failings, sometimes a little self-deprecating humor.  Unfortunately, I find this often backfires, as my accusers see in my vulnerability a golden opportunity to mount another wretched evildoer above their fireplace.

Or, to go even further out on a psychoanalytic limb: I sometimes get the sense that it gradually does dawn on my accusers that I’m not who they thought I was.  And then, far from prompting an apology, that realization seems to make my accusers even angrier, as if my throwing off their model of reality so badly, was an even worse offense than actually being guilty of whatever they thought!  A thief, a misogynist, they know how to handle.  But a living, breathing adversarial example for their worldview?

Dana, who watched the entire arrest, tells me that the central mistake I made was to try to reason with the police officers: “you say I took $3 that wasn’t mine?  If so, then I’m sure it was an accident, so let’s try to figure out what happened so we can fix it…”  In Dana’s view, what I saw as an earnest desire to get to the bottom of things, came across to grizzled cops only as evasiveness and guilt.  She says it would’ve been far better if I’d categorically denied: “no, I did not steal.  That’s completely absurd.  Please release me immediately.”

I’ve asked myself: how do you live in a world where, again and again, you can choose the hard right path over the easy wrong one, and then see your choice gleefully wielded against you?  Where you can spill your guts out to your accusers, in a desperate attempt to talk with them not as hardened warriors, but one confused and vulnerable human to another–and your reward is (to take one example) your picture in Salon above the headline “The Plight of the Bitter Nerd”?

The only way to live in such a world, as far as I can see, is to remind yourself that sometimes openness and vulnerability work.  In the course of my arrest, the two officers gradually differentiated themselves into a “good cop” and a “bad cop.”  While the “bad cop” treated me till the end like an unrepentant kleptomaniac being freed on a technicality, the “good cop,” who talked to me and Dana much more, became almost apologetic: “look man, when we get a call that someone stole money, we have to treat it like that’s the situation, you understand what I’m saying?  And then if it’s not, well then it’s not.”  Likewise, Arthur Chu recently tweeted that he’s “unhappy about [my] continued existence”–i.e., on a straightforward reading, that he wants me to die.  But I try to remind myself every day that the human race doesn’t consist solely of Arthur Chus (or Amanda Marcottes, or Lubos Motls, or SneerClub posters, or Paul Manaforts or Donald Trumps).  The world contains millions of women and men of every background and ideology who want actual dialogue, many of whom I’m lucky to count as friends, many of whom I met through this blog.  Vulnerability is possible because the world is not uniformly evil.

Third, I emerged from my arrest with a self-help technique that’s probably well-known to somebody, but that was new to me, and that I hope others will find as useful as I’m finding it.  Here it is: when something freakishly bad happens to you, draw a directed graph of all the known causes of the event, and the causes of the causes, and so forth as far back as you can trace them.  Also draw all the known measures that could have blocked the causal path leading to the bad event, and what prevented those measures from working or from being tried.

For example: why did I end up in handcuffs?  Firstly because, earlier in the day, Lily threw a temper tantrum that prevented us from packing and leaving for Logan Airport on time.  Because there was also heavy traffic on the way there.  Because we left from Harvard Square, and failed to factor in the extra 10 minutes to reach the airport, compared to if we’d left from MIT.  Because online check-in didn’t work.  Because when we did arrive, (barely) on time, the contemptuous American Airlines counter staff deliberately refused to check us in, chatting as we stewed impotently, so that we’d no longer be on time and they could legally give our seats away to others, and strand us in an airport with two young kids.  Because the only replacement flight was in a different terminal.  Because, in the stress of switching terminals–everything is stressful with two kids in an airport–I lost our suitcase.  Because the only shuttle to get back to the terminal went around the long way, and was slow as molasses, and by the time I returned our suitcase had been taken by the bomb squad.  Because the stress of such events bears down on me like an iron weight, and makes me unable to concentrate on the reality in front of me.  Because the guy at the smoothie counter and I failed to communicate.  Because the police chose to respond (or were trained to respond), not by politely questioning me to try to understand what had happened, but by handcuffing me and presuming guilt.

I actually drew the graph, filled a notebook page with it–and when I searched it for answers, neither I nor the world got off easily.  Looking over the strange chain of events that led to my arrest, I could find much to support my “default narrative,” that the laws of probability are broken and the universe is grotesquely awful.  But also, my belief in the universe’s grotesque awfulness clearly played a role in the events.  Had I been able maintain a calm demeanor, I would not have made so many mistakes.

Again and again, I screwed up.  Again and again, airport personnel responded to my honest mistakes with a maximum of cold bureaucracy rather than commonsense discussion: the booting from the flight, the bomb squad, the handcuffs.

We tend to think of bureaucracy as a mere nuisance, the person behind the counter at the Department of Motor Vehicles who makes you wait all day and then sends you home to get a different form of ID.  In my view, though, the bureaucratic impulse is one of the worst evils of which the human mind is capable.  It is, after all, the impulse that once sent trainloads of Jewish children to their deaths because that was the policy and there were no documents stating that any exception should be made in this case.  Today it’s the impulse that rounds up and deports people who’ve lived in the US for decades, sometimes served in the army, etc., and that separates screaming children from their parents.  To me, the mindset that willingly carries out such orders is almost more terrifying than the mindset that gives the orders in the first place.  I don’t mean to suggest, of course, that my arrest was even a trillionth as bad as those other things; at most I got a tiny, accidental taste of many less fortunate people’s daily reality.  But it’s worth remembering: every time you exercise official power over another person without even trying to talk it over first, clear up any honest misunderstandings, find out if there’s a reasonable explanation, you’re surrendering to one of the most destructive impulses in the history of civilization.

May we each strive to kill the bureaucrat in us and nurture the human being.

Unrelated Announcements:

I’m in Mexico City this week, to participate in a computer science and philosophy conference at UNAM and then give a broad quantum computing talk at CViCom 2018.  Because of this, responses to this post might be delayed.

(Update: But I’m having a wonderful time in Mexico!  Lots of delicious mole and horchata, and no arrests so far.  Today I gave my survey talk on P vs. NP.  I opened with the following icebreaker: “As a computer scientist speaking in a philosophy institute, I apologize that my talk will contain very little philosophy  Also, as an American speaking in Mexico, I apologize for our president.”)

My friend Elette Boyle asked me to announce that the 2018 CRYPTO conference, to be held in Santa Barbara, will be preceded by exciting workshops, including one that I’ll be speaking at myself entitled Beyond Crypto: A Theory Perspective.  Register now if you’re interested.

Huge congratulations to Costis Daskalakis, my former MIT colleague, for winning the Nevanlinna Prize for his work in algorithmic game theory!  While I don’t pretend to understand their work, congratulations to the four new Fields Medalists as well.

I put a new preprint online: Quantum Lower Bound for Approximate Counting Via Laurent Polynomials.

I’ve added a new blog to my blogroll: The Unit of Caring. I’ve been impressed by the author’s moral adeptness: when she addresses contentious debates among nerds, rationalists, feminists, SJWs, etc. etc., she often seems perfectly balanced on an atom-thin tightrope, even as some of us are plummetting left and right.

I forgot to mention this earlier, but I’m now a donor to the campaign of Beto O’Rourke, as he strives to unseat the quisling Ted Cruz in my adopted home state of Texas.  Americans: please consider donating as well!

Further Thoughts (Aug. 9):

  1. I wholeheartedly endorse an observation that many commenters (on this blog and elsewhere) made independently: that what really happened, is that I was forced to live out an episode of Seinfeld or Curb Your Enthusiasm.  To my detractors, I say the following: try for one minute to imagine how pathological, narcissistic, far outside the human norm, etc. etc. you could make Seinfeld or George or Kramer or Elaine seem, if their misadventures from any given episode were described and analyzed with clinical detachment.  (Or you were never a Seinfeld fan, then I guess this argument fails and we have nothing to say to each other.)
  2. I feel like some commenters are imposing their own after-the-fact knowledge (“c’mon, it was obviously a tip jar, he must be lying!”).  Dana, who’s generally more grounded than I am, saw their whole setup and agreed it was profoundly non-obvious that the tiny, unmarked plastic cup was supposed to be for tips, particularly to someone who was extremely stressed and not concentrating.  And when the employee later talked about tips, he didn’t indicate the cup so I didn’t make a connection.
  3. Most importantly: I wish to clarify that I don’t regard the police officers who handcuffed and interrogated me as having been “evil” in any sense.  I even took a liking to the “good cop,” the one who implicitly acknowledged the situation’s surreal absurdity by the end (although maybe that’s the whole point of a “good cop”?).  Having said that, I’m still rattled by the way the “bad cop” treated me as an unrepentant thief even to the end, even after the situation had been cleared up to everyone else’s satisfaction.  And I stand by my view that there was no need to handcuff me in front of my wife and young children, when I’d shown not a single subatomic particle of resistance.
  4. Speaking of which, let me now relate the most interesting and unexpected part of the reaction to my story.  Again and again, I found that fellow Americans, even nominally left-wing ones, sided with the police, said that I was crazy and guilty as charged and should’ve expected much worse, etc.  And again and again, commenters from Australia and New Zealand sided with me 300%, said that handcuffing someone over such a trivial mishap was a ludicrous overreaction, which would be totally unheard of in their countries and which confirms all the bad things they’ve heard about the US.  So maybe the rational conclusion is that I should be learning to enjoy vegemite in preparation for a move down under?

Summer recapitulates life

Tuesday, July 24th, 2018

Last week, I was back at the IAS in Princeton, to speak at a wonderful PITP summer school entitled “From Qubits to Spacetime,” co-organized by Juan Maldacena and Edward Witten. This week, I’ll be back in Waterloo, to visit old and new friends at the Perimeter Institute and Institute for Quantum Computing and give a couple talks.  Then, over the weekend, I’ll be back in Boston to see old friends, colleagues, and students.  After some other miscellaneous travel, I’ll then return to Austin in late August when the semester begins.  The particular sequence IAS → Waterloo → Boston → Austin is of course one that I’ve followed before, over a longer timescale.

Two quick announcements:

First, at the suggestion of reader Sanketh Menda, I’m thinking of holding a Shtetl-Optimized meetup in Waterloo this week.  Please send me an email if you’re interested, and we’ll figure out a time and place that work for everyone.

Second, many of the videos from the IAS summer school are now available, including mine: Part I and Part II.  I cover some basics of complexity theory, the complexity of quantum states and unitary transformations, the Harlow-Hayden argument about the complexity of turning a black hole event horizon into a firewall (with my refinement), and my and Lenny Susskind’s work on circuit complexity, wormholes, and AdS/CFT.  As a special bonus, check out the super-embarrassing goof at the beginning of my first lecture—claiming a mistaken symmetry of conditional entropy and even attributing it to Edward Witten’s lecture!  (But Witten, who I met for the first time on this visit, was kind enough to call my talk “lots of fun” anyway, and give me other positive comments, which I should put on my CV or something.)

Addendum: Many of the PITP videos are well worth watching!  As one example, I found Witten’s talks to be shockingly accessible.  I’d been to a previous talk of his involving Khovanov homology, but beyond the first few minutes, it went so far over my head that I couldn’t tell you how it was for its intended audience.  I’d also been to a popular talk of Witten’s on string theory, but that’s something he could do with only 3 awake brain cells.  In these talks, by contrast, Witten proves some basic inequalities of classical and quantum information theory, then proves the Reeh-Schlieder Theorem of quantum field theory and the Hawking and Penrose singularity theorems of GR, and finally uses quantum information theory to prove positive energy conditions from quantum field theory that are often needed to make statements about GR.

My Y Combinator podcast

Friday, June 29th, 2018

Here it is, recorded last week at Y Combinator’s office in San Francisco.  For regular readers of this blog, there will be a few things that are new—research projects I’ve been working on this year—and many things that are old.  Hope you enjoy it!  Thanks so much to Craig Cannon of Y Combinator for inviting me.

Associated with the podcast, Hacker News will be doing an AMA with me later today.  I’ll post a link to that when it’s available.  Update: here it is.

I’m at STOC’2018 TheoryFest in Los Angeles right now, where theoretical computer scientists celebrated the 50th anniversary of the conference that in some sense was the birthplace of the P vs. NP problem.  (Two participants in the very first STOC in 1969, Richard Karp and Allan Borodin, were on a panel to share their memories, along with Ronitt Rubinfeld and Avrim Blum, who joined the action in the 1980s.)  There’s been a great program this year—if you’d like to ask me about it, maybe do so in the comments of this post rather than in the AMA.

Five announcements

Tuesday, June 12th, 2018
  1. For the next two weeks, I’m in Berkeley for the Simons program “Challenges in Quantum Computation” (awesome program, by the way).  If you’re in the Bay Area and wanted to meet, feel free to shoot me an email (easiest for me if you come to Berkeley, though I do have a couple planned trips to SF).  If enough people wanted, we could even do a first-ever dedicated Shtetl-Optimized meetup.
  2. More broadly: I’m finally finished my yearlong sabbatical in Israel.  At some point I’ll do a post with my reflections on the experience.  I’ll now be traveling around North America all summer, then returning to UT Austin in the fall.
  3. Longtime friend-of-the-blog Boaz Barak, from a university in Cambridge, MA known as Harvard, asks me to invite readers to check out his new free draft textbook Introduction to Theoretical Computer Science, and to post comments about “typos, bugs, confusing explanations and such” in the book’s GitHub repository.  It looks great!
  4. This is already almost a month old, but if you enjoy the quantum computing content on this blog and wish to see related content from our carefully selected partners, check out John Preskill’s Y Combinator interview.
  5. Here’s the text of Senator Kamala Harris’s bill, currently working its way through the Senate, to create a US Quantum Computing Research Consortium.  Apparently there’s now also a second, competing quantum computing bill (!)—has anyone seen the text of that one?

Update (June 16): Even though I said there wouldn’t be a meetup, enough people eventually emailed wanting to have coffee that we did do the first-ever dedicated Shtetl-Optimized meetup after all—appropriately, given the title of the blog, at Saul’s Delicatessen in Berkeley. It was awesome. I met people working on fascinating and important things, from cheap nuclear energy to data analytics for downballot Democrats, and who I felt very proud to count as readers. Thanks so much to everyone who came; we’ll have to do another one sometime!

PDQP/qpoly = ALL

Saturday, May 19th, 2018

I’ve put up a new paper.  Unusually for me these days, it’s a very short and simple one (8 pages)—I should do more like this!  Here’s the abstract:

    We show that combining two different hypothetical enhancements to quantum computation—namely, quantum advice and non-collapsing measurements—would let a quantum computer solve any decision problem whatsoever in polynomial time, even though neither enhancement yields extravagant power by itself. This complements a related result due to Raz. The proof uses locally decodable codes.

I welcome discussion in the comments.  The real purpose of this post is simply to fulfill a request by James Gallagher, in the comments of my Robin Hanson post:

The probably last chance for humanity involves science progressing, can you apply your efforts to quantum computers, which is your expertise, and stop wasting many hours of you [sic] time with this [expletive deleted]

Indeed, I just returned to Tel Aviv, for the very tail end of my sabbatical, from a weeklong visit to Google’s quantum computing group in LA.  While we mourned tragedies—multiple members of the quantum computing community lost loved ones in recent weeks—it was great to be among so many friends, and great to talk and think for once about actual progress that’s happening in the world, as opposed to people saying mean things on Twitter.  Skipping over its plans to build a 49-qubit chip, Google is now going straight for 72 qubits.  And we now have some viable things that one can do, or try to do, with such a chip, beyond simply proving quantum supremacy—I’ll say more about that in subsequent posts.

Anyway, besides discussing this progress, the other highlight of my trip was going from LA to Santa Barbara on the back of Google physicist Sergio Boixo’s motorcycle—weaving in and out of rush-hour traffic, the tightness of my grip the only thing preventing me from flying out onto the freeway.  I’m glad to have tried it once, and probably won’t be repeating it.

Update: I posted a new version of the PDQP/qpoly=ALL paper, which includes an observation about communication complexity, and which—inspired by the comments section—clarifies that when I say “all languages,” I really do mean “all languages” (even the halting problem).


Monday, December 4th, 2017

Updates (Dec. 5): The US Supreme Court has upheld Trump’s latest travel ban. I’m grateful to all the lawyers who have thrown themselves in front of the train of fascism, desperately trying to slow it down—but I could never, ever have been a lawyer myself. Law is fundamentally a make-believe discipline. Sure, there are times when it involves reason and justice, possibly even resembles mathematics—but then there are times when the only legally correct thing to say is, “I guess that, contrary to what I thought, the Establishment Clause of the First Amendment does let you run for president promising to discriminate against a particular religious group, and then find a pretext under which to do it. The people with the power to decide that question have decided it.” I imagine that I’d last about half a day before tearing up my law-school diploma in disgust, which is surely a personality flaw on my part.

In happier news, many of you may have seen that papers by the groups of Chris Monroe and of Misha Lukin, reporting ~50-qubit experiments with trapped ions and optical lattices respectively, have been published back-to-back in Nature. (See here and here for popular summaries.) As far as I can tell, these papers represent an important step along the road to a clear quantum supremacy demonstration. Ideally, one wants a device to solve a well-defined computational problem (possibly a sampling problem), and also highly-optimized classical algorithms for solving the same problem and for simulating the device, which both let one benchmark the device’s performance and verify that the device is solving the problem correctly. But in a curious convergence, the Monroe group and Lukin group work suggests that this can probably be achieved with trapped ions and/or optical lattices at around the same time that Google and IBM are closing in on the goal with superconducting circuits.

As everyone knows, the flaming garbage fire of a tax bill has passed the Senate, thanks to the spinelessness of John McCain, Lisa Murkowski, Susan Collins, and Jeff Flake.  The fate of American higher education will now be decided behind closed doors, in the technical process of “reconciling” the House bill (which includes the crippling new tax on PhD students) with the Senate bill (which doesn’t—that one merely guts a hundred other things).  It’s hard to imagine that this particular line item will occassion more than about 30 seconds of discussion.  But, I dunno, maybe calling your Senator or Representative could help.  Me, I left a voicemail message with the office of Texas Senator Ted Cruz, one that I’m confident Cruz and his staff will carefully consider.

Here’s talk show host Seth Meyers (scroll to 5:00-5:20):

“By 2027, half of all US households would pay more in taxes [under the new bill].  Oh my god.  Cutting taxes was the one thing Republicans were supposed to be good at.  What’s even the point of voting for a Republican if they’re going to raise your taxes?  That’s like tuning in to The Kardashians only to see Courtney giving a TED talk on quantum computing.”

Speaking of which, you can listen to an interview with me about quantum computing, on a podcast called Data Skeptic. We discuss the basics and then the potential for quantum machine learning algorithms.

I got profoundly annoyed by an article called The Impossibility of Intelligence Explosion by François Chollet.  Citing the “No Free Lunch Theorem”—i.e., the (trivial) statement that you can’t outperform brute-force search on random instances of an optimization problem—to claim anything useful about the limits of AI, is not a promising sign.  In this case, Chollet then goes on to argue that most intelligence doesn’t reside in individuals but rather in culture; that there are hard limits to intelligence and to its usefulness; that we know of those limits because people with stratospheric intelligence don’t achieve correspondingly extraordinary results in life [von Neumann? Newton? Einstein? –ed.]; and finally, that recursively self-improving intelligence is impossible because we, humans, don’t recursively improve ourselves.  Scattered throughout the essay are some valuable critiques, but nothing comes anywhere close to establishing the impossibility advertised in the title.  Like, there’s a standard in CS for what it takes to show something’s impossible, and Chollet doesn’t even reach the same galaxy as that standard.  The certainty that he exudes strikes me as wholly unwarranted, just as much as (say) the near-certainty of a Ray Kurzweil on the other side.

I suppose this is as good a place as any to say that my views on AI risk have evolved.  A decade ago, it was far from obvious that known methods like deep learning and reinforcement learning, merely run with much faster computers and on much bigger datasets, would work as spectacularly well as they’ve turned out to work, on such a wide variety of problems, including beating all humans at Go without needing to be trained on any human game.  But now that we know these things, I think intellectual honesty requires updating on them.  And indeed, when I talk to the AI researchers whose expertise I trust the most, many, though not all, have updated in the direction of “maybe we should start worrying.”  (Related: Eliezer Yudkowsky’s There’s No Fire Alarm for Artificial General Intelligence.)

Who knows how much of the human cognitive fortress might fall to a few more orders of magnitude in processing power?  I don’t—not in the sense of “I basically know but am being coy,” but really in the sense of not knowing.

To be clear, I still think that by far the most urgent challenges facing humanity are things like: resisting Trump and the other forces of authoritarianism, slowing down and responding to climate change and ocean acidification, preventing a nuclear war, preserving what’s left of Enlightenment norms.  But I no longer put AI too far behind that other stuff.  If civilization manages not to destroy itself over the next century—a huge “if”—I now think it’s plausible that we’ll eventually confront questions about intelligences greater than ours: do we want to create them?  Can we even prevent their creation?  If they arise, can we ensure that they’ll show us more regard than we show chimps?  And while I don’t know how much we can say about such questions that’s useful, without way more experience with powerful AI than we have now, I’m glad that a few people are at least trying to say things.

But one more point: given the way civilization seems to be headed, I’m actually mildly in favor of superintelligences coming into being sooner rather than later.  Like, given the choice between a hypothetical paperclip maximizer destroying the galaxy, versus a delusional autocrat burning civilization to the ground while his supporters cheer him on and his opponents fight amongst themselves, I’m just about ready to take my chances with the AI.  Sure, superintelligence is scary, but superstupidity has already been given its chance and been found wanting.

Speaking of superintelligences, I strongly recommend an interview of Ed Witten by Quanta magazine’s Natalie Wolchover: one of the best interviews of Witten I’ve read.  Some of Witten’s prouncements still tend toward the oracular—i.e., we’re uncovering facets of a magnificent new theoretical structure, but it’s almost impossible to say anything definite about it, because we’re still missing too many pieces—but in this interview, Witten does stick his neck out in some interesting ways.  In particular, he speculates (as Einstein also did, late in life) about whether physics should be reformulated without any continuous quantities.  And he reveals that he’s recently been rereading Wheeler’s old “It from Bit” essay, because: “I’m trying to learn about what people are trying to say with the phrase ‘it from qubit.'”

I’m happy to report that a group based mostly in Rome has carried out the first experimental demonstration of PAC-learning of quantum states, applying my 2006 “Quantum Occam’s Razor Theorem” to reconstruct optical states of up to 6 qubits.  Better yet, they insisted on adding me to their paper!

I was at Cornell all of last week to give the Messenger Lectures: six talks in all (!!), if you include the informal talks that I gave at student houses (including Telluride House, where I lived as a Cornell undergrad from 1998 to 2000).  The subjects were my usual beat (quantum computing, quantum supremacy, learnability of quantum states, firewalls and AdS/CFT, big numbers).  Intimidatingly, the Messenger Lectures are the series in which Richard Feynman presented The Character of Physical Law in 1964, and in which many others (Eddington, Oppenheimer, Pauling, Weinberg, …) set a standard that my crass humor couldn’t live up to in a trillion years.  Nevertheless, thanks so much to Paul Ginsparg for hosting my visit, and for making it both intellectually stimulating and a trip down memory lane, with meetings with many of the professors from way back when who helped to shape my thinking, including Bart Selman, Jon Kleinberg, and Lillian Lee.  Cornell is much as I remember it from half a lifetime ago, except that they must’ve made the slopes twice as steep, since I don’t recall so much huffing and puffing on my way to class each morning.

At one of the dinners, my hosts asked me about the challenges of writing a blog when people on social media might vilify you for what you say.  I remarked that it hasn’t been too bad lately—indeed that these days, to whatever extent I write anything ‘controversial,’ mostly it’s just inveighing against Trump.  “But that is scary!” someone remarked.  “You live in Texas now!  What if someone with a gun got angry at you?”  I replied that the prospect of enraging such a person doesn’t really keep me awake at night, because it seems like the worst they could do would be to shoot me.  By contrast, if I write something that angers leftists, they can do something far scarier: they can make me feel guilty!

I’ll be giving a CS colloquium at Georgia Tech today, then attending workshops in Princeton and NYC the rest of the week, so my commenting might be lighter than usual … but yours need not be.

The problem with Uber

Thursday, October 19th, 2017

I just spent a wonderful and exhausting five days in the Bay Area: meeting friends, holding the first-ever combined SlateStarCodex/Shtetl-Optimized meetup, touring quantum computing startups, meeting with Silicon Valley folks about quantum computing, and giving a public lecture for the Simons Institute in Berkeley.  I’ll probably say more about some of these events in future posts, but for now: thanks so much to everyone who helped them happen!

Alas, my experiences getting around the Bay this week convinced me that there’s a real problem with Uber.  And no, I’m not talking about their corporate culture, or the personality of ousted CEO Travis Kalanick, or the hardball lobbying of municipalities to allow ride-sharing, or the taxi companies needing to adapt to survive, or even Uber having an unsustainable business model (they could charge more and I’d still use it…).

The problem is: when you order an Uber, like 2/3 of the time you and the driver can’t find each other without a lot of back and forth.

Firstly, because you can’t specify where you are with enough accuracy.  When you try, the app does this thing where it literally moves the “you are here” pointer to a place where you’re not. And then, even if the little dot correctly indicates your location, for some reason the driver will think you’re somewhere totally different.

Secondly, because Uber cars are typically unmarked.  Yes, the app tells you that it’s a white Ford or whatever—but there’s a lot of white cars, and it’s hard (at least for me) to distinguish models at a distance, so you can then face a stressful “Where’s Waldo?” problem involving hundreds of cars.

Thirdly, because the drivers understandably have their phones mounted on their dashboards—the result being that, when you call to try to figure out where they are, nothing they say can be distinguished from “mmph hrmph mmph.”  And of course they can’t text while driving.

To be clear, these gripes arise only because ride-sharing apps generally work so damn well, and are such an advance over what preceded them, that they’ve changed our expectations about the convenience of getting from place to place.  Because of Uber and Lyft and so on, it’s tempting to plan your life around the assumption that you can be anywhere in a greater metro area, and within 3 minutes a car will magically arrive to take you to wherever else in that area you need to be—while your brain remains uncluttered with transportation logistics, among the most excruciating of all topics.  This is a problem borne of success.

But—good news, everyone!—I have an idea to solve the problem, which I hereby offer free of charge to any ride-sharing service that wants to adopt it.  Namely, when you order a ride, why doesn’t the app—with your explicit permission, of course—use your phone’s camera to send a selfie of you, together with the location where you’re waiting, to the driver?  Is there some obvious reason I’m missing why this wouldn’t work?  Have any ride-sharing companies tried it?  (I only learned today that I can update my Uber profile to include my photo.  Hopefully that will help drivers find me—but a photo of the intersection, or the side of the building where I am, etc. could help even more.)

Coming to Nerd Central

Sunday, October 8th, 2017

While I’m generally on sabbatical in Tel Aviv this year, I’ll be in the Bay Area from Saturday Oct. 14 through Wednesday Oct. 18, where I look forward to seeing many friends new and old.  On Wednesday evening, I’ll be giving a public talk in Berkeley, through the Simons Institute’s “Theoretically Speaking” series, entitled Black Holes, Firewalls, and the Limits of Quantum Computers.  I hope to see at least a few of you there!  (I do have readers in the Bay Area, don’t I?)

But there’s more: on Saturday Oct. 14, I’m thinking of having a first-ever Shtetl-Optimized meetup, somewhere near the Berkeley campus.  Which will also be a Slate Star Codex meetup, because Scott Alexander will be there too.  We haven’t figured out many details yet, except that it will definitively involve getting fruit smoothies from one of the places I remember as a grad student.  Possible discussion topics include what the math, CS, and physics research communities could be doing better; how to advance Enlightenment values in an age of recrudescent totalitarianism; and (if we’re feeling really ambitious) the interpretation of quantum mechanics.  If you’re interested, shoot me an email, let me know if there are times that don’t work; then other Scott and I will figure out a plan and make an announcement.

On an unrelated note, some people might enjoy my answer to a MathOverflow question about why one should’ve expected number theory to be so rife with ridiculously easy-to-state yet hard-to-prove conjectures, like Fermat’s Last Theorem and the Goldbach Conjecture.  As I’ve discussed on this blog before, I’ve been deeply impressed with MathOverflow since the beginning, but never more so than today, when a decision to close the question as “off-topic” was rightfully overruled.  If there’s any idea that unites all theoretical computer scientists, I’d say it’s the idea that what makes a given kind of mathematics “easy” or “hard” is, itself, a proper subject for mathematical inquiry.

My Big Numbers talk at Festivaletteratura

Thursday, September 14th, 2017

Last weekend, I gave a talk on big numbers, as well as a Q&A about quantum computing, at Festivaletteratura: one of the main European literary festivals, held every year in beautiful and historic Mantua, Italy.  (For those who didn’t know, as I didn’t: this is the city where Virgil was born, and where Romeo gets banished in Romeo and Juliet.  Its layout hasn’t substantially changed since the Middle Ages.)

I don’t know how much big numbers or quantum computing have to do with literature, but I relished the challenge of explaining these things to an audience that was not merely “popular” but humanisitically rather than scientifically inclined.  In this case, there was not only a math barrier, but also a language barrier, as the festival was mostly in Italian and only some of the attendees knew English, to varying degrees.  The quantum computing session was live-translated into Italian (the challenge faced by the translator in not mangling this material provided a lot of free humor), but the big numbers talk wasn’t.  What’s more, the talk was held outdoors, on the steps of a cathedral, with tons of background noise, including a bell that loudly chimed halfway through the talk.  So if my own words weren’t simple and clear, forget it.

Anyway, in the rest of this post, I’ll share a writeup of my big numbers talk.  The talk has substantial overlap with my “classic” Who Can Name The Bigger Number? essay from 1999.  While I don’t mean to supersede or displace that essay, the truth is that I think and write somewhat differently than I did as a teenager (whuda thunk?), and I wanted to give Scott2017 a crack at material that Scott1999 has been over already.  If nothing else, the new version is more up-to-date and less self-indulgent, and it includes points (for example, the relation between ordinal generalizations of the Busy Beaver function and the axioms of set theory) that I didn’t understand back in 1999.

For regular readers of this blog, I don’t know how much will be new here.  But if you’re one of those people who keeps introducing themselves at social events by saying “I really love your blog, Scott, even though I don’t understand anything that’s in it”—something that’s always a bit awkward for me, because, uh, thanks, I guess, but what am I supposed to say next?—then this lecture is for you.  I hope you’ll read it and understand it.

Thanks so much to Festivaletteratura organizer Matteo Polettini for inviting me, and to Fabrizio Illuminati for moderating the Q&A.  I had a wonderful time in Mantua, although I confess there’s something about being Italian that I don’t understand.  Namely: how do you derive any pleasure from international travel, if anywhere you go, the pizza, pasta, bread, cheese, ice cream, coffee, architecture, scenery, historical sights, and pretty much everything else all fall short of what you’re used to?

Big Numbers

by Scott Aaronson
Sept. 9, 2017

My four-year-old daughter sometimes comes to me and says something like: “daddy, I think I finally figured out what the biggest number is!  Is it a million million million million million million million million thousand thousand thousand hundred hundred hundred hundred twenty eighty ninety eighty thirty a million?”

So I reply, “I’m not even sure exactly what number you named—but whatever it is, why not that number plus one?”

“Oh yeah,” she says.  “So is that the biggest number?”

Of course there’s no biggest number, but it’s natural to wonder what are the biggest numbers we can name in a reasonable amount of time.  Can I have two volunteers from the audience—ideally, two kids who like math?

[Two kids eventually come up.  I draw a line down the middle of the blackboard, and place one kid on each side of it, each with a piece of chalk.]

So the game is, you each have ten seconds to write down the biggest number you can.  You can’t write anything like “the other person’s number plus 1,” and you also can’t write infinity—it has to be finite.  But other than that, you can write basically anything you want, as long as I’m able to understand exactly what number you’ve named.  [These instructions are translated into Italian for the kids.]

Are you ready?  On your mark, get set, GO!

[The kid on the left writes something like: 999999999

While the kid on the right writes something like: 11111111111111111

Looking at these, I comment:]

9 is bigger than 1, but 1 is a bit faster to write, and as you can see that makes the difference here!  OK, let’s give our volunteers a round of applause.

[I didn’t plant the kids, but if I had, I couldn’t have designed a better jumping-off point.]

I’ve been fascinated by how to name huge numbers since I was a kid myself.  When I was a teenager, I even wrote an essay on the subject, called Who Can Name the Bigger Number?  That essay might still get more views than any of the research I’ve done in all the years since!  I don’t know whether to be happy or sad about that.

I think the reason the essay remains so popular, is that it shows up on Google whenever someone types something like “what is the biggest number?”  Some of you might know that Google itself was named after the huge number called a googol: 10100, or 1 followed by a hundred zeroes.

Of course, a googol isn’t even close to the biggest number we can name.  For starters, there’s a googolplex, which is 1 followed by a googol zeroes.  Then there’s a googolplexplex, which is 1 followed by a googolplex zeroes, and a googolplexplexplex, and so on.  But one of the most basic lessons you’ll learn in this talk is that, when it comes to naming big numbers, whenever you find yourself just repeating the same operation over and over and over, it’s time to step back, and look for something new to do that transcends everything you were doing previously.  (Applications to everyday life left as exercises for the listener.)

One of the first people to think about systems for naming huge numbers was Archimedes, who was Greek but lived in what’s now Italy (specifically Syracuse, Sicily) in the 200s BC.  Archimedes wrote a sort of pop-science article—possibly history’s first pop-science article—called The Sand-Reckoner.  In this remarkable piece, which was addressed to the King of Syracuse, Archimedes sets out to calculate an upper bound on the number of grains of sand needed to fill the entire universe, or at least the universe as known in antiquity.  He thereby seeks to refute people who use “the number of sand grains” as a shorthand for uncountability and unknowability.

Of course, Archimedes was just guessing about the size of the universe, though he did use the best astronomy available in his time—namely, the work of Aristarchus, who anticipated Copernicus.  Besides estimates for the size of the universe and of a sand grain, the other thing Archimedes needed was a way to name arbitrarily large numbers.  Since he didn’t have Arabic numerals or scientific notation, his system was basically just to compose the word “myriad” (which means 10,000) into bigger and bigger chunks: a “myriad myriad” gets its own name, a “myriad myriad myriad” gets another, and so on.  Using this system, Archimedes estimated that ~1063 sand grains would suffice to fill the universe.  Ancient Hindu mathematicians were able to name similarly large numbers using similar notations.  In some sense, the next really fundamental advances in naming big numbers wouldn’t occur until the 20th century.

We’ll come to those advances, but before we do, I’d like to discuss another question that motivated Archimedes’ essay: namely, what are the biggest numbers relevant to the physical world?

For starters, how many atoms are in a human body?  Anyone have a guess?  About 1028.  (If you remember from high-school chemistry that a “mole” is 6×1023, this is not hard to ballpark.)

How many stars are in our galaxy?  Estimates vary, but let’s say a few hundred billion.

How many stars are in the entire observable universe?  Something like 1023.

How many subatomic particles are in the observable universe?  No one knows for sure—for one thing, because we don’t know what the dark matter is made of—but 1090 is a reasonable estimate.

Some of you might be wondering: but for all anyone knows, couldn’t the universe be infinite?  Couldn’t it have infinitely many stars and particles?  The answer to that is interesting: indeed, no one knows whether space goes on forever or curves back on itself, like the surface of the earth.  But because of the dark energy, discovered in 1998, it seems likely that even if space is infinite, we can only ever see a finite part of it.  The dark energy is a force that pushes the galaxies apart.  The further away they are from us, the faster they’re receding—with galaxies far enough away from us receding faster than light.

Right now, we can see the light from galaxies that are up to about 45 billion light-years away.  (Why 45 billion light-years, you ask, if the universe itself is “only” 13.6 billion years old?  Well, when the galaxies emitted the light, they were a lot closer to us than they are now!  The universe expanded in the meantime.)  If, as seems likely, the dark energy has the form of a cosmological constant, then there’s a somewhat further horizon, such that it’s not just that the galaxies beyond that can’t be seen by us right now—it’s that they can never be seen.

In practice, many big numbers come from the phenomenon of exponential growth.  Here’s a graph showing the three functions n, n2, and 2n:

The difference is, n and even n2 grow in a more-or-less manageable way, but 2n just shoots up off the screen.  The shooting-up has real-life consequences—indeed, more important consequences than just about any other mathematical fact one can think of.

The current human population is about 7.5 billion (when I was a kid, it was more like 5 billion).  Right now, the population is doubling about once every 64 years.  If it continues to double at that rate, and humans don’t colonize other worlds, then you can calculate that, less than 3000 years from now, the entire earth, all the way down to the core, will be made of human flesh.  I hope the people use deodorant!

Nuclear chain reactions are a second example of exponential growth: one uranium or plutonium nucleus fissions and emits neutrons that cause, let’s say, two other nuclei to fission, which then cause four nuclei to fission, then 8, 16, 32, and so on, until boom, you’ve got your nuclear weapon (or your nuclear reactor, if you do something to slow the process down).  A third example is compound interest, as with your bank account, or for that matter an entire country’s GDP.  A fourth example is Moore’s Law, which is the thing that said that the number of components in a microprocessor doubled every 18 months (with other metrics, like memory, processing speed, etc., on similar exponential trajectories).  Here at Festivaletteratura, there’s a “Hack Space,” where you can see state-of-the-art Olivetti personal computers from around 1980: huge desk-sized machines with maybe 16K of usable RAM.  Moore’s Law is the thing that took us from those (and the even bigger, weaker computers before them) to the smartphone that’s in your pocket.

However, a general rule is that any time we encounter exponential growth in our observed universe, it can’t last for long.  It will stop, if not before then when it runs out of whatever resource it needs to continue: for example, food or land in the case of people, fuel in the case of a nuclear reaction.  OK, but what about Moore’s Law: what physical constraint will stop it?

By some definitions, Moore’s Law has already stopped: computers aren’t getting that much faster in terms of clock speed; they’re mostly just getting more and more parallel, with more and more cores on a chip.  And it’s easy to see why: the speed of light is finite, which means the speed of a computer will always be limited by the size of its components.  And transistors are now just 15 nanometers across; a couple orders of magnitude smaller and you’ll be dealing with individual atoms.  And unless we leap really far into science fiction, it’s hard to imagine building a transistor smaller than one atom across!

OK, but what if we do leap really far into science fiction?  Forget about engineering difficulties: is there any fundamental principle of physics that prevents us from making components smaller and smaller, and thereby making our computers faster and faster, without limit?

While no one has tested this directly, it appears from current physics that there is a fundamental limit to speed, and that it’s about 1043 operations per second, or one operation per Planck time.  Likewise, it appears that there’s a fundamental limit to the density with which information can be stored, and that it’s about 1069 bits per square meter, or one bit per Planck area. (Surprisingly, the latter limit scales only with the surface area of a region, not with its volume.)

What would happen if you tried to build a faster computer than that, or a denser hard drive?  The answer is: cycling through that many different states per second, or storing that many bits, would involve concentrating so much energy in so small a region, that the region would exceed what’s called its Schwarzschild radius.  If you don’t know what that means, it’s just a fancy way of saying that your computer would collapse to a black hole.  I’ve always liked that as Nature’s way of telling you not to do something!

Note that, on the modern view, a black hole itself is not only the densest possible object allowed by physics, but also the most efficient possible hard drive, storing ~1069 bits per square meter of its event horizon—though the bits are not so easy to retrieve! It’s also, in a certain sense, the fastest possible computer, since it really does cycle through 1043 states per second—though it might not be computing anything that anyone would care about.

We can also combine these fundamental limits on computer speed and storage capacity, with the limits that I mentioned earlier on the size of the observable universe, which come from the cosmological constant.  If we do so, we get an upper bound of ~10122 on the number of bits that can ever be involved in any computation in our world, no matter how large: if we tried to do a bigger computation than that, the far parts of it would be receding away from us faster than the speed of light.  In some sense, this 10122 is the most fundamental number that sets the scale of our universe: on the current conception of physics, everything you’ve ever seen or done, or will see or will do, can be represented by a sequence of at most 10122 ones and zeroes.

Having said that, in math, computer science, and many other fields (including physics itself), many of us meet bigger numbers than 10122 dozens of times before breakfast! How so? Mostly because we choose to ask, not about the number of things that are, but about the number of possible ways they could be—not about the size of ordinary 3-dimensional space, but the sizes of abstract spaces of possible configurations. And the latter are subject to exponential growth, continuing way beyond 10122.

As an example, let’s ask: how many different novels could possibly be written (say, at most 400 pages long, with a normal-size font, yadda yadda)? Well, we could get a lower bound on the number just by walking around here at Festivaletteratura, but the number that could be written certainly far exceeds the number that have been written or ever will be. This was the subject of Jorge Luis Borges’ famous story The Library of Babel, which imagined an immense library containing every book that could possibly be written up to a certain length. Of course, the vast majority of the books are filled with meaningless nonsense, but among their number one can find all the great works of literature, books predicting the future of humanity in perfect detail, books predicting the future except with a single error, etc. etc. etc.

To get more quantitative, let’s simply ask: how many different ways are there to fill the first page of a novel?  Let’s go ahead and assume that the page is filled with intelligible (or at least grammatical) English text, rather than arbitrary sequences of symbols, at a standard font size and page size.  In that case, using standard estimates for the entropy (i.e., compressibility) of English, I estimated this morning that there are maybe ~10700 possibilities.  So, forget about the rest of the novel: there are astronomically more possible first pages than could fit in the observable universe!

We could likewise ask: how many chess games could be played?  I’ve seen estimates from 1040 up to 10120, depending on whether we count only “sensible” games or also “absurd” ones (though in all cases, with a limit on the length of the game as might occur in a real competition). For Go, by contrast, which is played on a larger board (19×19 rather than 8×8) the estimates for the number of possible games seem to start at 10800 and only increase from there. This difference in magnitudes has something to do with why Go is a “harder” game than chess, why computers were able to beat the world chess champion already in 1997, but the world Go champion not until last year.

Or we could ask: given a thousand cities, how many routes are there for a salesman that visit each city exactly once? We write the answer as 1000!, pronounced “1000 factorial,” which just means 1000×999×998×…×2×1: there are 1000 choices for the first city, then 999 for the second city, 998 for the third, and so on.  This number is about 4×102567.  So again, more possible routes than atoms in the visible universe, yadda yadda.

But suppose the salesman is interested only in the shortest route that visits each city, given the distance between every city and every other.  We could then ask: to find that shortest route, would a computer need to search exhaustively through all 1000! possibilities—or, maybe not all 1000!, maybe it could be a bit more clever than that, but at any rate, a number that grew exponentially with the number of cities n?  Or could there be an algorithm that zeroed in on the shortest route dramatically faster: say, using a number of steps that grew only linearly or quadratically with the number of cities?

This, modulo a few details, is one of the most famous unsolved problems in all of math and science.  You may have heard of it; it’s called P versus NP.  P (Polynomial-Time) is the class of problems that an ordinary digital computer can solve in a “reasonable” amount of time, where we define “reasonable” to mean, growing at most like the size of the problem (for example, the number of cities) raised to some fixed power.  NP (Nondeterministic Polynomial-Time) is the class for which a computer can at least recognize a solution in polynomial-time.  If P=NP, it would mean that for every combinatorial problem of this sort, for which a computer could recognize a valid solution—Sudoku puzzles, scheduling airline flights, fitting boxes into the trunk of a car, etc. etc.—there would be an algorithm that cut through the combinatorial explosion of possible solutions, and zeroed in on the best one.  If P≠NP, it would mean that at least some problems of this kind required astronomical time, regardless of how cleverly we programmed our computers.

Most of us believe that P≠NP—indeed, I like to say that if we were physicists, we would’ve simply declared P≠NP a “law of nature,” and given ourselves Nobel Prizes for the discovery of the law!  And if it turned out that P=NP, we’d just give ourselves more Nobel Prizes for the law’s overthrow.  But because we’re mathematicians and computer scientists, we call it a “conjecture.”

Another famous example of an NP problem is: I give you (say) a 2000-digit number, and I ask you to find its prime factors.  Multiplying two thousand-digit numbers is easy, at least for a computer, but factoring the product back into primes seems astronomically hard—at least, with our present-day computers running any known algorithm.  Why does anyone care?  Well, you might know that, any time you order something online—in fact, every time you see a little padlock icon in your web browser—your personal information, like (say) your credit card number, is being protected by a cryptographic code that depends on the belief that factoring huge numbers is hard, or a few closely-related beliefs.  If P=NP, then those beliefs would be false, and indeed all cryptography that depends on hard math problems would be breakable in “reasonable” amounts of time.

In the special case of factoring, though—and of the other number theory problems that underlie modern cryptography—it wouldn’t even take anything as shocking as P=NP for them to fall.  Actually, that provides a good segue into another case where exponentials, and numbers vastly larger than 10122, regularly arise in the real world: quantum mechanics.

Some of you might have heard that quantum mechanics is complicated or hard.  But I can let you in on a secret, which is that it’s incredibly simple once you take the physics out of it!  Indeed, I think of quantum mechanics as not exactly even “physics,” but more like an operating system that the rest of physics runs on as application programs.  It’s a certain generalization of the rules of probability.  In one sentence, the central thing quantum mechanics says is that, to fully describe a physical system, you have to assign a number called an “amplitude” to every possible configuration that the system could be found in.  These amplitudes are used to calculate the probabilities that the system will be found in one configuration or another if you look at it.  But the amplitudes aren’t themselves probabilities: rather than just going from 0 to 1, they can be positive or negative or even complex numbers.

For us, the key point is that, if we have a system with (say) a thousand interacting particles, then the rules of quantum mechanics say we need at least 21000 amplitudes to describe it—which is way more than we could write down on pieces of paper filling the entire observable universe!  In some sense, chemists and physicists knew about this immensity since 1926.  But they knew it mainly as a practical problem: if you’re trying to simulate quantum mechanics on a conventional computer, then as far as we know, the resources needed to do so increase exponentially with the number of particles being simulated.  Only in the 1980s did a few physicists, such as Richard Feynman and David Deutsch, suggest “turning the lemon into lemonade,” and building computers that themselves would exploit the exponential growth of amplitudes.  Supposing we built such a computer, what would it be good for?  At the time, the only obvious application was simulating quantum mechanics itself!  And that’s probably still the most important application today.

In 1994, though, a guy named Peter Shor made a discovery that dramatically increased the level of interest in quantum computers.  That discovery was that a quantum computer, if built, could factor an n-digit number using a number of steps that grows only like about n2, rather than exponentially with n.  The upshot is that, if and when practical quantum computers are built, they’ll be able to break almost all the cryptography that’s currently used to secure the Internet.

(Right now, only small quantum computers have been built; the record for using Shor’s algorithm is still to factor 21 into 3×7 with high statistical confidence!  But Google is planning within the next year or so to build a chip with 49 quantum bits, or qubits, and other groups around the world are pursuing parallel efforts.  Almost certainly, 49 qubits still won’t be enough to do anything useful, including codebreaking, but it might be enough to do something classically hard, in the sense of taking at least ~249 or 563 trillion steps to simulate classically.)

I should stress, though, that for other NP problems—including breaking various other cryptographic codes, and solving the Traveling Salesman Problem, Sudoku, and the other combinatorial problems mentioned earlier—we don’t know any quantum algorithm analogous to Shor’s factoring algorithm.  For these problems, we generally think that a quantum computer could solve them in roughly the square root of the number of steps that would be needed classically, because of another famous quantum algorithm called Grover’s algorithm.  But getting an exponential quantum speedup for these problems would, at the least, require an additional breakthrough.  No one has proved that such a breakthrough in quantum algorithms is impossible: indeed, no one has proved that it’s impossible even for classical algorithms; that’s the P vs. NP question!  But most of us regard it as unlikely.

If we’re right, then the upshot is that quantum computers are not magic bullets: they might yield dramatic speedups for certain special problems (like factoring), but they won’t tame the curse of exponentiality, cut through to the optimal solution, every time we encounter a Library-of-Babel-like profusion of possibilities.  For (say) the Traveling Salesman Problem with a thousand cities, even a quantum computer—which is the most powerful kind of computer rooted in known laws of physics—might, for all we know, take longer than the age of the universe to find the shortest route.

The truth is, though, the biggest numbers that show up in math are way bigger than anything we’ve discussed until now: bigger than 10122, or even

$$ 2^{10^{122}}, $$

which is a rough estimate for the number of quantum-mechanical amplitudes needed to describe our observable universe.

For starters, there’s Skewes’ number, which the mathematician G. H. Hardy once called “the largest number which has ever served any definite purpose in mathematics.”  Let π(x) be the number of prime numbers up to x: for example, π(10)=4, since we have 2, 3, 5, and 7.  Then there’s a certain estimate for π(x) called li(x).  It’s known that li(x) overestimates π(x) for an enormous range of x’s (up to trillions and beyond)—but then at some point, it crosses over and starts underestimating π(x) (then overestimates again, then underestimates, and so on).  Skewes’ number is an upper bound on the location of the first such crossover point.  In 1955, Skewes proved that the first crossover must happen before

$$ x = 10^{10^{10^{964}}}. $$

Note that this bound has since been substantially improved, to 1.4×10316.  But no matter: there are numbers vastly bigger even than Skewes’ original estimate, which have since shown up in Ramsey theory and other parts of logic and combinatorics to take Skewes’ number’s place.

Alas, I won’t have time here to delve into specific (beautiful) examples of such numbers, such as Graham’s number.  So in lieu of that, let me just tell you about the sorts of processes, going far beyond exponentiation, that tend to yield such numbers.

The starting point is to remember a sequence of operations we all learn about in elementary school, and then ask why the sequence suddenly and inexplicably stops.

As long as we’re only talking about positive integers, “multiplication” just means “repeated addition.”  For example, 5×3 means 5 added to itself 3 times, or 5+5+5.

Likewise, “exponentiation” just means “repeated multiplication.”  For example, 53 means 5×5×5.

But what’s repeated exponentiation?  For that we introduce a new operation, which we call tetration, and write like so: 35 means 5 raised to itself 3 times, or

$$ ^{3} 5 = 5^{5^5} = 5^{3125} \approx 1.9 \times 10^{2184}. $$

But we can keep going. Let x pentated to the y, or xPy, mean x tetrated to itself y times.  Let x sextated to the y, or xSy, mean x pentated to itself y times, and so on.

Then we can define the Ackermann function, invented by the mathematician Wilhelm Ackermann in 1928, which cuts across all these operations to get more rapid growth than we could with any one of them alone.  In terms of the operations above, we can give a slightly nonstandard, but perfectly serviceable, definition of the Ackermann function as follows:

A(1) is 1+1=2.

A(2) is 2×2=4.

A(3) is 3 to the 3rd power, or 33=27.

Not very impressive so far!  But wait…

A(4) is 4 tetrated to the 4, or

$$ ^{4}4 = 4^{4^{4^4}} = 4^{4^{256}} = BIG $$

A(5) is 5 pentated to the 5, which I won’t even try to simplify.  A(6) is 6 sextated to the 6.  And so on.

More than just a curiosity, the Ackermann function actually shows up sometimes in math and theoretical computer science.  For example, the inverse Ackermann function—a function α such that α(A(n))=n, which therefore grows as slowly as the Ackermann function grows quickly, and which is at most 4 for any n that would ever arise in the physical universe—sometimes appears in the running times of real-world algorithms.

In the meantime, though, the Ackermann function also has a more immediate application.  Next time you find yourself in a biggest-number contest, like the one with which we opened this talk, you can just write A(1000), or even A(A(1000)) (after specifying that A means the Ackermann function above).  You’ll win—period—unless your opponent has also heard of something Ackermann-like or beyond.

OK, but Ackermann is very far from the end of the story.  If we want to go incomprehensibly beyond it, the starting point is the so-called “Berry Paradox”, which was first described by Bertrand Russell, though he said he learned it from a librarian named Berry.  The Berry Paradox asks us to imagine leaping past exponentials, the Ackermann function, and every other particular system for naming huge numbers.  Instead, why not just go straight for a single gambit that seems to beat everything else:

The biggest number that can be specified using a hundred English words or fewer

Why is this called a paradox?  Well, do any of you see the problem here?

Right: if the above made sense, then we could just as well have written

Twice the biggest number that can be specified using a hundred English words or fewer

But we just specified that number—one that, by definition, takes more than a hundred words to specify—using far fewer than a hundred words!  Whoa.  What gives?

Most logicians would say the resolution of this paradox is simply that the concept of “specifying a number with English words” isn’t precisely defined, so phrases like the ones above don’t actually name definite numbers.  And how do we know that the concept isn’t precisely defined?  Why, because if it was, then it would lead to paradoxes like the Berry Paradox!

So if we want to escape the jaws of logical contradiction, then in this gambit, we ought to replace English by a clear, logical language: one that can be used to specify numbers in a completely unambiguous way.  Like … oh, I know!  Why not write:

The biggest number that can be specified using a computer program that’s at most 1000 bytes long

To make this work, there are just two issues we need to get out of the way.  First, what does it mean to “specify” a number using a computer program?  There are different things it could mean, but for concreteness, let’s say a computer program specifies a number N if, when you run it (with no input), the program runs for exactly N steps and then stops.  A program that runs forever doesn’t specify any number.

The second issue is, which programming language do we have in mind: BASIC? C? Python?  The answer is that it won’t much matter!  The Church-Turing Thesis, one of the foundational ideas of computer science, implies that every “reasonable” programming language can emulate every other one.  So the story here can be repeated with just about any programming language of your choice.  For concreteness, though, we’ll pick one of the first and simplest programming languages, namely “Turing machine”—the language invented by Alan Turing all the way back in 1936!

In the Turing machine language, we imagine a one-dimensional tape divided into squares, extending infinitely in both directions, and with all squares initially containing a “0.”  There’s also a tape head with n “internal states,” moving back and forth on the tape.  Each internal state contains an instruction, and the only allowed instructions are: write a “0” in the current square, write a “1” in the current square, move one square left on the tape, move one square right on the tape, jump to a different internal state, halt, and do any of the previous conditional on whether the current square contains a “0” or a “1.”

Using Turing machines, in 1962 the mathematician Tibor Radó invented the so-called Busy Beaver function, or BB(n), which allowed naming by far the largest numbers anyone had yet named.  BB(n) is defined as follows: consider all Turing machines with n internal states.  Some of those machines run forever, when started on an all-0 input tape.  Discard them.  Among the ones that eventually halt, there must be some machine that runs for a maximum number of steps before halting.  However many steps that is, that’s what we call BB(n), the nth Busy Beaver number.

The first few values of the Busy Beaver function have actually been calculated, so let’s see them.

BB(1) is 1.  For a 1-state Turing machine on an all-0 tape, the choices are limited: either you halt in the very first step, or else you run forever.

BB(2) is 6, as isn’t too hard to verify by trying things out with pen and paper.

BB(3) is 21: that determination was already a research paper.

BB(4) is 107 (another research paper).

Much like with the Ackermann function, not very impressive yet!  But wait:

BB(5) is not yet known, but it’s known to be at least 47,176,870.

BB(6) is at least 7.4×1036,534.

BB(7) is at least

$$ 10^{10^{10^{10^{18,000,000}}}}. $$

Clearly we’re dealing with a monster here, but can we understand just how terrifying of a monster?  Well, call a sequence f(1), f(2), … computable, if there’s some computer program that takes n as input, runs for a finite time, then halts with f(n) as its output.  To illustrate, f(n)=n2, f(n)=2n, and even the Ackermann function that we saw before are all computable.

But I claim that the Busy Beaver function grows faster than any computable function.  Since this talk should have at least some math in it, let’s see a proof of that claim.

Maybe the nicest way to see it is this: suppose, to the contrary, that there were a computable function f that grew at least as fast as the Busy Beaver function.  Then by using that f, we could take the Berry Paradox from before, and turn it into an actual contradiction in mathematics!  So for example, suppose the program to compute f were a thousand bytes long.  Then we could write another program, not much longer than a thousand bytes, to run for (say) 2×f(1000000) steps: that program would just need to include a subroutine for f, plus a little extra code to feed that subroutine the input 1000000, and then to run for 2×f(1000000) steps.  But by assumption, f(1000000) is at least the maximum number of steps that any program up to a million bytes long can run for—even though we just wrote a program, less than a million bytes long, that ran for more steps!  This gives us our contradiction.  The only possible conclusion is that the function f, and the program to compute it, couldn’t have existed in the first place.

(As an alternative, rather than arguing by contradiction, one could simply start with any computable function f, and then build programs that compute f(n) for various “hardwired” values of n, in order to show that BB(n) must grow at least as rapidly as f(n).  Or, for yet a third proof, one can argue that, if any upper bound on the BB function were computable, then one could use that to solve the halting problem, which Turing famously showed to be uncomputable in 1936.)

In some sense, it’s not so surprising that the BB function should grow uncomputably quickly—because if it were computable, then huge swathes of mathematical truth would be laid bare to us.  For example, suppose we wanted to know the truth or falsehood of the Goldbach Conjecture, which says that every even number 4 or greater can be written as a sum of two prime numbers.  Then we’d just need to write a program that checked each even number one by one, and halted if and only if it found one that wasn’t a sum of two primes.  Suppose that program corresponded to a Turing machine with N states.  Then by definition, if it halted at all, it would have to halt after at most BB(N) steps.  But that means that, if we knew BB(N)—or even any upper bound on BB(N)—then we could find out whether our program halts, by simply running it for the requisite number of steps and seeing.  In that way we’d learn the truth or falsehood of Goldbach’s Conjecture—and similarly for the Riemann Hypothesis, and every other famous unproved mathematical conjecture (there are a lot of them) that can be phrased in terms of a computer program never halting.

(Here, admittedly, I’m using “we could find” in an extremely theoretical sense.  Even if someone handed you an N-state Turing machine that ran for BB(N) steps, the number BB(N) would be so hyper-mega-astronomical that, in practice, you could probably never distinguish the machine from one that simply ran forever.  So the aforementioned “strategy” for proving Goldbach’s Conjecture, or the Riemann Hypothesis would probably never yield fruit before the heat death of the universe, even though in principle it would reduce the task to a “mere finite calculation.”)

OK, you wanna know something else wild about the Busy Beaver function?  In 2015, my former student Adam Yedidia and I wrote a paper where we proved that BB(8000)—i.e., the 8000th Busy Beaver number—can’t be determined using the usual axioms for mathematics, which are called Zermelo-Fraenkel (ZF) set theory.  Nor can B(8001) or any larger Busy Beaver number.

To be sure, BB(8000) has some definite value: there are finitely many 8000-state Turing machines, and each one either halts or runs forever, and among the ones that halt, there’s some maximum number of steps that any of them runs for.  What we showed is that math, if it limits itself to the currently-accepted axioms, can never prove the value of BB(8000), even in principle.

The way we did that was by explicitly constructing an 8000-state Turing machine, which (in effect) enumerates all the consequences of the ZF axioms one after the next, and halts if and only if it ever finds a contradiction—that is, a proof of 0=1.  Presumably set theory is actually consistent, and therefore our program runs forever.  But if you proved the program ran forever, you’d also be proving the consistency of set theory.  And has anyone heard of any obstacle to doing that?  Of course, Gödel’s Incompleteness Theorem!  Because of Gödel, if set theory is consistent (well, technically, also arithmetically sound), then it can’t prove our program either halts or runs forever.  But that means set theory can’t determine BB(8000) either—because if it could do that, then it could also determine the behavior of our program.

To be clear, it was long understood that there’s some computer program that halts if and only if set theory is inconsistent—and therefore, that the axioms of set theory can determine at most k values of the Busy Beaver function, for some positive integer k.  “All” Adam and I did was to prove the first explicit upper bound, k≤8000, which required a lot of optimizations and software engineering to get the number of states down to something reasonable (our initial estimate was more like k≤1,000,000).  More recently, Stefan O’Rear has improved our bound—most recently, he says, to k≤1000, meaning that, at least by the lights of ZF set theory, fewer than a thousand values of the BB function can ever be known.

Meanwhile, let me remind you that, at present, only four values of the function are known!  Could the value of BB(100) already be independent of set theory?  What about BB(10)?  BB(5)?  Just how early in the sequence do you leap off into Platonic hyperspace?  I don’t know the answer to that question but would love to.

Ah, you ask, but is there any number sequence that grows so fast, it blows even the Busy Beavers out of the water?  There is!

Imagine a magic box into which you could feed in any positive integer n, and it would instantly spit out BB(n), the nth Busy Beaver number.  Computer scientists call such a box an “oracle.”  Even though the BB function is uncomputable, it still makes mathematical sense to imagine a Turing machine that’s enhanced by the magical ability to access a BB oracle any time it wants: call this a “super Turing machine.”  Then let SBB(n), or the nth super Busy Beaver number, be the maximum number of steps that any n-state super Turing machine makes before halting, if given no input.

By simply repeating the reasoning for the ordinary BB function, one can show that, not only does SBB(n) grow faster than any computable function, it grows faster than any function computable by super Turing machines (for example, BB(n), BB(BB(n)), etc).

Let a super duper Turing machine be a Turing machine with access to an oracle for the super Busy Beaver numbers.  Then you can use super duper Turing machines to define a super duper Busy Beaver function, which you can use in turn to define super duper pooper Turing machines, and so on!

Let “level-1 BB” be the ordinary BB function, let “level-2 BB” be the super BB function, let “level 3 BB” be the super duper BB function, and so on.  Then clearly we can go to “level-k BB,” for any positive integer k.

But we need not stop even there!  We can then go to level-ω BB.  What’s ω?  Mathematicians would say it’s the “first infinite ordinal”—the ordinals being a system where you can pass from any set of numbers you can possibly name (even an infinite set), to the next number larger than all of them.  More concretely, the level-ω Busy Beaver function is simply the Busy Beaver function for Turing machines that are able, whenever they want, to call an oracle to compute the level-k Busy Beaver function, for any positive integer k of their choice.

But why stop there?  We can then go to level-(ω+1) BB, which is just the Busy Beaver function for Turing machines that are able to call the level-ω Busy Beaver function as an oracle.  And thence to level-(ω+2) BB, level-(ω+3) BB, etc., defined analogously.  But then we can transcend that entire sequence and go to level-2ω BB, which involves Turing machines that can call level-(ω+k) BB as an oracle for any positive integer k.  In the same way, we can pass to level-3ω BB, level-4ω BB, etc., until we transcend that entire sequence and pass to level-ω2 BB, which can call any of the previous ones as oracles.  Then we have level-ω3 BB, level-ω4 BB, etc., until we transcend that whole sequence with level-ωω BB.  But we’re still not done!  For why not pass to level

$$ \omega^{\omega^{\omega}} $$,


$$ \omega^{\omega^{\omega^{\omega}}} $$,

etc., until we reach level

$$ \left. \omega^{\omega^{\omega^{.^{.^{.}}}}}\right\} _{\omega\text{ times}} $$?

(This last ordinal is also called ε0.)  And mathematicians know how to keep going even to way, way bigger ordinals than ε0, which give rise to ever more rapidly-growing Busy Beaver sequences.  Ordinals achieve something that on its face seems paradoxical, which is to systematize the concept of transcendence.

So then just how far can you push this?  Alas, ultimately the answer depends on which axioms you assume for mathematics.  The issue is this: once you get to sufficiently enormous ordinals, you need some systematic way to specify them, say by using computer programs.  But then the question becomes which ordinals you can “prove to exist,” by giving a computer program together with a proof that the program does what it’s supposed to do.  The more powerful the axiom system, the bigger the ordinals you can prove to exist in this way—but every axiom system will run out of gas at some point, only to be transcended, in Gödelian fashion, by a yet more powerful system that can name yet larger ordinals.

So for example, if we use Peano arithmetic—invented by the Italian mathematician Giuseppe Peano—then Gentzen proved in the 1930s that we can name any ordinals below ε0, but not ε0 itself or anything beyond it.  If we use ZF set theory, then we can name vastly bigger ordinals, but once again we’ll eventually run out of steam.

(Technical remark: some people have claimed that we can transcend this entire process by passing from first-order to second-order logic.  But I fundamentally disagree, because with second-order logic, which number you’ve named could depend on the model of set theory, and therefore be impossible to pin down.  With the ordinal Busy Beaver numbers, by contrast, the number you’ve named might be breathtakingly hopeless ever to compute—but provided the notations have been fixed, and the ordinals you refer to actually exist, at least we know there is a unique positive integer that you’re talking about.)

Anyway, the upshot of all of this is that, if you try to hold a name-the-biggest-number contest between two actual professionals who are trying to win, it will (alas) degenerate into an argument about the axioms of set theory.  For the stronger the set theory you’re allowed to assume consistent, the bigger the ordinals you can name, therefore the faster-growing the BB functions you can define, therefore the bigger the actual numbers.

So, yes, in the end the biggest-number contest just becomes another Gödelian morass, but one can get surprisingly far before that happens.

In the meantime, our universe seems to limit us to at most 10122 choices that could ever be made, or experiences that could ever be had, by any one observer.  Or fewer, if you believe that you won’t live until the heat death of the universe in some post-Singularity computer cloud, but for at most about 102 years.  In the meantime, the survival of the human race might hinge on people’s ability to understand much smaller numbers than 10122: for example, a billion, a trillion, and other numbers that characterize the exponential growth of our civilization and the limits that we’re now running up against.

On a happier note, though, if our goal is to make math engaging to young people, or to build bridges between the quantitative and literary worlds, the way this festival is doing, it seems to me that it wouldn’t hurt to let people know about the vastness that’s out there.  Thanks for your attention.

HTTPS / Kurtz / eclipse / Charlottesville / Blum / P vs. NP

Friday, August 25th, 2017

This post has a grab bag of topics, unified only by the fact that I can no longer put off blogging about them. So if something doesn’t interest you, just scroll down till you find something that does.

Great news, everyone: following a few reader complaints about the matter, the domain now supports https—and even automatically redirects to it! I’m so proud that Shtetl-Optimized has finally entered the technological universe of 1994. Thanks so much to heroic reader Martin Dehnel-Wild for setting this up for me.

Update 26/08/2017: Comments should now be working again; comments are now coming through to the moderated view in the blog’s control panel, so if they don’t show up immediately it might just be awaiting moderation. Thanks for your patience.

Last weekend, I was in Columbia, South Carolina, for a workshop to honor the 60th birthday of Stuart Kurtz, theoretical computer scientist at the University of Chicago.  I gave a talk about how work Kurtz was involved in from the 1990s—for example, on defining the complexity class GapP, and constructing oracles that satisfy conflicting requirements simultaneously—plays a major role in modern research on quantum computational supremacy: as an example, my recent paper with Lijie Chen.  (Except, what a terrible week to be discussing the paths to supremacy!  I promise there are no tiki torches involved, only much weaker photon sources.)

Coincidentally, I don’t know if you read anything about this on social media, but there was this total solar eclipse that passed right over Columbia at the end of the conference.

I’d always wondered why some people travel to remote corners of the earth to catch these.  So the sky gets dark for two minutes, and then it gets light again, in a way that’s been completely understood and predictable for centuries?

Having seen it, I can now tell you the deal, if you missed it and prefer to read about it here rather than 10500 other places online.  At risk of stating the obvious: it’s not the dark sky; it’s the sun’s corona visible around the moon.  Ironically, it’s only when the sun’s blotted out that you can actually look at the sun, at all the weird stuff going on around its disk.

OK, but totality is “only” to eclipses as orgasms are to sex.  There’s also the whole social experience of standing around outside with friends for an hour as the moon gradually takes a bigger bite out of the sun, staring up from time to time with eclipse-glasses to check its progress—and then everyone breaking into applause as the sky finally goes mostly dark, and you can look at the corona with the naked eye.  And then, if you like, standing around for another hour as the moon gradually exits the other way.  (If you’re outside the path of totality, this standing around and checking with eclipse-glasses is the whole experience.)

One cool thing is that, a little before and after totality, shadows on the ground have little crescents in them, as if the eclipse is imprinting its “logo” all over the earth.

For me, the biggest lesson the eclipse drove home was the logarithmic nature of perceived brightness (see also Scott Alexander’s story).  Like, the sun can be more than 90% occluded, and yet it’s barely a shade darker outside.  And you can still only look up with glasses so dark that they blot out everything except the sliver of sun, which still looks pretty much like the normal sun if you catch it out of the corner of your unaided eye.  Only during totality, and a few minutes before and after, is the darkening obvious.

Another topic at the workshop, unsurprisingly, was the ongoing darkening of the United States.  If it wasn’t obvious from my blog’s name, and if saying so explicitly will make any difference for anything, let the record state:

Shtetl-Optimized condemns Nazis, as well as anyone who knowingly marches with Nazis or defends them as “fine people.”

For a year, this blog has consistently described the now-president as a thug, liar, traitor, bully, sexual predator, madman, racist, and fraud, and has urged decent people everywhere to fight him by every peaceful and legal means available.  But if there’s some form of condemnation that I accidentally missed, then after Charlottesville, and Trump’s unhinged quasi-defenses of violent neo-Nazis, and defenses of his previous defenses, etc.—please consider Shtetl-Optimized to have condemned Trump that way also.

At least Charlottesville seems to have set local decisionmakers on an unstoppable course toward removing the country’s remaining Confederate statues—something I strongly supported back in May, before it had become the fully thermonuclear issue that it is now.  In an overnight operation, UT Austin has taken down its statues of Robert E. Lee, Albert Johnston, John Reagan, and Stephen Hogg.  (I confess, the postmaster general of the Confederacy wouldn’t have been my #1 priority for removal.  And, genuine question: what did Texas governor Stephen Hogg do that was so awful for his time, besides naming his daughter Ima Hogg?)

A final thing to talk about—yeah, we can’t avoid it—is Norbert Blum’s claimed proof of P≠NP.  I suppose I should be gratified that, after my last post, there were commenters who said, “OK, but enough about gender politics—what about P vs. NP?”  Here’s what I wrote on Tuesday the 15th:

To everyone who keeps asking me about the “new” P≠NP proof: I’d again bet $200,000 that the paper won’t stand, except that the last time I tried that, it didn’t achieve its purpose, which was to get people to stop asking me about it. So: please stop asking, and if the thing hasn’t been refuted by the end of the week, you can come back and tell me I was a closed-minded fool.

Many people misunderstood me to be saying that I’d again bet $200,000, even though the sentence said the exact opposite.  Maybe I should’ve said: I’m searching in vain for the right way to teach the nerd world to get less excited about these claims, to have the same reaction that the experts do, which is ‘oh boy, not another one’—which doesn’t mean that you know the error, or even that there is an error, but just means that you know the history.

Speaking of which, some friends and I recently had an awesome idea.  Just today, I registered the domain name  I’d like to set this up with a form that lets you type in the URL of a paper claiming to resolve the P vs. NP problem.  The site will then take 30 seconds or so to process the paper—with a status bar, progress updates, etc.—before finally rendering a verdict about the paper’s correctness.  Do any readers volunteer to help me create this?  Don’t worry, I’ll supply the secret algorithm to decide correctness, and will personally vouch for that algorithm for as long as the site remains live.

I have nothing bad to say about Norbert Blum, who made important contributions including the 3n circuit size lower bound for an explicit Boolean function—something that stood until very recently as the world record—and whose P≠NP paper was lucidly written, passing many of the most obvious checks.  And I received a bit of criticism for my “dismissive” stance.  Apparently, some right-wing former string theorist who I no longer read, whose name rhymes with Mubos Lotl, even accused me of being a conformist left-wing ideologue, driven to ignore Blum’s proof by an irrational conviction that any P≠NP proof will necessarily be so difficult that it will need to “await the Second Coming of Christ.”  Luca Trevisan’s reaction to that is worth quoting:

I agree with [Mubos Lotl] that the second coming of Jesus Christ is not a necessary condition for a correct proof that P is different from NP. I am keeping an open mind as to whether it is a sufficient condition.

On reflection, though, Mubos has a point: all of us, including me, should keep an open mind.  Maybe P≠NP (or P=NP!) is vastly easier to prove than most experts think, and is susceptible to a “fool’s mate.”

That being the case, it’s only intellectual honesty that compels me to report that, by about Friday of last week—i.e., exactly on my predicted schedule—a clear consensus had developed among experts that Blum’s P≠NP proof was irreparably flawed, and the consensus has stood since that time.

I’ve often wished that, even just for an hour or two, I could be free from this terrifying burden that I’ve carried around since childhood: the burden of having the right instincts about virtually everything.  Trust me, this “gift” is a lot less useful than it sounds, especially when reality so often contradicts what’s popular or expedient to say.

The background to Blum’s attempt, the counterexample that shows the proof has to fail somewhere, and the specifics of what appears to go wrong have already been covered at length elsewhere: see especially Luca’s post, Dick Lipton’s post, John Baez’s post, and the CS Theory StackExchange thread.

Very briefly, though: Blum claims to generalize some of the most celebrated complexity results of the 1980s—namely, superpolynomial lower bounds on the sizes of monotone circuits, which consist entirely of Boolean AND and OR gates—so that they also work for general (non-monotone) circuits, consisting of AND, OR, and NOT gates.  Everyone agrees that, if this succeeded, it would imply P≠NP.

Alas, another big discovery from the 1980s was that there are monotone Boolean functions (like Perfect Matching) that require superpolynomial-size monotone circuits, even though they have polynomial-size non-monotone circuits.  Why is that such a bummer?  Because it means our techniques for proving monotone circuit lower bounds can’t possibly work in as much generality as one might’ve naïvely hoped: if they did, they’d imply not merely that P doesn’t contain NP, but also that P doesn’t contain itself.

Blum was aware of all this, and gave arguments as to why his approach evades the Matching counterexample.  The trouble is, there’s another counterexample, which Blum doesn’t address, called Tardos’s function.  This is a weird creature: it’s obtained by starting with a graph invariant called the Lovász theta function, then looking at a polynomial-time approximation scheme for the theta function, and finally rounding the output of that PTAS to get a monotone function.  But whatever: in constructing this function, Tardos achieved her goal, which was to produce a monotone function that all known lower bound techniques for monotone circuits work perfectly fine for, but which is nevertheless in P (i.e., has polynomial-size non-monotone circuits).  In particular, if Blum’s proof worked, then it would also work for Tardos’s function, and that gives us a contradiction.

Of course, this merely tells us that Blum’s proof must have one or more mistakes; it doesn’t pinpoint where they are.  But the latter question has now been addressed as well.  On CS StackExchange, an anonymous commenter who goes variously by “idolvon” and “vloodin” provides a detailed analysis of the proof of Blum’s crucial Theorem 6.  I haven’t gone through every step myself, and there might be more to say about the matter than “vloodin” has, but several experts who are at once smarter, more knowledgeable, more cautious, and more publicity-shy than me have confirmed for me that vloodin correctly identified the erroneous region.

To those who wonder what gave me the confidence to call this immediately, without working through the details: besides the Cassandra-like burden that I was born with, I can explain something that might be helpful.  When Razborov achieved his superpolynomial monotone lower bounds in the 1980s, there was a brief surge of excitement: how far away could a P≠NP proof possibly be?  But then people, including Razborov himself, understood much more deeply what was going on—an understanding that was reflected in the theorems they proved, but also wasn’t completely captured by those theorems.

What was going on was this: monotone circuits are an interesting and nontrivial computational model.  Indeed for certain Boolean functions, such as the “slice functions,” they’re every bit as powerful as general circuits.  However, insofar as it’s possible to prove superpolynomial lower bounds on monotone circuit size, it’s possible only because monotone circuits are ridiculously less expressive than general Boolean circuits for the problems in question.  E.g., it’s possible only because monotone circuits aren’t expressing pseudorandom functions, and therefore aren’t engaging the natural proofs barrier or most of the other terrifying beasts that we’re up against.

So what can we say about the prospect that a minor tweak to the monotone circuit lower bound techniques from the 1980s would yield P≠NP?  If, like Mubos Lotl, you took the view that discrete math and theory of computation are just a mess of disconnected, random statements, then such a prospect would seem as likely to you as not.  But if you’re armed with the understanding above, then this possibility is a lot like the possibility that the OPERA experiment discovered superluminal neutrinos: no, not a logical impossibility, but something that’s safe to bet against at 10,000:1 odds.

During the discussion of Deolalikar’s earlier P≠NP claim, I once compared betting against a proof that all sorts of people are calling “formidable,” “solid,” etc., to standing in front of a huge pendulum—behind the furthest point that it reached the last time—even as it swings toward your face.  Just as certain physics teachers stake their lives on the conservation of energy, so I’m willing to stake my academic reputation, again and again, on the conservation of circuit-lower-bound difficulty.  And here I am, alive to tell the tale.