Archive for the ‘Nerd Interest’ Category

Edging in: the biggest science news of 2015

Sunday, January 3rd, 2016

For years, I was forced to endure life with my nose up against the glass of the Annual Edge Question.  What are you optimistic about?  Ooh! ooh! Call on me!  I’m optimistic about someday being able to prove my pessimistic beliefs (like P≠NP).  How is the Internet changing the way you think?  Ooh, ooh! I know! Google and MathOverflow are saving me from having to think at all!  So then why are they only asking Steven Pinker, Freeman Dyson, Richard Dawkins, David Deutsch, some random other people like that?

But all that has changed.  This year, I was invited to participate in Edge for the first time.  So, OK, here’s the question:

What do you consider the most interesting recent [scientific] news?  What makes it important?

My response is here.  I wasn’t in love with the question, because of what I saw as an inherent ambiguity in it: the news that’s most interesting to me, that I have a comparative advantage in talking about, and that people probably want to hear me talk about (e.g., progress in quantum computing), is not necessarily what I’d regard as the most important in any objective sense (e.g., climate change).  So, I decided to write my answer precisely about my internal tension in what I should consider most interesting: should it be the recent progress by John Martinis and others toward building a quantum computer?  Or should it be the melting glaciers, or something else that I’m confident will affect the future of the world?  Or possibly the mainstream attention now being paid to the AI-risk movement?  But if I really want to nerd out, then why not Babai’s graph isomorphism algorithm?  Or if I actually want to be honest about what excited me, then why not the superquadratic separations between classical and quantum query complexities for a total Boolean function, by Ambainis et al. and my student Shalev Ben-David?  On the other hand, how can I justify even caring about such things while the glaciers are melting?

So, yeah, my response tries to meditate on all those things.  My original title was “How nerdy do you want it?,” but John Brockman of Edge had me change it to something blander (“How widely should we draw the circle?”), and made a bunch of other changes from my usual style.  Initially I chafed at having an editor for what basically amounted to a blog post; on the other hand, I’m sure I would’ve gotten in trouble much less often on this blog had I had someone to filter my words for me.

Anyway, of course I wasn’t the only person to write about the climate crisis.  Robert Trivers, Laurence Smith, and Milford Wolpoff all wrote about it as well (Trivers most chillingly and concisely), while Max Tegmark wrote about the mainstreaming of AI risk.  John Naughton even wrote about Babai’s graph isomorphism breakthrough (though he seems unaware that the existing GI algorithms were already extremely fast in practice, and therefore makes misleading claims about the new algorithm’s practical applications).  Unsurprisingly, no one else wrote about breakthroughs in quantum query complexity: you’ll need to go to my essay for that!  A bit more surprisingly, no one besides me wrote about progress in quantum computing at all (if we don’t count the loophole-free Bell test).

Anyway, on reflection, 2015 actually was a pretty awesome year for science, no matter how nerdy you want it or how widely you draw the circle.  Here are other advances that I easily could’ve written about but didn’t:

I’ve now read all (more or less) of this year’s Edge responses.  Even though some of the respondents pushed personal hobbyhorses like I’d feared, I was impressed by how easy it was to discern themes: advances that kept cropping up in one answer after another and that one might therefore guess are actually important (or at least, are currently perceived to be important).

Probably at the top of the list was a new gene-editing technique called CRISPR: Randolph Neese, Paul Dolan, Eric Topol, Mark Pagel, and Stuart Firestein among others all wrote about this, and about its implications for creating designer humans.

Also widely-discussed was the discovery that most psychology studies fail to replicate (I’d long assumed as much, but apparently this was big news in psychology!): Nicholas Humphrey, Stephen Kosslyn, Jonathan Schooler, Ellen Winner, Judith Rich Harris, and Philip Tetlock all wrote about that.

Then there was the Pluto flyby, which Juan Enriquez, Roger Highfield, and Nicholas Christakis all wrote about.  (As Christakis, Master of Silliman College at Yale, was so recently a victim of a social-justice mob, I found it moving how he simply ignored those baying for his head and turned his attention heavenward in his Edge answer.)

Then there was progress in deep learning, including Google’s Deep Dream (those images of dogs in nebulae that filled your Facebook wall) and DeepMind (the program that taught itself how to play dozens of classic video games).  Steve Omohundro, Andy Clark, Jamshed Bharucha, Kevin Kelly, David Dalrymple, and Alexander Wissner-Gross all wrote about different aspects of this story.

And recent progress in SETI, which Yuri Milner (who’s given $100 million for it) and Mario Livio wrote about.

Unsurprisingly, a bunch of high-energy physicists wrote about high-energy physics at the LHC: how the Higgs boson was found (still news?), how nothing other than the Higgs boson was found (the biggest news?), but how there’s now the slightest hint of a new particle at 750 GeV.  See Lee Smolin, Garrett Lisi, Sean Carroll, and Sarah Demers.

Finally, way out on the Pareto frontier of importance and disgustingness was the recently-discovered therapeutic value of transplanting one person’s poop into another person’s intestines, which Joichi Ito, Pamela Rosenkranz, and Alan Alda all wrote about (it also, predictably, featured in a recent South Park episode).

Without further ado, here are 27 other answers that struck me in one way or another:

  • Steven Pinker on happy happy things are getting better (and we can measure it)
  • Freeman Dyson on the Dragonfly astronomical observatory
  • Jonathan Haidt on how prejudice against people of differing political opinions was discovered to have surpassed racial, gender, and religious prejudice
  • S. Abbas Raza on Piketty’s r>g
  • Rebecca Newberger Goldstein, thoughtful as usual, on the recent study that said it’s too simple to say female participation is lower in STEM fields—rather, female participation is lower in all and only those fields, STEM or non-STEM, whose participants believe (rightly or wrongly) that “genius” is required rather than just conscientious effort
  • Bill Joy on recent advances on reducing CO2 emissions
  • Paul Steinhardt on recent observations saying that, not only were the previous “B-modes from inflation” just galactic dust, but there are no real B-modes to within the current detection limits, and this poses a problem for inflation (I hadn’t heard about this last part)
  • Aubrey de Grey on new antibiotics that are grown in the soil rather than in lab cultures
  • John Tooby on the evolutionary rationale for germline engineering
  • W. Tecumseh Fitch on the coming reality of the “Jurassic Park program” (bringing back extinct species through DNA splicing—though probably not dinosaurs, whose DNA is too degraded)
  • Keith Devlin on the new prospect of using massive datasets (from MOOCs, for example) to actually figure out how students learn
  • Richard Muller on how air pollution in China has become one of the world’s worst problems (imagine every child in Beijing being force-fed two packs of cigarettes per day)
  • Ara Norenzayan on the demographic trends in religious belief
  • James Croak on amazing advances in battery technology (which were news to me)
  • Buddhini Samarasinghe on (among other things) the power of aspirin to possibly prevent cancer
  • Todd Sacktor on a new treatment for Parkinson’s
  • Charles Seife on the imminent availability of data about pretty much everything in our lives
  • Susan Blackmore on “that dress” and what it revealed about the human visual system
  • Brian Keating on experiments that should soon tell us the neutrinos’ masses (again, I hadn’t heard about these)
  • Michael McCullough on something called “reproductive religiosity theory,” which posits that the central purpose of religions is to enforce social norms around mating and reproduction (for what it’s worth, I’d always regarded that as obvious; it’s even expounded in the last chapter of Quantum Computing Since Democritus)
  • Greg Cochran on the origin of Europeans
  • David Buss on the “mating crisis among educated women”
  • Ed Regis on how high-fat diets are better (except, isn’t this the principle behind Atkins, and isn’t this pretty old news by now?)
  • Melanie Swan on blockchain-based cryptography, such as Bitcoin (though it wasn’t entirely clear to me what point Swan was making about it)
  • Paul Davies on LIGO getting ready to detect its first gravitational waves
  • Samuel Arbesman on how weather prediction has gotten steadily better (rendering our culture’s jokes about the perpetually-wrong weatherman outdated, with hardly anyone noticing)
  • Alison Gopnik on how the ubiquity of touchscreen devices like the iPad means that toddlers can now master computers, and this is something genuinely new under the sun (I can testify from personal experience that she’s onto something)

Then there were three answers for which the “progress” being celebrated, seemed to me to be progress racing faster into WrongVille:

  • Frank Tipler on how one can conclude a priori that there must be a Big Crunch to our future (and hence, the arena for Tiplerian theology) in order to prevent the black hole information paradox from arising, all recent cosmological evidence to the contrary be damned.
  • Ross Anderson on an exciting conference whose participants aim to replace quantum mechanics with local realistic theories.  (Anderson, in particular, is totally wrong that you can get Bell inequality violation from “a combination of local action and global correlation,” unless the global correlation goes as far as a ‘t-Hooft-like superdeterministic conspiracy.)
  • Gordon Kane on how the big news is that the LHC should soon see superparticles.  (This would actually be fine except that Kane omits the crucial context, that he’s been predicting superparticles just around the corner again and again for the past twenty years and they’ve never shown up)

Finally, two responses by old friends that amused me.  The science-fiction writer Rudy Rucker just became aware of the discovery of the dark energy back in 1998, and considers that to be exciting scientific news (yes, Rudy, so it was!).  And Michael Vassar —the Kevin Bacon or Paul Erdös of the rationalist world, the guy who everyone‘s connected to somehow—writes something about a global breakdown of economic rationality, $20 bills on the sidewalk getting ignored, that I had trouble understanding (though the fault is probably mine).

If I can’t do math, I don’t want to be part of your revolution

Thursday, December 3rd, 2015

1. Emma Goldman, the fiery early-20th-century anarchist, is credited for giving the world the immortal refrain “if I can’t dance, I don’t want to be part of your revolution” (actually it’s not clear that she ever said it so pithily, but she did express such a thought).  Admittedly, no one would mistake me for either a dancer or an anarchist, but I’ve always felt a kinship with Goldman over her terpsichorean line in the sand.  The other day, it occurred to me that there’s a parallel sentence that sums up my entire political philosophy—on the one hand, my default instinct to side with the downtrodden and with the progressive left, but on the other, my dissent from any even vaguely anti-STEM, anti-rationality, or anti-nerd undercurrents, and my refusal to join any popular uprising that seems liable (for example) to delay the discovery of a P≠NP proof, by inconveniencing the people working on one.

So, here’s my sentence, which you should feel free to reprint on t-shirts and coffee mugs as desired:

If I can’t do math, I don’t want to be part of your revolution.

2. Over at Scientific American‘s website, John Horgan posted an account of a workshop on Integrated Information Theory, which I attended a couple weeks ago at NYU (along with David Chalmers, Giulio Tononi, Christof Koch, Max Tegmark, and a dozen or so others).  I was the “official skeptic” of the workshop, and gave a talk based on my blog post The Unconscious Expander.  I don’t really agree with what Horgan says about physics and information in general, but I do (of course) join him in his skepticism of IIT, and he gives a pretty accurate summary of what people said at the workshop.  (Alas, my joke about my lunch not being poisoned completely bombed with the IIT crowd … as I should’ve predicted!)  The workshop itself was lots of fun; thanks so much to David, Giulio, and Hedda Hassel Morch for organizing it.

3. As you might have noticed, I’ve created a new category on this blog: “Obviously I’m Not Defending Aaronson.”  This category—reserved for posts that caused at least a hundred people to hate me—refers to a peculiar phrase I encountered over and over, in the social media threads denouncing me as a horrible person.  The phrase tends to occur in passages like: “look, obviously I’m not defending Aaronson, but it’s worth pointing out that, if you carefully reread everything he wrote, he never actually said that war orphans should be roasted alive and then eaten for fun.  That’s just something we all know that a clueless, arrogant nerd like him would think.”

4. Right now I’m at the “ThinkQ” conference at IBM in Yorktown Heights.  Here are the PowerPoint slides from my talk yesterday, entitled “The Largest Possible Quantum Speedups.”  Regular readers of this blog will find a lot that’s old and a little that’s new.

Talk, be merry, and be rational

Monday, November 23rd, 2015

Yesterday I wrote a statement on behalf of a Scott Alexander SlateStarCodex/rationalist meetup, which happened last night at MIT (in the same room where I teach my graduate class), and which I’d really wanted to attend but couldn’t.  I figured I’d share the statement here:

I had been looking forward to attending tonight’s MIT SlateStarCodex meetup as I hardly ever look forward to anything. Alas, I’m now stuck in Chicago, with my flight cancelled due to snow, and with all flights for the next day booked up. But instead of continuing to be depressed about it, I’ve decided to be happy that this meetup is even happening at all—that there’s a community of people who can read, let’s say, a hypothetical debate moderator questioning Ben Carson about what it’s like to be a severed half-brain, and simply be amused, instead of silently trying to figure out who benefits from the post and which tribe the writer belongs to. (And yes, I know: the answer is the gray tribe.) And you can find this community anywhere—even in Cambridge, Massachusetts! Look, I spend a lot of time online, just getting more and more upset reading social justice debates that are full of people calling each other douchebags without even being able to state anything in the same galactic supercluster as the other side’s case. And then what gives me hope for humanity is to click over to the slatestarcodex tab, and to see all the hundreds of comments (way more than my blog gets) by people who disagree with each other but who all basically get it, who all have minds that don’t make me despair. And to realize that, when Scott Alexander calls an SSC meetup, he can fill a room just about anywhere … well, at least anywhere I would visit. So talk, be merry, and be rational.

I’m now back in town, and told by people who attended the meetup that it was crowded, disorganized, and great.  And now I’m off to Harvard, to attend the other Scott A.’s talk “How To Ruin A Perfectly Good Randomized Controlled Trial.”


Update (Nov. 24) Scott Alexander’s talk at Harvard last night was one of the finest talks I’ve ever attended. He was introduced to rapturous applause as simply “the best blogger on the Internet,” and as finally an important speaker, in a talk series that had previously wasted everyone’s time with the likes of Steven Pinker and Peter Singer. (Scott demurred that his most notable accomplishment in life was giving the talk at Harvard that he was just now giving.) The actual content, as Scott warned from the outset, was “just” a small subset of a basic statistics course, but Scott brought each point alive with numerous recent examples, from psychiatry, pharmacology, and social sciences, where bad statistics or misinterpretations of statistics were accepted by nearly everyone and used to set policy. (E.g., Alcoholics Anonymous groups that claimed an “over 95%” success rate, because the people who relapsed were kicked out partway through and not counted toward the total.) Most impressively, Scott leapt immediately into the meat, ended after 20 minutes, and then spent the next two hours just taking questions. Scott is publicity-shy, but I hope for others’ sake that video of the talk will eventually make its way online.

Then, after the talk, I had the honor of meeting two fellow Boston-area rationalist bloggers, Kate Donovan and Jesse Galef. Yes, I said “fellow”: for almost a decade, I’ve considered myself on the fringes of the “rationalist movement.” I’d hang out a lot with skeptic/effective-altruist/transhumanist/LessWrong/OvercomingBias people (who are increasingly now SlateStarCodex people), read their blogs, listen and respond to their arguments, answer their CS theory questions. But I was always vaguely uncomfortable identifying myself with any group that even seemed to define itself by how rational it was compared to everyone else (even if the rationalists constantly qualified their self-designation with “aspiring”!). Also, my rationalist friends seemed overly interested in questions like how to prevent malevolent AIs from taking over the world, which I tend to think we lack the tools to make much progress on right now (though, like with many other remote possibilities, I’m happy for some people to work on them and see if they find anything interesting).

So, what changed? Well, in the debates about social justice, public shaming, etc. that have swept across the Internet these past few years, it seems to me that my rationalist friends have proven themselves able to weigh opposing arguments, examine their own shortcomings, resist groupthink and hysteria from both sides, and attack ideas rather than people, in a way that the wider society—and most depressingly to me, the “enlightened, liberal” part of society—has often failed. In a real-world test (“real-world,” in this context, meaning social media…), the rationalists have walked the walk and rationaled the rational, and thus they’ve given me no choice but to stand up and be counted as one of them.

Have a great Thanksgiving, those of you in the US!


Another Update: Dana, Lily, and I had the honor of having Scott Alexander over for dinner tonight. I found this genius of human nature, who took so much flak last year for defending me, to be completely uninterested in discussing anything related to social justice or online shaming. Instead, his gaze was fixed on the eternal: he just wanted to grill me all evening about physics and math and epistemology. Having recently read this Nature News article by Ron Cowen, he kept asking me things like: “you say that in quantum gravity, spacetime itself is supposed to dissolve into some sort of network of qubits. Well then, how does each qubit know which other qubits it’s supposed to be connected to? Are there additional qubits to specify the connectivity pattern? If so, then doesn’t that cause an infinite regress?” I handwaved something about AdS/CFT, where a dynamic spacetime is supposed to emerge from an ordinary quantum theory on a fixed background specified in advance. But I added that, in some sense, he had rediscovered the whole problem of quantum gravity that’s confused everyone for almost a century: if quantum mechanics presupposes a causal structure on the qubits or whatever other objects it talks about, then how do you write down a quantum theory of the causal structures themselves?

I’m sure there’s a lesson in here somewhere about what I should spend my time on.

Ordinary Words Will Do

Sunday, October 18th, 2015

Izabella Laba, a noted mathematician at the University of British Columbia, recently posted some tweets that used me as a bad, cautionary example for how “STEM faculty should be less contemptuous of social sciences.”  Here was the offending comment of mine, from the epic Walter Lewin thread last fall:

[W]hy not dispense with the empirically-empty notion of “privilege,” and just talk directly about the actual well-being of actual people, or groups of people?  If men are doing horrific things to women—for example, lashing them for driving cars, like in Saudi Arabia—then surely we can just say so in plain language.  Stipulating that the torturers are “exercising their male privilege” with every lash adds nothing to anyone’s understanding of the evil.  It’s bad writing.  More broadly, it seems to me that the entire apparatus of “privilege,” “delegitimation,” etc. etc. can simply be tossed overboard, to rust on the ocean floor alongside dialectical materialism and other theoretical superstructures that were once pompously insisted upon as preconditions of enlightened social discourse.  This isn’t quantum field theory.  Ordinary words will do.

Prof. Laba derisively commented:

Might as well ask you to explain calculus without using fancy words like “derivative” or “continuous.”  Simple number arithmetic will do.

Prof. Laba’s tweets were favorited by Jordan Ellenberg, a mathematician who wrote the excellent popular book How Not to Be Wrong.  (Ellenberg had also criticized me last year for my strange, naïve idea that human relations can be thought about using logic.)

Given my respect for the critics, I guess I’m honor-bound to respond.

For the record, I tend not to think about the social sciences—or for that matter, the natural sciences—as monolithic entities at all.  I admire any honest attempt to discover the truth about anything.  And not being a postmodern relativist, I believe there are deep truths worth discovering in history, psychology, economics, linguistics, possibly even sociology.  Reading the books of Steven Pinker underscored for me how much is actually understood nowadays about human nature—much of it only figured out within the last half-century.  Likewise, reading the voluminous profundities of Scott Alexander taught me that even in psychiatry, there are truths (and even a few definite cures) to be had for those who seek.

I also believe that the social sciences are harder—way harder—than math or physics or CS.  They’re harder because of the tenuousness of the correlations, because of the complexity of each individual human brain (let alone 7 billion of them on the same planet), but most of all, because politics and ideology and the scientist’s own biases place such powerful thumbs on the scale.  This makes it all the more impressive when a social scientist, like (say) Stanley Milgram or Judith Rich Harris or Napoleon Chagnon, teaches the world something important and new.

I will confess to contempt for anything that I regard as pompous obscurantism—for self-referential systems of jargon whose main purposes are to bar outsiders, to mask a lack of actual understanding, and to confer power on certain favored groups.  And I regard the need to be alert to such systems, to nip them in the bud before they grow into Lysenkoism, as in some sense the problem of intellectual life.  Which brings me to the most fundamental asymmetry between the hard and soft sciences.  Namely, the very fact that it’s so much harder to nurture new truths to maturity in the social sciences than it is in math or physics, means that in the former, the jargon-weeds have an easier time filling the void—and we know they’ve done it again and again, even in the post-Enlightenment West.

Time for a thought experiment.  Suppose you showed up at a university anytime between, let’s say, 1910 and 1970, and went from department to department asking (in so many words): what are you excited about this century?  Where are your new continents, what’s the future of your field?  Who should I read to learn about that future?

In physics, the consensus answer would’ve been something like: Planck, Einstein, Bohr, Schrödinger, Dirac.

In psychology, it would’ve been: Freud and Jung (with another faction for B. F. Skinner).

In politics and social sciences, over an enormous swath of academia (including in the West), it would’ve been: Marx, Engels, Trotsky, Lenin.

With hindsight, we now know that the physics advice would’ve been absolute perfection, the psychology and politics advice an unmitigated disaster.  Yes, physicists today know more than Einstein, can even correct him on some points, but the continents he revealed to us actually existed—indeed, have only become more important since Einstein’s time.

But Marx and Freud?  You would’ve done better to leave the campus, and ask a random person on the street what she or he thought about economics and psychology.  In high school, I remember cringing through a unit on the 1920s, when we learned about how “two European professors upset a war-weary civilization’s established certainties—with Einstein overturning received wisdom about space and time, and Freud doing just the same for the world of the mind.”  It was never thought important to add that Einstein’s theories turned out to be true while Freud’s turned out to be false.  Still, at least Freud’s ideas led “only” to decades of bad psychology and hundreds of innocent people sent to jail because of testimony procured through hypnosis, rather than to tens of millions of dead, as with the other social-scientific theory that reigned supreme among 20th-century academics.

Marx and Freud built impressive intellectual edifices—sufficiently impressive for a large fraction of intellectuals to have accepted those men as gurus on par with Darwin and Einstein for almost a century.  Yet on nearly every topic they wrote about, we now know that Marx and Freud couldn’t have been any more catastrophically wrong.  Moreover, their wrongness was knowable at the time—and was known to many, though the ones who knew were typically the ones who the intellectual leaders sneered at, as deluded reactionaries.

Which raises a question: suppose that, in the 1920s, I’d taken the social experts’ advice to study Marx and Freud, didn’t understand much of what they said (and found nonsensical much of what I did understand), and eventually rejected them as pretentious charlatans.  Then why wouldn’t I have been just like Prof. Laba’s ignorant rube, who dismisses calculus because he doesn’t understand technical terms like “continuous” and “derivative”?

On reflection, I don’t think that the two cases are comparable at all.

The hard sciences need technical vocabularies for a simple reason: because they’re about things that normal people don’t spend their hours obsessively worrying about.  Yes, I’d have a hard time understanding organic chemists or differential geometers, but largely for the same reasons I’d have a hard time understanding football fans or pirates.  It’s not just that I don’t understand the arguments; it’s that the arguments are about a world that’s alien to me (and that, to be honest, I don’t care about as much as I do my world).

Suppose, by contrast, that you’re writing about the topics everyone spends their time obsessively worrying about: politics, society, the human mind, the relations between the races and sexes.  In other words, suppose you’re writing about precisely the topics for which the ordinary English language has been honed over centuries—for which Shakespeare and Twain and Dr. King and so many others deployed the language to such spectacular effect.  In that case, what excuse could you possibly have to write in academese, to pepper your prose with undefined in-group neologisms?

Well, let’s be charitable; maybe you have a reason.  For example, maybe you’re doing a complicated meta-analysis of psychology papers, so you need to talk about r-values and kurtosis and heteroskedasticity.  Or maybe you’re putting people in an fMRI machine while you ask them questions, so you need to talk about the temporal resolution in the anterior cingulate cortex.  Or maybe you’re analyzing sibling rivalries using game theory, so you need Nash equilibria.  Or you’re picking apart sentences using Chomskyan formal grammar.  In all these cases, armchair language doesn’t suffice because you’re not just sitting in your armchair: you’re using a new tool to examine the everyday from a different perspective.  For present purposes, you might as well be doing algebraic geometry.

The Freudians and Marxists would, of course, claim that they’re doing the exact same thing.  Yes, they’d say, you thought you had the words to discuss your own mind or the power structure of society, but really you didn’t, because you lacked the revolutionary theoretical framework that we now possess.  (Trotsky’s writings  are suffused with this brand of arrogance in nearly every sentence: for example, when he ridicules the bourgeoisie liberals who whine about “human rights violations” in the early USSR, yet who are too dense to phrase their objections within the framework of dialectical materialism.)

I submit that, even without the hindsight of 2015, there would’ve been excellent reasons to be skeptical of these claims.  Has it ever happened, you might ask yourself, that someone sat in their study and mused about the same human questions that occupied Plato and Shakespeare and Hume, in the same human way they did, and then came up with a new, scientific conclusion that was as rigorous and secure as relativity or evolution?

Let me know if I missed something, but I can’t think of a single example.  Sure, it seems to me, there have been geniuses of human nature, who enlarged our vision without any recourse to the quantitative methods of science.  But even those geniuses “only” contributed melodies for other geniuses to answer in counterpoint, rather than stones for everyone who came later to build upon.  Also, the geniuses usually wrote well.

Am I claiming that progress is impossible in the social realm?  Not at all.  The emancipation of slaves, the end of dueling and blasphemy laws and the divine right of kings, women’s suffrage and participation in the workforce, gay marriage—all these strike me as crystal-clear examples of moral progress, as advances that will still be considered progress a thousand years from now, if there’s anyone around then to discuss such things.  Evolutionary psychology, heuristics and biases, reciprocal altruism, and countless other developments likewise strike me as intellectual progress within the sciences of human nature.  But none of these advances needed recondite language!  Ordinary words sufficed for Thomas Paine and Frederick Douglass and John Stuart Mill, as they sufficed for Robert Axelrod and for Kahneman and Tversky.  So forgive me for thinking that whatever is true and important in the social world today, should likewise be defensible to every smart person in ordinary words, and that this represents a genuine difference between the social sciences and physics.

Which brings us to the central point that Prof. Laba disputed in that comment of mine.  I believe there are countless moral heroes in our time, as well as social scientists who struggle heroically to get the right answers.  But as far as I can tell, the people who build complex intellectual edifices around words like “privilege” and “delegitimation” and “entitlement” and “marginalized” are very much the same sort of people who, a few generations ago, built similar edifices around “bourgeoisie” and “dialectical” and “false consciousness.”  In both cases, there’s an impressive body of theory that’s held up as the equivalent in its domain of relativity, quantum mechanics, and Darwinism, with any skeptics denounced as science-deniers.  In both cases, enlightened liberals are tempted to side with the theorists, since the theorists believe in so many of the same causes that the enlightened liberals believe in, and hate so many of the same people who the enlightened liberals hate.  But in both cases, the theorists’ language seems to alternate between incomprehensible word-salad and fervid, often profanity-laced denunciations, skipping entirely over calm clarity.  And in both cases, the only thing that the impressive theoretical edifice ever seems to get used for, is to prove over and over that certain favored groups should get more power while disfavored ones should get less.

So I’m led to the view that, if you want to rouse people’s anger about injustice or their pity about preventable suffering, or end arbitrary discrimination codified into law, or give individuals more freedom to pursue their own happiness, or come up with a new insight about human nature, or simply describe the human realities that you see around you—for all these purposes, the words that sufficed for every previous generation’s great humanists will also suffice for you.

On the other hand, to restrict freedom and invent new forms of discrimination—and to do it in the name of equality and justice—that takes theory.  You’ll need a sophisticated framework, for example, to prove that even if two adults both insist they’re consenting to a relationship, really they might not be, because of power structures in the wider society that your superior insight lets you see.  You’ll need advanced discourse to assure you that, even though your gut reaction might be horror at (say) someone who misspoke once and then had their life gleefully destroyed on social media, your gut is not to be trusted, because it’s poisoned by the same imperialist, patriarchal biases as everything else—and because what looks like a cruel lynching needs to be understood in a broader social context (did the victim belong to a dominant group, or to a marginalized one?).  Finally, you’ll need oodles of theory (bring out the Marcuse) to explain why the neoliberal fanaticism about “free speech” and “tolerance” and “due process” and “the presumption of innocence” is too abstract and simplistic—for those concepts, too, fail to distinguish between a marginalized group that deserves society’s protection and a dominant group that doesn’t.

So I concede to Prof. Laba that the complicated discourse of privilege, hegemony, etc. serves a definite purpose for the people who wield it, just as much as the complicated discourse of quantum field theory serves a purpose for physicists.  It’s just that the purposes of the privilege-warriors aren’t my purposes.  For my purposes—which include fighting injustice, advancing every social and natural science as quickly as possible, and helping all well-meaning women and men see each other’s common humanity—I said last year and I say again that ordinary words will do.


Update (Oct. 26): Izabella Laba has written a response to this post, for which I’m extremely grateful. Her reply reveals that she and I have a great deal of common ground, and also a few clear areas of disagreement (e.g., what’s wrong with Steven Pinker?). But my most important objection is simply that, the first time I loaded her blog, the text went directly over the rock image in the background, making it impossible to read without highlighting it.

Six announcements

Monday, September 21st, 2015
  1. I did a podcast interview with Julia Galef for her series “Rationally Speaking.”  See also here for the transcript (which I read rather than having to listen to myself stutter).  The interview is all about Aumann’s Theorem, and whether rational people can agree to disagree.  It covers a lot of the same ground as my recent post on the same topic, except with less technical detail about agreement theory and more … well, agreement.  At Julia’s suggestion, we’re planning to do a follow-up podcast about the particular intractability of online disagreements.  I feel confident that we’ll solve that problem once and for all.  (Update: Also check out this YouTube video, where Julia offers additional thoughts about what we discussed.)
  2. When Julia asked me to recommend a book at the end of the interview, I picked probably my favorite contemporary novel: The Mind-Body Problem by Rebecca Newberger Goldstein.  Embarrassingly, I hadn’t realized that Rebecca had already been on Julia’s show twice as a guest!  Anyway, one of the thrills of my life over the last year has been to get to know Rebecca a little, as well as her husband, who’s some guy named Steve Pinker.  Like, they both live right here in Boston!  You can talk to them!  I was especially pleased two weeks ago to learn that Rebecca won the National Humanities Medal—as I told Julia, Rebecca Goldstein getting a medal at the White House is the sort of thing I imagine happening in my ideal fantasy world, making it a pleasant surprise that it happened in this one.  Huge congratulations to Rebecca!
  3. The NSA has released probably its most explicit public statement so far about its plans to move to quantum-resistant cryptography.  For more see Bruce Schneier’s Crypto-Gram.  Hat tip for this item goes to reader Ole Aamot, one of the only people I’ve ever encountered whose name alphabetically precedes mine.
  4. Last Tuesday, I got to hear Ayaan Hirsi Ali speak at MIT about her new book, Heretic, and then spend almost an hour talking to students who had come to argue with her.  I found her clear, articulate, and courageous (as I guess one has to be in her line of work, even with armed cops on either side of the lecture hall).  After the shameful decision of Brandeis in caving in to pressure and cancelling Hirsi Ali’s commencement speech, I thought it spoke well of MIT that they let her speak at all.  The bar shouldn’t be that low, but it is.
  5. From far away on the political spectrum, I also heard Noam Chomsky talk last week (my first time hearing him live), about the current state of linguistics.  Much of the talk, it struck me, could have been given in the 1950s with essentially zero change (and I suspect Chomsky would agree), though a few parts of it were newer, such as the speculation that human languages have many of the features they do in order to minimize the amount of computation that the speaker needs to perform.  The talk was full of declarations that there had been no useful work whatsoever on various questions (e.g., about the evolutionary function of language), that they were total mysteries and would perhaps remain total mysteries forever.
  6. Many of you have surely heard by now that Terry Tao solved the Erdös Discrepancy Problem, by showing that for every infinite sequence of heads and tails and every positive integer C, there’s a positive integer k such that, if you look at the subsequence formed by every kth flip, there comes a point where the heads outnumber tails or vice versa by at least C.  This resolves a problem that’s been open for more than 80 years.  For more details, see this post by Timothy Gowers.  Notably, Tao’s proof builds, in part, on a recent Polymath collaborative online effort.  It was a big deal last year when Konev and Lisitsa used a SAT-solver to prove that there’s always a subsequence with discrepancy at least 3; Tao’s result now improves on that bound by ∞.

Common Knowledge and Aumann’s Agreement Theorem

Sunday, August 16th, 2015

The following is the prepared version of a talk that I gave at SPARC: a high-school summer program about applied rationality held in Berkeley, CA for the past two weeks.  I had a wonderful time in Berkeley, meeting new friends and old, but I’m now leaving to visit the CQT in Singapore, and then to attend the AQIS conference in Seoul.


Common Knowledge and Aumann’s Agreement Theorem

August 14, 2015

Thank you so much for inviting me here!  I honestly don’t know whether it’s possible to teach applied rationality, the way this camp is trying to do.  What I know is that, if it is possible, then the people running SPARC are some of the awesomest people on earth to figure out how.  I’m incredibly proud that Chelsea Voss and Paul Christiano are both former students of mine, and I’m amazed by the program they and the others have put together here.  I hope you’re all having fun—or maximizing your utility functions, or whatever.

My research is mostly about quantum computing, and more broadly, computation and physics.  But I was asked to talk about something you can actually use in your lives, so I want to tell a different story, involving common knowledge.

I’ll start with the “Muddy Children Puzzle,” which is one of the greatest logic puzzles ever invented.  How many of you have seen this one?

OK, so the way it goes is, there are a hundred children playing in the mud.  Naturally, they all have muddy foreheads.  At some point their teacher comes along and says to them, as they all sit around in a circle: “stand up if you know your forehead is muddy.”  No one stands up.  For how could they know?  Each kid can see all the other 99 kids’ foreheads, so knows that they’re muddy, but can’t see his or her own forehead.  (We’ll assume that there are no mirrors or camera phones nearby, and also that this is mud that you don’t feel when it’s on your forehead.)

So the teacher tries again.  “Knowing that no one stood up the last time, now stand up if you know your forehead is muddy.”  Still no one stands up.  Why would they?  No matter how many times the teacher repeats the request, still no one stands up.

Then the teacher tries something new.  “Look, I hereby announce that at least one of you has a muddy forehead.”  After that announcement, the teacher again says, “stand up if you know your forehead is muddy”—and again no one stands up.  And again and again; it continues 99 times.  But then the hundredth time, all the children suddenly stand up.

(There’s a variant of the puzzle involving blue-eyed islanders who all suddenly commit suicide on the hundredth day, when they all learn that their eyes are blue—but as a blue-eyed person myself, that’s always struck me as needlessly macabre.)

What’s going on here?  Somehow, the teacher’s announcing to the children that at least one of them had a muddy forehead set something dramatic in motion, which would eventually make them all stand up—but how could that announcement possibly have made any difference?  After all, each child already knew that at least 99 children had muddy foreheads!

Like with many puzzles, the way to get intuition is to change the numbers.  So suppose there were two children with muddy foreheads, and the teacher announced to them that at least one had a muddy forehead, and then asked both of them whether their own forehead was muddy.  Neither would know.  But each child could reason as follows: “if my forehead weren’t muddy, then the other child would’ve seen that, and would also have known that at least one of us has a muddy forehead.  Therefore she would’ve known, when asked, that her own forehead was muddy.  Since she didn’t know, that means my forehead is muddy.”  So then both children know their foreheads are muddy, when the teacher asks a second time.

Now, this argument can be generalized to any (finite) number of children.  The crucial concept here is common knowledge.  We call a fact “common knowledge” if, not only does everyone know it, but everyone knows everyone knows it, and everyone knows everyone knows everyone knows it, and so on.  It’s true that in the beginning, each child knew that all the other children had muddy foreheads, but it wasn’t common knowledge that even one of them had a muddy forehead.  For example, if your forehead and mine are both muddy, then I know that at least one of us has a muddy forehead, and you know that too, but you don’t know that I know it (for what if your forehead were clean?), and I don’t know that you know it (for what if my forehead were clean?).

What the teacher’s announcement did, was to make it common knowledge that at least one child has a muddy forehead (since not only did everyone hear the announcement, but everyone witnessed everyone else hearing it, etc.).  And once you understand that point, it’s easy to argue by induction: after the teacher asks and no child stands up (and everyone sees that no one stood up), it becomes common knowledge that at least two children have muddy foreheads (since if only one child had had a muddy forehead, that child would’ve known it and stood up).  Next it becomes common knowledge that at least three children have muddy foreheads, and so on, until after a hundred rounds it’s common knowledge that everyone’s forehead is muddy, so everyone stands up.

The moral is that the mere act of saying something publicly can change the world—even if everything you said was already obvious to every last one of your listeners.  For it’s possible that, until your announcement, not everyone knew that everyone knew the thing, or knew everyone knew everyone knew it, etc., and that could have prevented them from acting.

This idea turns out to have huge real-life consequences, to situations way beyond children with muddy foreheads.  I mean, it also applies to children with dots on their foreheads, or “kick me” signs on their backs…

But seriously, let me give you an example I stole from Steven Pinker, from his wonderful book The Stuff of Thought.  Two people of indeterminate gender—let’s not make any assumptions here—go on a date.  Afterward, one of them says to the other: “Would you like to come up to my apartment to see my etchings?”  The other says, “Sure, I’d love to see them.”

This is such a cliché that we might not even notice the deep paradox here.  It’s like with life itself: people knew for thousands of years that every bird has the right kind of beak for its environment, but not until Darwin and Wallace could anyone articulate why (and only a few people before them even recognized there was a question there that called for a non-circular answer).

In our case, the puzzle is this: both people on the date know perfectly well that the reason they’re going up to the apartment has nothing to do with etchings.  They probably even both know the other knows that.  But if that’s the case, then why don’t they just blurt it out: “would you like to come up for some intercourse?”  (Or “fluid transfer,” as the John Nash character put it in the Beautiful Mind movie?)

So here’s Pinker’s answer.  Yes, both people know why they’re going to the apartment, but they also want to avoid their knowledge becoming common knowledge.  They want plausible deniability.  There are several possible reasons: to preserve the romantic fantasy of being “swept off one’s feet.”  To provide a face-saving way to back out later, should one of them change their mind: since nothing was ever openly said, there’s no agreement to abrogate.  In fact, even if only one of the people (say A) might care about such things, if the other person (say B) thinks there’s any chance A cares, B will also have an interest in avoiding common knowledge, for A’s sake.

Put differently, the issue is that, as soon as you say X out loud, the other person doesn’t merely learn X: they learn that you know X, that you know that they know that you know X, that you want them to know you know X, and an infinity of other things that might upset the delicate epistemic balance.  Contrast that with the situation where X is left unstated: yeah, both people are pretty sure that “etchings” are just a pretext, and can even plausibly guess that the other person knows they’re pretty sure about it.  But once you start getting to 3, 4, 5, levels of indirection—who knows?  Maybe you do just want to show me some etchings.

Philosophers like to discuss Sherlock Holmes and Professor Moriarty meeting in a train station, and Moriarty declaring, “I knew you’d be here,” and Holmes replying, “well, I knew that you knew I’d be here,” and Moriarty saying, “I knew you knew I knew I’d be here,” etc.  But real humans tend to be unable to reason reliably past three or four levels in the knowledge hierarchy.  (Related to that, you might have heard of the game where everyone guesses a number between 0 and 100, and the winner is whoever’s number is the closest to 2/3 of the average of all the numbers.  If this game is played by perfectly rational people, who know they’re all perfectly rational, and know they know, etc., then they must all guess 0—exercise for you to see why.  Yet experiments show that, if you actually want to win this game against average people, you should guess about 20.  People seem to start with 50 or so, iterate the operation of multiplying by 2/3 a few times, and then stop.)

Incidentally, do you know what I would’ve given for someone to have explained this stuff to me back in high school?  I think that a large fraction of the infamous social difficulties that nerds have, is simply down to nerds spending so much time in domains (like math and science) where the point is to struggle with every last neuron to make everything common knowledge, to make all truths as clear and explicit as possible.  Whereas in social contexts, very often you’re managing a delicate epistemic balance where you need certain things to be known, but not known to be known, and so forth—where you need to prevent common knowledge from arising, at least temporarily.  “Normal” people have an intuitive feel for this; it doesn’t need to be explained to them.  For nerds, by contrast, explaining it—in terms of the muddy children puzzle and so forth—might be exactly what’s needed.  Once they’re told the rules of a game, nerds can try playing it too!  They might even turn out to be good at it.

OK, now for a darker example of common knowledge in action.  If you read accounts of Nazi Germany, or the USSR, or North Korea or other despotic regimes today, you can easily be overwhelmed by this sense of, “so why didn’t all the sane people just rise up and overthrow the totalitarian monsters?  Surely there were more sane people than crazy, evil ones.  And probably the sane people even knew, from experience, that many of their neighbors were sane—so why this cowardice?”  Once again, it could be argued that common knowledge is the key.  Even if everyone knows the emperor is naked; indeed, even if everyone knows everyone knows he’s naked, still, if it’s not common knowledge, then anyone who says the emperor’s naked is knowingly assuming a massive personal risk.  That’s why, in the story, it took a child to shift the equilibrium.  Likewise, even if you know that 90% of the populace will join your democratic revolt provided they themselves know 90% will join it, if you can’t make your revolt’s popularity common knowledge, everyone will be stuck second-guessing each other, worried that if they revolt they’ll be an easily-crushed minority.  And because of that very worry, they’ll be correct!

(My favorite Soviet joke involves a man standing in the Moscow train station, handing out leaflets to everyone who passes by.  Eventually, of course, the KGB arrests him—but they discover to their surprise that the leaflets are just blank pieces of paper.  “What’s the meaning of this?” they demand.  “What is there to write?” replies the man.  “It’s so obvious!”  Note that this is precisely a situation where the man is trying to make common knowledge something he assumes his “readers” already know.)

The kicker is that, to prevent something from becoming common knowledge, all you need to do is censor the common-knowledge-producing mechanisms: the press, the Internet, public meetings.  This nicely explains why despots throughout history have been so obsessed with controlling the press, and also explains how it’s possible for 10% of a population to murder and enslave the other 90% (as has happened again and again in our species’ sorry history), even though the 90% could easily overwhelm the 10% by acting in concert.  Finally, it explains why believers in the Enlightenment project tend to be such fanatical absolutists about free speech—why they refuse to “balance” it against cultural sensitivity or social harmony or any other value, as so many well-meaning people urge these days.

OK, but let me try to tell you something surprising about common knowledge.  Here at SPARC, you’ve learned all about Bayes’ rule—how, if you like, you can treat “probabilities” as just made-up numbers in your head, which are required obey the probability calculus, and then there’s a very definite rule for how to update those numbers when you gain new information.  And indeed, how an agent that wanders around constantly updating these numbers in its head, and taking whichever action maximizes its expected utility (as calculated using the numbers), is probably the leading modern conception of what it means to be “rational.”

Now imagine that you’ve got two agents, call them Alice and Bob, with common knowledge of each other’s honesty and rationality, and with the same prior probability distribution over some set of possible states of the world.  But now imagine they go out and live their lives, and have totally different experiences that lead to their learning different things, and having different posterior distributions.  But then they meet again, and they realize that their opinions about some topic (say, Hillary’s chances of winning the election) are common knowledge: they both know each other’s opinion, and they both know that they both know, and so on.  Then a striking 1976 result called Aumann’s Theorem states that their opinions must be equal.  Or, as it’s summarized: “rational agents with common priors can never agree to disagree about anything.”

Actually, before going further, let’s prove Aumann’s Theorem—since it’s one of those things that sounds like a mistake when you first hear it, and then becomes a triviality once you see the 3-line proof.  (Albeit, a “triviality” that won Aumann a Nobel in economics.)  The key idea is that knowledge induces a partition on the set of possible states of the world.  Huh?  OK, imagine someone is either an old man, an old woman, a young man, or a young woman.  You and I agree in giving each of these a 25% prior probability.  Now imagine that you find out whether they’re a man or a woman, and I find out whether they’re young or old.  This can be illustrated as follows:

ymom

The diagram tells us, for example, that if the ground truth is “old woman,” then your knowledge is described by the set {old woman, young woman}, while my knowledge is described by the set {old woman, old man}.  And this different information leads us to different beliefs: for example, if someone asks for the probability that the person is a woman, you’ll say 100% but I’ll say 50%.  OK, but what does it mean for information to be common knowledge?  It means that I know that you know that I know that you know, and so on.  Which means that, if you want to find out what’s common knowledge between us, you need to take the least common coarsening of our knowledge partitions.  I.e., if the ground truth is some given world w, then what do I consider it possible that you consider it possible that I consider possible that … etc.?  Iterate this growth process until it stops, by “zigzagging” between our knowledge partitions, and you get the set S of worlds such that, if we’re in world w, then what’s common knowledge between us is that the world belongs to S.  Repeat for all w’s, and you get the least common coarsening of our partitions.  In the above example, the least common coarsening is trivial, with all four worlds ending up in the same set S, but there are nontrivial examples as well:

youme

Now, if Alice’s expectation of a random variable X is common knowledge between her and Bob, that means that everywhere in S, her expectation must be constant … and hence must equal whatever the expectation is, over all the worlds in S!  Likewise, if Bob’s expectation is common knowledge with Alice, then everywhere in S, it must equal the expectation of X over S.  But that means that Alice’s and Bob’s expectations are the same.

There are lots of related results.  For example, rational agents with common priors, and common knowledge of each other’s rationality, should never engage in speculative trade (e.g., buying and selling stocks, assuming that they don’t need cash, they’re not earning a commission, etc.).  Why?  Basically because, if I try to sell you a stock for (say) $50, then you should reason that the very fact that I’m offering it means I must have information you don’t that it’s worth less than $50, so then you update accordingly and you don’t want it either.

Or here’s another one: suppose again that we’re Bayesians with common priors, and we’re having a conversation, where I tell you my opinion (say, of the probability Hillary will win the election).  Not any of the reasons or evidence on which the opinion is based—just the opinion itself.  Then you, being Bayesian, update your probabilities to account for what my opinion is.  Then you tell me your opinion (which might have changed after learning mine), I update on that, I tell you my new opinion, then you tell me your new opinion, and so on.  You might think this could go on forever!  But, no, Geanakoplos and Polemarchakis observed that, as long as there are only finitely many possible states of the world in our shared prior, this process must converge after finitely many steps with you and me having the same opinion (and moreover, with it being common knowledge that we have that opinion).  Why?  Because as long as our opinions differ, your telling me your opinion or me telling you mine must induce a nontrivial refinement of one of our knowledge partitions, like so:

youtell

I.e., if you learn something new, then at least one of your knowledge sets must get split along the different possible values of the thing you learned.  But since there are only finitely many underlying states, there can only be finitely many such splittings (note that, since Bayesians never forget anything, knowledge sets that are split will never again rejoin).

And something else: suppose your friend tells you a liberal opinion, then you take it into account, but reply with a more conservative opinion.  The friend takes your opinion into account, and replies with a revised opinion.  Question: is your friend’s new opinion likelier to be more liberal than yours, or more conservative?

Obviously, more liberal!  Yes, maybe your friend now sees some of your points and vice versa, maybe you’ve now drawn a bit closer (ideally!), but you’re not going to suddenly switch sides because of one conversation.

Yet, if you and your friend are Bayesians with common priors, one can prove that that’s not what should happen at all.  Indeed, your expectation of your own future opinion should equal your current opinion, and your expectation of your friend’s next opinion should also equal your current opinion—meaning that you shouldn’t be able to predict in which direction your opinion will change next, nor in which direction your friend will next disagree with you.  Why not?  Formally, because all these expectations are just different ways of calculating an expectation over the same set, namely your current knowledge set (i.e., the set of states of the world that you currently consider possible)!  More intuitively, we could say: if you could predict that, all else equal, the next thing you heard would probably shift your opinion in a liberal direction, then as a Bayesian you should already shift your opinion in a liberal direction right now.  (This is related to what’s called the “martingale property”: sure, a random variable X could evolve in many ways in the future, but the average of all those ways must be its current expectation E[X], by the very definition of E[X]…)

So, putting all these results together, we get a clear picture of what rational disagreements should look like: they should follow unbiased random walks, until sooner or later they terminate in common knowledge of complete agreement.  We now face a bit of a puzzle, in that hardly any disagreements in the history of the world have ever looked like that.  So what gives?

There are a few ways out:

(1) You could say that the “failed prediction” of Aumann’s Theorem is no surprise, since virtually all human beings are irrational cretins, or liars (or at least, it’s not common knowledge that they aren’t). Except for you, of course: you’re perfectly rational and honest.  And if you ever met anyone else as rational and honest as you, maybe you and they could have an Aumannian conversation.  But since such a person probably doesn’t exist, you’re totally justified to stand your ground, discount all opinions that differ from yours, etc.  Notice that, even if you genuinely believed that was all there was to it, Aumann’s Theorem would still have an aspirational significance for you: you would still have to say this is the ideal that all rationalists should strive toward when they disagree.  And that would already conflict with a lot of standard rationalist wisdom.  For example, we all know that arguments from authority carry little weight: what should sway you is not the mere fact of some other person stating their opinion, but the actual arguments and evidence that they’re able to bring.  Except that as we’ve seen, for Bayesians with common priors this isn’t true at all!  Instead, merely hearing your friend’s opinion serves as a powerful summary of what your friend knows.  And if you learn that your rational friend disagrees with you, then even without knowing why, you should take that as seriously as if you discovered a contradiction in your own thought processes.  This is related to an even broader point: there’s a normative rule of rationality that you should judge ideas only on their merits—yet if you’re a Bayesian, of course you’re going to take into account where the ideas come from, and how many other people hold them!  Likewise, if you’re a Bayesian police officer or a Bayesian airport screener or a Bayesian job interviewer, of course you’re going to profile people by their superficial characteristics, however unfair that might be to individuals—so all those studies proving that people evaluate the same resume differently if you change the name at the top are no great surprise.  It seems to me that the tension between these two different views of rationality, the normative and the Bayesian, generates a lot of the most intractable debates of the modern world.

(2) Or—and this is an obvious one—you could reject the assumption of common priors. After all, isn’t a major selling point of Bayesianism supposed to be its subjective aspect, the fact that you pick “whichever prior feels right for you,” and are constrained only in how to update that prior?  If Alice’s and Bob’s priors can be different, then all the reasoning I went through earlier collapses.  So rejecting common priors might seem appealing.  But there’s a paper by Tyler Cowen and Robin Hanson called “Are Disagreements Honest?”—one of the most worldview-destabilizing papers I’ve ever read—that calls that strategy into question.  What it says, basically, is this: if you’re really a thoroughgoing Bayesian rationalist, then your prior ought to allow for the possibility that you are the other person.  Or to put it another way: “you being born as you,” rather than as someone else, should be treated as just one more contingent fact that you observe and then conditionalize on!  And likewise, the other person should condition on the observation that they’re them and not you.  In this way, absolutely everything that makes you different from someone else can be understood as “differing information,” so we’re right back to the situation covered by Aumann’s Theorem.  Imagine, if you like, that we all started out behind some Rawlsian veil of ignorance, as pure reasoning minds that had yet to be assigned specific bodies.  In that original state, there was nothing to differentiate any of us from any other—anything that did would just be information to condition on—so we all should’ve had the same prior.  That might sound fanciful, but in some sense all it’s saying is: what licenses you to privilege an observation just because it’s your eyes that made it, or a thought just because it happened to occur in your head?  Like, if you’re objectively smarter or more observant than everyone else around you, fine, but to whatever extent you agree that you aren’t, your opinion gets no special epistemic protection just because it’s yours.

(3) If you’re uncomfortable with this tendency of Bayesian reasoning to refuse to be confined anywhere, to want to expand to cosmic or metaphysical scope (“I need to condition on having been born as me and not someone else”)—well then, you could reject the entire framework of Bayesianism, as your notion of rationality. Lest I be cast out from this camp as a heretic, I hasten to say: I include this option only for the sake of completeness!

(4) When I first learned about this stuff 12 years ago, it seemed obvious to me that a lot of it could be dismissed as irrelevant to the real world for reasons of complexity. I.e., sure, it might apply to ideal reasoners with unlimited time and computational power, but as soon as you impose realistic constraints, this whole Aumannian house of cards should collapse.  As an example, if Alice and Bob have common priors, then sure they’ll agree about everything if they effectively share all their information with each other!  But in practice, we don’t have time to “mind-meld,” swapping our entire life experiences with anyone we meet.  So one could conjecture that agreement, in general, requires a lot of communication.  So then I sat down and tried to prove that as a theorem.  And you know what I found?  That my intuition here wasn’t even close to correct!

In more detail, I proved the following theorem.  Suppose Alice and Bob are Bayesians with shared priors, and suppose they’re arguing about (say) the probability of some future event—or more generally, about any random variable X bounded in [0,1].  So, they have a conversation where Alice first announces her expectation of X, then Bob announces his new expectation, and so on.  The theorem says that Alice’s and Bob’s estimates of X will necessarily agree to within ±ε, with probability at least 1-δ over their shared prior, after they’ve exchanged only O(1/(δε2)) messages.  Note that this bound is completely independent of how much knowledge they have; it depends only on the accuracy with which they want to agree!  Furthermore, the same bound holds even if Alice and Bob only send a few discrete bits about their real-valued expectations with each message, rather than the expectations themselves.

The proof involves the idea that Alice and Bob’s estimates of X, call them XA and XB respectively, follow “unbiased random walks” (or more formally, are martingales).  Very roughly, if |XA-XB|≥ε with high probability over Alice and Bob’s shared prior, then that fact implies that the next message has a high probability (again, over the shared prior) of causing either XA or XB to jump up or down by about ε.  But XA and XB, being estimates of X, are bounded between 0 and 1.  So a random walk with a step size of ε can only continue for about 1/ε2 steps before it hits one of the “absorbing barriers.”

The way to formalize this is to look at the variances, Var[XA] and Var[XB], with respect to the shared prior.  Because Alice and Bob’s partitions keep getting refined, the variances are monotonically non-decreasing.  They start out 0 and can never exceed 1 (in fact they can never exceed 1/4, but let’s not worry about constants).  Now, the key lemma is that, if Pr[|XA-XB|≥ε]≥δ, then Var[XB] must increase by at least δε2 if Alice sends XA to Bob, and Var[XA] must increase by at least δε2 if Bob sends XB to Alice.  You can see my paper for the proof, or just work it out for yourself.  At any rate, the lemma implies that, after O(1/(δε2)) rounds of communication, there must be at least a temporary break in the disagreement; there must be some round where Alice and Bob approximately agree with high probability.

There are lots of other results in my paper, including an upper bound on the number of calls that Alice and Bob need to make to a “sampling oracle” to carry out this sort of protocol approximately, assuming they’re not perfect Bayesians but agents with bounded computational power.  But let me step back and address the broader question: what should we make of all this?  How should we live with the gargantuan chasm between the prediction of Bayesian rationality for how we should disagree, and the actual facts of how we do disagree?

We could simply declare that human beings are not well-modeled as Bayesians with common priors—that we’ve failed in giving a descriptive account of human behavior—and leave it at that.   OK, but that would still leave the question: does this stuff have normative value?  Should it affect how we behave, if we want to consider ourselves honest and rational?  I would argue, possibly yes.

Yes, you should constantly ask yourself the question: “would I still be defending this opinion, if I had been born as someone else?”  (Though you might say this insight predates Aumann by quite a bit, going back at least to Spinoza.)

Yes, if someone you respect as honest and rational disagrees with you, you should take it as seriously as if the disagreement were between two different aspects of yourself.

Finally, yes, we can try to judge epistemic communities by how closely they approach the Aumannian ideal.  In math and science, in my experience, it’s common to see two people furiously arguing with each other at a blackboard.  Come back five minutes later, and they’re arguing even more furiously, but now their positions have switched.  As we’ve seen, that’s precisely what the math says a rational conversation should look like.  In social and political discussions, though, usually the very best you’ll see is that two people start out diametrically opposed, but eventually one of them says “fine, I’ll grant you this,” and the other says “fine, I’ll grant you that.”  We might say, that’s certainly better than the common alternative, of the two people walking away even more polarized than before!  Yet the math tells us that even the first case—even the two people gradually getting closer in their views—is nothing at all like a rational exchange, which would involve the two participants repeatedly leapfrogging each other, completely changing their opinion about the question under discussion (and then changing back, and back again) every time they learned something new.  The first case, you might say, is more like haggling—more like “I’ll grant you that X is true if you grant me that Y is true”—than like our ideal friendly mathematicians arguing at the blackboard, whose acceptance of new truths is never slow or grudging, never conditional on the other person first agreeing with them about something else.

Armed with this understanding, we could try to rank fields by how hard it is to have an Aumannian conversation in them.  At the bottom—the easiest!—is math (or, let’s say, chess, or debugging a program, or fact-heavy fields like lexicography or geography).  Crucially, here I only mean the parts of these subjects with agreed-on rules and definite answers: once the conversation turns to whose theorems are deeper, or whose fault the bug was, things can get arbitrarily non-Aumannian.  Then there’s the type of science that involves messy correlational studies (I just mean, talking about what’s a risk factor for what, not the political implications).  Then there’s politics and aesthetics, with the most radioactive topics like Israel/Palestine higher up.  And then, at the very peak, there’s gender and social justice debates, where everyone brings their formative experiences along, and absolutely no one is a disinterested truth-seeker, and possibly no Aumannian conversation has ever been had in the history of the world.

I would urge that even at the very top, it’s still incumbent on all of us to try to make the Aumannian move, of “what would I think about this issue if I were someone else and not me?  If I were a man, a woman, black, white, gay, straight, a nerd, a jock?  How much of my thinking about this represents pure Spinozist reason, which could be ported to any rational mind, and how much of it would get lost in translation?”

Anyway, I’m sure some people would argue that, in the end, the whole framework of Bayesian agents, common priors, common knowledge, etc. can be chucked from this discussion like so much scaffolding, and the moral lessons I want to draw boil down to trite advice (“try to see the other person’s point of view”) that you all knew already.  Then again, even if you all knew all this, maybe you didn’t know that you all knew it!  So I hope you gained some new information from this talk in any case.  Thanks.


Update: Coincidentally, there’s a moving NYT piece by Oliver Sacks, which (among other things) recounts his experiences with his cousin, the Aumann of Aumann’s theorem.


Another Update: If I ever did attempt an Aumannian conversation with someone, the other Scott A. would be a candidate! Here he is in 2011 making several of the same points I did above, using the same examples (I thank him for pointing me to his post).

Celebrate gay marriage—and its 2065 equivalent

Saturday, June 27th, 2015

Yesterday was a historic day for the United States, and I was as delighted as everyone else I know.  I’ve supported gay marriage since the mid-1990s, when as a teenager, I read Andrew Hodges’ classic biography of Alan Turing, and burned with white-hot rage at Turing’s treatment.  In the world he was born into—our world, until fairly recently—Turing was “free”: free to prove the unsolvability of the halting problem, free to help save civilization from the Nazis, just not free to pursue the sexual and romantic fulfillment that nearly everyone else took for granted.  I resolved then that, if I was against anything in life, I was against the worldview that had hounded Turing to his death, or anything that even vaguely resembled it.

So I’m proud for my country, and I’m thrilled for my gay friends and colleagues and relatives.  At the same time, seeing my Facebook page light up with an endless sea of rainbow flags and jeers at Antonin Scalia, there’s something that gnaws at me.  To stand up for Alan Turing in 1952 would’ve taken genuine courage.  To support gay rights in the 60s, 70s, 80s, even the 90s, took courage.  But celebrating a social change when you know all your friends will upvote you, more than a decade after the tide of history has made the change unstoppable?  It’s fun, it’s righteous, it’s justified, I’m doing it myself.  But let’s not kid ourselves by calling it courageous.

Do you want to impress me with your moral backbone?  Then go and find a group that almost all of your Facebook friends still consider it okay, even praiseworthy, to despise and mock, for moral failings that either aren’t failings at all or are no worse than the rest of humanity’s.  (I promise: once you start looking, it shouldn’t be hard to find.)  Then take a public stand for that group.

Can blog posts nourish the soul? Scott A. (alas, not me) as existence proof

Wednesday, June 3rd, 2015

Reading the essays and speculative fiction of Scott Alexander, as they’ve grown in awesomeness even just within the past half-year, has for me been like witnessing the birth of a new Asimov.  (For more Alexandery goodness, check out Universal Love, Said the Cactus Person.)  That this nerd-bard, this spinner of stupid Internet memes into reflections on eternity, came to my attention by way of his brilliantly defending me, is almost immaterial at this point; I don’t think it plays any role in my continuing admiration for his work.  Whatever you do, just keep writing, other Scott A.

NSA in P/poly: The Power of Precomputation

Friday, May 22nd, 2015

Even after the Snowden revelations, there remained at least one big mystery about what the NSA was doing and how.  The NSA’s classified 2013 budget request mentioned, as a priority item, “groundbreaking cryptanalytic capabilities to defeat adversarial cryptography and exploit internet traffic.”  There was a requested increase, of several hundred million dollars, for “cryptanalytic IT services” and “cryptanalysis and exploitation services program C” (whatever that was).  And a classified presentation slide showed encrypted data being passed to a high-performance computing system called “TURMOIL,” and decrypts coming out.  But whatever was going on inside TURMOIL seemed to be secret even within NSA; someone at Snowden’s level wouldn’t have had access to the details.

So, what was (or is) inside the NSA’s cryptanalytic black box?  A quantum computer?  Maybe even one that they bought from D-Wave?  (Rimshot.)  A fast classical factoring algorithm?  A proof of P=NP?  Commentators on the Internet rushed to suggest each of these far-reaching possibilities.  Some of us tried to pour cold water on these speculations—pointing out that one could envision many scenarios that were a little more prosaic, a little more tied to the details of how public-key crypto is actually used in the real world.  Were we just naïve?

This week, a new bombshell 14-author paper (see also the website) advances an exceedingly plausible hypothesis about what may have been the NSA’s greatest cryptanalytic secret of recent years.  One of the authors is J. Alex Halderman of the University of Michigan, my best friend since junior high school, who I’ve blogged about before.  Because of that, I had some advance knowledge of this scoop, and found myself having to do what regular Shtetl-Optimized readers will know is the single hardest thing in the world for me: bite my tongue and not say anything.  Until now, that is.

Besides Alex, the other authors are David Adrian, Karthikeyan Bhargavan, Zakir Durumeric, Pierrick Gaudry, Matthew Green, Nadia Heninger, Drew Springall, Emmanuel Thomé, Luke Valenta, Benjamin VanderSloot, Eric Wustrow, Santiago Zanella-Béguelink, and Paul Zimmermann (two of these, Green and Heninger, have previously turned up on Shtetl-Optimized).

These authors study vulnerabilities in Diffie-Hellman key exchange, the “original” (but still widely-used) public-key cryptosystem, the one that predates even RSA.  Diffie-Hellman is the thing where Alice and Bob first agree on a huge prime number p and a number g, then Alice picks a secret a and sends Bob ga (mod p), and Bob picks a secret b and sends Alice gb (mod p), and then Alice and Bob can both compute (ga)b=(gb)a=gab (mod p), but an eavesdropper who’s listening in only knows p, g, ga (mod p), and gb (mod p), and one can plausibly conjecture that it’s hard from those things alone to get gab (mod p).  So then Alice and Bob share a secret unknown to the eavesdropper, which they didn’t before, and they can use that secret to start doing cryptography.

As far as anyone knows today, the best way to break Diffie-Hellman is simply by calculating discrete logarithms: that is, solving the problem of recovering a given only g and h=ga (mod p).  At least on a classical computer, the fastest known algorithm for discrete logarithms (over fields of prime order) is the number field sieve (NFS).  Under plausible conjectures about the distribution of “smooth” numbers, NFS uses time that grows like exp((1.923+o(1))(log p)1/3(log log p)2/3), where the exp and logs are base e (and yes, even the lower-order stuff like (log log p)2/3 makes a big difference in practice).  Of course, once you know the running time of the best-known algorithm, you can then try to choose a key size (that is, a value of log(p)) that’s out of reach for that algorithm on the computing hardware of today.

(Note that the recent breakthrough of Antoine Joux, solving discrete log in quasipolynomial time in fields of small characteristic, also relied heavily on sieving ideas.  But there are no improvements from this yet for the “original” discrete log problem, over prime fields.)

But there’s one crucial further fact, which has been understood for at least a decade by theoretical cryptographers, but somehow was slow to filter out to the people who deploy practical cryptosystems.  The further fact is that in NFS, you can arrange things so that almost all the discrete-logging effort depends only on the prime number p, and not at all on the specific numbers g and h for which you’re trying to take the discrete log.  After this initial “precomputation” step, you then have a massive database that you can use to speed up the “descent” step: the step of solving ga=h (mod p), for any (g,h) pair that you want.

It’s a little like the complexity class P/poly, where a single, hard-to-compute “advice string” unlocks exponentially many inputs once you have it.  (Or a bit more precisely, one could say that NFS reveals that exponentiation modulo a prime number is sort of a trapdoor one-way function, except that the trapdoor information is subexponential-size, and given the trapdoor, inverting the function is still subexponential-time, but a milder subexponential than before.)

The kicker is that, in practice, a large percentage of all clients and servers that use Diffie-Hellman key exchange use the same few prime numbers p.  This means that, if you wanted to decrypt a large fraction of all the traffic encrypted with Diffie-Hellman, you wouldn’t need to do NFS over and over: you could just do it for a few p‘s and cache the results.  This fact can singlehandedly change the outlook for breaking Diffie-Hellman.

The story is different depending on the key size, log(p).  In the 1990s, the US government insisted on “export-grade” cryptography for products sold overseas (what a quaint concept!), which meant that the key size was restricted to 512 bits.  For 512-bit keys, Adrian et al. were able to implement NFS and use it to do the precomputation step in about 7 days on a cluster with a few thousand cores.  After this initial precomputation step (which produced 2.5GB of data), doing the descent, to find the discrete log for a specific (g,h) pair, took only about 90 seconds on a 24-core machine.

OK, but no one still uses 512-bit keys, do they?  The first part of Adrian et al.’s paper demonstrates that, because of implementation issues, even today you can force many servers to “downgrade” to the 512-bit, export-grade keys—and then, having done so, you can stall for time for 90 seconds as you figure out the session key, and then do a man-in-the-middle attack and take over and impersonate the server.  It’s an impressive example of the sort of game computer security researchers have been playing for a long time—but it’s really just a warmup to the main act.

As you’d expect, many servers today are configured more intelligently, and will only agree to 1024-bit keys.  But even there, Adrian et al. found that a large fraction of servers rely on just a single 1024-bit prime (!), and many of the ones that don’t rely on just a few other primes.  Adrian et al. estimate that, for a single 1024-bit prime, doing the NFS precomputation would take about 45 million years using a single core—or to put it more ominously, 1 year using 45 million cores.  If you built special-purpose hardware, that could go down by almost two orders of magnitude, putting the monetary cost at a few hundred million dollars, completely within the reach of a sufficiently determined nation-state.  Once the precomputation was done, and the terabytes of output stored in a data center somewhere, computing a particular discrete log would then take about 30 days using 1 core, or mere minutes using a supercomputer.  Once again, none of this is assuming any algorithmic advances beyond what’s publicly known.  (Of course, it’s possible that the NSA also has some algorithmic advances; even modest ones could obviate the need for special-purpose hardware.)

While writing this post, I did my own back-of-the-envelope, and got that using NFS, calculating a 1024-bit discrete log should be about 7.5 million times harder than calculating a 512-bit discrete log.  So, extrapolating from the 7 days it took Adrian et al. to do it for 512 bits, this suggests that it might’ve taken them about 143,840 years to calculate 1024-bit discrete logs with the few thousand cores they had, or 1 year if they had 143,840 times as many cores (since almost all this stuff is extremely parallelizable).  Adrian et al. mention optimizations that they expect would improve this by a factor of 3, giving us about 100 million core-years, very similar to Adrian et al.’s estimate of 45 million core-years (the lower-order terms in the running time of NFS might account for some of the remaining discrepancy).

Adrian et al. mount a detailed argument in their paper that all of the details about NSA’s “groundbreaking cryptanalytic capabilities” that we learned from the Snowden documents match what would be true if the NSA were doing something like the above.  The way Alex put it to me is that, sure, the NSA might not have been doing this, but if not, then he would like to understand why not—for it would’ve been completely feasible within the cryptanalytic budget they had, and the NSA would’ve known that, and it would’ve been a very good codebreaking value for the money.

Now that we know about this weakness of Diffie-Hellman key exchange, what can be done?

The most obvious solution—but a good one!—is just to use longer keys.  For decades, when applied cryptographers would announce some attack like this, theorists like me would say with exasperation: “dude, why don’t you fix all these problems in one stroke by just, like, increasing the key sizes by a factor of 10?  when it’s an exponential against a polynomial, we all know the exponential will win eventually, so why not just go out to where it does?”  The applied cryptographers explain to us, with equal exasperation in their voices, that there are all sorts of reasons why not, from efficiency to (maybe the biggest thing) backwards-compatibility.  You can’t unilaterally demand 2048-bit keys, if millions of your customers are using browsers that only understand 1024-bit keys.  On the other hand, given the new revelations, it looks like there really will be a big push to migrate to larger key sizes, as the theorists would’ve suggested from their ivory towers.

A second, equally-obvious solution is to stop relying so much on the same few prime numbers in Diffie-Hellman key exchange.  (Note that the reason RSA isn’t vulnerable to this particular attack is that it inherently requires a different composite number N for each public key.)  In practice, generating a new huge random prime number tends to be expensive—taking, say, a few minutes—which is why people so often rely on “standard” primes.  At the least, we could use libraries of millions of “safe” primes, from which a prime for a given session is chosen randomly.

A third solution is to migrate to elliptic-curve cryptography (ECC), which as far as anyone knows today, is much less vulnerable to descent attacks than the original Diffie-Hellman scheme.  Alas, there’s been a lot of understandable distrust of ECC after the DUAL_EC_DBRG scandal, in which it came out that the NSA backdoored some of NIST’s elliptic-curve-based pseudorandom generators by choosing particular parameters that it knew how handle.  But maybe the right lesson to draw is mod-p groups and elliptic-curve groups both seem to be pretty good for cryptography, but the mod-p groups are less good if everyone is using the same few prime numbers p (and those primes are “within nation-state range”), and the elliptic-curve groups are less good if everyone is using the same few parameters.  (A lot of these things do seem pretty predictable with hindsight, but how many did you predict?)

Many people will use this paper to ask political questions, like: hasn’t the NSA’s codebreaking mission once again usurped its mission to ensure the nation’s information security?  Doesn’t the 512-bit vulnerability that many Diffie-Hellman implementations still face, as a holdover from the 1990s export rules, illustrate why encryption should never be deliberately weakened for purposes of “national security”?  How can we get over the issue of backwards-compatibility, and get everyone using strong crypto?  People absolutely should be asking such questions.

But for readers of this blog, there’s one question that probably looms even larger than those of freedom versus security, openness versus secrecy, etc.: namely, the question of theory versus practice.  Which “side” should be said to have “won” this round?  Some will say: those useless theoretical cryptographers, they didn’t even know how their coveted Diffie-Hellman system could be broken in the real world!  The theoretical cryptographers might reply: of course we knew about the ability to do precomputation with NFS!  This wasn’t some NSA secret; it’s something we discussed openly for years.  And if someone told us how Diffie-Hellman was actually being used (with much of the world relying on the same few primes), we could’ve immediately spotted the potential for such an attack.  To which others might reply: then why didn’t you spot it?

Perhaps the right lesson to draw is how silly such debates really are.  In the end, piecing this story together took a team that was willing to do everything from learning some fairly difficult number theory to coding up simulations to poring over the Snowden documents for clues about the NSA’s budget.  Clear thought doesn’t respect the boundaries between disciplines, or between theory and practice.

(Thanks very much to Nadia Heninger and Neal Koblitz for reading this post and correcting a few errors in it.  For more about this, see Bruce Schneier’s post or Matt Green’s post.)

Five announcements

Saturday, May 16th, 2015

1. Sanjeev Arora sent me a heads-up that there’s a discussion about the future of the STOC conference  at the Windows on Theory blog—in particular, about the idea of turning STOC into a longer “CS theory festival.”  If you have opinions about this, don’t miss the chance to make your voice heard.

2. Back in January, I blogged about a new quantum optimization algorithm by Farhi, Goldstone, and Gutmann, which was notable for being, as far as anyone could tell, the first quantum algorithm to achieve a provably better approximation ratio than the best-known classical algorithm for an NP-hard optimization problem.  Today, I report that a fearsome list of authors—Boaz Barak, Ankur Moitra, Ryan O’Donnell, Prasad Raghavendra, Oded Regev, David Steurer, Luca Trevisan, Aravindan Vijayaraghavan, David Witmer, and John Wright—has put out an eagerly-awaited paper that gives a classical algorithm for the same problem, with better performance than the quantum algorithm’s.  (They write that this “improves both qualitatively and quantitatively” on Farhi et al.’s work; I assume “qualitatively” refers to the fact that the new algorithm is classical.)  What happened, apparently, is that after I blogged (with enthusiasm) about the Farhi et al. result, a bunch of classical complexity theorists read my post and decided independently that they could match or beat the quantum algorithm’s performance classically; then they found out about each other and decided to merge their efforts.  I’m proud to say that this isn’t the first example of this blog catalyzing actual research progress, though it’s probably the best example so far.  [Update: Luca Trevisan now has a great post explaining what happened in much more detail, entitled “How Many Theoreticians Does It Take to Approximate Max 3Lin?”]

Another update: Farhi et al. have posted a new version of their paper, in which they can almost match the performance of the classical algorithm using their quantum algorithm.

3. Jennifer Ouellette has a wonderful article in Quanta magazine about recent progress in AdS/MERA (i.e., “the emergence of spacetime from entanglement”), centered around the ideas of Brian Swingle.  This is one of the main things that I’d love to understand better right now—if I succeed even partially, you’ll know because I’ll write a blog post trying to explain it to others.  See also this blog post by Sean Carroll (about this paper by Ning Bao et al.), and this paper by Pastawski, Yoshida, Harlow, and Preskill, which explicitly mines the AdS/CFT correspondence for examples of quantum error-correcting codes.

4. Celebrity rationalist Julia Galef, who I had the great honor of meeting recently, has a podcast interview with Sean Carroll about why Carroll accepts the many-worlds interpretation.  (Or if, like me, you prefer the written word to the spoken one, click here for a full transcript.)  Unfortunately, Sean is given the opportunity at the end of the interview to recommend one science book to his listeners—just one!—but he squanders it by plugging some weird, self-indulgent thing called Quantum Computing Since Democritus.  Julia also has a YouTube video about what she learned from the interview, but I haven’t yet watched it (is there a transcript?).

5. I came across an insightful if meandering essay about nerd culture by Meredith L. Patterson.  In particular, noticing how the term “nerd” has been co-opted by normal, socially-skilled people, who’ve quickly set about remaking nerd social norms to make them identical to the rest of the world’s norms, Patterson coins the term “weird-nerd” to describe people like herself, who are still nerds in the original sense and who don’t see nerd culture as something horribly, irreparably broken.  As she writes: “We’ll start to feel less defensive when we get some indication — any indication — that our critics understand what parts of our culture we don’t want to lose and why we don’t want to lose them.”  (But is this the start of a linguistic treadmill?  Will we eventually need to talk about weird-weird-nerds, etc.?)