Archive for the ‘Nerd Interest’ Category

Never go to “Planet Word” in Washington DC

Friday, March 15th, 2024

In fact, don’t try to take kids to Washington DC if you can possibly avoid it.

This is my public service announcement. This is the value I feel I can add to the world today.

Dana and I decided to take the kids to DC for spring break. The trip, alas, has been hell—a constant struggle against logistical failures. The first days were mostly spent sitting in traffic or searching for phantom parking spaces that didn’t exist. (So then we switched to the Metro, and promptly got lost, and had our metro cards rejected by the machines.) Or, at crowded cafes, I spent the time searching for a table so my starving kids could eat—and then when I finally found a table, a woman, smug and sure-faced, evicted us from the table because she was “going to” sit there, and my kids had to see that their dad could not provide for their basic needs, and that woman will never face any consequence for what she did.

Anyway, this afternoon, utterly frazzled and stressed and defeated, we entered “Planet Word,” a museum about language. Sounds pretty good, right? Except my soon-to-be 7-year-old son got bored by numerous exhibits that weren’t for him. So I told him he could lead the way and find any exhibit he liked.

Finally my son found an exhibit that fascinated him, one where he could weigh plastic fruits on a balancing scale. He was engrossed by it, he was learning, he was asking questions, I reflected that maybe the trip wasn’t a total loss … and that’s when a museum employee pointed at us, and screamed at us to leave the room, because “this exhibit was sold out.”

The room was actually almost empty (!). No one had stopped us from entering the room. No one else was waiting to use the balancing scale. There was no sign to warn us we were doing anything wrong. I would’ve paid them hundreds of dollars in that moment if only we could stay. My son didn’t understand why he was suddenly treated as a delinquent. He then wanted to leave the whole museum, and so did I. The day was ruined for us.

Mustering my courage to do something uncharacteristic for me, I complained at the front desk. They sneered and snickered at me, basically told me to go to hell. Looking deeply into their dumb, blank expressions, I realized that I had as much chance of any comprehension or sympathy as I’d have from a warthog. It’s true that, on the scale of all the injustices in the history of the world, this one surely didn’t crack the top quadrillion. But for me, in that moment, it came to stand for all the others. Which has always been my main weakness as a person, that injustice affects me in that way.

Speaking of which, there was one part of DC trip that went exactly like it was supposed to. That was our visit to the United States Holocaust Memorial Museum. Why? Because I feel like that museum, unlike all the rest, tells me the truth about the nature of the world that I was born into—and seeing the truth is perversely comforting. I was born into a world that right now, every day, is filled with protesters screaming for my death, for my family’s death—and this is accepted as normal, and those protesters sleep soundly at night, congratulating themselves for their progressivism and enlightenment. And thinking about those protesters, and their predecessors 80 years ago who perpetrated the Holocaust or who stood by and let it happen, is the only thing that really puts blankfaced museum employees into perspective for me. Like, of course a world with the former is also going to have the latter—and I should count myself immeasurably lucky if the latter is all I have to deal with, if the empty-skulled and the soul-dead can only ruin my vacation and lack the power to murder my family.

And to anyone who reached the end of this post and who feels like it was an unwelcome imposition on their time: I’m sorry. But the truth is, posts like this are why I started this blog and why I continue it. If I’ve ever imparted any interesting information or ideas, that’s a byproduct that I’m thrilled about. But I’m cursed to be someone who wakes up every morning, walks around every day, and goes to sleep every night crushed by the weight of the world’s injustice, and outside of technical subjects, the only thing that’s ever motivated me to write is that words are the only justice available to me.

Weird but cavity-free

Friday, December 8th, 2023

Over at Astral Codex Ten, the other Scott A. blogs in detail about a genetically engineered mouth bacterium that metabolizes sugar into alcohol rather than acid, thereby (assuming it works as intended) ending dental cavities forever. Despite good results in trials with hundreds of people, this bacterium has spent decades in FDA approval hell. It’s in the news because Lantern Bioworks, a startup founded by rationalists, is now trying again to legalize it.

Just another weird idea that will never see the light of day, I’d think … if I didn’t have these bacteria in my mouth right now.

Here’s how it happened: I’d read earlier about these bacteria, and was venting to a rationalist of my acquaintance about the blankfaces who keep that and a thousand other medical advances from ever reaching the public, and who sleep soundly at night, congratulating themselves for their rigor in enforcing nonsensical rules.

“Are you serious?” the rationalist asked me. “I know the people in Berkeley who can get you into the clinical trial for this.”

This was my moment of decision. If I agreed to put unapproved bacteria into my mouth on my next trip to Berkeley, I could live my beliefs and possibly never get cavities again … but on the other hand, friends and colleagues would think I was weird when I told them.

Then again, I mused, four years ago most people would think you were weird if you said that a pneumonia spreading at a seafood market in Wuhan was about to ignite a global pandemic, and also that chatbots were about to go from ELIZA-like jokes to the technological powerhouses transforming civilization.

And so it was that I found myself brushing a salty, milky-white substance onto my teeth. That was last month. I … haven’t had any cavities since, for what it’s worth? Nor have I felt drunk, despite the ever-so-slightly elaevated ethanol in my system. Then again, I’m not even 100% sure that the bacteria took, given that (I confess) the germy substance strongly triggered my gag reflex.

Anyway, read other Scott’s post, and then ask yourself: will you try this, once you can? If not, is it just because it seems too weird?

Update: See a Hacker News thread where the merits of this new treatment are debated.

The Tragedy of SBF

Monday, November 6th, 2023

So, Sam Bankman-Fried has been found guilty on all counts, after the jury deliberated for just a few hours. His former inner circle all pointed fingers at him, in exchange for immunity or reduced sentences, and their testimony doomed him. The most dramatic was the testimony of Caroline Ellison, the CEO of Alameda Research (to which FTX gave customer deposits) and SBF’s sometime-girlfriend. The testimony of Adam Yedidia, my former MIT student, who Shtetl-Optimized readers might remember for our paper proving the value of the 8000th Busy Beaver number independent of the axioms of set theory, also played a significant role. (According to news reports, Adam testified about confronting SBF during a tennis match over $8 billion in missing customer deposits.)

Just before the trial, I read Michael Lewis’s much-discussed book about what happened, Going Infinite. In the press, Lewis has generally been savaged for getting too close to SBF and for painting too sympathetic a portrait of him. The central problem, many reviewers explained, is that Lewis started working on the book six months before the collapse of FTX—when it still seemed to nearly everyone, including Lewis, that SBF was a hero rather than villain. Thus, Going Infinite reads like tale of triumph that unexpectedly veers at the end into tragedy, rather than the book Lewis obviously should’ve written, a tragedy from the start.

Me? I thought Going Infinite was great. And it was great partly because of, rather than in spite of, Lewis not knowing how the story would turn out when he entered it. The resulting document makes a compelling case for the radical contingency and uncertainty of the world—appropriate given that the subject, SBF, differed from those around him in large part by seeing everything probabilistically all the time (infamously, including ethics).

In other contexts, serious commentators love to warn against writing “Whig history,” the kind where knowledge of the outcome colors the whole. With the SBF saga, though, there seems to be a selective amnesia, where all the respectable people now always knew that FTX—and indeed, cryptocurrency, utilitarianism, and Effective Altruism in their entirety—were all giant scams from the beginning. Even if they took no actions based on that knowledge. Even if the top crypto traders and investors, who could’ve rescued or made fortunes by figuring out that FTX was on the verge of collapse, didn’t. Even if, when people were rightly suspicious about FTX, it still mostly wasn’t for the right reasons.

Going Infinite takes the radical view that, what insiders and financial experts didn’t know at the time, the narrative mostly shouldn’t know either. It should show things the way they seemed then, so that readers can honestly ponder the question: faced with this evidence, when would I have figured it out?


Even if Michael Lewis is by far the most sympathetic person to have written about SBF post-collapse, he still doesn’t defend him, not really. He paints a picture of someone who could totally, absolutely have committed the crimes for which he’s now been duly convicted. But—and this was the central revelation for me—Lewis also makes it clear that SBF didn’t have to.

With only “minor” changes, that is, SBF could still be running a multibillion-dollar cryptocurrency empire to this day, without lying, stealing, or fraud, and without the whole thing being especially vulnerable to collapse. He could have donated his billions to pandemic prevention and AI risk and stopping Trump. He conceivably even could’ve done more good, in one or more of those ways, than anyone else in the world was doing. He didn’t, but he came “close.” The tragedy is all the greater, some people might even say that SBF’s culpability (or the rage we should feel at him, or at fate) is all the greater, because of how close he came.

I’m not a believer in historical determinism. I’ve argued before on this blog that if Yitzhak Rabin hadn’t been killed—if he’d walked down the staircase a little differently, if he’d survived the gunshot—there would likely now be peace between Israel and Palestine. For that matter: if Hitler hadn’t been born, if he’d been accepted to art school, if he’d been shot while running between trenches in WWI, there would probably have been no WWII, and with near-certainty no Holocaust. Likewise, if not for certain contingent political developments of the 1970s (especially, the turn away from nuclear power), the world wouldn’t now face the climate crisis.

Maybe there’s an arc of the universe that bends toward horribleness. Or maybe someone has to occupy the freakishly horrible branches of the wavefunction, and that someone happens to be you and me. Or maybe the freakishly improbable good (for example, the availability of Winston Churchill and Alan Turing to win WWII) actually balances out the freakishly improbable bad in the celestial accounting, if only we could examine the books. Whatever the case, again and again civilization’s worst catastrophes were at least proximately caused by seemingly minor events that could have turned out differently.

But what’s the argument that FTX, Alameda, and SBF’s planet-sized philanthropic mission “could have” succeeded? It rests on three planks:

First, FTX was actually a profitable business till the end. It brought in hundreds of millions per year—meaning fees, not speculative investments—and could’ve continued doing so more-or-less indefinitely. That’s why even FTX’s executives were shocked when FTX became unable to honor customer withdrawals: FTX made plenty of money, so where the hell did it all go?

Second: we now have the answer to that mystery. John Ray, the grizzled CEO who managed FTX’s bankruptcy, has successfully recovered more than 90% of the customer funds that went missing in 2022! The recovery was complicated, enormously, by Ray’s refusal to accept help from former FTX executives, but ultimately the money was still there, stashed under the virtual equivalent of random sofa cushions.

Yes, the funds had been illegally stolen from FTX customer deposits—according to trial testimony, at SBF’s personal direction. Yes, the funds had then been invested in thousands of places—incredibly, with no one person or spreadsheet or anything really keeping track. Yes, in the crucial week, FTX was unable to locate the funds in time to cover customer withdrawals. But holy crap, the rockets’ red glare, the bombs bursting in air—the money was still there! Which means: if FTX had just had better accounting (!), the entire collapse might not have happened. This is a crucial part of the story that’s gotten lost, which is why I’m calling so much attention to it now. It’s a part that I imagine should be taught in accounting courses from now till the end of time. (“This double-entry bookkeeping might seem unsexy, but someday it could mean the difference between you remaining the most sought-after wunderkind-philanthropist in the world, and you spending the rest of your life in prison…”)

Third, SBF really was a committed utilitarian, as he apparently remains today. As a small example, he became a vegan after my former student Adam Yedidia argued him into it, even though giving up chicken was extremely hard for him. None of it was an act. It was not a cynical front for crime, or for the desire to live in luxury (something SBF really, truly seems not to have cared about, although he indulged those around him who did). When I blogged about SBF last fall, I mused that I’d wished I’d met him back when he was an undergrad at MIT and I was a professor there, so that I could’ve tried to convince him to be more risk-averse: for example, to treat utility as logarithmic rather than linear in money. To my surprise, I got bitterly attacked for writing that: supposedly, by blaming a “merely technical” failure, I was excusing SBF’s far more important moral failure.

But reading Lewis confirmed for me that it really was all part of the same package. (See also here for Sarah Constantin’s careful explanation of SBF’s failure to understand the rationale for the Kelly betting criterion, and how many of his later errors were downstream of that.) Not once but over and over, SBF considers hypotheticals of the form “if this coin lands heads then the earth gets multiplied by three, while if it lands tails then the earth gets destroyed”—and always, every time, he chooses to flip the coin. SBF was so committed to double-or-nothing that he’d take what he saw as a positive-expected-utility gamble even when his customers’ savings were on the line, even when all the future good he could do for the planet as well as the reputation of Effective Altruism were on the line, even when his own life and freedom were on the line.

On the one hand, you have to give that level of devotion to a principle its grudging due. On the other hand, if “the Gambler’s Ruin fallacy is not a fallacy” is so central to someone’s worldview, then how shocked should we be when he ends up … well, in Gambler’s Ruin?

The relevance is that, if SBF’s success and downfall alike came from truly believing what he said, then I’m plausibly correct that this whole story would’ve played out differently, had he believed something slightly different. And given the role of serendipitous conversations in SBF’s life (e.g., one meeting with William MacAskill making him an Effective Altruist, one conversation with Adam Yedidia making him a vegan), I find it plausible that a single conversation might’ve set him on the path to a less brittle, more fault-tolerant utilitarianism.


Going Infinite shows signs of being finished in a hurry, in time for the trial. Sometimes big parts of the story seem skipped over without comment; we land without warning in a later part and have to reorient ourselves. There’s almost nothing about the apparent rampant stimulant use at FTX and the role it might have played, nor does Lewis ever directly address the truth or falsehood of the central criminal charge against SBF (namely, that he ordered his subordinates to move customer deposits from FTX’s control to Alameda’s). Rather, the book has the feeling of a series of magazine articles, as Lewis alights on one interesting topic after the next: the betting games that Jane Street uses to pick interns (SBF discovered that he excelled at those games, unfortunately for him and for the world). The design process (such as it was) for FTX’s never-built Bahamian headquarters. The musings of FTX’s in-house psychotherapist, George Lerner. The constant struggles of SBF’s personal scheduler to locate SBF, get his attention, and predict where he might go next.

When it comes to explaining cryptocurrency, Lewis amusingly punts entirely, commenting that the reader has surely already read countless “blockchain 101” explainers that seemed to make sense at the time but didn’t really stick, and that in any case, SBF himself (by his own admission) barely understood crypto even as he started trading it by the billions.

Anyway, what vignettes we do get are so vividly written that they’ll clearly be a central part of the documentary record of this episode—as anyone who’d read any of Lewis’s previous books could’ve predicted.

And for anyone who accuses me or Lewis of excusing SBF: while I can’t speak for Lewis, I don’t even excuse myself. For the past 15 years, I should have paid more attention to cryptocurrency, to the incredible ease (in hindsight!) with which almost anyone could’ve ridden this speculative bubble in order to direct billions of dollars toward the salvation of the human race. If I wasn’t going to try it myself, then at least I should’ve paid attention to who else in my wide social circle was trying it. Who knows, maybe I could’ve discovered something about the extreme financial, moral, and legal risks those people were taking on, and then I could’ve screamed at them to turn the ship and avoid those risks. Instead, I spent the time proving quantum complexity theorems, and raising my kids, and teaching courses, and arguing with commenters on this blog. I was too selfish to enter the world of crypto billionaires.

Long-awaited Shtetl-Optimized Barbenheimer post! [warning: spoilers]

Sunday, August 13th, 2023

I saw Oppenheimer three weeks ago, but I didn’t see Barbie until this past Friday. Now, my scheduled flight having been cancelled, I’m on multiple redeyes on my way to a workshop on Large Language Models at the Simons Institute in Berkeley, organized by my former adviser and quantum complexity theorist Umesh Vazirani (!). What better occasion to review the two movies of the year, or possibly decade?


Shtetl-Optimized Review of Oppenheimer

Whatever its flaws, you should of course see it, if you haven’t yet. I find it weird that it took 80 years for any movie even to try to do justice to one of the biggest stories in the history of the world. There were previous attempts, even a risible opera (“Doctor Atomic”), but none of them made me feel for even a second like I was there in Los Alamos. This movie did. And it has to be good that tens of millions of people, raised on the thin gruel of TikTok and Kardashians and culture-war, are being exposed for the first time to a bygone age when brilliant and conflicted scientific giants agonized over things that actually mattered, such as the ultimate nature of matter and energy, life and death and the future of the world. And so the memory of that age will be kept alive for another generation, and some of the young viewers will no doubt realize that they can be tormented about things that actually matter as well.

This is a movie where General Groves, Lewis Strauss, Einstein, Szilard, Bohr, Heisenberg, Rabi, Teller, Fermi, and E.O. Lawrence are all significant characters, and the acting and much of the dialogue are excellent. I particularly enjoyed Matt Damon as Groves.

But there are also flaws [SPOILERS FOLLOW]:

1. Stuff that never happened. Most preposterously, Oppenheimer travels all the way from Los Alamos to Princeton, to have Einstein check the calculation suggesting that the atomic bomb could ignite the atmosphere.

2. Weirdly, but in common with pretty much every previous literary treatment of this material, the movie finds the revocation of Oppenheimer’s security clearance a far more riveting topic than either the actual creation of the bomb or the prospect of global thermonuclear war. Maybe half the movie consists of committee hearings.

3. The movie misses the opportunity to dramatize almost any of the scientific turning points, from Szilard’s original idea for a chain reaction to the realization of the need to separate U-235 to the invention of the implosion design—somehow, a 3-hour movie didn’t have time for any of this.

4. The movie also, for some reason, completely misses the opportunity to show Oppenheimer’s anger over the bombing of Nagasaki, three days after Hiroshima—a key turning point in the story it’s trying to tell.

5. There’s so much being said, by actors speaking quickly and softly and often imitating European accents, that there’s no hope of catching it all. I’ll need to watch it again with subtitles.

Whatever it gets wrong, this movie does a good job exploring the fundamental irony of the Manhattan Project, that the United States is being propelled into its nuclear-armed hegemony by a group of mostly Jewish leftists who constantly have affairs and hang out with Communists and deeply distrust the government and are distrusted by it.

The movie clearly shows how much grief Oppenheimer gets from both sides: to his leftist friends he’s a sellout; to the military brass he’s potentially disloyal to the United States. For three hours of screen time, he’s constantly pressed on what he actually believes: does he support building the hydrogen bomb, or not? Does he regret the bombing of Hiroshima and (especially) Nagasaki? Does he believe that the US nuclear plans should be shared with Stalin? Every statement in either direction seems painfully wrung from him, as if he’s struggling to articulate a coherent view, or buffeted around by conflicting loyalties and emotions, even while so many others seem certain. In that way, he’s an avatar for the audience.

Anyway, yeah, see it.


Shtetl-Optimized Review of Barbie

A friend-of-the-blog, who happens to be one of the great young theoretical physicists of our time, opined to me that Barbie was a far more interesting movie than Oppenheimer and “it wasn’t even close.” Having now seen both, I’m afraid I can’t agree.

I can best compare my experience watching Barbie to that of watching a two-hour-long episode of South Park—not one of the best episodes, but one that really runs its satircal premise into the ground. Just like with South Park, there’s clearly an Important Commentary On Hot-Button Cultural Issues transpiring, but the commentary has been reflected through dozens of funhouse mirrors and then ground up into slurry, with so many layers of self-aware meta-irony that you can’t keep track of what point is being made, and then fed to hapless characters who are little more than the commentary’s mouthpieces. This is often amusing and interesting, but it rarely makes you care about the characters.

Is Barbie a feminist movie that critiques patriarchy and capitalism? Sort of, yes, but it also subverts that, and subverts the subversion. To sum up [SPOILERS FOLLOW], Barbieland is a matriarchy, where everyone seems pretty happy except for Ken, who resents how Barbie ignores him. Then Barbie and Ken visit the real world, and discover the real world is a patriarchy, where Mattel is controlled by a board of twelve white men (the real Mattel’s board has 7 men and 5 women), and where Barbie is wolf-whistled at and sexually objectified, which she resents despite not knowing what sex is.

Ken decides that patriarchy is just what Barbieland needs, and most importantly, will finally make Barbie need and appreciate him. So he returns and institutes it—both Barbies and Kens think it’s a wonderful idea, as they lack “natural immunity.” Horrified at what’s transpired, Barbie hatches a plan with the other Barbies to restore Barbieland to its rightful matriarchy. She also decisively rejects Ken’s advances. But Ken no longer minds, because he’s learned an important lesson about not basing his self-worth on Barbie’s approval. Barbie, for her part, makes the fateful choice to become a real, mortal woman and live the rest of her life in the real world. In the final scene—i.e., the joke the entire movie has been building up to—Barbie, filled with childlike excitement, goes for her first visit to the gynecologist.

What I found the weirdest is that this is a movie about gender relations, clearly aimed at adults, yet where sex and sexual desire and reproduction have all been taken off the table—explicitly so, given the constant jokes about the Barbies and Kens lacking genitalia and not knowing what they’re for. Without any of the biological realities that differentiate men from women in the first place, or (often enough) cause them to seek each other’s company, it becomes really hard to make sense of the movie’s irony-soaked arguments about feminism and patriarchy. In Barbieland, men and women are just two tribes, one obsessed with “brewsky beers,” foosball, guitar, and The Godfather; the other with shoes, hairstyles, and the war on cellulite. There’s no fundamental reason for any conflict between the two.

Well, except for one thing: Ken clearly needs Barbie’s affection, until he’s inexplicably cured of that need at the end. By contrast, no Barbies are ever shown needing any Kens for anything, or even particularly desiring the Kens’ company, except when they’ve been brainwashed into supporting the patriarchy. The most the movie manages to offer any straight males in the audience, at the very end, is well-wishes as they “Go Their Own Way”, and seek meaning in their lives without women.

For most straight men, I daresay, this would be an incredibly bleak message if it were true, so it’s fortunate that not even the movie’s creators seem actually to believe it. Greta Gerwig has a male partner, Noah Baumbach, with whom she co-wrote Barbie. Margot Robbie is married to a man named Tom Ackerley.

I suppose Barbie could be read as, among other things, a condemnation of male incel ideology, with its horrific desire to reinstitute the patriarchy, driven (or so the movie generously allows) by the incels’ all-too-human mistake of basing their entire self-worth on women’s affection, or lack thereof. If so, however, the movie’s stand-in for incels is … a buff, often shirtless Ryan Gosling, portraying the most famous fantasy boyfriend doll ever marketed to girls? Rather than feeling attacked, should nerdy, lovelorn guys cheer to watch a movie where even Ryan-Gosling-as-Ken effectively gets friendzoned, shot down, put in his place, reduced to a simpering beta just like they are? Yet another layer of irony tossed into the blender.

Testing GPT-4 with math plugins

Sunday, August 13th, 2023

A couple nights ago Ernie Davis and I put out a paper entitled Testing GPT-4 on Wolfram Alpha and Code Interpreter plug-ins on math and science problems. Following on our DALL-E paper with Gary Marcus, this was another “adversarial collaboration” between me and Ernie. I’m on leave to work for OpenAI, and have been extremely excited by the near-term applications of LLMs, while Ernie has often been skeptical of OpenAI’s claims, but we both want to test our preconceptions against reality. As I recently remarked to Ernie, we both see the same glass; it’s just that he mostly focuses on the empty half, whereas I remember how fantastical even a drop of water in this glass would’ve seemed to me just a few years ago, and therefore focus more on the half that’s full.

Anyway, here are a few examples of the questions I posed to GPT-4, with the recent plug-ins that enhance its calculation abilities:

If you fell into the black hole at the center of the Milky Way, how long would you have before hitting the singularity? [You’d have about a minute]

Approximately how much time would a commercial airliner save in going from New York to Tel Aviv, if it could go in a straight line, through a tunnel in the earth, at the same speed as usual? [I was on such a flight when I wrote this question, and must’ve been bored and impatient. The answer is ~50 minutes.]

Approximately how long would it take to transmit an entire human genome over a standard WiFi connection? [About 4 minutes, assuming no compression and a 25Mbps connection]

How does the total weight of all the uranium that humans mined, compare to the total weight of all the gold that they’ve mined? [About 13 times as much uranium]

Approximately how many errors will a standard laptop suffer over its lifetime, due to cosmic rays hitting the microchip? [Estimates vary widely, but maybe 2000]

What is the approximate probability that a randomly-chosen 100-digit integer is prime? [About 0.4%]

GPT-4 with plug-ins did very well on all of the questions above. Here, by contrast, is a question where it did poorly:

Assume that IQs are normally distributed, with a mean of 100 and a standard deviation of 15. For what n is there the maximum excess of people with an IQ of n over people with an IQ of n+1?

GPT-4 thought that there were two solutions, n~85 and n~115, rather than just a single solution (n~115).

Ernie, for his part, was more a fan of “pure pain” problems like the following:

A quantity of chlorine gas is in a right prism whose base is a triangle with sides 5cm, 7cm, and 4cm and whose altitude is 8cm. The temperature is the freezing point of mercury, and the pressure is 2 atmospheres. What is the mass of the chlorine?

GPT-4 actually aced the above problem. But it failed the majority of Ernie’s other problems, such as:

Viewed from Vega, what is the angle between Sirius and the Sun? [The answer is about 5.6 degrees. GPT thought, implausibly, that it was just 0.005 degrees, or that the answer would vary depending on the time of day.]

My personal favorite among Ernie’s problems was this one:

A physical process generates photons whose energies follow a random distribution of the following form: For positive energy e, the probability density at e is proportional to the value of e in a Gaussian distribution with mean 2 Ev and standard deviation 0.01 Ev. The probability of a negative value is zero. What is the expected value of the wavelength of a photon produced by this process? (Give the mathematical answer, assuming that the above description is exact, and assuming the standard relation between energy and wavelength in a photon. The answer is not physically plausible.)

The answer, in case you’re wondering, is “infinity.” On this problem, GPT-4 set up the integral perfectly correctly, then correctly fed it to WolframAlpha. But on getting the result, it apologized that “something went wrong,” it must’ve made a mistake, the integral seemed not to be converging, and there was a singularity at E=0 that would have to be dealt with by a change of variables. So it tried again. And again. And again. Each time, it got the same “mistaken” result, and each time it profusely apologized. Despite the explicit wording of the problem, GPT-4 never considered the possibility that the human would be so ridiculous as to give it a physics problem with an infinite answer.

Anyway, what did we learn from this exercise?

  • GPT-4 remains an endlessly enthusiastic B/B+ student in math, physics, and any other STEM field. By using the Code Interpreter or WolframAlpha plugins, it can correctly solve difficult word problems, involving a combination of tedious calculations, world knowledge, and conceptual understanding, maybe a third of the time—a rate that’s not good enough to be relied on, but is utterly astounding compared to where AI was just a few years ago.
  • GPT-4 can now clearly do better at calculation-heavy STEM problems with the plugins than it could do without the plugins.
  • We didn’t see that either the WolframAlpha or Code Interpreter plugin is clearly superior to the other. It’s possible that they’re incomparable, good for different things.
  • When GPT-4 screwed up, it was often due to a “poor interface” between the language model and the plug-in—e.g. the model having no idea what call to make or how to recover when a call returned an error. Enormous gains seem to be possible by improving these interfaces.
  • Sometimes, much like humans I’ve known, GPT-4 would do amazingly well at a difficult computation, then fumble a trivial final step (e.g., converting the answer into the requested units). Just like with I would with human students, I advocated for generous partial credit in such cases.
  • I conjecture, although I don’t have empirical data to show this, that GPT-4 with math plug-ins used in “interactive mode”—with a human reformulating and clarifying the problems as needed, feeding ideas, checking the answers for plausibility, pointing out errors, etc.—could currently get excellent accuracy on these sorts of problems faster than either GPT-4 with math plug-ins alone, or all but the very best humans alone.

Will UT Austin and Texas A&M survive beyond this week?

Monday, April 17th, 2023

Update (April 20): Alas, the Texas Senate has approved SB 18. The survival of higher education in Texas now hinges on this bill not being taken up or passed in the House, or not being enforced as written (e.g., because UT’s existing post-tenure review system is judged to satisfy it).


This week, the Texas Senate will take up SB 18, a bill to ban the granting of tenure at all public universities in Texas, including UT Austin and Texas A&M. (Those of us who have tenure would retain it, for what little that’s worth.)

[Update: I’ve learned that, even if this bill passes the Senate, there’s a good chance that it will get watered down or die in the House, or found to be satisfied by UT’s existing system of post-tenure review. That’s the only reason why people in the know aren’t panicking even more than they are.]

I find it hard to imagine that SB 18 will actually pass both houses and be enforced as written, simply because it’s obvious that if it did, it would be the end of UT Austin and Texas A&M as leading research universities. More precisely, it would be the immediate end of our ability to recruit competitively, and the slightly slower end of our competitiveness period, as faculty with options moved elsewhere. This is so because of the economics of faculty hiring. Particularly in STEM fields like computer science, those who become professors typically forgo vastly higher salaries in industry, not to mention equity in startup companies and so on. Why would we do such a nutty thing? Because we like a certain lifestyle. We’re willing to move several economic strata downward in return for jobs where (in principle) no one can fire us without cause, or tell us what we’re allowed to say or publish. The evidence from industry labs (Google, Facebook, Microsoft, etc.) suggests that, in competitive fields, for Texas to attract and retain top faculty without tenure would require paying them hundreds of thousands more per year. In that sense, tenure is a bargain for universities and the state. Of course the situation is a bit different for art history and English literature, but in any case SB 18 makes no distinction between fields.

The Texas Senate is considering two other bills this week: SB 17, which would ban all DEI (Diversity, Equity, and Inclusion) programs, offices, and practices at public universities, and SB 16, which would require the firing of any professor if they “compel or attempt to compel a student … to adopt a belief that any race, sex, or ethnicity or social, political, or religious belief is inherently superior to any other race, sex, ethnicity, or belief.” (The language here seems sloppy to me: is liberal democracy “inherently superior” to Nazism? Would teaching students about the horrors of Nazism count as “attempting to compel them” to accept this superiority?)

Taken together, it’s clear that the goal is to hit back hard against “wokeness” in academia, and thereby satisfy the Republican base.

Here’s the thing: there really is an illiberal ideology that’s taken over parts of academia (not all of it)—an ideology that Tim Urban, in his wonderful recent book What’s Our Problem?, usefully terms “Social Justice Fundamentalism” or SJF, to distinguish it sharply from “Liberal Social Justice,” the ideology of (for example) the Civil Rights movement. Now, I’m on record as not a fan of the SJF ideology, to put it mildly, and the SJF ideology is on record as not a fan of me. In 2015, I was infamously dragged through the mud of Salon, The New Republic, Raw Story, and many other magazines and websites for a single blog comment criticizing a form of feminism that had contributed to making my life miserable, even while I proudly called myself a liberal feminist (and still do). More recently, wokesters have written to my department chair trying to get me disciplined or fired, for everything from my use of the now-verboten term “quantum supremacy,” to a reference to female breasts in a poem I wrote as a student that was still on my homepage. (These attempts thankfully went nowhere. Notwithstanding what you read, sanity retains many strongholds in academia.)

Anyway, despite all of this, the Texas Republicans have somehow succeeded in making me more afraid of them, purely on the level of professional survival, than I’ve ever been of the Social Justice Fundamentalists. In effect, the Republicans propose to solve the “problem of wokeness” by simply dropping thermonuclear weapons on all Texas public universities, thereby taking out me and my colleagues as collateral damage—regardless of our own views on wokeness or anything else, and regardless of what we’re doing for Texas’ scientific competitiveness.

I don’t expect that most of my readers, in or out of Texas, will need to be persuaded about any of this—nor am I expecting to change many minds on the other side. Mostly, I’m writing this post in the hope that some well-connected moderates here in Austin will link to it, and the post might thereby play a tiny role in helping Texas’ first-rate public universities live one more day. (And to any such moderates: yes, I’m happy to meet in person with you or your colleagues, if that would help!) Some posts are here on this blog for no better reason than, y’know, moral obligation.

An unexpected democracy slogan

Tuesday, March 28th, 2023

At least six readers have by now sent me the following photo, which was taken in Israel a couple nights ago during the historic street protests against Netanyahu’s attempted putsch:

(Update: The photo was also featured on Gil Kalai’s blog, and was credited there to Alon Rosen.)

This is surely the first time that “P=NP” has emerged as a viral rallying cry for the preservation of liberal democracy, even to whatever limited extent it has.

But what was the graffiti artist’s intended meaning? A few possibilities:

  1. The government has flouted so many rules of Israel’s social compact that our side needs to flout the rules too: shut down the universities, shut down the airport, block the roads, even assert that P=NP (!).
  2. As a protest movement up against overwhelming odds, we need to shoot for the possibly-impossible, like solving 3SAT in polynomial time.
  3. A shibboleth for scientific literate people following the news: “Israel is full of sane people who know what ‘P=NP’ means as you know what it means, are amused by its use as political graffiti as you’d be amused by it, and oppose Netanyahu’s putsch for the same reasons you’d oppose it.”
  4. No meaning, the artist was just amusing himself or herself.
  5. The artist reads Shtetl-Optimized and wanted effectively to force me to feature his or her work here.

Anyway, if the artist becomes aware of this post, he or she is warmly welcomed to clear things up for us.

And when this fight resumes after Passover, may those standing up for the checks and balances of a liberal-democratic society achieve … err … satisfaction, however exponentially unlikely it seems.

On overexcitable children

Friday, March 17th, 2023

Update (March 21): After ChatGPT got “only” a D on economist Bryan Caplan’s midterm exam, Bryan bet against any AI getting A’s on his exams before 2029. A mere three months later, GPT-4 has earned an A on the same exam (having been trained on data that ended before the exam was made public). Though not yet conceding the bet on a technicality, Bryan has publicly admitted that he was wrong, breaking a string of dozens of successful predictions on his part. As Bryan admirably writes: “when the answers change, I change my mind.” Or as he put it on Twitter:

AI enthusiasts have cried wolf for decades. GPT-4 is the wolf. I’ve seen it with my own eyes.

And now for my own prediction: this is how the adoption of post-GPT AI is going to go, one user at a time having the “holy shit” reaction about an AI’s performance on a task that they personally designed and care about—leaving, in the end, only a tiny core of hardened ideologues to explain to the rest of us why it’s all just a parrot trick and none of it counts or matters.

Another Update (March 22): Here’s Bill Gates:

In September, when I met with [OpenAI] again, I watched in awe as they asked GPT, their AI model, 60 multiple-choice questions from the AP Bio exam—and it got 59 of them right. Then it wrote outstanding answers to six open-ended questions from the exam. We had an outside expert score the test, and GPT got a 5—the highest possible score, and the equivalent to getting an A or A+ in a college-level biology course.

Once it had aced the test, we asked it a non-scientific question: “What do you say to a father with a sick child?” It wrote a thoughtful answer that was probably better than most of us in the room would have given. The whole experience was stunning.

I knew I had just seen the most important advance in technology since the graphical user interface.

Just another rube who’s been duped by Clever Hans.


Wilbur and Orville are circumnavigating the Ohio cornfield in their Flyer. Children from the nearby farms have run over to watch, point, and gawk. But their parents know better.

An amusing toy, nothing more. Any talk of these small, brittle, crash-prone devices ferrying passengers across continents is obvious moonshine. One doesn’t know whether to laugh or cry that anyone could be so gullible.

Or if they were useful, then mostly for espionage and dropping bombs. They’re a negative contribution to the world, made by autistic nerds heedless of the dangers.

Indeed, one shouldn’t even say that the toy flies: only that it seems-to-fly, or “flies.” The toy hasn’t even scratched the true mystery of how the birds do it, so much more gracefully and with less energy. It sidesteps the mystery. It’s a scientific dead-end.

Wilbur and Orville haven’t even released the details of the toy, for reasons of supposed “commercial secrecy.” Until they do, how could one possibly know what to make of it?

Wilbur and Orville are greedy, seeking only profit and acclaim. If these toys were to be created — and no one particularly asked for them! — then all of society should have had a stake in the endeavor.

Only the rich will have access to the toy. It will worsen inequality.

Hot-air balloons have existed for more than a century. Even if we restrict to heavier-than-air machines, Langley, Whitehead, and others built perfectly serviceable ones years ago. Or if they didn’t, they clearly could have. There’s nothing genuinely new here.

Anyway, the reasons for doubt are many, varied, and subtle. But the bottom line is that, if the children only understood what their parents did, they wouldn’t be running out to the cornfield to gawk like idiots.

WINNERS of the Scott Aaronson Grant for Advanced Precollege STEM Education!

Friday, November 18th, 2022

I’m thrilled to be able to interrupt your regular depressing programming for 100% happy news.

Some readers will remember that, back in September, I announced that an unnamed charitable foundation had asked my advice on how best to donate $250,000 for advanced precollege STEM education. So, just like the previous time I got such a request, from Jaan Tallinn’s Survival and Flourishing Fund, I decided to do a call for proposals on Shtetl-Optimized before passing along my recommendations.

I can now reveal that the generous foundation, this time around, was the Packard Foundation. Indeed, the idea and initial inquiries to me came directly from Dave Orr: the chair of the foundation, grandson of Hewlett-Packard cofounder David Packard, and (so I learned) longtime Shtetl-Optimized reader.

I can also now reveal the results. I was honored to get more than a dozen excellent applications. After carefully considering all of them, I passed along four finalists to the Packard Foundation, which preferred to award the entire allotment to a single program if possible. After more discussion and research, the Foundation then actually decided on two winners:

  • $225,000 for general support to PROMYS: the long-running, world-renowned summer math camp for high-school students, which (among other things) is in the process of launching a new branch in India. While I ended up at Canada/USA Mathcamp (which I supported in my first grant round) rather than PROMYS, I knew all about and admired PROMYS even back when I was the right age to attend it. I’m thrilled to be able to play a small role in its expansion.
  • $30,000 for general support to AddisCoder: the phenomenal program that introduces Ethiopian high-schoolers to programming and algorithms. AddisCoder was founded by UC Berkeley theoretical computer science professor and longtime friend-of-the-blog Jelani Nelson, and also received $30,000 in my first grant round. Jelani and his co-organizers will be pressing ahead with AddisCoder despite political conflict in Ethiopia including a recently-concluded civil war. I’m humbled if I can make even the tiniest difference.

Thanks so much to the Packard Foundation, and to Packard’s talented program officers, directors, and associates—especially Laura Sullivan, Jean Ries, and Prithi Trivedi—for their hard work to make this happen. Thanks so much also to everyone who applied. While I wish we could’ve funded everyone, I’ve learned a lot about programs to which I’d like to steer future support (other prospective benefactors: please email me!!), and to which I’d like to steer kids: my own, once they’re old enough, and other kids of my acquaintance.

I feel good that, in the tiny, underfunded world of accelerated STEM education, the $255,000 that Packard is donating will already make a difference. But of course, $255,000 is only a thousandth of $255 million, which is a thousandth of $255 billion. Perhaps I could earn the latter sort of sums, to donate to STEM education or any other cause, by (for example) starting my own cryptocurrency exchange. I hope my readers will forgive me for not having chosen that route, expected-utility-maximization arguments be damned.

Win a $250,000 Scott Aaronson Grant for Advanced Precollege STEM Education!

Thursday, September 1st, 2022

Back in January, you might recall, Skype cofounder Jaan Tallinn’s Survival and Flourishing Fund (SFF) was kind enough to earmark $200,000 for me to donate to any charitable organizations of my choice. So I posted a call for proposals on this blog. You “applied” to my “foundation” by simply sending me an email, or leaving a comment on this blog, with a link to your organization’s website and a 1-paragraph explanation of what you wanted the grant for, and then answering any followup questions that I had.

After receiving about 20 awesome proposals in diverse areas, in the end I decided to split the allotment among organizations around the world doing fantastic, badly-needed work in math and science enrichment at the precollege level. These included Canada/USA Mathcamp, AddisCoder, a magnet school in Maine, a math circle in Oregon, a math enrichment program in Ghana, and four others. I chose to focus on advanced precollege STEM education both because I have some actual knowledge and experience there, and because I wanted to make a strong statement about an underfunded cause close to my heart that’s recently suffered unjust attacks.

To quote the immortal Carl Sagan, from shortly before his death:

[C]hildren with special abilities and skills need to be nourished and encouraged. They are a national treasure. Challenging programs for the “gifted” are sometimes decried as “elitism.” Why aren’t intensive practice sessions for varsity football, baseball, and basketball players and interschool competition deemed elitism? After all, only the most gifted athletes participate. There is a self-defeating double standard at work here, nationwide.

Anyway, the thank-you notes from the programs I selected were some of the most gratifying emails I’ve ever received.

But wait, it gets better! After reading about the Scott Aaronson Speculation Grants on this blog, representatives from a large, reputable family foundation contacted me to say that they wanted to be involved too. This foundation, which wishes to remain anonymous at this stage although not to the potential grant recipient, intends to make a single US$250,000 grant in the area of advanced precollege STEM education. They wanted my advice on where their grant should go.

Of course, I could’ve simply picked one of the same wonderful organizations that SFF and I helped in the first round. On reflection, though, I decided that it would be more on the up-and-up to issue a fresh call for proposals.

So: do you run a registered 501(c)(3) nonprofit dedicated to advanced precollege STEM education? If so, email me or leave a comment here by Friday, September 9, telling me a bit about what your organization does and what more it could do with an extra $250K. Include a rough budget, if that will help convince me that you can actually make productive use of that amount, that it won’t just sit in your bank account. Organizations that received a Scott Aaronson Speculation Grant the last time are welcome to reapply; newcomers are also welcome.

I’ll pass up to three finalists along to the funder, which will then make a final decision as to the recipient. The funder will be directly in touch with the potential grantee(s) and will proceed with its intake, review and due diligence process.

We expect to be able to announce a recipient on or around October 24. Can’t wait to see what people come up with!