The above question came up in conversation with Michael Vassar and some other nerds in New York City yesterday (before I went with relatives to see Gimpel Tam, an extraordinarily dark and depressing musical performed entirely in Yiddish). Look, I know a massive black hole would swallow the earth extremely quickly, and I also know that a microscopic black hole would quickly evaporate as Hawking radiation. So suppose we chose one of intermediate size so as to maximize the earth’s survival time—how long a time could we achieve? (Does the answer depend on the viscosity of the magma or whatever else is in the earth’s core?) Sure, I could try to calculate an answer myself, but why bother when so many physicists read this blog? Pencils out!
Archive for the ‘Embarrassing Myself’ Category
Today is the 70th birthday of Donald E. Knuth: Priest of Programming, Titan of Typesetting, Monarch of MMIX, intellectual heir to Turing and von Neumann, greatest living computer scientist by almost-universal assent … alright, you get the idea.
That being the case, Jeff Shallit proposed to various CS bloggers that we should all band together and present the master with a birthday surprise: one post each about how his work has inspired us. The posts are now in! Readers who don’t know about Knuth’s work (are there any?) should start with this post from Luca. Then see this from David Eppstein, this from Doron Zeilberger, this from Jeff, this from Bill Gasarch, and this from Suresh.
Knuth’s impact on my own work and thinking, while vast, has not been directly through research: his main influence on my BibTeX file is that if not for him, I wouldn’t have a BibTeX file. (One reason is that I’m one of the people Doron Zeilberger attacks for ignoring constant factors, and supporting what he calls “the ruling paradigm in computational complexity theory, with its POL vs. EXP dichotomy.”) So I decided to leave Knuth’s scientific oeuvre to others, and to concentrate in this post on his contributions to two other fields: mathematical exposition and computational theology.
Knuth’s creation of the TeX typesetting system — his original motivation being to perfect the layout of his own Art of Computer Programming books — was remarkable in two ways. First, because scientific typesetting is of so little interest to industry, it’s not clear if something like TeX would ever have been invented if not for one man and his borderline-neurotic perfectionism. Second, TeX is one of the only instances I can think of when a complicated software problem was solved so well that it never had to be solved again (nor will it for many decades, one hazards to guess). At least in math, computer science, and physics, the adoption of TeX has been so universal that failure to use it is now a reliable crackpot indicator.
Since version 3, TeX has used an idiosyncratic version numbering system, where updates have been indicated by adding an extra digit at the end of the decimal, so that the version number asymptotically approaches π. This is a reflection of the fact that TeX is now very stable, and only minor updates are anticipated. The current version of TeX is 3.141592; it was last updated in December 2002 … Even though Donald Knuth himself has suggested a few areas in which TeX could have been improved, he indicated that he firmly believes that having an unchanged system that will produce the same output now and in the future is more important than introducing new features. For this reason, he has stated that the “absolutely final change (to be made after my death)” will be to change the version number to π, at which point all remaining bugs will become features.
But Knuth’s interest in scientific exposition goes far beyond typesetting. His 1974 Surreal Numbers: How Two Ex-Students Turned on to Pure Mathematics and Found Total Happiness, which he wrote in one week, was weirdness at the highest possible level: the Beatles’ White Album of math. It’s said to represent the only occasion in history when a new mathematical theory (Conway’s theory of surreal numbers) was introduced in the form of a novel. (Though admittedly, with the exception of one sex scene, this is a “novel” whose plot development mostly takes the form of lemmas.)
Those seeking to improve their own writing should consult Mathematical Writing (available for free on the web), the lecture notes from a course at Stanford taught by Knuth, Tracy Larrabee, and Paul Roberts. Like a lot of Knuth’s work, Mathematical Writing has the refreshing feel of an open-ended conversation: we get to see Knuth interact with students, other teachers, and visiting luminaries like Mary-Claire van Leunen, Paul Halmos, Jeff Ullman, and Leslie Lamport.
Since I’ve blogged before about the battle over academic publishing, I also wanted to mention Knuth’s remarkable and characteristically methodical 2003 letter to the editorial board of the Journal of Algorithms. Knuth asks in a postscript that his letter not be distributed widely — but not surprisingly, it already has been.
In the rest of this post, I’d like to talk about Things A Computer Scientist Rarely Talks About, the only book of Knuth’s for which I collected one of his coveted $2.56 prizes for spotting an error. (Nothing important, just a typo.)
Things is based on a series of lectures on computer science and religion that Knuth gave in 1997 at MIT. (At the risk of oversimplifying: Knuth practices Christianity, but in a strange form less interested in guns and gays than in some business about “universal compassion.”) Perhaps like most readers, when I bought Things I expected yet another essay on “non-overlapping magisteria,” a famous scientist’s apologia justifying his belief in the Virgin Birth and the Resurrection. But Knuth likes to surprise, and what he delivers instead is mostly a meditation on the typography of Bible verses [sic]. More precisely, Things is a “metabook”: a book about the lessons Knuth learned while writing and typesetting an earlier book, one I haven’t yet read, that analyzed verse 3:16 of every book of the Bible.
But this being a lecture series, Knuth also fields questions from the audience about everything from sin and redemption to mathematical Platonism. He has a habit of parrying all the really difficult questions with humor; indeed, he does this so often one comes to suspect humor is his answer. As far as I could tell, there’s only one passage in the entire book where Knuth directly addresses what atheists are probably waiting for him to address. From one of the question periods:
Q: How did you become so interested in God and religion in the first place?
A: It was because of the family I was born into. If I had been born in other circumstances, my religious life would no doubt have been quite different. (p. 155)
And then on to the next question.
To me, what’s remarkable about this response is that Knuth without any hesitation concedes what skeptics from Xenophanes to Richard Dawkins have held up as the central embarrassment of religion. This, of course, is the near-perfect correlation between the content of religious belief and the upbringing of the believer. How, Dawkins is fond of asking, could there possibly be such a thing as a Christian or Hindu or Jewish child? How could a four-year-old already know what he or she thinks about profound questions of cosmogony, history, and ethics — unless, of course, the child were brainwashed by parents or teachers?
My Bayesian friends, like Robin Hanson, carry this argument a step further. For them, the very fact that Knuth knows his beliefs would be different were he born to different parents must, assuming he’s rational, force him to change his beliefs. For how can he believe something with any conviction, if he knows his belief was largely determined by a logically-irrelevant coin toss?
And yet, openly defying the armies of Bayes arrayed against him, here we have Knuth saying, in effect: yes, if I know that if I were some other person my beliefs would be different, but I’m not that other person; I’m Knuth.
So, readers: is Knuth’s response a cop-out, the understandable yet ultimately-indefensible defense of an otherwise-great scientist who never managed to free himself from certain childhood myths? Or is it a profound acknowledgment that none of us ever escape the circumstances of our birth, that we might as well own up to it, that tolerance ought not to require a shared prior, that the pursuit of science and other universal values can coexist with the personal and incommunicable?
Taking a cue from Knuth himself, I’m going to dodge this question. Instead, I decided to end this post by quoting some of my favorite passages from Chapter 6 of Things A Computer Scientist Rarely Talks About.
On computer science and God: “When I talk about computer science as a possible basis for insights about God, of course I’m not thinking about God as a super-smart intellect surrounded by large clusters of ultrafast Linux workstations and great search engines. That’s the user’s point of view.” (p. 168)
“I think it’s fair to say that many of today’s large computer programs rank among the most complex intellectual achievements of all time. They’re absolutely trivial by comparison with any of the works of God, but still they’re somehow closer to those works than anything else we know.” (p. 169)
On infinity: “Infinity is a red herring. I would be perfectly happy to give up immortality if I could only live Super K years before dying ['Super K' being defined similarly to an Ackermann number]. In fact, Super K nanoseconds would be enough.” (p. 172)
On the other hand: “I once thought, if I ever had to preach a sermon in church, I would try to explain Cantor’s theorem to my non-mathematical friends so that they could understand something about the infinite.” (p. 172)
On God and computational complexity: “I think it’s fair to say that God may well be bound by the laws of computational complexity … But I don’t recommend that theologians undertake a deep study of computational complexity (unless, of course, they really enjoy it). ” (p. 174)
On quantum mechanics: “Several years ago, I chanced to open Paul Dirac’s famous book on the subject and I was surprised to find out that Dirac was not only an extremely good writer but also that his book was not totally impossible to understand. The biggest surprise, however — actually a shock — was to learn that the things he talks about in that book were completely different from anything I had ever read in Scientific American or in any other popular account of the subject. Apparently when physicists talk to physicists, they talk about linear transformations of generalized Hilbert spaces over the complex numbers; observable quantities are eigenvalues and eigenfunctions of Hermitian linear operators. But when physicists talk to the general public they don’t dare mention such esoteric things, so they speak instead about particles and spins and such, which are much less than half the story. No wonder I could never really understand the popular articles.” (p. 181)
“The extra detail that gets suppressed when quantum mechanics gets popularized amounts to the fact that, according to quantum mechanics, the universe actually consists of much more data than could ever be observed.” (p. 182)
On free will and the problem of evil: “I can design a program that never crashes if I don’t give the user any options. And if I allow the user to choose from only a small number of options, limited to things that appear on a menu, I can be sure that nothing anomalous will happen, because each option can be foreseen in advance and its effects can be checked. But if I give the user the ability to write programs that will combine with my own program, all hell might break loose. (In this sense the users of Emacs have much more free will than the users of Microsoft Word.) … I suppose we could even regard Figure 5 [a binary tree representing someone's choices] as the Tree of the Knowledge of Good and Evil.” (p. 189-190)
No particular news to report — it’s about the same as it was 400 years ago, I guess. I just wanted to liveblog from the Taj Mahal, is all. (Jonathan Walgate is the one who suggested it.). Now I’ll go back to looking at it.
Theorem: Suppose NP-complete problems are efficiently solvable by quantum computers. Then either the polynomial hierarchy collapses, or else BQP ⊄ AM (that is, quantum computations can’t be efficiently simulated by Arthur-Merlin protocols).
Proof: Suppose NP ⊆ BQP and BQP ⊆ AM. Then coNP ⊆ BQP ⊆ AM, and hence the polynomial hierarchy collapses to the second level by a result of Boppana, Håstad, and Zachos.
Note: If only we could delete the weasel phrase “or else BQP ⊄ AM” from my Most Trivial Theorem, we would’ve achieved a long-sought breakthrough in quantum computing theory. In particular, we would’ve shown that any fast quantum algorithm to solve NP-complete problems would imply an unlikely collapse of classical complexity classes. But while the weasel phrase is weaselly enough to make the Most Trivial Theorem a triviality, I don’t think it’s infinitely weaselly. The reason is my growing suspicion that BQP ⊆ AM in the unrelativized world.
Second Note: When I call this my “Most Trivial Theorem,” obviously I’m excluding homework exercises.
Welcome to an occasional new Shtetl-Optimized series, where physicists get to amuse themselves by watching me struggle to understand the most basic concepts of their discipline. I’ll consider my post on black hole singularities to be retroactively part of this series.
Official motto: “Because if I talked about complexity, you wouldn’t understand it.”
Unofficial motto: “Because if I talked about climate change, I’d start another flamewar — and as much as I want to save civilization, I want even more for everyone to like me.”
Today’s topic is Understanding Electricity. First of all, what makes electricity confusing? Well, besides electricity’s evil twin magnetism (which we’ll get to another time), what makes it confusing is that there are six things to keep track of: charge, current, energy, power, voltage, and resistance, which are measured respectively in coulombs, amps, joules, watts, volts, and ohms. And I mean, sure you can memorize formulas for these things, but what are they, in terms of actual electrons flowing through a wire?
Alright, let’s take ‘em one by one.
Charge is the q in kqq/r2. Twice as many electrons, twice as much charge. ‘Nuff said.
Current is charge per unit time. It’s how many electrons are flowing through a cross-section of the wire every second. If you’ve got 100 amps coming out, you can send 50 this way and 50 that way, or π this way and 100-π that way, etc.
Energy … Alright, even I know this one. Energy is what we fight wars to liberate. In our case, if you have a bunch of electrons going through a wire, then the energy scales like the number of electrons times the speed of the electrons squared.
Power is energy per unit time: how much energy does your appliance consume every second? Duh, that’s why a 60-watt light bulb is environmentally-friendlier than a 100-watt bulb.
Voltage is the first one I had trouble with back in freshman physics. It’s energy per charge, or power per current. Intuitively, voltage measures how much energy gets imparted to each individual electron. Thus, if you have a 110-volt hairdryer and you plug it into a 220-volt outlet, then the trouble is that the electrons have twice as much energy as the hairdryer expects. This is what transformers are for: to ramp voltages up and down.
Incidentally, the ability to transform voltages is related to why what comes out of your socket is alternating current (AC) instead of direct current (DC). AC, of course, is the kind where the electrons switch direction 60 times or so per second, while DC is the kind where they always flow in the same direction. For computers and other electronics, you clearly want DC, since logic gates are unidirectional. And indeed, the earliest power plants did transmit DC. In the 1890′s, Thomas Edison fought vigorously against the adoption of AC, going so far as to electrocute dogs, horses, and even an elephant using AC in order to “prove” that it was unsafe. (These demonstrations proved about as much as D-Wave’s quantum computer — since needless to say, one can also electrocute elephants using DC. To draw any conclusions a comparative study is needed.)
So why did AC win? Because it turns out that it’s not practical to transmit DC over distances of more than about a mile. The reason is this: the longer the wire, the more power gets lost along the way. On the other hand, the higher the voltage, the less power gets lost along the way. This means that if you want to send power over a long wire and have a reasonable amount of it reach its destination, then you want to transmit at high voltages. But high voltages are no good for household appliances, for safety and other reasons. So once the power gets close to its destination, you want to convert back down to lower voltages.
Now, the simplest way to convert high voltages to low ones was discovered by Michael Faraday, and relies on the principle of electromagnetic induction. This is the principle according to which a changing electric current creates a changing magnetic field, which can in turn be used to drive another current. (Damn, I knew we wouldn’t get far without bumping into electricity’s evil and confusing magnetwin.) And that gives us a simple way to convert one voltage to another — analogous to using a small, quickly-rotating gear to drive a big, slowly-rotating gear.
So to make a long story short: while in principle it’s possible to convert voltages with DC, it’s more practical to do it with AC. And if you don’t convert voltages, then you can only transmit power for about a mile — meaning that you’d have to build millions of tiny power plants, unless you only cared about urban centers like New York.
Resistance is the trickiest of the six concepts. Basically, resistance is the thing you need to cut in half, if you want to send twice as much current through a wire at the same voltage. If you have two appliances hooked up serially, the total resistance is the sum of the individual resistances: Rtot = R1 + R2. On the other hand, if you have two appliances hooked up in parallel, the reciprocal of the total resistance is the sum of the reciprocals of the individual resistances: 1/Rtot = 1/R1 + 1/R2. If you’re like me, you’ll immediately ask: why should resistance obey these identities? Or to put it differently, why should the thing that obeys one or both of these identities be resistance, defined as voltage divided by current?
Well, as it turns out, the identities don’t always hold. That they do in most cases of interest is just an empirical fact, called Ohm’s Law. I suspect that much confusion could be eliminated in freshman physics classes, were it made clear that there’s nothing obvious about this “Law”: a new physical assumption is being introduced. (Challenge for commenters: can you give me a handwaving argument for why Ohm’s Law should hold? The rule is that your argument has to be grounded in terms of what the actual electrons in a wire are doing.)
Here are some useful formulas that follow from the above discussion:
Power = Voltage2/Resistance = Current2 x Resistance = Voltage x Current
Voltage = Power/Current = Current x Resistance = √(Power x Resistance)
Resistance = Voltage/Current = Power/Current2 = Voltage2/Power
Current = Power/Voltage = Voltage/Resistance = √(Power/Resistance)
Understand? Really? Take the test!
Update (4/16): Chad Orzel answers my question about Ohm’s Law.
I am, I’m slightly embarrassed to admit, quoted pretty extensively in the cover story of this week’s New Scientist magazine (alas, only available to subscribers or those willing to shell out $4.95). The story, by Michael Brooks, is about an interesting recent paper by Lucien Hardy of Perimeter Institute, on the power of “quantum gravity computers.” Lucien’s paper considers the following question: by exploiting quantum fluctuations in the causal structure of spacetime, can one efficiently solve problems that are not efficiently solvable with a garden-variety quantum computer?
As I told Brooks, I really do think this is a hell of a question, one that’s intimately related to the challenge of understanding quantum gravity itself. The trouble is that, until an actual quantum theory of gravity chooses to make itself known to us, almost everything we can say about the question is pure speculation.
But of course, pure speculation is what New Scientist gobbles up with french fries and coleslaw. And so, knowing what kind of story they were going to run, I did my best to advocate giving reality at least a few column inches. Fortunately, the end result isn’t quite as bad as I’d feared.
(Full disclosure: recently New Scientist asked me to write an article for them on theoretical computer science breakthroughs of the last 30 years. Remembering some of the steamers NS has unloaded in the recent past, I faced a moral dilemma for approximately five minutes. I then wrote back to them and said I’d be delighted to do it.)
Anyway, here are a few relevant excerpts from the article. If New Scientist wants me to take these down, then of course I’ll have to comply — though I imagine that being linked to from the 25,000th most popularest blog on the entire Internet could only boost their sales.
A NEW computer is always welcome, isn’t it? It’s always faster than your old one, and it always does more stuff. An upgrade, the latest model with all the bells and whistles is an exciting prospect.
And when it comes to the kind of machine physicists are hoping for, you really are looking at something special. No ordinary upgrade for them: this will be the ultimate computer, and radically different from anything we have ever seen. Not only might it be supremely powerful, defying the logic of cause and effect to give instantaneous answers, it might also tell us exactly how the universe works. It might even tell us how our minds produce the phenomenon we call consciousness. Clear a space on your desk, then, for the quantum gravity computer.
Of course, there’s a chance it may not fit on your desktop because we don’t yet know what the machine will look like. Neither do we know how to build it, or even whether it will do all that its proponents hope. Nevertheless, just thinking about how this processor works could improve our understanding of the universe. “The power of quantum gravity computers is one of the deepest problems in physics,” says Scott Aaronson, a mathematician based at the University of Waterloo in Ontario, Canada.
Put [quantum theory and general relativity] together to make a quantum theory of gravity and it is almost inevitable that we are going to have trouble with notions of cause and effect: the logic of tock following tick or output following input just won’t apply in the quantum-gravity universe.
Aaronson agrees with Hardy. “General relativity says that the causal structure can vary, and quantum mechanics says that anything that can vary can be in superposition,” he says. “So to me, an indefinite causal structure seems like the main new conceptual feature.”
The big question is how powerful [a quantum gravity computer] could be: will it be the ultimate processor?
It turns out this is a hard question to answer. Traditionally, a computer’s power is rated by the number of computations it can do in a given time. IBM’s Blue Gene computer currently tops the world rankings for classical computers: it can do 280 trillion calculations per second. In theory, a quantum computer can do even better. It will be able to crack the world’s toughest codes in the blink of an eye.
The quantum gravity computer, on the other hand, can’t compete under these rules because “quickly” doesn’t mean anything in a scheme where space and time can’t be separated. Or, as Aaronson puts it: “It would be nice if the quantum gravity theorists could at least tell us what they mean by ‘time’.”
Nevertheless, Hardy thinks there is good reason to suppose the quantum gravity computer would indeed be a more powerful machine than anything we have so far envisioned. The fact that it might glimpse its results without running a computation hints at this, he says — though he admits this is just speculation.
What’s more convincing, he says, is the difficulty of simulating a quantum gravity computer on a quantum computer. The fact that we have no algorithm for simulating quantum systems on classical computers highlights the gulf between a classical computer and a quantum computer. If a quantum computer cannot simulate a quantum gravity computer, then that implies there might be another huge leap in computing power waiting to be exploited.
It is a controversial conclusion, though. Seth Lloyd of the Massachusetts Institute of Technology thinks there is no reason to invoke a discontinuity that separates quantum gravity from more familiar processes … Aaronson’s money is on the Lloyd camp: quantum gravity computers can’t be more powerful than quantum computers, he says. In his view, it is a short step from ultra-powerful quantum gravity computers to total cosmic anarchy. If, as Hardy suggests, a quantum gravity computer might be able to see its result without having to run its algorithms, it is essentially no different to having a quantum computer strapped to a time machine. As we all know, time machines don’t make sense: they would enable us to do things like travel back in history to kill our grandparents and thereby cease to exist. “It’s hard to come up with any plausible way to make quantum computers more powerful that wouldn’t make them absurdly more powerful,” he says.
Whatever the truth, this is why investigating the characteristics of the quantum gravity computer is so valuable. It ties theories to the real world, Aaronson says, and stops the important issues, such as a link with observable facts or staying within the bounds of what’s physically possible, from being swept under the carpet. After all, a computer has to produce an observable, measurable output based on an input and a known set of rules. “The connection to observation is no longer a minor detail,” Aaronson says. “It’s the entire problem.”
Two obvious corrections:
- I certainly don’t think that quantum gravity computers “can’t” be more powerful than ordinary quantum computers. What I think is that, at the moment, there’s no good evidence that they would be.
- I am not a mathematician.
Update: Six months ago, New Scientist ran a credulous, uncomprehending story about a rocket propulsion system that flagrantly violates conservation of momentum (!). This led to interesting discussions here, here, and here about what can be done to improve the magazine’s standards. If you enjoyed the D-Wave fiasco, you’ll also like the spectacle of commenters rushing to defend the article against those elitist, ivory-tower academics with their oh-so-sacred conservation laws. In a world of Homer Simpsons, it’s not easy being a Lisa.
Lenny Susskind — the Stanford string theorist who Shtetl-Optimized readers will remember from this entry — is currently visiting Perimeter Institute to give a fascinating series of lectures on “Black Holes and Holography.”
After this morning’s lecture (yes, I’m actually getting up at 10am for them), the following question occurred to me: what’s the connection between a black hole having an event horizon and its having a singularity? In other words, once you’ve clumped enough stuff together that light can’t escape, why have you also clumped enough together to create a singularity? I know there’s a physics answer; what I’m looking for is a conceptual answer.
Of course, one direction of the correspondence — that you can’t have a singularity without also having an event horizon — is the famous Cosmic Censorship Hypothesis popularized by Hawking. But what about the other direction?
When I posed this question at lunch, Daniel Gottesman patiently explained to me that singularities and event horizons just sort of go together, “like bacon and eggs.” However, this answer was unsatisfying to me for several reasons — one of them being that, with my limited bacon experience, I don’t know why bacon and eggs go together. (I have eaten eggs with turkey bacon, but I wouldn’t describe their combined goodness as greater than the sum of their individual goodnesses.)
So then Daniel gave me a second answer, which, by the time it lodged in my brain, had morphed itself into the following. By definition, an event horizon is a surface that twists the causal structure in its interior, so that none of the geodesics (paths taken by light rays) lead outside the horizon. But geodesics can’t just stop: assuming there are no closed timelike curves, they have to either keep going forever or else terminate at a singularity. In particular, if you take a causal structure that “wants” to send geodesics off to infinity, and shoehorn it into a finite box (as you do when creating a black hole), the causal structure gets very, very angry — so much so that it has to “vent its anger” somewhere by forming a singularity!
Of course this can’t be the full explanation, since why can’t the geodesics just circle around forever? But if it’s even slightly correct, then it makes me much happier. The reason is that it reminds me of things I already know, like the hairy ball theorem (there must be a spot on the Earth’s surface where the wind isn’t blowing), or Cauchy’s integral theorem (if the integral around a closed curve in the complex plane is nonzero, then there must be a singularity in the middle), or even the Nash equilibrium theorem. In each of these cases, you take a geometric structure with some global property, and then deduce that having that property makes the structure “angry,” so that it needs a special point (a singularity, an equilibrium, or whatever) to blow off some steam.
So, question for the relativistas: is there a theorem in GR anything like my beautiful story, or am I just talking out of my ass as usual?
Update (3/22): Well, it turns out that I was ignorantly groping toward the famous Penrose-Hawking singularity theorems. Thanks to Dave Bacon, Sean Carroll, and ambitwistor for immediately pointing this out.
I botched the calculation. While I got the answer I wanted (a quadratic improvement in energy), and while I more-or-less correctly identified the reason for that answer (unintuitive properties of the relativistic velocity addition formula), I did the calculation in the rest frame of one of the particles instead of the zero-momentum rest frame, and thereby obtained a scaling of 1/sqrt(ε) versus 1/ε instead of 1/ε1/4 versus 1/sqrt(ε). As a result, my answer flagrantly violates conservation of energy.
Thanks to rrtucci and perseph0ne. In my defense, I did call it a doofus discovery.
There’s a trivial question about particle accelerators that bugged me for a while. Today I finally figured out the answer, and I’m so excited by my doofus “discovery” that I want to tell the world.
In Ye Olde Times, accelerators used to smash particles against a fixed target. But today’s accelerators smash one particle moving at almost the speed of light against another particle moving at almost the speed of light — that’s why they’re called particle colliders (duhhh). Now, you’d think this trick would increase the collision energy by a constant factor, but according to the physicists, it does asymptotically better than that: it squares the energy!
My question was, how could that be? Even if both particles are moving, we can clearly imagine that one of them is stationary, since the particles’ motion with respect to the Earth is irrelevant. So then what’s the physical difference between a particle hitting a fixed target and two moving particles hitting each other, that could possibly produce a quadratic improvement in energy?
[Warning: Spoiler Ahead]
The answer pops out if we consider the rule for adding velocities in special relativity. If in our reference frame, particle 1 is headed left at a v fraction of the speed of light, while particle 2 is headed right at a w fraction of the speed of light, then in particle 1′s reference frame, particle 2 is headed right at a (v+w)/(1+vw) fraction of the speed of light. Here 1+vw is the relativistic correction, “the thing you put in to keep the fraction less than 1.” If v and w are both close to 0, then of course we get v+w, the Newtonian answer.
Now set v=w=1-ε. Then (v+w)/(1+vw) = 1 – ε2/(2-2ε+ε2), which scales like 1-ε2. Aha!
To finish the argument, remember that relativistic energy increases with speed like 1/sqrt(1-v2). If we plug in v=1-ε, then we get 1/sqrt(2ε-ε2), while if we plug in v=1-ε2, then we get 1/sqrt(2ε2-ε4). So in the case of a fixed target the energy scales like 1/sqrt(ε), while in the case of two colliding particles it scales like 1/ε.
In summary, nothing’s going on here except relativistic addition of velocities. As with Grover’s algorithm, as with the quantum Zeno effect, it’s our intuition about linear versus quadratic that once again leads us astray.
I’ve learned my lesson, at least for the next day or two.
And speaking of learning — in computational learning theory, there’s an “obvious” algorithm for learning a function from random samples. Here’s the algorithm: output any hypothesis that minimizes the error on those samples.
I’m being intentionally vague about what the learning model is — since as soon as you specify a model, it seems like some version of that algorithm is what you want to do, if you want the best tradeoff between the number of samples and the error of your hypothesis. For example, if you’re trying to learn a Boolean function from a class C, then you want to pick any hypothesis from C that’s consistent with all your observations. If you’re trying to learn a Boolean function based on noisy observations, then you want to pick any hypothesis that minimizes the total number of disagreements. If you’re trying to learn a degree-d real polynomial based on observations subject to Gaussian noise, then you want to pick any degree-d polynomial that minimizes the least-squared error, and so on.
Here’s my question: is the “obvious” algorithm always the best one, or is there a case where a different algorithm needs asymptotically fewer samples? That is, do you ever want to pick a hypothesis that disagrees with more of your observations over one that disagrees with less?
While I’m on the subject, have you ever wished you could help Scott Aaronson do his actual research, and even be thanked — by name — in the acknowledgments of one of his papers? Well then, don’t miss this chance! All you have to do is read this seminal paper by Alon, Ben-David, Cesa-Bianchi, and Haussler, and then tell me what upper bound on the sample complexity of p-concept learning follows from their results. (Perversely, all they prove in the paper is that some finite number of samples suffices — must be a mathematician thing.)