Jacob Bekenstein (1947-2015)

Today I learned the sad news that Jacob Bekenstein, one of the great theoretical physicists of our time, passed away at the too-early age of 68.

Everyone knows what a big deal it was when Stephen Hawking discovered in 1975 that black holes radiate.  Bekenstein was the guy who, as a grad student in Princeton in the early 1970s, was already raving about black holes having nonzero entropy and temperature, and satisfying the Second Law of Thermodynamics—something just about everyone, including Hawking, considered nuts at the time.  It was, as I understand it, Hawking’s failed attempt to prove Bekenstein wrong that led to Hawking’s discovery of the Hawking radiation, and thence to the modern picture of black holes.

In the decades since, Bekenstein continued to prove ingenious physical inequalities, often using thought experiments involving black holes.  The most famous of these, the Bekenstein bound, says that the number of bits that can be stored in any bounded physical system is finite, and is upper-bounded by ~2.6×1043 MR, where M is the system’s mass in kilograms and R is its radius in meters.  (This bound is saturated by black holes, and only by black holes, which therefore emerge as the most compact possible storage medium—though probably not the best for retrieval!)  Bekenstein’s lectures were models of clarity and rigor: at conferences full of audacious speculations, he stood out to my non-expert eyes as someone who was simply trying to follow chains of logic from accepted physical principles, however mind-bogglingly far those chains led but no further.

I first met Bekenstein in 2003, when I was a grad student spending a semester at Hebrew University in Jerusalem.  I was struck by the kindness he showed a 21-year-old nobody, who wasn’t even a physics student, coming to bother him.  Not only did he listen patiently to my blather about applying computational complexity to physics, he said that of course physics should ultimately aim to understand everything as the output of some computer program, that he too was thinking in terms of computation when he studied black-hole entropy.  I remember pondering the fact that the greatest reductionist I’d ever met was wearing a yarmulke—and then scolding myself for wasting precious brain-cycles on such a trivial thought when there was science to discuss.  I met Bekenstein maybe four or five more times on visits to Israel, most recently a year and a half ago, when we shared walks to and from the hotel at a firewall workshop at the Weizmann Institute.  He was unfailingly warm, modest, and generous—totally devoid of the egotism that I’ve heard can occasionally afflict people of his stature.  Now, much like with the qubits hitting the event horizon, the information that comprised Jacob Bekenstein might seem to be gone, but it remains woven into the cosmos.

27 Responses to “Jacob Bekenstein (1947-2015)”

  1. Bram Cohen Says:

    The Bekenstein bound seems a bit paradoxical: If there’s such a firm limit on the amount of information necessary to describe the state of a quantum system, how come it requires so much computation to simulate?

  2. HoS Says:

    That was beautifully written Scott. I will have to learn more about him and his work.

  3. Scott Says:

    Bram #1: Aha! That’s why I talked about the number of bits that can be “stored” in the system, rather than the number of bits needed to describe the system’s state! The latter will grow like exp(2.6×1043 MR).

    To put it differently, the Bekenstein bound is really an upper bound on the number of qubits (i.e., on the log of the dimension of the Hilbert space), not on the number of classical bits. However, we also know from Holevo’s Theorem that n qubits can’t be used to reliably store and retrieve more than n classical bits—and for that reason, we can also phrase Bekenstein (if we like) as a bound on the number of classical bits that can be reliably stored and retrieved.

  4. Bram Cohen Says:

    Thanks for the explanation. That’s… very weird. Are there ‘quantum forms of storage’ which quantum storage can do better than regular storage despite being limited in its classical storage capacity? I have no idea what that API would even look like.

  5. Shantanu Says:

    You also forgot his tour de force paper on TeVes in 2004 and also his work on radiation recoil from astrophysical black holes in 70s, which was also quite seminal.

    Although I never met him, he did seem modest. You cannot find a single seminar by him at Perimeter (which boasts a 10+ years
    collection of talks in gravity and cosmology). In fact you cannot
    even find a photo of him from 70’s or 80s.
    shantanu

  6. Scott Says:

    Bram #4:

      Are there ‘quantum forms of storage’ which quantum storage can do better than regular storage despite being limited in its classical storage capacity?

    Yes! As the simplest example, quantum storage is exponentially more efficient than classical, if the thing that you’re trying to store is a quantum state. 😉

    More interesting examples come from communication complexity. This seminal 2006 paper by Gavinsky et al., and this one by Klartag and Regev, give examples of communication scenarios (i.e., Alice gets an input x, Bob gets an input y, and they want to compute f(x,y)), where Alice can tell Bob everything he needs to know to learn f(x,y) by sending him only O(log n) qubits, but she’d need to send something like √n or n1/3 classical bits to achieve the same thing.

  7. Miguel Says:

    The bound seems to imply that the bit unit is equivalent to kg*m…. surely this can’t be right?

  8. Job Says:

    As the simplest example, quantum storage is exponentially more efficient than classical, if the thing that you’re trying to store is a quantum state.

    I understand the reasoning behind that statement but it doesn’t seem immediately impossible that we might be able to encode the information present in a quantum state, given enough time and assuming we know what it is, into a probability distribution for a classical system with less than exponential storage overhead.

    Sure the classical state can’t be fed to a quantum computer, but it might be fed into a different computer that can exploit the probability distribution in the same way that a QC would – assuming we’re allowed exponential time here as well.

    Basically, i’m wondering if it has been shown that, as far as storage compression goes, quantum states are more efficient than classical probabilistic states.

  9. Scott Says:

    Job #8: If you define the task to be, “output this n-qubit quantum state,” then it’s easy to prove that you need to remember exp(n) classical bits (even with probabilistic encodings) if you want to succeed at the task even approximately. But there’s a reason why I put a smiley after that sentence: this task is tautological and uninteresting.

    For examples of classical tasks where quantum states have been proven to give an exponential compression in message length compared to classical probabilistic encodings, see the later part of the comment (the Gavinsky et al. and Klartag-Regev results).

  10. Sniffnoy Says:

    Miguel #7:

    The bound seems to imply that the bit unit is equivalent to kg*m…. surely this can’t be right?

    You’ll notice that Scott wrote not “where M is the system’s mass and R is its radius”, but rather “where M is the system’s mass in kilograms and R is its radius in meters”, the way they often do in high-school textbook exercises, so that M and R are, technically, according to the way he wrote it, dimensionless. Obviously that’s not really how you want to think about it; the more physical way would be to say M is the mass, R is the radius, and that the constant that appears there is not actually dimensionless.

    …of course, I gather that physicists doing this sort of thing frequently use natural units, fixing certain constants that we would normally regard as dimensioned as a dimensionless 1 (or some other specified dimensionless number), so that things we would normally regard as different dimensions become interconvertible; for instance, with c=1, one second is 299792458 meters. And if you fix enough of these (for instance as with Planck units) then all the usual physical dimensions become dimensionless (so that 1 meter is about equal to the number 6*10^34).

    So you do have to watch out for that in this sort of thing, and maybe remember to multiply things by appropriate physical constants if you want to convert back to ordinary non-natural units, where all the usual dimensions really are different. That said, that doesn’t appear to be what’s going on here; this is just the high-school textbook “let’s not think about dimensioned quantities and think about numbers instead” shortcut.

  11. Job Says:

    If you define the task to be, “output this n-qubit quantum state,” then it’s easy to prove that you need to remember exp(n) classical bits

    Is that because the negative/complex amplitudes in the quantum state can’t be encoded directly as a probability, or is there another reason? Sorry i even have to ask, but i’m not seeing it.

    If we had 2^n classical probabilities (with a sum of 1) that we wished to encode into n bits, we could prepare the n bits such that each of the original probabilities is assigned to one of the 2^n classical bit states.

    Then, to extract the information back out of the system, we’d perform a large number of measurements – as many as necessary, depending on precision – and observe the measured distribution for the 2^n bit states.

    Negative amplitudes will make this strategy less viable, but suppose we use two sets of n bits instead. We could subtract the probabilities for the first set from those of the second set – this would be able to encode negative values with only a polynomial overhead.

    We’d use a third set for the imaginary quantity of complex amplitudes.

    You’re saying we’d need up to an exponential number of such sets? I can see how that might be the case, but what’s the proof?

  12. Job Says:

    Actually, i see the problem with my argument.

    The n classical bits would be in a definite state, despite being in a probabilistic superposition, so measuring them multiple times will always yield the same result.

    We can’t access the set of probabilities in a classical system any more than we can access the amplitudes in a quantum system.

    Is that why it’s uninteresting and tautological?

  13. Scott Says:

    Job #11: To be clear what we’re talking about, suppose someone gives you a classical description of an n-qubit state |ψ⟩. Then you get to send either an n-qubit state, or an m-bit sample from a classical probability distribution, to a processing station. The processing station uses what you sent to prepare a new n-qubit quantum state, and send it to a referee. Finally, the referee measures the state in an orthonormal basis containing |ψ⟩, and accepts if and only if it sees |ψ⟩. You and the processing station, working in concert, want to maximize the probability that the referee accepts, assuming |ψ⟩ is random and unknown.

    If you’re allowed to send a quantum state to the processing station, then the solution is obvious: just send |ψ⟩ itself! That takes only n qubits.

    By contrast, if you need to send a classical m-bit probabilistic message, then I claim that m≥exp(n) is necessary. The proof (which is simple, but I agree not 100% trivial) goes like this. For a given |ψ⟩, let Dψ be the probability distribution over m-bit classical messages that leads to the referee accepting with probability (say) ≥1-ε. Then by convexity, there must be a specific string xψ in the support of Dψ, which also leads to the referee accepting with probability ≥1-ε. Furthermore, for every |ψ⟩ and |φ⟩ that are nearly orthogonal, we must have xψ≠xφ, because otherwise, the processing station won’t know whether to prepare |ψ⟩ or |φ⟩, so the verifier will not accept with probability close to 1.

    But now we come to the crux of the matter: there are “only” 2m strings in {0,1}m, but there are exp(exp(n)) quantum states of n qubits, every one of which has inner product close to 0 with every other (as can be seen by choosing the states at random, or by constructing them using an error-correcting code). Therefore we need m≥exp(n).

  14. asdf Says:

    Sad to hear about Bekenstein. I also noticed only recently, that Ed Nelson died just over a year ago.

  15. asdf Says:

    Sorry, meant to say Ed Nelson died just under a year ago.

    https://en.wikipedia.org/wiki/Edward_Nelson

  16. Rahul Says:

    Scott:

    Can you elaborate on the “greatest reductionist” comment you made about Bekenstein?

    That sounded intriguing but I’m not sure I understood what you meant there.

  17. Miguel Says:

    Sniffnoy #10: Thanks for explaining. I’m an engineer, and I like to think about the units because they provide both an error detection mechanism into my calculations, and insight into what is really going on. Of course, it makes sense to take well-known shortcuts in a blog post; I’m not so sure about the Wikipedia page linked, though.

  18. Scott Says:

    Miguel #7, Sniffnoy #10: Actually, I don’t see any problem with the way Wikipedia or I wrote it. It’s comprehensible to a layperson, who can easily get a sense of how many bits we’re talking about (namely: a lot) for reasonable values of M and R, but it’s also completely precise for an expert, who could easily convert it to whatever other format they wanted (e.g., keeping mass and length abstract, while collecting all the units in the leading constant). If I put a constant in front with units of 1/(kg m), it would just befuddle the layperson, with no compensating increase in correctness or precision.

  19. Serge Says:

    In 1990 Bekenstein wrote the following in his paper entitled “Quantum limitations on the storage and transmission of information”:

    “Thus far, as customary in the field, we have treated information storage and communication as separate issues. But clearly they are not. A situation can be described purely in terms of information storage only in the rest frame of the storing device. In another Lorentz frame information flows with the motion of the device, and the communication facet surfaces. In fact, the Lorentz invariance of the laws of physics must mean that information storage and communication are inextricably linked, and proper understanding of one of them suffices for understanding of the other. Thus far no unified treatment of this sort exists. But there are some insights into how information is intertwined with the concept of space-time.”

    Scott, do you think Bekenstein had foreseen a deep relativistic connection between space complexity and communication complexity? If so, is it possible to apply it to complexity theory?

  20. Gil Kalai Says:

    I was  shocked to hear the sad news and my heart goes to Jacob’s family. Jacob was a great scientist and I also share Scott’s impressions about him as a person. I saw Jacob on Campus almost every day, and, as Scott describes, he was always extremely kind and nice. Let me add that Jacob was a member of our quantum information center, gave (like Scott) one of the main lectures in the Center’s opening conference QSTART, Black Holes and the Physical Limits of Information (the link is to the video). And here is another videotaped lecture in Hebrew : Thermodynamics: From Steam Engines to Black Holes

  21. Miguel Says:

    Scott, the constant in front would have to have units of bits/(kg m) to satisfy me 🙂

    I think my confusion arose because I’m completely mistified by the Bekenstein bound. I work in digital and wireless communications, and I know information theory (Shannon’s), so I work with bits every day. To see an equation that combines the speed of light, Planck’s constant, mass, size, pi, and ln 2, and returns bits… it just blew my mind.

  22. Scott Says:

    Serge #19:

      Scott, do you think Bekenstein had foreseen a deep relativistic connection between space complexity and communication complexity? If so, is it possible to apply it to complexity theory?

    I think the passage you quoted conveys an important insight. Indeed, I would say that we already use something sort of like it in complexity theory, in that lower bounds on one-way communication complexity are routinely used to get lower bounds on the amount of data storage needed in streaming and online computation, and other related models. (See, e.g., this paper by Francois Le Gall.) In effect, one thinks about data storage as a “message” sent from the past to the future—so lower bounds on communication can then be used to get lower bounds on the number of bits one needs to store.

  23. Scott Says:

    Rahul #16:

      Can you elaborate on the “greatest reductionist” comment you made about Bekenstein?

    I just meant that at that point in my life, Bekenstein was the individual I’d met who’d done the most to advance the reductionist vision, of reality as the output of a simple computer program—particularly through his discovery of the Bekenstein bound, with its implication that any bounded part of the universe can be fully described using a finite number of qubits. (Since then, I’ve met Steven Weinberg, and others who could also make strong claims to have advanced reductionism. 🙂 ) Of course, one could reasonably wonder: if a simple computer program accounts for all observable phenomena, then what role could possibly be left for an interventionist deity like that of the Old Testament? I never talked to Bekenstein about such things, though I’m sure he would’ve had an interesting answer.

  24. Michael Gogins Says:

    Rahul 16, Scott 23:

    For what it’s worth there are a few tidbits from Jacob Bekenstein and David Eichler on reductionism and the Old Testament creator in an interview on the Huffington Post here.

    Personally, as a non-affiliated philosophical theist with a deep interest in science, a Darwinian without qualification, I think that there is an obvious tension between theism and the scientific assumptions about Nature. This tension is revealed as an uncomfortable feeling that, if the laws of Nature are a simple computer program, then who wrote the program (the position, apparently, of Bekenstein), on the one hand; and, on the other hand, an equally uncomfortable feeling that if we discover anything like this program or think it highly plausible that it could be discovered, then what need is there for any other information on the program, and the question who did or not write the program is meaningless (the position, as far as I can make it out, of Quine and his disciples).

    I share Bekenstein’s view much more than Quine’s. Not only is there the question where the simple program came from, there’s the question how human thinkers came up with the assumptions that justify looking for the simple program. Our contemporary naturalistic view of cause and effect, for example, is radically different from what went under the same name before the Renaissance. You really kind of have to make this assumption of naturalistic, mechanistic cause and effect before you can even get started doing science in the modern sense.

    I would also note that, based on knowing a number of Jewish friends of varying persuasions ranging from haredim to “Modern Orthodox” to Reconstructionist to Reform to completely secular, there is a radical loss of interest in science and clinging to Creationism that sets in as one moves from the “Modern Orthodox” persuasion further to the right.

  25. Daniel Moskovich Says:

    BD”E. I was very much hoping I would someday have the chance to meet Bekenstein myself and to talk to him about stuff, but now that won’t happen in this world.

    Regarding religion, I don’t imagine Bekenstein was bothered by matters of religion vs. science- we try to understand what we can observe, and what we can’t observe needn’t scientifically concern us unduly.

  26. Whewell’s Gazette: Year 2, Vol. #06 | Whewell's Ghost Says:

    […] Shtetl–Optimized: Jacob Bekenstein (1947–2015) […]

  27. fred Says:

    Scott #23
    “Of course, one could reasonably wonder: if a simple computer program accounts for all observable phenomena, then what role could possibly be left for an interventionist deity like that of the Old Testament?”

    God had to program that computer then pick his program’s initial conditions just so that he would himself appear in the output.