## Summer recapitulates life

Last week, I was back at the IAS in Princeton, to speak at a wonderful PITP summer school entitled “From Qubits to Spacetime,” co-organized by Juan Maldacena and Edward Witten. This week, I’ll be back in Waterloo, to visit old and new friends at the Perimeter Institute and Institute for Quantum Computing and give a couple talks. Then, over the weekend, I’ll be back in Boston to see old friends, colleagues, and students. After some other miscellaneous travel, I’ll then return to Austin in late August when the semester begins. The particular sequence IAS → Waterloo → Boston → Austin is of course one that I’ve followed before, over a longer timescale.

Two quick announcements:

First, at the suggestion of reader Sanketh Menda, I’m thinking of holding a *Shtetl-Optimized* meetup in Waterloo this week. Please send me an email if you’re interested, and we’ll figure out a time and place that work for everyone.

Second, many of the videos from the IAS summer school are now available, including mine: Part I and Part II. I cover some basics of complexity theory, the complexity of quantum states and unitary transformations, the Harlow-Hayden argument about the complexity of turning a black hole event horizon into a firewall (with my refinement), and my and Lenny Susskind’s work on circuit complexity, wormholes, and AdS/CFT. As a special bonus, check out the super-embarrassing goof at the beginning of my first lecture—claiming a mistaken symmetry of conditional entropy and even attributing it to Edward Witten’s lecture! (But Witten, who I met for the first time on this visit, was kind enough to call my talk “lots of fun” anyway, and give me other positive comments, which I should put on my CV or something.)

**Addendum:** Many of the PITP videos are well worth watching! As one example, I found Witten’s talks to be shockingly accessible. I’d been to a previous talk of his involving Khovanov homology, but beyond the first few minutes, it went so far over my head that I couldn’t tell you how it was for its intended audience. I’d also been to a popular talk of Witten’s on string theory, but that’s something he could do with only 3 awake brain cells. In these talks, by contrast, Witten proves some basic inequalities of classical and quantum information theory, then proves the Reeh-Schlieder Theorem of quantum field theory and the Hawking and Penrose singularity theorems of GR, and finally uses quantum information theory to prove positive energy conditions from quantum field theory that are often needed to make statements about GR.

Comment #1 July 24th, 2018 at 11:15 am

Any chance of a

Shtetl-Optimizedmeetup in Boston too?Comment #2 July 24th, 2018 at 11:28 am

atreat #1: OK, sure! Who’s interested? Shoot me an email if you are.

Comment #3 July 24th, 2018 at 5:14 pm

I’d be interested in a Boston meetup, and have heard positive responses from at least three others in the local community.

Comment #4 July 25th, 2018 at 1:29 am

Random questions, I apologize if this is “basic” as these are new thoughts to me and I’m probably just not googling the correct terms:

How can an oracle truly be an oracle if it cannot model the entire universe as observed by me at my particular location and orientation in space-time, plus what all my future (and past) locations and orientations will be in the universe w.r.t. space-time?

And, on Newcomb’s Paradox — what if my decision is based on examining the Predictor, in that I will only pick both boxes if the Predictor’s main processing loop halts, or some variation where I respond based on whether or not the Predictor halts? Wouldn’t that Predictor then need to solve the halting problem?

I’m probably being “dumb” here and proposing a thing that violates the terms of the paradox, but under this scenario I’d consider it meaningless. Also, due to how I view the first question, not only is this undecidable, but the only real Predictor would be me situated in a perfect simulation of the entire universe. It seems to me like there’s a connection here to symmetry, in that putting “me” either a.) into a box that isn’t me, or b.) at a different “place”, is not a symmetrical operation.

Comment #5 July 25th, 2018 at 1:36 am

…Or, perhaps, I hypothesize that there can be no gauge-symmetrical version of me, therefore an oracle that predicts “me” cannot exist separate from me. Conversely, an “oracle” can only exist for an object that is “completely” gauge-symmetric, in that all of the properties are gauge invariant.

Comment #6 July 25th, 2018 at 1:57 am

Andrew #4: Your first question is just a matter of how we use the word “oracle” in computer science. We don’t mean an omniscient being that knows the answers to life, the universe, and everything—we just mean a hypothetical device that’s able to solve a particular computational problem (for example, 3SAT, or some other specific problem of interest) in a single time step.

For your second question: yes, if the Chooser has oracle access to the Predictor, then Newcomb’s problem leads to a flat-out logical contradiction, just as you say. In the usual setup, however, the Chooser does

nothave access to the Predictor: the Chooser is “ordinary” (we’re to imagine a being much like ourselves), and the Predictor is flat-out more powerful. In that case you don’t get a contradiction (at least, not without some additional assumptions from decision theory), but “merely” a very interesting dilemma faced by the Chooser.Comment #7 July 25th, 2018 at 7:34 am

Hello Scott, I could swing by Boston for a S.O. meetup.

Comment #8 July 25th, 2018 at 9:44 am

Hi Scott,

a very basic noob question.

Isn’t complexity theory always making the implicit assumption that basic arithmetic operations (like addition) can be done in constant time regardless of the number of bits in the data?

Which, in a practical classical computer, is only achievable on a limited number of bits (like 64 bits registries), but once you want to do something like drawing the Mandelbrot set at arbitrary zoom levels, you need to manually re-implement all arithmetic operations on arbitrary size arrays, and it all slows down the higher the zoom level.

How does this issue affect quantum computers?

Comment #9 July 25th, 2018 at 10:07 am

fred #8: No. In Turing machines, quantum circuits, and other models of computation based on bits or qubits, what can be done at constant cost is only operations on constant numbers of

bits, not unbounded-length integers.In practical algorithm design, it’s often convenient to adopt the

fictionthat you can do arithmetic operations at unit cost, simply because you’re uninterested in the tedious counting of bit-operations below that. And in the field of algebraic complexity, that fiction even gets promoted to a foundational assumption—i.e., you imagine that (e.g.) addition and multiplication even of arbitrary reals or complex numbers can be done at unit cost, and then you explore what the consequences are (for example, does the permanent then have polynomial-size circuits?).But believe it or not, not all of us in CS are complete morons, 🙂 who confuse these convenient fictions with reality. In any situation where someone was claiming to perform any immense computation at almost zero cost, by shoehorning the computation into exponentially large integers and then assuming unit-cost arithmetic—well then, the way to expose that person is to abandon the unit-cost idealization and insist on a full accounting of bit-operations, just like Turing did in 1936, and just as we continue to do in all the parts of complexity theory where the unit-cost assumption isn’t useful.

Comment #10 July 25th, 2018 at 11:16 am

Thanks Scott!

From a computational complexity point of view, is it obvious whether computing a 100×100 image of the Mandelbrot set at a higher and higher zoom factor (assuming the same level of average recursion for the pixels) has to be more and more expensive?

We clearly need more and more significant digits as we zoom in, but isn’t there a way to “re-normalize” the domain and the recursion function to make the cost constant?

Especially considering the nature of fractals – self-similarity is scale invariance.

Comment #11 July 25th, 2018 at 5:16 pm

I would definitely come to the meetup in Waterloo, if you still have time. Assuming I didn’t miss it?

Comment #12 July 25th, 2018 at 6:13 pm

Devin #11: Sorry you missed it! Hope to meet you tomorrow after my talk.

Comment #13 July 25th, 2018 at 6:18 pm

fred #10: Interesting question! At least judging by this MathOverflow thread, the answer seems to be that it’s not known. Indeed, Andrej Bauer says that it’s not even known whether the following problem is decidable (let alone efficiently decidable):

given a disk, specified by a rational center and a rational radius, determine whether the disk intersects the Mandelbrot set.Comment #14 July 25th, 2018 at 8:14 pm

Ah, shoot, I missed the meetup. Didn’t check my blogroll until today. I assume the talks you’re giving tomorrow aren’t public? I’m unfortunately a layman rather than a researcher.

Comment #15 July 26th, 2018 at 12:28 am

Harrison #14: No, both of my talks, like almost all academic talks, are totally open to anyone. The one tomorrow is about gentle measurement of quantum states and differential privacy. It’s a research talk, not a popular talk, but you’re welcome to come and introduce yourself afterwards if you’re interested.

Comment #16 July 26th, 2018 at 10:04 am

Scott #13

Thanks Scott!

Even in the case of a straightforward function f(x) = x^2, I was still a bit puzzled as to why the cost of evaluation depends on the representation of x.

E.g. If we sample 10 values within a domain, the smaller the domain, the more bits we need, in general (evaluation for x = 0.100000000000101 is more costly than evaluation for x = 0.1).

It’s just not something we observe in the real world – nature doesn’t care where things happen … but if the real world were a simulation, then position would matter?!

But what I was missing (I think) is that it all boils down to the relation of the function and the “units” considered.

In nature, we never have functions like f(x) = x^2, all physical laws have constants in them which would require a huge amount of bits to capture exactly (at the very least corresponding to a zoom level all the way to the plank constant?).

Those constants are what’s bringing a huge amount of bits in the computation (regardless of the compactness of position representation).

Comment #17 July 26th, 2018 at 10:35 am

I’d certainly be down for attending a Boston meetup if it happens at any point.

Comment #18 July 29th, 2018 at 8:41 pm

Fred #16

It’s possible I’m missing what you’re asking because I don’t think your question and your example line up, but let me give a shot at answering “Even in the case of a straightforward function f(x) = x^2, I was still a bit puzzled as to why the cost of evaluation depends on the representation of x.”

The fact that different representations result in different costs isn’t a fact about algorithms or about the functions themselves. It’s a fact about computer architecture. If there was One True Way to build a computer, or if there were only one computer in the world, we wouldn’t be having this conversation. In the case of the universe being a simulation, the architecture of that particular computer would determine run-times, but there’d be no way to observe this from within the simulation itself. If the outside-the-simulation-world works roughly like ours and you had access to an oracle for the current time outside the simulation in the reference frame of the computer, you could determine properties of it’s architecture by using the oracle to observe the universe “running slowly.”

Comment #19 July 30th, 2018 at 4:38 pm

Dear Prof. Aaronson,

At one point I thought you had done some outreach with Native American’s. But my memory is vague and Google unhelpful.

Do you think you could point me in the correct direction?

Many thanks,

Comment #20 July 30th, 2018 at 5:04 pm

Stella #18

I’m actually looking at things from the simulator’s point of view 😛

When it comes to regular smooth functions, I would think that the cost of evaluating the function over a certain small window, for a given number of samples, and for a given relative precision, should be independent of scale/zoom level.

Especially considering that given a sufficient zoom level, all smooth functions become linear.

There should be a way to show that transforming the input dimensions so that the domain of interest is now centered around zero (making sure that all numbers can be expressed more compactly) guarantees a constant cost. But I’m not sure how to express that yet.

Basically if you’re using floats for your simulation, you want the computations to be done around the zero, so that all your bits can be put to good use for maximum precision (instead of “wasted” in the distance to origin).

Note that here I’m talking about doing things in a small window of interest (x0 +/- dx), I’m not talking about another type of issue when doing a simulation both at a large scale and small scale at once, e.g. when you try to simulate the entire galaxy, at a precision of a few inches (you then have to use some hierarchical scheme).

When it comes to fractals (involving non-obvious recursive functions), you would think that the cost of rendering the same relative area at different zoom level should be constant as well (assuming you’re hitting the average density of interest).

Creatures living at 10X zoom level or 1,000,000 times zoom level see a world that has roughly the same characteristics, by definition of fractals.

So, the cost of rendering the same relative area, at same relative precision, should be independent of scale as well, by some transformation… I just don’t want to have to re-implement arithmetic for arbitrary precision!

Comment #21 July 30th, 2018 at 8:19 pm

Rhenium #19: Such outreach sounds like it could be interesting! But I’m honestly not sure what you’re referring to.

Comment #22 July 31st, 2018 at 8:43 am

Dear Scott,

reading about the complexity of quantum operations reminded me of a question on Stackexchange I bookmarked a long time ago, and which completely puzzled me:

https://cstheory.stackexchange.com/questions/23766/why-is-bqpspace-in-pspace-if-it-can-have-doubly-exponentially-long-running-times

Basically, say I have a classical TM counting on a tape, it can count to exp(length of the tape). That is clearly in PSPACE. But in BQPSPACE, I could count to exp(exp(length of tape)), no? Or does that again depend on the representation of the number?

Greetings,

Sven

Comment #23 July 31st, 2018 at 10:30 am

Sven #22: You could only do that using a probabilistic/approximate counting algorithm. And if you were OK with such an algorithm, you could also get one classically, with no QM needed. For deterministic counting, what’s relevant is not the number of nearly pairwise-orthogonal states (which is indeed doubly exponential in QM), but the number of exactly pairwise-orthogonal states (which is “merely” singly exponential).

Comment #24 July 31st, 2018 at 6:01 pm

“… the Hawking and Penrose singularity theorems …” After quantum averaging, are Einstein’s field equations 100% correct? The empirical successes of Milgrom’s MOND might indicate that something is seriously wrong with Newtonian-Einsteinian gravitational theory. Google “kroupa milgrom”, “mcgaugh milgrom”, “sanders milgrom”, and “scarpa milgrom”.

Comment #25 August 2nd, 2018 at 1:04 pm

Scott

Can you refer to the most up to date survey or a book on quantum computer hardware/design? A text written for physicists would be great.

thanks

a.

Comment #26 August 2nd, 2018 at 7:06 pm

Argyn #25: I’m not sure. While the underlying principles haven’t changed all that much since Nielsen & Chuang, QC hardware is evolving sufficiently quickly right now that I’m not sure if there’s

anybook that’s really up to date. (That’s why we have the arXiv!) Would anyone else like to field this one?Comment #27 August 2nd, 2018 at 9:35 pm

Scott, when will you post your take on the work of the recently announced Fields medalists? It would be interesting to hear how and to what extent their work has influenced your own.

Comment #28 August 2nd, 2018 at 11:10 pm

Hong-Konger #27: I’m not qualified to comment on the work of any of the four new Fields Medalists. I was just reading about them today, going back and forth among Quanta magazine, Wikipedia, and the abstracts of some of their papers to try to get a sense. The problems are obviously very far removed from anything I’ve worked on.

The Nevanlinna winner, Costis Daskalakis, by contrast, is a former colleague of mine from the theory group at MIT. I know him quite well, and also know some of his great results—most famously on the hardness of finding Nash equilibria, but also some of his other stuff on e.g. the complexity of auctions. Huge congratulations to Costis!

Comment #29 August 9th, 2018 at 12:22 am

‘For your second question: yes, if the Chooser has oracle access to the Predictor, then Newcomb’s problem leads to a flat-out logical contradiction, just as you say. In the usual setup, however, the Chooser does not have access to the Predictor: the Chooser is “ordinary” (we’re to imagine a being much like ourselves), and the Predictor is flat-out more powerful. In that case you don’t get a contradiction (at least, not without some additional assumptions from decision theory), but “merely” a very interesting dilemma faced by the Chooser.’

Thanks for your response, I appreciate your time! That makes sense. Is it possible to simply know if the Predictor is or is not “halted” within the bounds of the paradox? To me, being able to know the answer to that question at any time T I choose is essentially the same thing as being able to peer inside the Predictor, maybe examine source code, or do other actions to determine if it will “halt” in the future. Because, regardless of how much “access” I have to the Predictor, the Predictor’s job is still essentially the same (“will I halt at time T?”). I don’t know how “time” formally plays into the bounds of the paradox, but I envision “reducing” myself to a very simple “machine” w.r.t. the Predictor:

1.) I walk into the room. There are two boxes and I can pick box b or both a & b.

2.) I wait some amount of time T which I determine at random in the moment (maybe I pick it beforehand, maybe I just stand around awhile until I feel like making my observation). At the end of that time, I observe if the Predictor has “halted”. Maybe there’s a glowing red LED on the front and I observe whether or not that LED is still lit to determine if it has “halted”, or some other set of physical indicators that show the machine’s current state. This requires me to know “something” about the Predictor, but I don’t need to know anything beyond what I can physically observe in the moment. If the Predictor is some super-intelligent being, then I observe whether or not said being is “alive” at the moment.

3.) If the Predictor has “halted”, I take both boxes. Otherwise I take box b.

It seems that if I want to limit this to a purely decision-theory problem, it’s necessary to completely isolate the Predictor from me so I can’t make a decision based on its current observed state. It also seems that by making my decision in this way, I win because an inherent property of the Predictor seems to be that it doesn’t halt/die at random points in time. Hence predicting that it will be “alive”/not halted is a safe bet. Also, this implies that “oracle”-like access to the Predictor is, in the context of this problem, essentially the same as being able to observe the Predictor’s current state in some way.

Comment #30 August 9th, 2018 at 12:48 pm

@Andrew 29:

Assuming the usual rules, the Predictor has already halted by the time you are offered the choice. Allowing you to choose before then would defeat the purpose, since the eventual “prediction” would be of an action you’ve already taken.

Comment #31 August 12th, 2018 at 11:52 pm

@Ben 30: [took me two posts to figure out the protocol, sorry!]

Ah, o.k. So, when I walk into the room I replace step 2 with:

I have a smartphone in my pocket. I pull it out and bring up the hot new app “Am I Halt Or Not”. You can swipe left or right — left brings up a new random program submitted by various random people on the internet, right selects a particular random program. Once selected, I provide some “random” input using finger-wiggling on the screen and hit “go”. After a certain amount of time, I hit “stop” and an indicator will be either green or red — green if the program wasn’t halted, red if it was halted.

I make the same set of decisions based on the result. To me the temporal bit of when the Predictor makes its prediction is less important, but again maybe I’m somehow violating the rules of the game.

Comment #32 August 13th, 2018 at 1:01 pm

Andrew: Is there some reason for all this complication rather than just saying, “I make my decision by flipping a coin?”

In any case the usual response to this is to restate the problem as to say, if the predictor predicts that you will randomize, it does not place the money in the box.

Comment #33 August 13th, 2018 at 3:48 pm

@Sniffnoy 32:

I’m trying to probe the confines of what the “oracle” or “predictor” is and can actually do. If it gets a pass by saying “oh, the choice will be random, no soup for you”, then I suppose that’s part of the “paradox”. It seems though like the paradox is rather limited in scope. I feel a weird inking that a lot of decision theory basically posits a totally rational world at the outset, when that’s far from being the case. Hence to me its utility is rather limited, other than to maybe make statistical predictions about things. And, if all the “oracle” is is a really good guesser, then what sort of special power is that? From the way the paradox is framed, it seems like the “oracle” has some magical ability to predict what I’m going to do. Well, it’s easy enough for me to turn that decision over to some other thing X, so that the predictor’s actual job is to predict the behavior of X, rather than me. And if that’s the case, then it’s easy to tie it into knots by giving it a problem it can’t solve. Basically in my mind it boils down to being somewhat pointless, like “all powerful being moving immovable object”.

I guess when I think of “paradox”, I think of Shrodinger’s Cat or something similar where there’s a real underlying philosophical question of interpretation that may also spill over into experimentation. Hence to me, Newcomb’s Paradox then really only exists as a paradox if you confine it to a rather limited “space” governed by the mathematics of decision theory, rather than being a statement about the larger universe and how it may operate.