## Teaching quantum in junior high: special Thanksgiving guest post by Terry Rudolph

Happy Thanksgiving!

People have sometimes asked me: “how do you do it?  how do you do your research, write papers, teach classes, mentor grad students, build up the quantum center at UT, travel and give talks every week or two, serve on program committees, raise two rambunctious young kids, and also blog and also participate in the comments and also get depressed about people saying mean things on social media?”  The answer is that increasingly I don’t.  Something has to give, and this semester, alas, that something has often been blogging.

And that’s why, today, I’m delighted to have a special guest post by my good friend Terry Rudolph.  Terry, who happens to be Erwin Schrödinger’s grandson, has done lots of fascinating work over the years in quantum computing and the foundations of quantum mechanics, and previously came up on this blog in the context of the PBR (Pusey-Barrett-Rudolph) Theorem.  Today, he’s a cofounder and chief architect at PsiQuantum, a startup in Palo Alto that’s trying to build silicon-photonic quantum computers.

Terry’s guest post is about the prospects for teaching quantum theory at the junior high school level—something he thought about a lot in the context of writing his interesting recent book Q is for Quantum.  I should stress that the opinions in this post are Terry’s, and don’t necessarily reflect the official editorial positions of Shtetl-Optimized.  Personally, I have taught the basics of quantum information to sharp junior high and high school students, so I certainly know that it’s possible.  (By undergrad, it’s not only possible, but maybe should become standard for both physics and CS majors.)  But I would also say that, given the current state of junior high and high school education in the US, it would be a huge step up if most students graduated fully understanding what’s a probability, what’s a classical bit, what’s a complex number, and any of dozens of other topics that feed into quantum information—so why not start by teaching the simpler stuff well?  And also, if students don’t learn the rules of classical probability first, then how will they be properly shocked when they come to quantum? 🙂

Can we/should we teach Quantum Theory in Junior High?

by Terry Rudolph

Should we?

Reasons which suggest the answer is “yes” include:

Economic: We are apparently into a labor market shortage in quantum engineers.  We should not, however, need the recent hype around quantum computing to make the economic case – the frontier of many disparate regions of the modern science and technology landscape is quantum.  Surely if students do decide to drop out of school at 16 they should at least be equipped to get an entry-level job as a quantum physicist?

Educational: If young peoples’ first exposures to science are counterintuitive and “cutting edge,” it could help excite them into STEM.  The strong modern quantum information theoretic connections between quantum physics, computer science and math can help all three subjects constructively generate common interest.

Pseudo-Philosophical: Perhaps our issues with understanding/accepting quantum theory are because we come to it late and have lost the mental plasticity for a “quantum reset” of our brain when we eventually require it late in an undergraduate degree.  It may be easier to achieve fluency in the “language of quantum” with early exposure.

Can we?

There are two distinct aspects to this question: Firstly, is it possible at the level of “fitting it in” – training teachers, adjusting curricula and so on?  Secondly, can a nontrivial, worthwhile fraction of quantum theory even be taught at all to pre-calculus students?

With regards to the first question, as the child of two schoolteachers I am very aware that an academic advocating for such disruption will not be viewed kindly by all.  As I don’t have relevant experience to say anything useful about this aspect, I have to leave it for others to consider.

Let me focus for the remainder of this post on the second aspect, namely whether it is even possible to appropriately simplify the content of the theory.  This month it is exactly 20 years since I lectured the first of many varied quantum courses I have taught at multiple universities. For most of that period I would have said it simply wasn’t possible to teach any but the most precocious of high school students nontrivial technical content of quantum theory – despite some brave attempts like Feynman’s use of arrows in QED: The Strange Theory of Light and Matter (a technique that cannot easily get at the mysteries of two-particle quantum theory, which is where the fun really starts).  I now believe, however, that it is actually possible.

A pedagogical method covering nontrivial quantum theory using only basic arithmetic

My experience talking about quantum theory to 12-15 year olds has only been in the idealized setting of spending a few hours with them at science fairs, camps and similar.  In fact it was on the way to a math camp for very young students, desperately trying to plan something non-trivial to engage them with, that I came up with a pedagogical method which I (and a few colleagues) have found does work.

I eventually wrote the method into a short book Q is for Quantum, but if you don’t want to purchase the book then here is a pdf of Part I,, which takes a student knowing only the rules of basic arithmetic through to learning enough quantum computing they can understand the Deutsch–Jozsa algorithm.  In fact not only can they do a calculation to see how it works in detail, they can appreciate conceptual nuances often under-appreciated in popular expositions, such as why gate speed doesn’t matter – it’s all about the number of steps, why classical computing also can have exponential growth in “possible states” so interference is critical, why quantum computers do not compute the uncomputable and so on.

Before pointing out a few features of the approach, here are some rules I set myself while writing the book:

• No analogies, no jargon – if it can’t be explained quantitatively then leave it out.
• No math more than basic arithmetic and distribution across brackets.
• Keep clear the distinction between mathematical objects and the observed physical events they are describing.
• Be interpretationally neutral.
• No soap opera: Motivate by intriguing with science, not by regurgitating quasi-mythological stories about the founders of the theory.
• No using the word “quantum” in the main text! This was partly to amuse myself, but I also thought if I was succeeding in the other points then I should be able to avoid a word almost synonymous with “hard and mysterious.”

One of the main issues to confront is how to represent and explain superposition.  It is typical in popular expositions to draw analogies between a superposition of, say, a cat which is dead and a cat which is alive by saying it is dead “and” alive.  But if superposition was equivalent to logical “and”, or, for that matter, logical “or”, then quantum computing wouldn’t be interesting, and in this and other ways the analogy is ultimately misleading.  The approach I use is closer to the latter – an unordered list of possible states for a system (which is most like an “or”) can be used to represent a superposition. Using a list has some advantages – it is natural to apply a transformation to all elements of a list, for instance doubling the list of ingredients in a recipe.  More critically, given two independent lists of possibilities the new joint list of combined possibilities is a natural concept.  This makes teaching the equivalent of the Kronecker (tensor) product for multiple systems easy, something often a bit tricky even for undergrads to become comfortable with.

Conceptually the weirdest part of the whole construction, particularly for someone biased by the standard formalism, is that I use a standard mathematical object (a negative or minus sign) applied to a diagram of a physical object (a black or white ball).  Moreover, positive and negative balls in a diagram can cancel out (interfere).  This greatly simplifies the exposition, by removing a whole level of abstraction in the standard theory (we do not need to use a vector containing entries whose specific ordering must be remembered in order to equate them to the physical objects).  While it initially seemed odd to me personally to do this, I have yet to have any young person think of it as any more weird than using the negative sign on a number.  And if it is always kept clear that drawing and manipulating the whole diagram is an abstract thing we do, which may or may not have any correspondence to what is “really going on” in the physical setups we are describing, then there really is no difference.

There are some subtleties about the whole approach – while the formalism is universal for quantum computing, it can only make use of unitary evolution which is proportional to a matrix with integer entries.  Thus the Hadamard gate (PETE box) is ok, the Controlled-NOT and Toffoli likewise, but a seemingly innocuous gate like the controlled-Hadamard is not capable of being incorporated (without adding a whole bunch of unintuitive and unjustified rules).  The fact the approach covers a universal gate set means some amazing things can be explained in this simple diagrammatic language.  For example, the recent paper Quantum theory cannot consistently describe the use of itself, which led to considerable discussion on this blog, can be fully reproduced.  That is, a high school student can in principle understand the technical details of a contemporary argument between professional physicists.  I find this amazing.

Based on communication with readers I have come to realize the people at most risk of being confused by the book are actually those already with a little knowledge – someone who has done a year or two’s worth of undergraduate quantum courses, or someone who has taken things they read in pop-sci books too literally.  Initially, as I was developing the method, I thought it would be easy to keep “touching base” with the standard vector space formalism.  But in fact it becomes very messy to do so (and irrelevant for someone learning quantum theory for the first time).  In the end I dropped that goal, but now realize I need to develop some supplementary notes to help someone in that situation.

Q is for Quantum is certainly not designed to be used as a classroom text – if nothing else my particular style and choice of topics will not be to others’ tastes, and I haven’t included all the many, many simple examples and exercises I have students doing along with me in class when I actually teach this stuff.  It should be thought of as more a “proof of principle,” that the expository challenge can be met.  Several colleagues have used parts of these ideas already for teaching, and they have given me some great feedback.  As such I am planning on doing a revised and slightly expanded version at some point, so if you read it and have thoughts for improvement please send me them.

### 48 Responses to “Teaching quantum in junior high: special Thanksgiving guest post by Terry Rudolph”

1. jonas Says:

> (By undergrad, [teaching quantum theory is] not only possible, but maybe should become standard for both physics and CS majors.)

Scott: should it be standard for mathematicians who aren’t studying physics or CS? I studied mathematics in university, and learned very little about quantum theory. I was taught a lot of the particular prerequisites that you mention.

2. Scott Says:

jonas #1: I guess it’s not my place to say. I think that some exposure to QI will increasingly be part of being an educated computer scientist, and I’ll also go to the mat that for many students, this is the best way ever discovered to introduce them to QM, which of course the physicists already care about. But it’s for mathematicians to decide whether this (as far as we know) exact realization of complex linear algebra, by the deepest level of physical reality, should be part of the core of math education.

I’ll say this, though: when I learned about eigenspaces, diagonalization, trace, matrix exponentiation, etc. in my Honors Linear Algebra class, as a 16-year-old at Cornell—well, I could do it, but I didn’t get the motivation for any of it. Had I been better at math, I would have, but I didn’t. Then, a couple years later, I learned quantum information, and suddenly it all snapped into place. And to this day I often use ket notation even for doing linear algebra that has nothing to do with QM.

3. John Sidles Says:

My copy of Q is for Quantum is scheduled for Monday delivery … and thank you, Terry, for so thoughtfully providing free public access to <a href="http://qisforquantum.org/wp-content/uploads/2018/11/PART-I-draft-Q-is-for-Quantum.pdf&quot; the first half!

PS: thoughtful public discussion(s) of any book, when shared with librarians, concretely motivates libraries to acquire the book boring discussed … so if a few Shtetl Optimized readers are willing to post their thoughts, then I am entirely confident that both the Seattle Public Library system and the University of Washington library system (for example) can be induced to stock Q is for Quantum … to the considerable long-term benefit of young readers.

4. jonas Says:

Scott: you are a mathematician and aren’t a physicist, you know quantum theory, and you’re willingly giving lectures in a university. If it’s not your place to say, then I don’t know who to ask.

Your second paragraph, however, makes me less confident of the importance of teaching quantum theory to mathematicians than I was. I find it natural to not appreciate tricky linear algebra at first, and understand its significance much later when you see its practical consequences, but that’s happened to me without quantum mechanics. So if that’s the best reason you can give, then I’m happy to leave all that stuff to mathematical physicists like Baez and complexity theorists like you, and if I’m motivated to learn physics, then I’d better spend it to learn about the various classical physics topics that I know little about.

5. The problem with the gatekeepers Says:

Somehow tangentially related to the topic, since at the end of the day this post is about addressing the question of whether more money should be spent on teaching a particular scientific topic in high school.

What’s Scott’s take on this enlightening article published by the Atlantic https://www.theatlantic.com/science/archive/2018/11/diminishing-returns-science/575665/ that asserts that we are spending more than ever in science and yet at the same time we are getting less return on investment than ever? The co-authors -who come of the world of software entrepreneurship- used a very ingenious methodology to reach the conclusion.

Happy Thanksgiving to everyone!

6. Shmi Says:

Scott #2

> Then, a couple years later, I learned quantum information, and suddenly it all snapped into place.

Interestingly, my experience was the opposite. Learning linear algebra at 17 was an almost physical pleasure. Applying it to numerical methods a year later was neat but annoying, because of all the little details one had to keep track of. Using it in undergrad QM was confusing, because of the idiosyncratic notation, while in grad QM it all finally fell into place. Sadly, it never happened with quantum information, which was enough of a struggle every time I tried to avoid learning it in depth. Yet it came very handy when learning General Relativity, not just because GR made intuitive sense to me, but also because I again used linear algebra for what it was invented: continuous transformations in arbitrary dimensions.

7. John Sidles Says:

Waiting for my copy of Terry Rudolph’s Q is for Quantum motivated me to discover Terry’s recent article “Why I am optimistic about the silicon-photonic route to quantum computing” (APL Photonics, 2017). Particularly relevant to perennial Shtetl Optimized concerns is the Appendix of Terry’s article, which is sub-titled “Sociological Musings”:

Inevitably different routes to building a quantum computer will compete for resources. That time should not be now. … Every route to quantum computing that is being pursued seriously is pushing us to a deeper understanding of the physics of the systems involved, as well as pushing engineering boundaries.

Be that as it may, any individual scientist has finite resources and so has to pick what to work on. The mundane, unspoken drivers of this choice are normally what specific physics we understand best given our background, and the fact you can only raise funding for stuff you are already acknowledged as an expert in. As such few people work with systems much different from those they did their early research in, despite us all knowing this was primarily an accident of geography.

The spoken justification for our choice, however, is often of the form “my choice is going to be the winner of the race” (a fact we should blame partly on our individual human competitiveness and partly on funding agencies and their government overlords, who expect such trite from us). Workers in quantum photonics can, I think, say this with the same level of conviction as those in any other area.

These are well-grounded observations (in my view)!

It is interesting that Terry’s arguments accommodate too — mathematically, physically, and even psychologically — the Kalai-compatible thesis that (in Terry’s phrase) “The ECT is going to be the winner of the race” … in that no other passage or argument in Terry’s entire article need be modified, to accommodate the possibility that the ECT is true.

Supposing then, for the sake of discourse, that in the relatively near future, the PsiQ Corporation discerns that technologically feasible devices for photonic computation invariably can be ingeniously simulated by efficient Church-Turing (ECT) computations … well heck … that’s no problem! … because ECT algorithms, and their associated software, and (most important) the physical understanding that is fostered by by algorithms and software, can themselves compose a product line for companies like PsiQ … a product line at least as valuable as “mere” quantum computer chips.

These considerations lead us (or me anyway) to appreciate the futility of “motte and bailey” quantum informatic discourse, that is, technological discourse in which, too readily, optimistic claims for near-term Quantum Supremacy serve as a psychological “motte” that defensively lays claim to a “bailey” of scalable quantum computations.

In summary, ECT fans (like me) prefer a technologically balanced style of scientific discourse; such discourse is exemplified (for me) by lectures like Sir Anthony Leggett’s “Condensed matter theory from a quantum information perspective” (2015, video here) and “Quantum phase slips in superconducting nanowires” (2012, video here, as background, quantum phase-slip (QPS) current standards are potentially central to NIST’s practical realization of a “Quantum SI“).

The common-sense point is that Terry Rudolph’s admirable optimism in respect to scalable quantum computing depends crucially upon the appreciation condensed matter dynamics that Tony Leggett’s lectures foster … an appreciation of quantum dynamics that is broadly shared, technologically creative, dynamically superradiant, and (of course) eminently ECT-compatible.

8. Gene Chase Says:

I “learned” QM as a junior mathematics major at MIT in 1963 in a required physics course. But I didn’t understand it until I read Chapter 9 “Quantum” in Scott’s book Quantum Computing since Democritus. Thank you, Scott.

9. Terry Says:

Shmi #6 my experience was almost identical to yours, but it was many years ago. I think modern undergraduate quantum teaching is much better than what I experienced, partly because there are good textbooks now which follow a “qubits first” approach (i.e. start with finite dimensional QM instead of wave mechanics). Although it was the approach Feynman took in his lectures it took a very long time to catch on.

10. Craig Gidney Says:

When teaching quantum information it might be beneficial to start even simpler than qubits: with quantum state machines.

Deterministic finite state machines are standard in the CS curriculum. After you explain how to generalize them to probabilistic processes, you can just keep going and explain how to generalize them to quantum processes. This approach has a few possible benefits:

1) You avoid the early conceptual hurdle of going from 1 qubits to 2 qubits, where you need to realize you’re dealing with 2*2 underlying states instead of 2+2 underlying states. States are just states, one per circle in the diagram. Qubits can be introduced later, as a particular way of grouping/labeling states.

2) Conceptually important physical experiments translate into simple state machines that don’t loop back on themselves. Mach-Zhender interferometers, for example. This gives an easy way to explain how to predict results of those experiments.

3) Detectors are handled in a way that makes it dead obvious that something odd is going on. In particular, adding a detector doubles the number of circles in your diagram because you need distinct states for each possible configuration of the detector and the system under test.

4) You can contrast with the classical state machines. For example, a Hadamard state machine looks a lot like a coin flip machine but you can show the quantum one cycles between two states while the classical one converges to a single state. Another good example is doing a random walk machine vs a quantum walk machine, and contrasting the wildly different distributions you get.

The cost of this approach is you now have N+1 things to teach, instead of N.

11. Sniffnoy Says:

I just want to say that as someone who doesn’t do quantum information and mostly only knows about it the things I read here, the quantum state machines way of thinking is basically how I’ve always thought about it! 🙂

12. Richard Gaylord Says:

scott:

“given the current state of junior high and high school education in the US, it would be a huge step up if most students graduated fully understanding what’s a probability.” What explanation of probability would you teach? There is no one accepted definition of probability (e.g. two recent articles ‘explain’ probability in terms of the so-called multiverse – UGH!). As Bertrand Russell said “Probability is the most important concept in modern science, especially as nobody has the slightest notion what it means.”.

13. Paul Hayes Says:

From that wikipedia entry for the PBR theorem:

“The theorem states that either the quantum state corresponds to a physically real object and is not merely a statistical tool, or else all quantum states, including non-entangled ones, can communicate by action at a distance.”

Shocking.

And teaching (applied) general / quantum probability theory and its interpretation(s) to the young student members of a species whose professional mathematicians, physicists and philosophers have often – sometimes quite bizarrely obtusely – struggled even with [simple problems in] classical probability?

Good luck with that.

/bitter sarcasm. 😉

14. Scott Says:

Paul #13: The PBR Theorem proves something interesting, but how best to render that something in English words has been the subject of considerable dispute on this blog and elsewhere. In any case, this thread is not the place to reignite that argument.

15. Scott Says:

Richard #12: I wasn’t even thinking of teaching students about the philosophical foundations of probability theory, Dutch book arguments, the Bayesian vs frequentist debate, etc — that’s possible; but is more advanced than what I had in mind! I just had in mind teaching students how to think probabilistically, using everyday life, news stories, puzzles, or whatever else as the grist, in situations where Bayesians, frequentists, and everyone else who’s studied the subject agrees on the right answer, but where much of the public gets it wrong.

16. Terry Says:

Part III of the book covers PBR trying to keep absolutely clear at every stage what is being assumed, what the different”pieces” of the math do or do not correspond to and so on. I doubt I can make it any simpler than what is in there.

17. michel Says:

@Terry Your introduction says “A fairly recent mathematical breakthrough (not by me) suggested a very different method of presenting some of the most interesting and weird phenomena of modern physics.” What was this mathematical breakthrough? (Apologies if it’s mentioned somewhere in the book — I haven’t read that far yet.)

18. Serge Massar Says:

Dear Terry,

When I first learned about Q is for Quantum, I immediately ordered it and read it.

I really admire what you achieved. I would never have thought it possible that one could explain so much of quantum mechanics, with quantitative results, with only the elementary math you use. The fact that it is possible may be hiding some deep insights about quantum mechanics which we still have to learn.

But unfortunately I also felt I could not recommend your book, or only with strong qualifiers.

For the professional quantum mechanic like myself it’s a very nice and interesting read.

And maybe for somebody that will never go further, reading the first chapter or two will help them glimpse that the quantum world is completely different.

But I quail when I think of somebody who painfully reads through your book, puts much effort into understanding it, and thinks he is thereby getting a good start on quantum theory. In fact he will have to unlearn everything in your book when he goes deeper, and starts learning the standard formalism. It will be a big waste of time to him. (To belabour the point: if a student of my undergraduate quantum mechanics course asked me whether I recommend your book, I would have to strongly discourage him).

The problem is that we are stuck with the notation Dirac invented. It may sometimes mislead us. It may sometimes be unwieldy. But it’s what everybody uses. Teaching people another notation is just going to confuse them, and make learning quantum mechanics even harder. It should not be done.

This reminds me of an anecdote of Feynman in one of his books of stories -I forgot which- : when in high school he reinvented a lot of trigonometry, and had invented his own notation for it. But as an undergrad he realised that he could not use his personal notation, even though it was perfectly good, because otherwise he could not communicate. He needed to unlearn his notation, and start using COS and SIN like everybody else.

In your post you mention that you are going to add an appendix on the ket-bra notation. I hope you can do more than that, and make Q is for Quantum into a book that can be recommended to an undergraduate. So that if he picks up your book during the summer before his first quantum mechanics course, he will have a head start when the course begins, rather than the contrary. I would really love it if version 2.0 of your book could achieve this.

I am sorry to be so harsh, and I hope you will forgive me. I unfortunately feel that your book cannot be recommended too freely. But I also really admire what you have achieved.

Serge

19. John Sidles Says:

Serge Massar says: “The problem is that we [quantum researchers] are stuck with the notation Dirac invented. It may sometimes mislead us. It may sometimes be unwieldy. But it’s what everybody uses.”

Perhaps it might be more accurate — and more helpful, too, to beginning quantum students — to say more precisely that “Dirac’s bra-ket notation is a older mathematical notation that many quantum research articles still use, however bra-ket notation is by no means adequate for all research purposes.”

Here we can be guided by Terry Tao’s ongoing lecture notes in fluid dynamics. For example, Comment #18 of Tao’s recent lecture “254A, Notes 3: Local well-posedness for the Euler equations, begins as follows:

“One can interpret the vorticity equation in the language of differential geometry, which is a more covenient formalism when working on more general Riemann manifolds than R^d” … [derivations follow]

As with fluid mechanics, so too with quantum mechanics. In particular, computational quantum ECT unravellings generically are pulled-back onto varietal state-spaces (aka “tensor networks”), such that for “quantum ECTopian purposes” — as research into computationally efficient quantum dynamical simulation might be called — it is natural to agree with Terry Tao’s observation that “the language of differential geometry is a more convenient formalism.”

The historical example of Misnor, Thorne, and Wheeler’s Gravitation (1973) provides a compatible pedagogic path. In this textbook, spacetime dynamics is presented in two tracks (1) an older coordinate-based “Track 1” notation, intermixed with (1) a newer coordinate-free “Track 2” notation.

Nowadays, of course, the “Track 2” language of differential geometry has evolved to dominate general relativity research and pedagogy, and this same language is increasingly prevalent too in classical fluid dynamics (as Terry Tao’s lecture illustrates).

These considerations are why it is mathematically plausible — and immensely helpful, too, in computationally demonstrating “Quantum ECTacy”, as the ECT-affirming conterpoise to “Quantum Supremacy” might be called — to foresee that an accelerating embrace of the language of differential geometry will be taking place, too, in quantum research and pedagogy.

After all, how likely is it, that just one mathematical language can suffice, to encompass all the potentialities of quantum dynamical systems? Not very likely!

20. Scott Says:

Serge #18: My own preference, when learning a new field, is indeed usually to learn the notation that the experts use as quickly as I can. But I admire the ambition of what Terry is trying to do, in probing the frontier of just how simple QM can be made without loss of accuracy. We could use some more people at that frontier!

21. Terry Says:

Michel #17 – I list some references at the end of the book, the original proof that Toffoli and Hadamard are universal (which is what I am relying on is by Shih https://arxiv.org/abs/quant-ph/0205115) although it is more recent simplifications by Doirt Ahaoronov and others that made me realize it could actually be useful for this kind of thing.

Serge #18 I am sympathetic to that viewpoint, as I mentioned I already discovered the people it is most dangerous for are those with a little knowledge already.

The book is aimed at making it possible for *everybody* with the barest level of mathematical competency get some technical and quantitative understanding of quantum theory, which they simply cannot achieve reading standard pop-sci books. The tiny, tiny fraction of those people who will go on to do an undergrad had better be competent enough mathematically that they can map a list like [W,W,-B] to a vector [2;-1] to a normalized vector [2;-1]/sqrt(4+1) or they have far larger issues ahead of them than unlearning the stuff in the book! But of course I should get off my butt and make a few notes and put them up on the webpage. I’m sure someone levelled a similar criticism when Feynman tried to teach quantum using a simplification of path integrals – maybe his critics were right as that didn’t work for me personally either when I was young, as it couldn’t get me to understanding why entanglement was mysterious. If nothing else I feel the method in “Q is for Quantum” can get someone over that particular line.

22. Job Says:

I would probably have been derailed by the discussion around the internals of the PETE/Hadamard box.

It’s given as an abstraction to something that’s very strange. I would need to understand what’s going on before proceeding.

That’s not necessarily a bad thing, but IMO it happens too early in the book.

Also, at one point it’s practically challenging the reader to produce a sensible explanation for the PETE gate. And it’s fun times trying to do so, but it really interrupts the flow.

It’s like when i first heard the Halting problem was unsolvable. Didn’t learn anything else that week, including the respective proof.

I would have left the Hadamard as an imaginary gate until later in the book.

23. John D. Says:

Re: Scott’s comment #2: “And to this day I often use ket notation even for doing linear algebra that has nothing to do with QM.”

Now I’m intrigued. Any chance you can put up a quick problem in both notations, so we can see how it’s done? (All I remember offhand about bra/ket is something like: one of them represents the “rows” of a matrix, and the other is the “columns”.)

24. Kevin S. Van Horn Says:

> if students don’t learn the rules of classical probability first, then how will they be properly shocked when they come to quantum?

They shouldn’t be; quantum amplitudes are not probabilities and quantum states are not probability distributions. Endless unnecessary confusion has been caused by a wrongheaded insistence in trying to portray them as such. The amplitudes *yield* probabilities, but are not themselves probabilities.

A good analogy is the linear predictor vector $\eta = \beta x$ in a multinomial logit regression model:

# x is an n-vector, beta is an m x n matrix
\eta = \beta x
p = softmax(\eta)
y ~ Categorical(p_1, …, p_m)

We could get all breathless about how shocking it is that in a logistic regression model we have “probabilities” $\eta_i$ that don’t sum to 1 and can range from negative infinity to positive infinity, etc. … but statisticians are too sensible to do that.

25. fred Says:

Hi Terry,

What are you thoughts on your grandfather’s book “My View of the World”?

I found his approach to the mysteries of consciousness really fascinating.
It had a big effect on me – both in terms of being more compassionate and deepening of my mindfulness training.

26. Job Says:

They shouldn’t be; quantum amplitudes are not probabilities and quantum states are not probability distributions. Endless unnecessary confusion has been caused by a wrongheaded insistence in trying to portray them as such.

In the context of classical vs quantum computation it’s useful to consider quantum amplitudes as probabilities.

And it takes more than just a probabilistic system that’s described or parameterized with negative or imaginary quantities to capture the separation between a classical and a quantum computer.

Like, we could label the parameters in terms of cat breeds too. Cat based “probabilities” are even weirder than negative “probabilities” but that doesn’t make them interesting.

27. Kevin S. Van Horn Says:

Job #26 says:

In the context of classical vs quantum computation it’s useful to consider quantum amplitudes as probabilities.

The fact that Scott can talk about being “properly shocked” by quantum amplitudes argues against your claim. If it were appropriate and enlightening to consider quantum amplitudes as probabilities, nobody would be shocked by them and there would be nothing mysterious about them to anyone familiar with probability theory.

Quantum amplitudes exhibit interference phenomena; probabilities do not. Considering quantum amplitudes as probabilities obscures this crucial point.

Quantum amplitudes obey *none* — not even one! — of the basic laws of probability theory. Can you think of any other area where people insist on saying “A is sort of a variation of B” when A exhibits none of B’s properties at all?

28. Scott Says:

Kevin and Job: Obviously amplitudes are in no sense probabilities; they’re numbers whose absolute squares are probabilities, not known to be relevant to nature until the discovery of QM. The shocking part is that they change how you have to calculate probabilities.

29. Job Says:

Quantum amplitudes exhibit interference phenomena; probabilities do not. Considering quantum amplitudes as probabilities obscures this crucial point.

That’s the whole point behind the idea of “negative probabilities”, it highlights the fact that quantum computers have access to interference.

It helps transition from classical to quantum, from a computing perspective. You might find it confusing but it is a useful analogy.

In some cases, a quantum computer is acting exactly like a classical probabilistic machine except for the phase that’s applied right at the end.

It’s almost to the point where, if we had a random number generator that would apply a “phase” corresponding to the dot product between the seed and the random value, mod 2, we’d match a quantum computer for some of the existing black-box separations.

30. Other Michael Says:

@the Problem With Gatekeepers#5- Scott Alexander has a nice rebuttal to that argument here:
http://slatestarcodex.com/2018/11/26/is-science-slowing-down-2/

31. Scott Says:

With apologies—Lily (age 5) dictated the following paragraph to me, and then asked me to “put it on the Internet where everyone could see my idea.” I promised her that I would, and this comment section seemed like the easiest venue.

Lily drew 5 houses. 1 house is big and 4 are little. First there’s a person in a lovely little house. There’s a person who was bigger and he had a bigger house. They couldn’t build an even bigger house. So there was another little person who had another little house. And there was a littler person who had a littler house. And there was an even littler person who had an even littler house.
32. The problem with the gatekeepers Says:

Other Michael #30,

I have skimmed through the article. I don’t see in which ways it refutes the article by Collison and Nielsen. It’s like apple to oranges. The question is not whether the number of transistors per chip increases, rather that still rely on the transistor. a device invented in the 1940s, not on something totally different and revolutionary as when vacuum tubes were replaced by transistors. A similar point could be made about the LIGO experiment for example. It’s a very impressive feat, but it is still the case that LIGO is testing the consequences of a theory (Einstein’s General Relativity) first proposed in 1916.

Collison and Nielsen used a very innovative methodology: asking existing experts to rank the significance of those inventions that have been awarded what is considered the highest award in hard science: the Nobel Prize.

33. fred Says:

Scott #28

“not known to be relevant to nature until the discovery of QM”

Nature needs quaternions to do rotations! 😛

34. Shmi Says:

Scott #31

Have you been discussing Zeno with Lily? Or is she discovering series convergence on her own?

35. Kevin S. Van Horn Says:

Scott #28 and Job #29:

All of this “amplitudes as probabilities” talk is missing the point that the only place probabilities enter into QM is in the act of measurement. Everything before that point is unitary linear operators and tensor products. This is why I use the analogy with a multinomial logit model, or a neural network classifier: with these it is also true that probabilities only show up in one final operation that converts a numeric vector into a probability vector. But all the interesting stuff happens in the earlier layers, which don’t involve probabilities in any form.

Job, this makes the analogy with a classical probabilistic machine very inapt. The probabilistic machine is doing random draws all throughout the computation; with a quantum algorithm the random draw happens only once, at the output layer, when reading the result. This makes it much more similar to a neural network classifier.

Scott, how is processing the final layer of a quantum circuit by taking squared amplitudes fundamentally different than applying softmax to the final layer of a neural network? Why does one “change how you have to calculate probabilities,” while the other does not?

Instead of strained analogies with probabilities, I think these are the important points to make:

* You aggregate qbits using the tensor product; hence O(n) physical resources (n qbits) yields a quantum state vector of length 2^n.

* A classical state for n bits corresponds to a quantum state vector of length 2^n that is all 0’s except for a single 1 (“one-hot” encoding).

* Quantum gates are unitary linear operators (ULOs).

* You can do general computations with ULOs is because any one-to-one, onto function on a finite domain can be implemented as a permutation operator when using the “one-hot” encoding.

* But things only get interesting once you start playing with other sorts of ULOs…

* You only get to observe the final layer of a quantum circuit, and doing so probabilistically converts a state vector of length 2^n into a classical bit vector of length n.

36. lewikee Says:

Bought the book – loving it so far. This is precisely the approach that every discipline could benefit from – quantitative, with specific examples, bringing up ambiguous interpretations only when the reader is sufficiently exposed – but particularly quantum.

The ball / machine framework is amazingly clear (and fun!).

37. Job Says:

Job, this makes the analogy with a classical probabilistic machine very inapt. The probabilistic machine is doing random draws all throughout the computation; with a quantum algorithm the random draw happens only once, at the output layer, when reading the result. This makes it much more similar to a neural network classifier.

Really, it’s the number of random draws that bothers you?

I think the analogy with a neural network classifier is actually worse. Why? Because it doesn’t capture the power of a quantum computer. It’s too generic.

Can you explain why a quantum computer can solve Deutsch’s or Simon’s problem exponentially faster than a classical machine using the classifier analogy?

By comparing a QC to TensorFlow, you’re describing the computing platform, not why it’s faster.

The concept of “negative probabilities” is meant to describe why a QC is faster, not how it works.

38. Eric L Says:

As a physics minor who enjoyed Feynman’s QED, I did find the online chapter helpful and may end up buying the book. This isn’t the best notation for quantum algorithms, but after reading this I was inspired to read some Wikipedia pages on quantum algorithms and I’m finding the bra-ket notation much less daunting than I used to. I’m not sure I can precisely point to something I understand now that I didn’t before, but I have a stronger intuitive grounding for it I guess?

But that leads me to an otherwise unrelated question — is the scalability of the Quantum Fourier Transform a sham?

For this to be polynomial, my expectation is that once you’ve developed a reliable quantum computer that could do a 16-bit QFT, doing a 4096-bit QFT is a matter of adding more qbits to the same processor and waiting a bit longer. But based on the Wikipedia description it seems that for every additional qbit you also need more phase shift gates that are *exponentially* more precise. This makes it seem like the challenges of cracking RSA aren’t just the usual challenges of building a quantum computer, but also the challenges of building an arbitrarily precise analog computer.

Am I understanding this correctly? Or is there a way to simulate precise tiny phase shifts on a quantum computer that only supports large shifts?

39. Job Says:

Am I understanding this correctly? Or is there a way to simulate precise tiny phase shifts on a quantum computer that only supports large shifts?

The Solovay-Kitaev theorem does basically that. It shows that any single-qubit gate can be efficiently approximated with a sequence of gates drawn from a universal set, up to the desired precision.

Shor’s paper also addresses this problem using another result:

When k − j is large in the gate S_j,k in (4.3), we are multiplying by a very small phase factor. This would be very difficult to do accurately physically, and thus it would be somewhat disturbing if this were necessary for quantum computation. Luckily, Coppersmith [1994] has shown that one can define an approximate Fourier transform that ignores these tiny phase factors, but which approximates the Fourier transform closely enough that it can also be used for factoring.

I would like to see a Bloch sphere visualization showing how such a tiny rotation can be approximated using a typical universal gate set.

40. Scott Says:

Eric #38: In every generation, there is one who notices that Shor’s circuit for the QFT involves exponentially small rotations. So in every generation, there must be one ready to reply that those rotations, because they’re exponentially small, can be omitted without loss of generality.

41. John Sidles Says:

In regard to DFT algorithms, the STEM-history stans of Shtetl Optimized may enjoy Richard Garwin’s personal account in “The Fast Fourier Transform as an example of difficulty in gaining wide use of a new technique” (1968, PDF here), along with James Cooley’s personal account in “The Re-Discovery of the Fast Fourier Transform algorithm” (1987, PDF here) and “How the FFT gained acceptance” (1992, PDF here).

These FFT articles provide further concrete grounds to wonder — as prior Shtetl Optimized comments have wondered — why was 20th century STEM-progress immensely slower than might reasonably have been strategically projected and practically achieved?

42. Kevin S. Van Horn Says:

Job #37 Says:

The concept of “negative probabilities” is meant to describe why a QC is faster, not how it works.

And the oxymoron of “negative probabilities” utterly fails at this task. Seriously, who in the world thinks, “Oh, yeah, I can see how negative probabilities might allow you to solve in polynomial time a problem that takes a classical computer exponential time”?

Take a look again at my list of what I think should be emphasized. The first item hints at why a QC can be faster: “You aggregate qbits using the tensor product; hence O(n) physical resources (n qbits) yields a quantum state vector of length 2^n.”

The third and last items hint at why getting that speedup is tricky, and why a QC (probably) can’t solve NP-complete problems in polynomial time.

I think the analogy with a neural network classifier is actually worse.

That analogy is not meant to explain how QC works, only to emphasize that all the important stuff happens before probabilities ever enter the picture.

43. Job Says:

Take a look again at my list of what I think should be emphasized. The first item hints at why a QC can be faster: “You aggregate qbits using the tensor product; hence O(n) physical resources (n qbits) yields a quantum state vector of length 2^n.”

I don’t need to look at your list for that basic misconception, i can just read it off of a pop-sci magazine.

The vector size by itself says nothing of the power of a quantum computer. In fact there are several quantum gate sets that can be efficiently simulated by a classical computer, for state spaces as large as you want.

You have to go further than that. Once you get there you might realize that:

Oh, yeah, I can see how negative probabilities might allow you to solve in polynomial time a problem that takes a classical computer exponential time

44. Job Says:

I’m tempted to implement the random number generator i mentioned previously and upload it to GitHub with a sample solver for Simon’s problem.

And maybe Kevin can implement an equivalent solver using TensorFlow.

They’ll both be slow that’s for sure, but the random-number generator’s abstraction yields a polynomial-time algorithm since all of the real work (the interference) happens implicitly.

And that’s the reason why it’s useful to consider amplitudes as probabilities in the context of quantum vs classical. You don’t have to explicitly track a huge vector, but you still get the interference needed to solve some difficult problems.

If the concept of negative probabilities bothers you so much then think of them as [x, y] pairs, where x is a boolean and y is a probability.

45. Alex Says:

I would like to share a thought on the pseudo-philosophical argument Terry provided for “should we teach Quantum Theory in Junior High?”.

The risk of losing -progressively until relatively late in our educational lives- the mental plasticity required for understanding quantum theory is a very real risk. But it is not “brain age” alone that plays such a part.

It is also the academic context, in which someone learns to operate, that makes him/her less open to nonconformist ideas. Νonconformist in content, but also nonconformist as to who expresses them. In other words, the “more” expert someone becomes in one field, the deeper he/she enters in the circles of his/her expert peers. And that is a good thing, it is how science makes progress. But when it comes to answering big, philosophical questions, one should be open to as many ideas as possible, whoever provides them.

Of course, I am not talking about spending resources in listening to what anyone has to say about QM after having seen only a couple documentaries on the topic (maybe in an ideal world). But I believe we could widen the group of people talking about it if we offered more chances to grad students or even undergrads and amateurs (with some level of quantum-physics maturity that needs to be proven) to formally express any innovative ideas they have on e.g. the interpretation of Quantum Mechanics!

I know that there is a good reason why someone is considered an expert, but a true expert knows that a great idea can be found anywhere. And I am not that romantically optimistic as to expect a Ramanujan-Hardy kind of collaborations, but you never really know what to expect in any field as long as you are willing to consider new, serious ideas from (almost) wherever they come from.

46. Kevin S. Van Horn Says:

I don’t need to look at your list for that basic misconception, i can just read it off of a pop-sci magazine.

If you’re going to grossly misrepresent what I wrote, there’s no point in continuing the discussion. Where is your self-respect?

47. Job Says:

If you’re going to grossly misrepresent what I wrote, there’s no point in continuing the discussion. Where is your self-respect?

Let’s recap, because you’re being very evasive here.

You open by saying that treating amplitudes as probabilities is wrongheaded. I say that it’s useful in understanding why QCs are faster than classical machines.

You react with disbelief. I attempt to explain why it’s helpful to consider amplitudes as probabilities in order to capture the quantum speedup.

You double down on your disbelief and point to the vector size as the basis for the quantum speedup – why look any further? I point out that this is a popular misconception since classical systems can efficiently manipulate equally large vectors and that you need to go further.

Why is going further important? Because you’re leaving out interference, which is what negative probabilities adds to the classical picture.

Terry is adding signs to physical objects in order to help develop an intuition for quantum computing but apparently adding a sign to a probability is just going too far for you. And why?

The probabilities are the things you want to add the signs to because they are implicit.

48. Mason Green Says:

You know, I think part of the problem might be that the English language doesn’t yet have the explanatory power to talk about quantum mechanics without resorting to big words and symbols. Take the concept of a superposition and how it differs from the logical “and”. Why isn’t there a separate conjunction (or family of conjunctions) used in English to describe superpositions? That would really drive home the point that a superposition is different from “both A and B”.

I wish all the people who are spending time creating scores of new pronouns and P.C. euphemisms would focus on a more worthy task: making the English language quantum-friendly. Then it might even be possible to teach quantum theory to 5-year-olds!