Customers who liked this quantum recommendation engine might also like its dequantization

I’m in Boulder, CO right now for the wonderful Boulder summer school on quantum information, where I’ll be lecturing today and tomorrow on introductory quantum algorithms.  But I now face the happy obligation of taking a break from all the lecture-preparing and schmoozing, to blog about a striking new result by a student of mine—a result that will probably make an appearance in my lectures as well.

Yesterday, Ewin Tang—an 18-year-old who just finished a bachelor’s at UT Austin, and who will be starting a PhD in CS at the University of Washington in the fall—posted a preprint entitled A quantum-inspired classical algorithm for recommendation systems. Ewin’s new algorithm solves the following problem, very loosely stated: given m users and n products, and incomplete data about which users like which products, organized into a convenient binary tree data structure; and given also the assumption that the full m×n preference matrix is low-rank (i.e., that there are not too many ways the users vary in their preferences), sample some products that a given user is likely to want to buy.  This is an abstraction of the problem that’s famously faced by Amazon and Netflix, every time they tell you which books or movies you “might enjoy.”  What’s striking about Ewin’s algorithm is that it uses only polylogarithmic time: that is, time polynomial in log(m), log(n), the matrix rank, and the inverses of the relevant error parameters.  Admittedly, the polynomial involves exponents of 33 and 24: so, not exactly “practical”!  But it seems likely to me that the algorithm will run much, much faster in practice than it can be guaranteed to run in theory.  Indeed, if any readers would like to implement the thing and test it out, please let us know in the comments section!

As the title suggests, Ewin’s algorithm was directly inspired by a quantum algorithm for the same problem, which Kerenidis and Prakash (henceforth KP) gave in 2016, and whose claim to fame was that it, too, ran in polylog(m,n) time.  Prior to Ewin’s result, the KP algorithm was arguably the strongest candidate there was for an exponential quantum speedup for a real-world machine learning problem.  The new result thus, I think, significantly changes the landscape of quantum machine learning, by killing off one of its flagship applications.  (Note that whether KP gives a real exponential speedup was one of the main open problems mentioned in John Preskill’s survey on the applications of near-term quantum computers.)  At the same time, Ewin’s result yields a new algorithm that can be run on today’s computers, that could conceivably be useful to those who need to recommend products to customers, and that was only discovered by exploiting intuition that came from quantum computing. So I’d consider this both a defeat and a victory for quantum algorithms research.

This result was the outcome of Ewin’s undergraduate thesis project (!), which I supervised. A year and a half ago, Ewin took my intro quantum information class, whereupon it quickly became clear that I should offer this person an independent project.  So I gave Ewin the problem of proving a poly(m,n) lower bound on the number of queries that any classical randomized algorithm would need to make to the user preference data, in order to generate product recommendations for a given user, in exactly the same setting that KP had studied.  This seemed obvious to me: in their algorithm, KP made essential use of quantum phase estimation, the same primitive used in Shor’s factoring algorithm.  Without phase estimation, you seemed to be stuck doing linear algebra on the full m×n matrix, which of course would take poly(m,n) time.  But KP had left the problem open, I didn’t know how to solve it either, and nailing it down seemed like an obvious challenge, if we wanted to establish the reality of quantum speedups for at least one practical machine learning problem.  (For the difficulties in finding such speedups, see my essay for Nature Physics, much of which is still relevant even though it was written prior to KP.)

Anyway, for a year, Ewin tried and failed to rule out a superfast classical algorithm for the KP problem—eventually, of course, discovering the unexpected reason for the failure!  Throughout this journey, I served as Ewin’s occasional sounding board, but can take no further credit for the result.  Indeed, I admit that I was initially skeptical when Ewin told me that phase estimation did not look essential after all for generating superfast recommendations—that a classical algorithm could get a similar effect by randomly sampling a tiny submatrix of the user preference matrix, and then carefully exploiting a variant of a 2004 result by Frieze, Kannan, and Vempala.  So when I was in Berkeley a few weeks ago for the Simons quantum computing program, I had the idea of flying Ewin over to explain the new result to the experts, including Kerenidis and Prakash themselves.  After four hours of lectures and Q&A, a consensus emerged that the thing looked solid.  Only after that gauntlet did I advise Ewin to put the preprint online.

So what’s next?  Well, one obvious challenge is to bring down the running time of Ewin’s algorithm, and (as I mentioned before) to investigate whether or not it could give a practical benefit today.  A different challenge is to find some other example of a quantum algorithm that solves a real-world machine learning problem with only a polylogarithmic number of queries … one for which the exponential quantum speedup will hopefully be Ewin-proof, ideally even provably so!  The field is now wide open.  It’s possible that my Forrelation problem, which Raz and Tal recently used for their breakthrough oracle separation between BQP and PH, could be an ingredient in such a separation.

Anyway, there’s much more to say about Ewin’s achievement, but I now need to run to lecture about quantum algorithms like Simon’s and Shor’s, which do achieve provable exponential speedups in query complexity!  Please join me in offering hearty congratulations, see Ewin’s nicely-written paper for details, and if you have any questions for me or (better yet) Ewin, feel free to ask in the comments.


Update: On the Hacker News thread, some commenters are lamenting that such a brilliant mind as Ewin’s would spend its time figuring out how to entice consumers to buy even more products that they don’t need. I confess that that’s an angle that hadn’t even occurred to me: I simply thought that it was a beautiful question whether you actually need a quantum computer to sample the rows of a partially-specified low-rank matrix in polylogarithmic time, and if the application to recommendation systems helped to motivate that question, then so much the better. Now, though, I feel compelled to point out that, in addition to the potentially lucrative application to Amazon and Netflix, research on low-rank matrix sampling algorithms might someday find many other, more economically worthless applications as well.

Another Update: For those who are interested, streaming video of my quantum algorithms lectures in Boulder are now available:

You can also see all the other lectures here.

59 Responses to “Customers who liked this quantum recommendation engine might also like its dequantization”

  1. Mike Says:

    Does this new approach offer clear insights that might be applicable beyond the machine learning case?

  2. Carl Lumma Says:

    So glad you blogged about this — I didn’t get the spelling of Ewin’s name after brunch the other week, and no similar name is listed as a postdoc on your home page (I didn’t guess he was an undergrad!), and couldn’t find a relevant video on the Simons Institute youtube channel.

  3. Scott Says:

    Mike #1: Good question! Check out the comments in Ewin’s paper about L2 sampling; they might offer some insights about where else these ideas might be useful. I should point out, though, that in many cases (Simon’s problem, period-finding, Forrelation, quantum walk on glued trees…), people have proved exponential separations between the quantum and randomized query complexities, which means that there certainly won’t be a dequantization in those cases along the same lines as what Ewin did.

  4. New top story on Hacker News: Customers who liked this Recommendation Engine may also like its Dequantization – Tech + Hckr News Says:

    […] Customers who liked this Recommendation Engine may also like its Dequantization 3 by beefman | 0 comments on Hacker News. […]

  5. Vaso Vuk Says:

    Hello I’m interested to implement this algorithm — I think memoization would end up creating great reductions in run time. Quantum computing could lead to exponential speedups, but it is unclear how one would draw the quantum circuit for such a sophisticated idea at this point. Any leads on that would be greatly appreciated.

  6. asdf Says:

    > Yesterday, Ewin Tang—an 18-year-old who just finished a bachelor’s at UT Austin, and who will be starting a PhD in CS at the University of Washington in the fall

    Could he finish that PhD before he even starts it? That result plus some incantations sounds like a thesis.

  7. Scott Says:

    asdf #6: Possibly, but why? Even when an undergrad is already an accomplished researcher, grad school provides an excellent opportunity to become even better. The pay isn’t high, but one will never again have as much free time to learn new things and think about hard problems (at least in my experience). At this level, it’s not about clearing some arbitrary milestone so one can get a degree, but about developing into the strongest researcher one can be.

    Having said that, if U. of Washington causes any trouble, we in Austin stand ready to take Ewin back… 🙂

  8. J-K Says:

    Just to chip in that recommender algorithms are also used in biology.

  9. C. Sowak Says:

    @Scott:
    You are so right about learning and time. I think not realizing that early enough might have been one of the biggest mistakes in my whole life.
    The other thing is that it’s really good for one’s character to live without much money – helps you to stay closer to the roots.

    @Ewin:
    Nice one! Keep up the good work 🙂

  10. Scott Says:

    J-K #8: Cool! For what?

  11. Brian Says:

    Vaso #5: The implementation of this will require working through some of the assumptions in the paper. For one, a “low-overhead data structure”. Maybe it could be an extension of an already well-engineered framework for building recommendation systems.

    Scott #10: Here’s a paper where researchers predicted cancer drug response using a recommendation system: https://www.biorxiv.org/content/early/2017/11/07/215327

  12. pst Says:

    Shot in the dark, just because I happened to be skimming arxiv.org/abs/1803.04189 yesterday and it seems to do with training machines on sparse data — related?

  13. JimV Says:

    After reading the post and then coming back the next day and reading the title again, I see what a great title it was. As my nephews say, sweet!

  14. rick samborski Says:

    Scott,
    You say: “Now, though, I feel compelled to point out that, in addition to the potentially lucrative application to Amazon and Netflix, research on low-rank matrix sampling algorithms might someday find many other, more economically worthless applications as well.” Don’t you mean more economically worthwhile applications as well? Or maybe that’s a joke.

  15. Scott Says:

    rick #14: Reread that sentence in the context of the whole paragraph and see if you get the joke.

  16. Mike Maxwell Says:

    I glance at this blog occasionally in hopes that some day I’ll understand s.t., and that it will make a difference. Alas, I’m a linguist with some computing knowledge (I understand the basics of algorithmic complexity), but most of this is over my head.

    But I have a question about a linguistic application that *sounds* like it might be related to this recommendation problem. In morphology, we have languages with multiple inflection classes, like Spanish -ar/-er/-ir verbs (similar systems in other Romance languages); the set of affixes is the same for all verbs within an inflection class, for example Spanish hablar (“to speak”) and arreglar (“to arrange”) both belong to the ar class, and therefore have all the same suffixes. Often some of the affixes are the same across all classes, for instance -o is the first person present indicative for -ar, -er and -ir verbs.

    Some languages have far more than three such systems; and these inflection classes can occur for nouns and adjectives as well as for verbs. And languages tend to have lots more nouns than verbs.

    To some extent cross-cutting these inflection classes are alternations in the stem. The Spanish verb tener (“to have”) is -er class, and venir (“to come”) is -ir class; and although they have some affixes that are different because of this, they have many of the same changes in their stem (= the part before the suffix), like tienes (“you have”) and vienes (“you come”), where the ‘e’ of the stem has changed to ‘ie’.

    A linguistic term that I’ll need to use is ‘lexeme’: it means a word, including all its affixed affixed forms. For English, an example would be RUN, with inflected forms run, runs, running, and ran.

    Spanish dictionaries of course tell you all this, i.e. given a lexeme, the dictionary tells you enough that you can construct all its affixed forms. But if you’re working on a less studied language, you may not have a good dictionary, indeed you may be creating the language’s first dictionary from words you find in corpora (texts). And except for the most common words (and sometimes not even then), you won’t find every form of every word in the corpus. In other words (pun intended), the matrix of inflected forms of lexemes is sparse (the matrix would contain, for each of the m lexemes, all n of its affixed forms, including stem changes).

    Your job as a lexicographer is then to decide which inflection class and which kinds of stem change a given lexeme has (or could have, since the answer may be ambiguous given the attested forms). For any given a column (let’s say different verbs are different columns) that’s the most similar to that column, in some sense of similar (same affixes, same stem changes).

    I’m guessing there’s something about this problem that is different from the recommender problem, but I’m not seeing what it is. (Actually, I know of one such difference: it has to do with the number of inflection classes you get in real languages, which is far lower than what you could get in principle, given the number of distinct affixes. But part of my interest lies in why this constraint exists in real languages.)

  17. Yovel Says:

    Hi Ewin (hopefully you are reading this),
    I’m going to start my thesis soon and I would love to try writing it about your paper. Of course I would not want to steal your own paper (plus you will probably be in a better position to write such a paper :)). Are you planning to try looking into lowering the polylog degree yourself soon, or is it free for grabs?

  18. Ewin Says:

    Mike #16:

    That does seem like a sensible application! I will note, though, that if n is sufficiently small (as it sounds like it might be in this case), the brute-force ways to do these types of linear algebra tasks become pretty fast (not much more time than it would take to upload the corpus into the matrix).

    Yovel #17:

    Feel free! I’m not currently planning to try lowering the exponents. From my perspective, the easiest way of doing this would be to abandon my algorithm strategy completely and do something more direct. It’s reasonable to me that such an approach could improve the exponents by at least half.

  19. FeelFreeToLeaveMeInModerationQueue Says:

    “Now, though, I feel compelled to point out that, in addition to the potentially lucrative application to Amazon and Netflix, research on low-rank matrix sampling algorithms might someday find many other, more economically worthless applications as well.”

    How pithy.

    Have you even SEEN the average 1bedroom rent in the Bay? Is your economic perspective solely focused on “globalist” total-revenue-output measures at the utter expense of all “local” issues? (You are making an argument based on internet-service jobs, of course.) How do you expect elementary school teachers and janitors and fast-food workers to afford to live in a city where the average primary wage-earner needs to make $150,000/yr to comfortably afford housing? Should the Plebeians of Rome shit the streets while the Patricians avert their eyes and noses? Or was Universal Basic Income and socialism-teetering-on-communism the end-goal in the first place?

    Teach me more about your views on economics, please. Apparently my degree in it hasn’t been a sufficient education. Show me the true future utopian worldview, Scott.

  20. Links for July 2018 – foreXiv Says:

    […] 18-year-old Ewin Tang has proved that the Kerenidis and Prakash recommendation algorithm does not provide an example of an exponential speed up in quantum machine learning. Here’s his advisor Scott Aaronson on the implications: […]

  21. Scott Says:

    FeelFreeToLeaveMeInTheModerationQueue #19: Ok, but what does any of that have to do with what I wrote? And what could’ve been going through your head to make you imagine it was related? I wasn’t talking those huge economic questions, important though they are, just about the applications of Ewin’s result.

  22. Ethan Says:

    We’re going to want a post on today’s news from China. https://www.scientificamerican.com/article/chinese-researchers-achieve-stunning-quantum-entanglement-record/

  23. Haggai Nuchi Says:

    I’ve implemented the BST sparse vector and matrix classes, as an exercise. I was thinking to implement the whole algorithm, but then I got distracted and stopped. Maybe someone else will continue where I left off!

    https://github.com/nuchi/bst-matrix-vector

  24. Scott Says:

    Ethan #22: Nope. Sorry. I don’t do a post about every experimental advance that gets picked up and hyped by the press as if it were the biggest thing in 500 years. I wouldn’t have time for anything else if I did.

  25. Bram Cohen Says:

    Analogizing with graph isomorphism, it seems likely (or maybe just plausible) that there’s a hack job algorithm with much lower exponent than Ewin’s which ‘seems’ to work right ‘most’ of the time on ‘practical’ examples but is extremely difficult to analyze properly.

  26. Scott Says:

    Bram #25: Agreed.

  27. AJ Says:

    Randomness seems essential here.

  28. Scott Says:

    AJ #27: Yes, especially since the desired output is a sample from a distribution. 🙂

  29. William Hird Says:

    @ Left in the Moderation Queue:
    What’s your point, do you expect Scott to come up with answers for all our socio-political-economic woes? I think we as a society get what we deserve (on average !) (or instant group karma if you have metaphysical leanings). Think about the fact that every day tens of thousands of Americans pack themselves into sports arenas to watch ( and spend their hard earned money on) mindless athletic exhibitions. But if someone were to try an host an event to have a meaningful discussion about how our government wastes trillions of our tax dollars, you will be lucky to get 5 people to show up , even if you offer free pizza and beer. There’s your answer.

  30. Meow Says:

    Suppose if $P=NP$ and SAT is in Linear time by a simple idea (yes the magic gorilla from mars comes to eat UT Austin) and your student proves it would you go to the same extent of taking student to experts and arguining with them for four hours and coming with a consensus?

  31. AJ Says:

    So P is not BPP?

  32. Scott Says:

    Meow #30: All the more so.

  33. Scott Says:

    AJ #31: No, this has no bearing at all on P vs. BPP.

    One reason is that we’re talking about sublinear-time algorithms, which have random access to the input, but not enough time even to read in all n bits of it. Exponential separations between deterministic and randomized complexity have long been known in the sublinear setting; Ewin’s work gives another (very interesting) example. But P vs. BPP asks only whether every nO(1)-time randomized algorithm for a decision problem can be simulated by an nO(1)-time deterministic one.

    A second reason is that we’re talking about sampling problems, whereas P vs. BPP talks only about decision problems. As I mentioned before, deterministic algorithms don’t even really make sense for sampling problems.

  34. Sugestões para seminários de CQ 2018/2 | Prof. Franklin Marquezino Says:

    […] Algoritmo clássico inspirado em quântico para sistemas de recomendação: https://scottaaronson-production.mystagingwebsite.com/?p=3880. […]

  35. AJ Says:

    Scott @33 Strange what other sublinear cases do we know randomness has upper hand? Also do we know that it is consistent with P=BPP that decision problems where randomness could only help in average case but still on some perhaps polynomially many inputs need exponential time determinism may not help in exponentially many inputs?

  36. Mitchell Porter Says:

    Readers of this blog might be interested to know that Gil Kalai will be giving a plenary lecture at next month’s International Congress of Mathematicians in Rio (where the Fields Medals for 2018 will be awarded).

  37. Scott Says:

    AJ #35: If we’re willing to talk about promise problems—which we should be, since the KP recommendation systems problem also involves a promise on the input!—then it’s easy to prove huge, unconditional deterministic vs. randomized separations in the sublinear world.

    Here’s an example: I promise you that the input, an n-bit string, has Hamming weight either at least 2n/3 or else at most n/3. Your task is to decide which.

    A random sampling algorithm trivially solves this problem, with high probability, using only O(1) samples of the bits (for the same reason political polling usually works 🙂 ). By contrast, it’s not hard to see that any deterministic algorithm for the problem needs to make a linear number of queries.

    Unfortunately I didn’t understand your second question.

  38. AJ Says:

    Suppose $\Pi$ is a decision problem and it admits a randomized algorithm where for all but polynomially many inputs of every length the algorithm is in polynomial time however every deterministic algorithm fails to be in polynomial time for exponentially many inputs of every length then is such a picture consistent with P=BPP?

  39. AJ Says:

    Are there still inconsistencies in my problem?

  40. AJ Says:

    Case 1. $\Pi$ can be derandomized Case 2. No such $\Pi$ can exist. Case 3. P=BPP is irrelevant to this. I am not sure which case applies here.

  41. Scott Says:

    AJ: Ah, now I understand the question; thanks. As far as I know, what you suggest is formally compatible with P=BPP, but it’s not compatible with the slightly stronger conjecture PromiseP=PromiseBPP, and all the concrete approaches to derandomization that are known (Impagliazzo-Wigderson, etc.) would actually derandomize PromiseBPP and not just BPP.

  42. AJ Says:

    This is first time I hear of PromiseP=PromiseBPP conjecture. How is this not compatible with this conjecture and so if there is such a problem then we have to find a different strategy for P=BPP? I thought only way for P=BPP is through good PRGs. I do not understand.

  43. AJ Says:

    In https://simons.berkeley.edu/sites/default/files/docs/6119/doesrandomnesshelp.pdf it is said ‘But at least some formalizations of the most moderate positions– the structuralist and the heuristic — have been proved false’.

    1. Is situation I seek similar to heuristic view which has been proved false and cannot occur? What exactly formally is meant here (an informal explanation is done in slides) and is there any references?

    2. Roei Tell shows PromiseP=PromiseBPP implies NTIME[n^f(n)]\not\subseteq P/poly at every f(n) in \omega(1) and also says it cannot be improved to NP\not\subseteq P/poly without proving latter unconditionally. Is P=NP and PromiseP\not=PromiseBPP compatible?

  44. Scott Says:

    AH: Google “derandomizing PromiseBPP.” It’s a pretty well-known question. In practice, pretty much the only approach we know to handle these problems is to develop PRGs, which would handle BPP and PromiseBPP alike. In principle, though, no one can rule out the weird possibility of some other approach that would work for languages but not for promise problems.

    I regret that I don’t have time for all your followups—maybe someone else can field them.

  45. AJ Says:

    Ok sorry for too many queries. However ‘Is situation I seek similar to heuristic view which has been proved false and cannot occur? What exactly formally is meant here (an informal explanation is done in slides) and is there any references?’ directly affects your answer.

    You said ‘it’s not compatible with the slightly stronger conjecture PromiseP=PromiseBPP’. However Impagliazzo says such ‘‘But at least some formalizations of the most moderate positions– the structuralist and the heuristic — have been proved false’’. So I think if my problem is of nature where randomness helps from heuristicview then it cannot exist and that is why want to confirm.

  46. Hansi Says:

    A few comments regarding the lectures about quantum mechanics, the content was ok and some aspects were really interesting, but I’m having a few formal issues:

    The videos seem to require a lot of bandwidth, while they basically broadcast an almost stationary image. I’m here with 16MBit/s connection and can hardly skip forward, it seems to take ages. It would be ok for me, if I’d only get a few frames/second, what matters, is sound and the chalkboard. Isn’t there none out there able to create a lecturerer’s codec that specifically adressed such needs?

    You moved way to many content into the exercises. While this can be encouraging for people who already have a sound understanding of at least some basics in that field, it is annoying like hell for newbies. Now comes a bit of a rant, don’t take it too seriously, but imo there may be some gist in it, the academic circus having serious isses with didactics. E.g. I used to learn a lot from the algorithm book of Brassard/Bratley, even though they covered almost half of the actual content, theorems, lemmas by their exercises. This worked, since I had some prior knowledge and later a few lectures at the university. Years later, after I left university, I tried to learn something about matroids, which were covered only very superficially even in the best lectures, so I got the book of Gordon/McNulty. While the book is ok and sometimes even didactically excellent, they did the usual no solution trick (aka dont wanna mess the university business by having to create new exercises). Annoying like hell, especially since it turned out, that some exercises were ill-posed or at least ambiguous. Is that really necessary? From what I’ve learned over the years, all cool kids (people in fraternities, cronys of TAs, etc did/do have access to full solutions, only the honest or dumb kids are messed up. What kind of pedagogics is that? Imo it is absolutely justified for students to buy instructors’ solution manuals. Yes I know, this is usually not doing the bad students a favour at all, since they just copy/paste without thinking or even wanting to learn, but they generally don’t benefit from tools like calculators or notebooks, but that’s another issue. I know of some solution manuals, which have way better explanations of theorems and stuff (I’m not talking about the exercises themselves) in them than in the corresponding text book, which clearly shows the enormous flaw in didactics, people who already grasped the ideas and don’t have to learn anymore, get way better explanations than the people who really need help, struggling with new concepts (from their point of view). Another business idea, scientific text books for autodidacts? Would have to be genuinely for self-learning, not another “easily understandable” introduction, that will only work for you, if you have a lecture and a good TA.

    You seem to be living in an industrialized country. Still no sponge and maybe a bucket of water publicly available? This kind of cleaning ^h blurring the chalkboard was really an atrocity, a crime against humanity, and I’m not an sjw snowflake. Watersaving may be ok, but pc should never go that far. I’m also a big fan of squeegees (hope that is the right word, I’m not a native speaker). When it comes to chalkboard cleaning, they are a must, costing only a few more dollars, if you can afford a bucket or a lavatory, don’t try to save on the squeegee. With a good squeegee you can almost immediately continue writing. Was that a trade union thingy? Or possible issues with reflections? Ok, just kidding, not a real issue for me, just when I saw it, I was seriously distracted.

    Overall I didn’t want to sound too negative, the lectures themselves were nice, it’s always hard to convey content like that, depending on the target group, even many engineers, mathematics, biologists, medics working in related areas usually don’t have much previous knowledge, I just read a few books, which mostly covered only the abstract part (Brac-Ket), but not the physical implementation. So I noticed, that I know way to little about constructive/destructive interference (repeatedly applying Deutsch-Josza).

    On youtube many people are asking for e.g. an introduction into quantum mechanics, but most yt videos about quantum stuff are way too superficial or completely useless, this target group usually would need lectures for high school kids, building up all the mathematics from scratch, so it would take many hours. Similar to what I’ve wrote above, in many areas you’d really need tutorials or books for autodidacts, outside of the university business, taking care of people with very limited previous knowledge. With ongoing digitization (whatever that means) and automation a lot of people may be forced to learn completely new stuff, not everybody will be suited for or will have the time and ressources for a complete course of studies.

    just my two cents,
    kind regards,

  47. Jalex Stark Says:

    AJ #45: For questions like yours, cstheory.stackexchange.com often has good discussions at the level you’re looking for. Likely such a result would show up if you google the keyword that Scott gave you.

    Hansi #46: You have to do exercises to learn concepts of the definition-theorem-proof kind. If you don’t come up with some of the proofs yourself, you’re going to have large gaps in your knowledge — things that you don’t really know but you recognize and skim over when other people say them. The author of a book or lecture notes does their job by presenting a picture of the subject that looks okay if you don’t fill in the holes.

    Also, these particular lectures are for a summer school. The intended audience probably spent a few hours that day working over the exercises and talking to Scott / other instructors / each other before going to the next day’s lecture. In this context, it would be a pedagogical *mistake* not to put important results in the exercises.

    I’m not sure what you’re getting at when you mention youtube. If you want a self-contained online course about quantum computing, there are a few good open online courses, see e.g. Vazirani https://www.edx.org/course/quantum-mechanics-quantum-computation-uc-berkeleyx-cs-191x and Vidick–Wehner https://www.edx.org/course/quantum-cryptography-caltechx-delftx-qucryptox-0

    (As an aside, I found your comments about the eraser choice pretty mean-spirited. Dry-erase is a pretty entrenched standard at most US institutions, and it seems like the wrong battle to fight if you want to convince people to lecture differently. Saying you’re “just kidding” at the end doesn’t absolve you of guilt if you didn’t make any jokes.)

  48. Raoul Ohio Says:

    Hansi #46: Number me with the fans of a clean chalkboard if a white board cannot be found.

    Many of us cling to obsolete technology because we enjoy being reminded of the good old days. E.g., I have just given up my last standard transmission vehicle.

    A common, if inexplicable, manifestation of this phenomena is presenting highly math oriented topics on chalk boards rather than whiteboards. A reasonable analogy is preferring graphics on a B&W CGA monitor rather than a modern monitor — except this does not capture icky “chalk dust on hands” aspect.

  49. gentzen Says:

    Hansi #46: If you are unhappy with the bandwidth of the videos, you can watch them on youtube instead. You loose the ability to switch between the slides view and the speaker view, but most lectures don’t use the slide view anyway.

    In other cases, I guess it makes more sense to view the content on its original website instead of YouTube. Take for example Robert Lawrence Kuhn’s Closer To Truth interviews or Sean Carroll’s new Mindscape Podcast. Those original websites contain explaining text and links to related content, and a matching atmosphere. (Those are interviews/conversations with really interesting people in a very relaxed atmosphere, so it is nice to consume them in a context where that relaxed atmosphere is not disturbed.)

  50. Phil Gossett Says:

    1) Now that we have this (very interesting) result, is there any remaining example of an exponential quantum speedup that is *not* an example of the (possibly generalized) abelian hidden subgroup problem?

    2) Given P/=NP, is there a proof that Shor or Simon (or any other such problem) is classically exponentially hard?

    Thanks!

  51. Scott Says:

    Phil #50:

    1) Yes. Forrelation, the Childs et al. glued trees problem, the vanDam-Hallgren-Ip shifted quadratic character problem, and probably at least a half-dozen others. (Here I’m not counting the simulation of quantum physics and chemistry, estimation of the Jones polynomial, or anything else BQP-complete, since in some sense those are near-tautological examples.)

    2) No, we don’t know how to prove such a statement. The closest we can get, in some sense, is to prove that there’s no polynomial-time classical algorithm to sample exactly from the BosonSampling distribution unless the polynomial hierarchy collapses (and similar statements for other sampling-based quantum supremacy proposals). Of course, Shor’s algorithm solves something classically hard if you assume that factoring is not in P. 🙂

  52. LowerBound Says:

    If Forrelation already proves quantum and classical query complexity is exponentially (in fact arbitrarily) apart then does it not prove quantum computing is superior to classical in Turing model?

  53. Major Quantum Computing Advance Made Obsolete by UT Grad | Physics Forums Says:

    […] On a separate note, here is Scott Aronson's blog post about Ewin Tang's result. https://scottaaronson-production.mystagingwebsite.com/?p=3880   atyy, Aug 2, 2018 at 11:25 […]

  54. James Gallagher Says:

    Hilariously, the increasingly “losing touch with reality” nutcase that is Lubus Motl actually rebanned me from his blog (after about 5 years with no posts) just for upvoting your posts there.

  55. James Gallagher Says:

    Just to clarify, It’s probable that my disqus upvotes were deleted automatically due to the old (~five years ago) ban, rather than a new ban, I forgot that disqus votes initially register even if you are banned on a blog but are not permanent, and they go away after a page refresh.

  56. QuantumBiceps Says:

    Can some one explain how the distribution F is formed, in Algorithm 3 it is mentioned “Let F denote the distribution given by choosing an s ∼u [p], and choosing a column
    from DA ˜i”. Are we sampling column indices from a distribution of row l2-norms ?

    And one more question what does “choosing a column from DA ˜i” mean ?

  57. Krishna Vishal Says:

    @Scott I think there is a typo in the Algorithm 3: MODFKV.
    In the step “Let F denote the distribution given by choosing an s ~_u [p], and choosing a column from …”. I think there shouldn’t be a tilde over A_{i_s}. Because if there is a tilde we would be sampling from a vector with one element. Is my interpretation correct ?

    And can you clarify why W is called a submatrix ?

  58. Shtetl-Optimized » Blog Archive » Annual recruitment post Says:

    […] and six PhD students—as well as some amazing undergrads striving to meet the bar set by Ewin Tang. Course offerings in quantum information currently include Brian La Cour’s Freshman Research […]

  59. Shtetl-Optimized » Blog Archive » Quantum Dominance, Hegemony, and Superiority Says:

    […] my brilliant former student Ewin Tang (yes, that one) relayed to me a suggestion by Kevin Tian: “quantum eclipse” (that is, the moment when […]