Quantum Computing Since Democritus Lecture 6: P, NP, and Friends

P, NP, and Friends. A whole undergraduate complexity course in one HTML file.

For those of you who already know this stuff: forgive me for boring you. For those who don’t: read, learn, and join the Enlightened.

Note: The comment section was down all day, but it’s back now. Google ought to be ashamed to be running something as rickety and unreliable as Blogger. Well, I guess you get what you pay for.

32 Responses to “Quantum Computing Since Democritus Lecture 6: P, NP, and Friends”

  1. logopetria Says:

    Another very nice installment of a great lecture series. Thanks for making these available.

    Do you have any tips or advice on how to go about finding reductions to prove NP-completeness? I have a problem in mind that I’ve reformulated to look sort of SAT-like, but now I don’t know where to start in trying to reduce some other problem to it.

    Of course, the best way to start would be to practise, practise, practise, by reading (or better yet, proving for myself) some known reductions. Then I might get more of a feeling for how these things work and where to look for promising footholds in a problem. But that’s hard work! So if you have any distilled wisdom on the matter that you could pass on, it would be much appreciated.

  2. Scott Says:

    Sorry, no distilled wisdom from me. Try Garey & Johnson. Or if your problem is easy enough to state, post it here and maybe someone will solve it for you.

  3. Scott Evil Says:

    John von Neumann was absolutely desperate to automatize mathematics. Over at IAS, Vladimir Voevodsky is so tired of manually checking all of those peer review papers that he is frantically examining his lambda-calculus in hopes of rescue. Can Nicolaas de Bruijn help us to automate proof verification? This has been an age-old quest.

  4. Anonymous Says:

    Scott, one question about the “extra credit” problem: can you state more precisely what the explicit algorithm is supposed to do when given as input an unsatisfiable formula ?

    Is it supposed to halt at all, and within what time bound ?

  5. Nagesh Adluru Says:

    Your honest statement about having coarser (broader) outlook in complexity theory is really enlightening:)

    Also you nicely bring out that even at that level of coarseness you cannot avoid messy fine details.

  6. Nagesh Adluru Says:

    Your honesty is one of the most important factors of the enlightening flavor of your lectures.

  7. Scott Says:

    Scott, one question about the “extra credit” problem: can you state more precisely what the explicit algorithm is supposed to do when given as input an unsatisfiable formula ?

    Given an unsatisfiable formula, your algorithm can behave arbitrarily. I edited the question to clarify this.

  8. Anonymous Says:

    But so far, it seems like we’re using the right abstraction. Of the big problems solvable in polynomial time — matching, linear programming, primality testing, etc. — most of them really do have practical algorithms.

    Here you are channeling Garey and Johnson, perhaps subconsciously, where a similar argument appears. However that statement was true back when G&J was published but it no longer holds. Robert Sedgewick first observed this in what is now called Sedgewick’s corollary to Moore’s law: suppose you have an algorithm that runs in time n^2, and currently the largest data set you run on is of size N. Then after d=y/1.5 years your program run runs on a computer that is d times faster on a data set that is d times larger. So total running time is: (N*d)^2/d = N*d, or d times slower. All your programs are getting slower by the day and the larger the degree of your polynomial, the slower they get! In fact Sedgewick observed that this even applies to sorting, which is O(n log n).

    Second, those of us in industry run every day into problems for which all known polynomial algorithms are impractical. Even in academia you can find such problems particularly in computer graphics, databases and numerical analysis. Ask any numerical analyst if an n^2 algorithm is acceptable on a large matrix.

    Lastly, it is surprising that you missed the circular nature of your statement, since that is a common pet peeve of yours. For example, in computer graphics there are many ideas that at first sound good and desirable, and then one quickly realizes that they are equivalent to some well known n^2 or n^3 algorithm. The idea is thus discarded and it is back to the drawing board. So you never get to hear about these problems because we know them to be impractical and instead we use heuristics such as photon sampling.

  9. Anonymous Says:

    matching, linear programming, primality testing…

    The point that Scott was making that the previous commenter obviously missed is that each of these problems is solved in practice very efficiently. This does not mean that they necessarily use the algorithms which prove these problems are in P (though, for the above examples, the approaches used in practice are intimately tied to the _reason_ those problems have P-time algorithms).

  10. Anonymous Says:

    Addendum to my previous post:
    As computers get faster by a factor of d, the problems we want to solve are getting harder by a factor of d^2…

  11. Anonymous Says:

    The point that Scott was making that the previous commenter obviously missed is that each of these problems is solved in practice very efficiently.

    Bzzt. Wrong. I’m not disagreeing that matching is solved efficiently in practice. I’m disagreeing with the part that says “big problems solvable in polynomial time … really do have practical algorithms”.

    This statement is both false and somewhat circular, viz. people rely on matching because matching is fast and hence matching is a big problem because everybody uses it.

  12. Anonymous Says:

    Bzzt. Wrong. I’m not disagreeing that matching is solved efficiently in practice. I’m disagreeing with the part that says “big problems solvable in polynomial time … really do have practical algorithms”.

    Not just matching; both primality testing and linear programming as well. And if you didn’t realize it, Scott classifies these as “big” problems NOT because they are “big in practice,” but because they were HUGE landmarks in the theory of computing.

  13. Anonymous Says:

    Scott classifies these as “big” problems NOT because they are “big in practice,” but because they were HUGE landmarks in the theory of computing.

    Not as presented in the notes. While LP and primality testing were landmarks in ToC at the time of solution, matching wasn’t. The importance of matching is derived from its many applications, both within theory and in practice, and I bet that if matching had proved hard, but something else but similar easy, people would have relied on that something else as the basic primitive.

  14. Scott Says:

    I bet that if matching had proved hard, but something else but similar easy, people would have relied on that something else as the basic primitive.

    Even if I accept this, I don’t see how it shows that my argument is circular. It would be circular if there were primitives that people wanted to use in practice, except that they took n^10 or n^40 time. My point was that we rarely encounter such primitives.

    (That’s less true in our current Derandomaceous Era than it was in the Karpian Explosion, but still very true.)

    Incidentally, I completely agree with you that in graphics, you really do want linear or n log n. You don’t want to be solving a large linear program 30 times per second. The meaning of “efficient” in this context is that, if you were desperate to solve some particular linear program, you could do it in a few minutes or hours — not in 10^5000 years.

  15. Anonymous Says:

    It would be circular if there were primitives that people wanted to use in practice, except that they took n^10 or n^40 time. My point was that we rarely encounter such primitives.

    What I’m trying to say is that we do not let them become primitives, since we know they are hard. Say, if we can solve problem A using matrix multiplication or some not so efficient n^10 technique, we’ll choose matrix multiplication, but if tomorrow we discovered a sublinear time algorithm for the alternative technique we would suddenly solve a lot more problems using it and it would become an important primitive.

    Expanders come to mind as one such tool whose importance as primitive rose tremendously after it was discovered that they could be produced effectively.

  16. Scott Says:

    Expanders come to mind as one such tool whose importance as primitive rose tremendously after it was discovered that they could be produced effectively.

    When something that was known to be in P, but believed to be “impractical,” turns out to have a practical algorithm when people really look at it, that seems like an argument in favor of the polynomial-time paradigm…

    But let’s move this conversation in a more interesting direction. What are some of the problems that you’d like to solve in practice, such that polynomial-time algorithms are known but they take n^10 or n^20 time?

  17. Anonymous Says:

    What are some of the problems that you’d like to solve in practice, such that polynomial-time algorithms are known but they take n^10 or n^20 time?

    Primality testing (n^6). Range queries in d dimensions (n^d). Convex hulls (n^(d/2)). In graphics shortest paths on polytope surfaces (n^2), image recognition primitives (n^4).

  18. Scott Says:

    Thanks for the list!

    Primality testing is n^3 if you’re willing to use a randomized algorithm (such as are used routinely for key generation). Of course, it would be fantastic to do better.

    Algorithms whose running time is n^d (or n^(1/epsilon), etc.) are really in a separate category, and in my view are best handled by the Downey-Fellows theory of parameterized complexity.

    I’ve already accepted that for graphics and streaming applications, linear or n log n is what you want. Having said that, it’s interesting that your other examples don’t go above n^4.

  19. Anonymous Says:

    While LP and primality testing were landmarks in ToC at the time of solution, matching wasn’t.

    Edmonds paper on non-bipartite matching was a pretty big deal…

  20. Anonymous Says:

    Primality testing (n^6)

    Randomized primality checkers used (at least) millions of times per day.

    Range queries in d dimensions (n^d). Convex hulls (n^(d/2)).

    Approximate solutions to the rescue…

    In graphics shortest paths on
    polytope surfaces (n^2), image recognition primitives (n^4).

    I don’t know anything about graphics… except that in coding graphics demos for mode 13h (320×200) on old x86 boxes we tried really hard to optimize our PutPixel (down to 13 clock cycles I think)…

  21. Anonymous Says:

    Randomized primality checkers used (at least) millions of times per day.

    Approximate solutions to the rescue…

    Your point being? Polynomial solutions weren’t fast enough, so we had to settle for second best. That works in practice, but it still disproves the statement that all “big problems” ended up having practical polynomial solutions.

    I’m sure many of those who so sternly defend “polynomial = easy” would change their mind if they spent a couple of years in industry dealing with real life problems.

    There, Omega(n^3) tends to be prohibitively large, Theta(n^2) barely acceptable, Theta(n log n) Ok and O(n) (meaning linear or below) nirvana.

  22. Anonymous Says:

    Another P algorithm that is slow in practice: semidefinite programming. In working on embeddings problems, I’ve wanted to explicitly solve SDPs to find good embeddings of given graphs into Euclidean space. It’s extremely annoying that for graphs with more than a few thousand vertices, it’s too slow to solve the SDP in practice.

    Here is a point I’ve thought about a lot: It seems that every argument presented in favor of “polynomial time seems to be a very good theoretical approximation to efficient-in-practice” holds equally for the claim “n polylog(n) time seems to be a very good theoretical approximation to efficient-in-practice”. I think that quasilinear time should be the proper standard for efficient algorithms.

  23. Scott Says:

    Maybe we should change the way we talk about these things to the following:

    Not in P: Your algorithms are getting slaughtered left and right, like foot soldiers in the Trojan War.

    In P: The war might be over, but like Odysseus, you still have a long and perilous journey home.

    n^2: Home (but dozens of suitors have been hitting on your wife).

    n*polylog(n): You’ve slaughtered the suitors.

  24. HN Says:

    Scott, please consider using WordPress. There are tons of tools which automatically convert blogger blogs to wordpess blogs.

    It is a crime to have nice ideas presented using bad bad bad blogging tool. It’s like writing STOC/FOCS/SODA papers with MS Word.

  25. Bram Cohen Says:

    An interesting constant factor problem is that there are variants on shell sort which are known to be n*log(n), but their constant factor is in the thousands, so they aren’t used in practice despite considerable intrinsic paralellism and beauty.

  26. Scott Says:

    Scott, please consider using WordPress. There are tons of tools which automatically convert blogger blogs to wordpess blogs.

    Thanks for the suggestion — I’ll try it when I get a chance. The main question is whether I know enough to set this up on my web server (with Blogger you just set it up to ftp the files). An alternative would be to wait for the new version of Blogger (developed in-house by Google) to get rolled out of beta.

  27. Anonymous Says:

    AFAIK the fastest known Shellsort takes time O(n log^2(n)).

  28. Anonymous Says:

    Your description of the pratical interpretations of NP, P, quadratic and n polylog(n) are bang on the money. That is the message we should pass on to students:

    NP = surrender all hope of obtaining the optimum solution, try approximations or heuristics.

    P = things are looking better, often we can take an n^k algorithm down to n^2 or below with some extra effort.

    n^2 you are in safe territory unless your application is particularly large or particularly speedy.

    n polylog(n) then in the immortal words of Dr. H. Simpson: wohoo!

  29. Douglas Knight Says:

    so they aren’t used in practice despite considerable intrinsic paralellism

    Parallelism doesn’t sound like evidence that they’d be used in practice, since most people don’t have access to lots of parallel resources. Do you know anything about what, say, google uses? (Isn’t quicksort already pretty parallel, anyhow?)

  30. Bram Cohen Says:

    Anonymous: my memory could very well be wrong. Please don’t cite my comments in blog posts from academic journals.

    Doug Knight: In practice everyone uses either merge sort or bucket sort, except in extremely highly optimized implementations which *sometimes* use quicksort.

  31. Rick Neff Says:

    I wonder if you would clarify something for me. My understanding of NP, NP-Complete and NP-Hard must be a little fuzzy. By definition, any given NP-Hard problem is reducible in polynomial time from any other NP-Hard problem. Just like NP-Complete problems, except that an NP-Hard problem need not be in NP, meaning it need not be able to be “verified quickly”.

    So if problem ABC is NP-Hard, all other NP-Hard problems are reducible to ABC, in polynomial time. So ABC is at least as hard as any other NP-Hard problem. So if we find a solution to ABC, we can find a solution to any problem that reduces to it, say XYZ. Let’s also say XYZ is NP-Complete. So why can’t we use a verifier for problem XYZ to also verify ABC? And wouldn’t that make ABC NP-Complete as well? Am I making sense?

  32. Abhinav Says:

    Are you going to NZ? Theres an awesome skydiving center at Queenstown. I go there after QIP

    Abhinav