## Tools for the modern complexity theorist

You’re deep in the Congo.  You’ve got an iPhone with some charge left, but there’s no cell tower for hundreds of miles.  With life-or-death urgency, you need to know the definition of the complexity class SBP and its relationship to BPPpath.  What do you do?

Not to worry: Satoshi Hada has created a free Complexity Zoo app for the iPad and iPhone.  I tried it out and it works great!

You get a cold call from yet another solitary genius who’s discovered a simple linear-time 3SAT algorithm.  You tell him to implement the algorithm and test it out, and then you’ll talk.  Half an hour later, he tells you he’s done so and it works perfectly.  So you tell him to go factor the 617-digit RSA challenge number.  But being an iconoclastic genius, he never learned how to reduce factoring to 3SAT.  What do you do?

Relax: answering challenge #2 from this blog post even before the post went up, USC students Henry Yuen and Joe Bebel have created a great web application called ToughSAT, which generates hard SAT instances on demand, based on factoring, subset sum, or even a “hard problem cocktail.”  As a commenter helpfully alerted us, a few years ago Paul Purdom and Amr Sabry of Indiana University already created a similar site that generates hard SAT instances based on factoring.

### 23 Responses to “Tools for the modern complexity theorist”

1. Mitch Says:

Is generating hard SAT problems mostly theory based…or is it just trying to solve a bunch of them and picking out the ones you can’t do?

2. Scott Says:

Mitch: A combination. We expect SAT instances based on cryptographic problems to be hard for the “theoretical” reason that solving them would break the underlying cryptosystems. (Though even there, obviously the empirical fact that the cryptosystems in question have withstood attack plays a major role!)

At the “other end of the spectrum,” there are uniformly-random k-SAT instances near the critical ratio, for which we have empirical data and semi-rigorous statistical physics arguments suggesting their apparent hardness, but essentially no reductions from other problems. (On the other hand, complexity theory has been gradually catching up here—to take one example, a few years ago Achlioptas et al. explained the failure of certain classes of algorithms on random k-SAT via rigorous results about the structure of the solution set.)

3. Raoul Ohio Says:

A major part of TCS are theorems for reducing various hard problems into other problems. A full set of programs that implement these reductions (for concrete problem instances) would probably have great practical and theoretical use. It would also provide lots of great practical research problems, namely improving the various steps, or proving an existing step is best possible.

Constructing a framework to implement such a set of programs would be a major step in ATCS (Applied TCS, did I just invent a topic?). To help get the ball rolling, I hereby offer the ROP (Raoul Ohio Prize!), namely \$10 and a couple old books by Knuth that are kicking around in my attic, for filling out my fuzzy goal, and doing something concrete.

4. Scott Says:

ATCS—I love it!

5. Henry Yuen Says:

Hi Scott, I don’t know if you’ve seen this yet, but it appears that wolfram alpha has added a “complexity module” to their engine, meaning you can look up complexity classes or even query for various inclusion relationships between classes! Of course, the answer to most queries will be “not known”, but I don’t think that’s a deficiency in wolfram alpha per se…

For example,

Apparently they misspelled sanjeev’s last name as “Aurora”.

This complements the complexity zoo app well, I think.

6. Scott Says:

Henry: Dammit, are they trying to horn in on the Complexity Zoo’s customers? 😉

I tried it and it worked well, though it would be nice if it returned more information than just a bare “true,” “false,” or “open.”

7. fagagr4bh Says:

OK, a nitpick, but I don’t think the “self-taught genius” bit is fair 🙁 I consider myself “self-taught” up until grad school and I am doing just fine in academia. Most of the best programmers I know are self-taught (and some of the academics too, like one of MIT’s recent hires)

8. Scott Says:

fagagr4bh: Thanks, I changed “self-taught” to “iconoclastic” to emphasize the irony. 🙂 I, too, was mostly “self-taught” as a teenager (and one could argue that anyone who ever manages to learn anything is “self-taught,” with at most small amounts of external help).

9. James Says:

> You get a cold call from yet another solitary genius who’s discovered a simple linear-time 3SAT algorithm.

Out of curiosity, how many people have ever contacted you with such an attempt? Are any of them actually interesting?

10. Scott Says:

James: In my life, probably a couple dozen (not counting P!=NP proofs, or P=NP proofs that I reviewed for conferences). Once I had to contact the police, over a guy who was making graphic and repeated death threats against me because I wouldn’t post his P=NP proof on my blog.

Many of the attempts have been absolutely fascinating from a psychological standpoint—relating P vs NP to everything from Jesus to the Bhagavad Gita to the author’s father. Only sympathy for the authors prevents me from making this stuff public.

No attempt, at least to date, has been especially interesting from a mathematical standpoint.

11. Raoul Ohio Says:

Back in the 70’s, a considerable bogon flux (bogon = unit of crackpotness) accumulated in physics departments. When I was a grad student, the chairman used to drop off the submissions in the faculty lounge for light reading while recaffinating. I claimed some top specimens before they got tossed by the cleaning crew.

A prize example (in my attic now) is a top end hardback tome, reorganizing all of math and physics using the insight that sqrt(10) is exactly equal to pi. I recall making some (Fermi estimates) of how much it must have cost to print and distribute this book.

A more recent example is the advertisements for the “Null Physics” book that appeared in top end science magazines for a couple of years. Can anyone calculate an estimate on how much that guy spent? For openers, what does it cost for the back cover of Scientific American?

12. Anonymous Says:

How do you reduce factoring to 3SAT? Is there a more direct way than going via Turing machines?

13. Scott Says:

Anonymous: Yes, a more direct way to reduce factoring to 3SAT is to construct a Boolean circuit to multiply integers–it’s not that complicated (certainly less so than Turing machines) and incurs “only” a quadratic blowup. If you want smaller blowup, you can use the Karatsuba algorithm, or even FFT multiplication.

14. jonas Says:

It’s interesting to read the comments to older (closed) posts on this blog. For example, today I learnt that Scott thinks graph isomorphism is not in P (and also not in NP).

15. Scott Says:

jonas: What on earth are you talking about??

I think GI is probably in P, and it’s known to be in NP. (On the other hand, we also know that it can’t be NP-complete unless the polynomial hierarchy collapses.)

16. Douglas Knight Says:

maybe jonas is referring to this, where you are agnostic about whether GI is in P.

Searching for this, I see you often invoke GI being easy in practice as evidence for its being in P. Does it seem any easier than factoring? Factoring a random number is easy, isn’t it? It’s just that we often like to make hard instances. I think there was a time when people didn’t know how to make hard pairs of graph, but now I think they do.

It seems weird to me to expect GI to be easier than factoring.

17. Scott Says:

Douglas: Thanks for researching what I’ve said! 🙂 I guess some days, I think GI is in P; other days, I think there might be hard instances (which, however, are very hard to find).

The situation with factoring is completely different: as far as anyone knows, factoring a random n-bit integers is hard. So in particular, it’s very easy to generate hard instances of factoring, but very hard to generate hard instances of GI (as soon as you describe how you’re generating the graphs, someone can probably come up with a distinguishing criterion for non-isomorphic pairs).

18. fagagr4bh Says:

The Wikipedia page on zero-knowledge proofs (http://en.wikipedia.org/wiki/Zero-knowledge_proof) uses graph isomorphism in a ZKP for Hamiltonian cycles. If GI is easy in practice (or it is likely to P) their example seems broken. Clearly, if GI is in P, their example is broken: one only needs to request to see the Hamiltonian cycle on the isomorphism, and then you can map the cycle on to the graph of interest in polynomial time.

However, even if it is not P but often easy the protocol seems broken. Is it the case that every graph has hard instances of GI? For instance, following Lipton’s recent post, if you knew that the graph of interest was free of a certain minor, could one solve the isomorphism quickly? Likewise, is it the case that one can generate a never ending stream of hard instances of GI, so you can request isomorphisms until you get an easy one?

19. Douglas Knight Says:

Scott,

As far as I can tell, there are no known hard instances of GI. not even ones that were subsequently made easy. Cai-Fürer-Immerman gave examples that showed that one putative invariant was inadequate and the same technique later showed another to be inadequate. Miyazaki used the CFI technique to produce “medium” difficulty instances that show Mckay’s algorithm (nauty) to take exponential time, but which are bounded degree, and thus solved by Luks’s algorithm. My impression is that the best algorithm by worst-case complexity, Babai-Luks, is related to Luks, and thus it, too should classify these graphs quickly, but I’m not sure.

So, maybe CFI could produce examples that could stump BL, but that seems to be an open problem. I don’t think anyone has ever had to produce an algorithm to deal with a hard instance.

Also, there’s an asymmetry between positive and negative examples. If you (constructively) prove that a family of graphs are different, the method of proof yields an invariant, which you might expect to be computable in polynomial time, and thus you can add it to your algorithm to handle this family. But I believe Miyazaki’s medium instance is a positive instance, where nauty struggles to find the isomorphism. Thus it seems difficult to patch the algorithm to handle the example (though other examples handle it already). I’m not sure what conclusion to draw from this.

fagagr4bh,

wikipedia is correct, but leaves out a lot of detail about the procedure, especially what is being committed to.

20. henri Says:

Scott,

ToughSat is really cool! However, it seems that the factoring2 mode has a bug. I have generated the cnf corresponding to 101 (which is prime) and minisat have found it satisfiable.

I have also a suggestion to simplify the CNF for factoring2. When factoring a n bits number, we can assume that one of the two factors will have a size less than |n/2| + 1 bits (enough to encode the square root of n). So it can be encoded with a multiplication of (|n/2| + 1) * n bits instead of a n*n bits as it is the case here (according to the comments about the factor’s encoding). This should significantly reduce the number of clauses and variables.

21. jonas Says:

To Scott and Douglas Knight: actually I was reading the comment “http://www.scottaaronson.com/blog/?p=284#comment-8159” about graph isomorphism.

22. Douglas Knight Says:

My previous comment is rather confused.

I now think BL always takes worst case time, so it’s irrelevant to the question of hard instances. Miyazaki’s instances probably have two parameters, number of vertices and degree. Keeping degree fixed, they are hard for McKay, but not Luks, but probably letting degree increase they are hard for Luks. Thus they are legitimate hard instances. And it took 10 years for Tener to modify the algorithm to solve them quickly.

So there has been only one hard instance, but it wasn’t quickly resolved.

23. Joe Bebel Says:

Hi Henri,

Thanks for the suggestions! We fixed (I hope 🙂 ) the bug you mentioned. Putting primes into “factoring2” should always result in unsatisfiable instances now.

You’re right about reducing the size of the multiplier – possibly in a future version. There are some other possible “optimizations” not done yet either. Hopefully, the source code will be ready for release soon.