## Experimental complexity theory

I just came back from the MIT CSAIL (Computer Science and Artificial Intelligence Lab) annual meeting, which was held at a beach resort in Cape Cod. No, it isn’t California, but for at least a few months a year “my” coast can put up a respectable showing too:

Out of all the ideas I heard at the CSAIL meeting, the one that made me proudest to have become a professor was this: *computer scientists should make a serious effort to address world hunger, deforestation, climate change, and other global crises, because of the significant opportunities to tap funding resource**s that are becoming available in these areas.* I’m telling you, if a giant asteroid were going to hit the earth in a week, the first question academics would ask would be how to beat out competing proposals for the $50-million “Deflection of Space-Based Objects” initiative at NSF.

The meeting ended with a “Wild & Crazy Ideas Session,” at which I (naturally) spoke. I briefly considered talking about quantum gravity computing, closed timelike curves, or quantum anthropic postselection, but ultimately decided on something a little less mainstream. My topic was “Experimental Computational Complexity Theory,” or “why do theoretical physicists get $8-billion machines for the sole purpose of confirming or refuting their speculative ideas, whereas theoretical computer scientists get diddlysquat?” More concretely, my proposal is to devote some of the world’s computing power to an all-out attempt to answer questions like the following: does computing the permanent of a 4-by-4 matrix require more arithmetic operations than computing its determinant? You can read my slides here.

Comment #1 June 27th, 2007 at 12:44 pm

Scott, are you proposing, like Avi Wigderson said about algorithms being sanity checks, huge experimental work has to be taken up

justfor sanity check on lower bounds? Or is there something more to your proposal?Comment #2 June 27th, 2007 at 12:50 pm

I think it’s more promising to take such

hugeexperiments for speculative ideas rather than for sanity checks of negative results.Comment #3 June 27th, 2007 at 1:56 pm

Did Wigderson say that, or was it Sipser? Or both? Anybody know?

Comment #4 June 27th, 2007 at 2:05 pm

Nagesh: No, the hope would be that experimental results could help us prove lower bounds. For example, if we found that minimal circuits for the permanent always had a certain structure, we could then try to prove that structure was necessary and that it forced the size to increase superpolynomially with n.

To be clear, I see this as an extreme long-shot proposal, for the reasons laid out in the talk. But right now, theoretical computer science operates almost entirely without the sort of data that theoretical physics takes for granted (or

tookfor granted until recently), and I think we should at least consider the possibility that the limit of n→infinity can be fruitfully approached from below, in the same way the Planck scale can.Comment #5 June 27th, 2007 at 3:20 pm

experimental complexity theory == computer hardware & software

Comment #6 June 27th, 2007 at 3:44 pm

This is exactly what I do whenever I need to prove something. I was pretty surprised to see in your slides that the theorist conventional wisdom is that you can’t learn anything using the methods that I use all the time. I’m an IR guy, not a theorist, though, so maybe I didn’t get the cultural conditioning that it’s supposed to be worthless? Or I’m just not smart enough to be able to prove something without looking at examples?

Comment #7 June 27th, 2007 at 5:32 pm

Oh

nowI get it. So the data acquired with such experiments might reveal insights into the hardness that canleadto actual lower bound proofs.In fact this is the approach most applied researchers in CS (like those in data mining, computer vision etc.) take for inventing their so called “theories”. Since you want this approach for negative results the experiments ought be exhaustive and huge.

Thanks a lot for the explanation Scott.

Comment #8 June 27th, 2007 at 5:39 pm

Did Wigderson say that, or was it Sipser?It was Sipser if Bill Gasarch isn’t wrong. Sorry for the mistake. I don’t know if Wigderson also said it though.

Comment #9 June 27th, 2007 at 6:32 pm

experimental complexity theory == computer hardware & softwareNo, that equation completely fails. Complexity theory asks about the best algorithms that could possibly exist. Ordinary hardware and software tells us nothing about such questions, nor is it designed to. Some software (like RSA) is certainly

appliedcomplexity theory, but that’s different from experimental.Comment #10 June 27th, 2007 at 6:42 pm

lylebot: Yes, even pure mathematicians often resort to experiment (more often than they admit!) when they want to guess at the truth or falsity of a conjecture. But complexity theory is a special branch of mathematics — one that asks about the asymptotic behavior, not of some

particularalgorithm or atypicalalgorithm, but of thebestalgorithm that could possibly exist. And the set of possible algorithms is both huge beyond imagination and lacking in any simple characterization. That’s why, in the past, experimental work hasnotbeen a big help to complexity theory: because the amount of computation that you’d have to do to learn anything interesting is so astronomical. My “wild & crazy” proposal is that we at least ask ourselves whether that’sstilltrue, given both the better understanding and the vastly increased computing power that we have today.Comment #11 June 27th, 2007 at 6:52 pm

But complexity theory is a special branch of mathematicsThank you, Scott, for finally admitting this! 🙂

Comment #12 June 27th, 2007 at 6:54 pm

Admitting? Why not boasting?

Comment #13 June 27th, 2007 at 7:13 pm

Admitting? Why not boasting?Even better, of course! I take it, then, that you would not consider it unreasonably presumptuous on the part of mathematicians were they to adopt the following as their slogan:

“Mathematics: the field that contains complexity theory”

(thereby subsuming complexity theory into mathematics in the same way that you yourself would subsume, well, the entire universe into complexity theory!)

Comment #14 June 27th, 2007 at 7:43 pm

Universe ⊆ Complexity

Complexity ⊆ Math

Math ⊆ Universe

Comment #15 June 27th, 2007 at 11:43 pm

So universe = math? That’s what I’ve been trying to tell people for a long time.

Comment #16 June 28th, 2007 at 12:50 am

More concretely, my proposal is to devote some of the world’s computing power to an all-out attempt to answer questions like the following: does computing the permanent of a 4-by-4 matrix require more arithmetic operations than computing its determinant? You can read my slides here.I haven’t heard of the permanent of a matrix but Scott just told us about it in his PowerPoint slides. The difference between it and the determinant of a matrix is the absence of the minus signs when doing the multiplication and addition of the matrix elements. It is very surpising to me that I am told that we haven’t found a program that calculates the permanent of a 4x 4 matrix with minimal steps, though we know the minimal steps for calculating its corresponding determinant.

Comment #17 June 28th, 2007 at 2:29 am

though we know the minimal steps for calculating its corresponding determinant.No, we don’t even know that. 🙁

Comment #18 June 28th, 2007 at 6:37 am

Several people have tried (since Brent’s 1970 Master’s thesis, at least) to discover fast algorithms for small matrix multiplication by brute force (by restricting to bilinear algorithms, which makes things simpler). As far as I know, there was little success.

BTW, for the 4×4 determinant, you can save 1 operation by using Cramer’s formulas for the inner 3×3 one.

Comment #19 June 28th, 2007 at 6:45 am

Indeed, why to jump to #P-hard problems (such as the permanent) if we don’t know yet how to derive lower bounds for P problems? For instance, I would love to know if the complexity of matrix multiplication is O(n^2). The is the other side of the coin: optimal algorithms/minimal curcuits for tractable problems may open to us new algorithmic ideas that are not discovered yet by human creativity/intuition. Then, this may lead to new approaches to hard problems, and they may happen to be not hard at all… This is why I tend to consider P=NP possibility seriously: we know only very few techniques leading to efficient algorithms, and discovering one of them takes ages (like AKS for PRIMES). So, I don’t agree with the bayesian argument that if there was a polynomial algorithm for NPC we would discover it, thus P!=NP. Simply, we are not efficient at discovering efficient algorithms, and if there is a chance to power us up by performing a complete automated search for them, it may help a lot. Besides, positive results are more useful for practical applications, so assuming equal probability of success, I would rather go for a positive result.

Comment #20 June 28th, 2007 at 7:04 am

Stas: Nontrivial lower bounds for P problems are

wonderfulto whatever extent we can get them, and this is universally understood by complexity theorists. On the other hand, sometimes lower bounds for NP and #P problems are (perhaps not surprisingly) easier to prove. As an example, Fortnow and van Melkebeek showed that SAT can’t be solved simultaneously in n^{1.618…}time and n^{o(1)}space (1.618… being the golden ratio). To my knowledge, we don’t currently have any similar result for a P problem. Likewise, we know that there exist problems in #P that don’t have linear-size circuits, but we don’t know the same for P (or for that matter NP).Comment #21 June 28th, 2007 at 7:34 am

Okay. I retract

experimental complexity theory == computer hardware & software

Instead I say,

\mu(experimental complexity theory – computer hardware & software) = 0

where \mu is MY own, well defined measure.

Scott, You yourself, when pressed for an example of experimental complexity theory, talk about: “to devote some of the world’s computing power”, instead of ” to devote some of the world’s pencil resources”. I’m glad you are duly kowtowing to computer hardware and software. We computer programmer’s rule the world. Without us, you complexity theorists are utterly unattractive in the eyes of the world.

Comment #22 June 28th, 2007 at 7:48 am

the whole theoretical disussion ignores constant

factors, but in practice these are important.

Isn’t this all a bit too distant from practice ?

I’d prefer to see computational bounds in seconds(n)

rather than some bounds with several parameters

and epsilons.

Comment #23 June 28th, 2007 at 7:50 am

correction:

Instead of – of the two sets, I meant their symmetric difference

Comment #24 June 28th, 2007 at 8:21 am

Nontrivial lower bounds for P problems are wonderful to whatever extent we can get them, and this is universally understood by complexity theorists. On the other hand, sometimes lower bounds for NP and #P problems are (perhaps not surprisingly) easier to prove.Thank you, Scott, I got it — the harder the problem, the easier the derivation of a superlinear lower bound.

As an example, Fortnow and van Melkebeek showed that SAT can’t be solved simultaneously in n1.618… time and no(1) space (1.618… being the golden ratio).I just wonder if there is a similar result for my favorite NP problem CLIQUE :).

To my knowledge, we don’t currently have any similar result for a P problem.It makes the experimental search for them even more exciting…

Comment #25 June 28th, 2007 at 9:36 am

Eric:

Actually I and two of my coauthors have been working a complete computer search for matrix multplication algorithms, for rectangular matrices. So far we have mostly been working with cases were the optimal number of multiplications is known, but for some of these cases we have now generated all optimal methods. For one case we have in this way also proven that the best method known is in fact optimal.

No paper available yet, but we are writing it now.

Comment #26 June 28th, 2007 at 9:53 am

Klas: When you have a preprint, send it to me!

Comment #27 June 28th, 2007 at 9:57 am

the whole theoretical disussion ignores constantfactors, but in practice these are important.

If you’re searching for an optimal algorithm for 4-by-4 matrices, it’s obvious that the whole problem boils down to constant factors, and I say as much in the slides. Specifically, the problem is to reduce the number of cases to check from 10

^{123}down to some manageable number.Comment #28 June 28th, 2007 at 10:29 am

Scott said

“More concretely, my proposal is to devote some of the world’s computing power to an all-out attempt to answer questions like the following: does computing the permanent of a 4-by-4 matrix require more arithmetic operations than computing its determinant?”

Scott said

“Complexity theory asks about the best algorithms that could possibly exist. Ordinary hardware and software tells us nothing about such questions, nor is it designed to.”

Comment #29 June 28th, 2007 at 10:31 am

I’ll be happy do to that Scott. I hope we will have the preprint written up by the end of summer, depending on conference travel, vacations and so on

Comment #30 June 28th, 2007 at 10:40 am

rrtucci, we seem to be talking past each other.

Computer hardware and software can be

usedto do experimental research in complexity theory, just as magnets can be used to do experimental research in high-energy physics. But experimental high-energy physics is notaboutmagnets, and experimental complexity theory would not be about existing hardware and software.Comment #31 June 28th, 2007 at 11:23 am

Scott, do you mechanize any of your proofs? Because I’d like seeing huge amounts of computing power burned to generate a google-sized database of lemmas.

Comment #32 June 28th, 2007 at 12:02 pm

I guess Scott means a

naturalP problem…Comment #33 June 28th, 2007 at 1:00 pm

Rahul: Yes, thanks for the clarification. Stuffing an n

^{k}-time Turing machine into the problem statement is definitely not allowed.Comment #34 June 28th, 2007 at 1:28 pm

Hi Scott,

Did you presentation only have 8 slides? that was the number of slides in the file linked from your post.

Cheers

Comment #35 June 28th, 2007 at 1:41 pm

Ernesto: Yes, and I didn’t even get to finish it. I was allotted 4 minutes, plus 1 minute for questions.

Comment #36 June 29th, 2007 at 10:11 am

Okay, I’ll say my last word about this and then I’ll shut up. Scott, you are fond of making a distinction between A is_used_by_or_uses B and A is_not_strictly_about B. You’ll say Complexity Theory is_not_strictly_about computer hardware or software. Or Experimental Complexity Theory is_not_strictly_about computer hardware or software. Or

“experimental complexity theory == computer hardware & software

No, that equation completely fails. Complexity theory asks about the best algorithms that could possibly exist. Ordinary hardware and software tells us nothing about such questions, nor is it designed to.”

Your last assertion might be true in the is_not_strictly_about sense, but it is false in the is_used_by_or_uses sense. I was of course talking in the is_used_by_or_uses sense, which is the sense most people would use to interpret a statement such as “experimental complexity theory == computer hardware & software”.

I also think that the A is_not_strictly_about B is a pedantic distinction, invented by and used almost exclusively by complexity theorist to distance themselves from computer programmers, because they consider themselves high above such trades. Of course, they are living in a bubble; the vast majority of the world admires a computer programmer much more than a complexity-what? I mean, so what if complexity-what?s eventually prove that P != NP. Everyone knew that all along. If I were a complexity theorist, I would stop making these true but pointless is_not_strictly_about distictions which make non-complexity theorists yawn. I would instead model myself around Donald Knuth, who is an excellent mathematician, and a very eager programmer. He is not concerned with these artificial class distinctions.

Comment #37 June 29th, 2007 at 11:36 am

Of course, they are living in a bubble; the vast majority of the world admires a computer programmer much more than a complexity-what? I mean, so what if complexity-what?s eventually prove that P != NP.Apparently, being admired by the masses is important to you. But it’s not a priority with most complexity theorists and mathematicians. They would rather that their work be admired by peers than by the general public, who have very little understanding of and, therefore, appreciation for theoretical research.

Having said that, however, I would have to agree with your last two sentences. The distinction between pure and applied mathematics is an artificial one, and a good mathematician (including the complexity theorist) should not look down on applications.

Comment #38 June 29th, 2007 at 12:59 pm

I’m amused by people who’ve somehow gotten the idea that I look down on applications. I worked on applied projects for four years as an undergrad and summer student at Bell Labs, and just this week I started working with a student to implement the ideas in my “Learnability of Quantum States” paper, and turn them into a practical tool for experimental physicists.

Comment #39 June 29th, 2007 at 1:35 pm

Scott, I wasn’t referring to you in particular; it was a general statement about the attitude of some theorists, that’s all. 🙂

Comment #40 June 30th, 2007 at 9:16 pm

Good idea Scott. I would suggest, however, collaborating with researchers working on theorem provers if you are serious about this. I’m guessing that if the “experimental complexity” you describe ever comes to be, those involved would build something resembling a theorem prover running on a massive computer cluster.

Comment #41 July 1st, 2007 at 6:19 am

Does anyone know offhand the respective costs for 3×3 matrices? Also, does one want to treat multiplication and

addition on equal footing for these problems?

Comment #42 July 1st, 2007 at 7:10 am

Anon: Excellent questions!

The situation

alreadylooks nontrivial for 3-by-3 matrices — the exact answers might be known, but not by me. I know that 14 operations are sufficient, and I conjecture that that’s optimal. I can show that 9 operations are necessary (i.e., 8 don’t suffice) :-). I’ll be extremely interested if anyone has better upper or lower bounds.No, there’s no need to treat multiplication and addition gates on an equal footing — I was doing so for simplicity, but it would also be extremely interesting to know the tradeoffs. Indeed, we could even look at the

three-waytradeoff among addition, multiplication, and division gates.(Gaussian elimination of course requires division gates. We can always replace division by addition and multiplication using Taylor series, but to do so we need to be working over either a field of small finite characteristic, or else over the reals or complex numbers where we’re willing to tolerate small errors. And at least for now, I’d prefer to look for determinant circuits that are formally correct regardless of the underlying field.)

Comment #43 July 1st, 2007 at 8:07 am

Scott,

Are you aware of Dodgson’s (a.k.a. Lewis Carroll) condensation method? It’s an alternative method for computing the determinant. It might not require less operations than Gaussian elimination, but seems to be different than the methods you are considering in your presentation.

Comment #44 July 1st, 2007 at 9:20 am

Assuming you want to compute a polynomial, to remove divisions (and keep an exact result!), what you need is that your base field has enough points (so that you can find a point where no denominator vanishes). Of course, there is an overhead in doing so, which is something like O(d log(d)loglog(d)), where d is the degree of the polynomial you want to compute.

Comment #45 July 1st, 2007 at 11:30 am

Alon: No, I didn’t know about Dodgson condensation! Does anyone know its complexity? Otherwise I’ll work it out myself.

Comment #46 July 1st, 2007 at 11:58 am

This week I was teaching Order of Operations to my High School students. I quoted: “Ambition, Distraction, Uglification, and Derision.”

I asked if they knew the author that I cited, “Lewis Carroll.” One young lady said: “Yeah; he molested little girls.”

We went off on a long highly-charged tangent. Eventually, I explained who Charles Dodgson was, what his day job was, and I told the anecdote of his audience with Queen Victoria. And the book that he sent Her Majesty. That book containing part of what Scott Aaronson now wants to probe more deeply. Ummm, I mean probe down the rabbit hole. I mean…

Bareiss, E. H. “Sylvester’s Identity and Multistep Integer-Preserving Gaussian Elimination.” Math. Comput. 22, 565-578, 1968.

Bressoud, D. and Propp, J. “How the Alternating Sign Matrix Conjecture was Solved.” Not. Amer. Math. Soc. 46, 637-646.

Dodgson, C. L. “Condensation of Determinants, Being a New and Brief Method for Computing their Arithmetic Values.” Proc. Roy. Soc. Ser. A 15, 150-155, 1866.

Lotkin, M. “Note on the Method of Contractants.” Amer. Math. Soc. 55, 476-479, 1959.

Macmillan, R. H. A New Method for the Numerical Evaluation of Determinants.” J. Roy. Aeronaut. Soc. 59, 772, 1955.

Robbins, D. P. and Rumsey, H. Jr. “Determinants and Alternating Sign Matrices.” Adv. Math. 62, 169-184, 1986.

Weisstein, Eric W. “Condensation.” From MathWorld–A Wolfram Web Resource. http://mathworld.wolfram.com/Condensation.html

Comment #47 July 1st, 2007 at 5:24 pm

Scott — Dodgson condensation has complexity O(n^3), which is, if I recall correctly, the same as Gaussian elimination. It tends to be useful for matrices which have a high degree of symmetry, as you wind up performing the same computations repeatedly.

Comment #48 July 1st, 2007 at 5:34 pm

Thanks, David!

Comment #49 July 1st, 2007 at 11:04 pm

As I mentioned at FCRC, I am currently working on a project that automatizes almost all of the existing framework for time lower bounds established via “alternation-trading proofs”. (The Fortnow-van Melkebeek result cited above falls in this category.) My approach applies not only to the existing proofs of statements like

“SAT requires n^c time and n^{o(1)} space”

but also things like

“Pi_4 SAT requires n^d time on a one-tape Turing machine.”

So far, my implementations have just been written in Maple. Given that, they’re surprisingly powerful– enough to have checked all possible alternation-trading proofs with at most 20 “lines” for the best possible time lower bound for SAT on n^{o(1)} space algorithms (for a natural definition of “line”).

In some settings, this automated approach has found no better time lower bound than the current best– a strong suggestion that the current best is “optimal” in some sense. In other more technically annoying settings (where humans fear to tread?), it has managed to yield new improvements. Eventually, when my thesis is fully revised and all that, I’ll post a preprint.

Comment #50 July 2nd, 2007 at 5:31 pm

As an example, Fortnow and van Melkebeek showed that SAT can’t be solved simultaneously in n1.618… time and no(1) space (1.618… being the golden ratio). To my knowledge, we don’t currently have any similar result for a P problem.It depends a little on exactly what you mean by ‘similar’. We can’t get time bounds in such tradeoffs to be even as large as n1.00001 for problems in P but Mike Saks, Xiaodong Sun, Erik Vee and I, following work of Miki Ajtai, did prove that a particular simple problem in P can’t be computed simultaneously in n(log n/loglog n)1/2 time and n.99999 space, even non-uniformly.

Also, though Ryan Williams was too modest to note it, his recent CCC 2007 paper improved that golden ratio to a number above 1.80.

I don’t think that the hardness of the permanent alone is what makes it so appealing. For low level complexity lower bounds, it is often easier to prove that a simple function is hard rather than a complex one. (The alternation-trading arguments are notable exceptions.) The appeal of the permanent is not only that it is #P hard but also that it has many symmetries and is very similar to the determinant which is simple to compute.

Comment #51 July 3rd, 2007 at 2:54 pm

The number above 1.80 which Ryan believes might be tight is, amusingly, 2cos(pi/7).

Comment #52 July 4th, 2007 at 1:39 am

You didn’t think we would recognize that on our own? Give us some credit ryano.

Comment #53 July 5th, 2007 at 6:39 pm

I think this is a great idea.

On the one hand, experimental complexity theory seems much more difficult than, say, experimental number theory, since it seems to require a much larger search space, ruling out many naive approaches. Thus, it’s clear that designing and executing useful experiments would require not just strong computers but a combination of both theoretical understanding and engineering/programming skills of the kind many theoreticians do not posses (even those who know can program in other languages than BASIC :). But of course, this is also true for experimental physicists/biologists etc..

On the other hand, I think that this difficulty just demonstrates why experimental complexity theory could be so important. We don’t really understand the space of algorithms, and I’m sure that such experiments could come up with surprising results. (Probably good candidates are arithmetic problems with relatively efficient algorithms such as matrix multiplication and Fourier transform, since then we still have a good notion of what are the elementary operations but the search space is not doubly exponential.)

Probably the first step, before asking for funding, is for us complexity theorists to be receptive and supportive of such works.

Comment #54 July 5th, 2007 at 6:48 pm

To specify more precisely what P problems I would like to see attacked experimentally, let me refer to

http://www.win.tue.nl/~gwoegi/CO2a/exact2.pdf

page 9, Open problem 4.3

It’s all related to k-CLIQUE complexity with fixed k. If you let me know about any progress in this direction, I would greatly appreciate it.

Comment #55 July 9th, 2007 at 6:21 am

“computer scientists should make a serious effort to address world hunger, deforestation, climate change, and other global crises, because of the significant opportunities to tap funding resources that are becoming available in these areas.”

What?! And all these years, Ive been doing this for the sex and the drugs…

No, wait, thats Bono.

Comment #56 July 9th, 2007 at 7:59 am

Tapan: 🙂

Whatever your motivations, I’m a big fan of your work, and it made me feel pretty shitty to be competing with you on the job market this year. I mean, yes, the dearth of quantum complexity theorems

isone of the most urgent crises facing our civilization … but more urgent than third-world poverty? That’s a tough one.So I’m delighted to read on your homepage that you ended up at Berkeley. I loved it there, and I don’t think you could’ve made a better choice. If you get lucky, you

mighteven encounter others there who share your interest in social justice.Also, I highly recommend the mango smoothies at Hummingbird on Euclid Avenue.

Comment #57 August 1st, 2007 at 10:41 am

The exploration of geometry in large dimensions is a worthy “experimental” mathematics topic.

In our own work, we are presently computing (non-sparse) Riemann tensors on Kahler manifolds of (complex) dimension up to 99. Each tensor thus requires 99^4 = 10^8 complex entries. This is about the limit that can be computed on a desktop scale machine. These tensors are showing us remarkable mathematical regularities that we are far from understanding analytically.

We would dearly love to move to larger dimensions, say Riemann tensors of dimension 1000^4. This would let us explore the Kählerian geometry to protein-scale spin simulations. IBM’s Blue Gene racks seem ideally suited for this scale of computation.

Upon a review of the literature, we presently believe that our 99^4 Kählerian Riemann tensors are the largest ever computed in explicit numerical form — does anyone know of a comparable or larger geometric computation effort?