My philomath project: Sensitivity versus block-sensitivity

If you like math, and you don’t yet have a Math Overflow account, stop reading this post now (not right now, but by the end of the sentence) and set one up, before returning here to finish reading the post. Math Overflow is the real deal: something that I’ve missed, dreamed about, and told my friends someone ought to set up for the last fifteen years, and that now finally actually exists. (It was founded by Berkeley grad students and postdocs Anton Geraschenko, David Brown, and Scott Morrison.) If you have a research-related math problem you can’t solve, you can post it there and there’s a nontrivial chance someone will solve it (or at least tell you something new), possibly within eleven minutes. If you’re an ambitious student looking for a problem to solve, you can go there and find one (or a hundred).

To take one example, here’s a terrific complexity question asked by Timothy Gowers, about a notion of “average-case NP-completeness” different from the usual notions (if you think he’s asking about a well-studied topic, read the question more carefully). I didn’t have a good answer, so I wrote a long, irrelevant non-answer summarizing what’s known about whether there are average-case NP-complete problems in the conventional sense.

But my real topic today is the sensitivity versus block-sensitivity problem, which I recently posted to MO in a disguised (and, dare I say, improved) form.

For non-Boolean-function-nerds, sensitivity vs. block-sensitivity is a frustrating and elusive combinatorial problem, first asked (as far as I know) by Noam Nisan and by Nisan-Szegedy around 1991. Here’s a lovely paper by Claire Kenyon and Samuel Kutin that gives background and motivation as well as partial results.

Briefly, let f:{0,1}ⁿ→{0,1} be a Boolean function, with n input bits and 1 output bit. Then given an input x=x₁…x_n to f, the sensitivity of x, or s_x(f), is the number of bits of x that you can flip to change the value of f. The sensitivity of f is s(f) = max_x s_x(f). Also, the block-sensitivity of an input x, or bs_x(f), is the maximum number of disjoint sets of bits of x (called “blocks”) that you can flip to change the value of f, and the block sensitivity of f is bs(f) = max_x bs_x(f). Clearly 1 ≤ s(f) ≤ bs(f) ≤ n for every non-constant Boolean function f. (bs(f) is at least s(f) since you could always just take each block to have size 1.)

To give some examples, the n-bit OR function satisfies s(OR)=bs(OR)=n, since the all-zeroes input is sensitive to flipping any of the n input bits. Likewise s(AND)=bs(AND)=n, since the all-ones input is sensitive to flipping any of the bits. Indeed, it’s not hard to see that s(f)=bs(f) for every monotone Boolean function f. For non-monotone Boolean functions, on the other hand, the block-sensitivity can be bigger. For example, consider the “sortedness function”, a 4-input Boolean function f that outputs 1 if the input is 0000, 0001, 0011, 0111, 1111, 1110, 1100, or 1000, and 0 otherwise. Then you can check that bs(f) is 3, whereas s(f) is only 2.

Here’s the question: What’s the largest possible gap between s(f) and bs(f)? Are they always polynomially related?

What makes this interesting is that block-sensitivity is known to be polynomially related to a huge number of other interesting complexity measures: the decision-tree complexity of f, the certificate complexity of f, the randomized query complexity of f, the quantum query complexity of f, the degree of f as a real polynomial, you name it. So if, as is conjectured, sensitivity and block-sensitivity are polynomially related, then sensitivity—arguably the most basic of all Boolean function complexity measures—ceases to be an outlier and joins a large and happy flock.

The largest known gap between sensitivity and block-sensitivity is quadratic, and is achieved by “Rubinstein’s function.” To define this function, assume for simplicity that n is an even perfect square, and arrange the input bits into a √n-by-√n square grid. Then we’ll set f(x)=1, if and only if there exists a row that has two consecutive 1’s and all other entries equal to 0. You can check that bs(f)=n/2 (for consider the all-zeroes input), whereas s(f)=2√n (the worst case is when every row contains exactly one 1).

It’s a reasonable guess that Rubinstein’s function gives pretty much the largest gap possible, and how hard could that possibly be to prove? Well, how hard could a white rabbit in front of a cave possibly be to kill?

I’ll confess to going on sensitivity versus block-sensitivity binges every couple of years since I first learned about this problem as an undergraduate at Cornell. The last binge occurred this weekend, triggered by the strange block-sensitivity properties of my counterexample to the GLN Conjecture. And that’s when it occurred to me to use the hyper-inter-network tools of Web 2.0, together with my power and influence here at Shtetl-Optimized, to unleash a new flood of activity on the problem. There are at least four factors that make this problem well-suited to a collaborative math project:

The statement can be understood by almost anyone. I could explain it to my parents.
It seems unlikely (though not impossible) that the solution will require any heavy-duty math. What seems needed, rather, is lots of creativity to come up with new ideas specific to the problem at hand, as well as diabolical examples of Boolean functions that refute those ideas.
Even though the problem has been around for 20 years, the relevant literature is very small (maybe half a dozen papers); it would take at most a day to learn everything known about the problem.
Despite 1-3, this is a real problem that a significant number of people would care about the answer to.

If you feel like you want a new angle on the problem—something that hasn’t already been explored to death, or even to serious injury—you can try my “geometric variant” of sensitivity vs. block sensitivity described on Math Overflow.

I’m calling this a “philomath project,” a term that pays tribute to the successful polymath projects popularized by (and carried out on) Timothy Gowers’ wonderful blog, but that avoids infringing on a registered trademark of GowersCorp.

So, here are the philomath project rules: do you have an idea about sensitivity vs. block sensitivity? Or a vague pseudo-idea? Or a proposal for an easier variant? Then post it here! Or go over to Math Overflow and post it there. Let’s see if a block of us acting in unison can flip this problem.

This entry was posted on Tuesday, July 13th, 2010 at 9:48 am and is filed under Complexity, Nerd Interest. You can follow any responses to this entry through the RSS 2.0 feed. Both comments and pings are currently closed.

89 Responses to “My philomath project: Sensitivity versus block-sensitivity”

Elad Verbin Says:
Comment #1 July 13th, 2010 at 11:47 am
Whoops, I forgot about greater-than signs being part of HTML tags and the last comment got messed up. Can you delete the last one?

1. It might be useful to remind people that, at least if you want to go by the rules of polymath, more or less anything one can say should be posted, and that short comments that just have one idea, however vague, are expected. In other words, laziness and half-baked ideas are king.

2. Let me start by giving one well-known observation about how to “lift” gaps between s and bs: Suppose we have a function f on n variables, with some gap between s(f) and bs(f). I’ll show how to “lift” this gap, i.e. to create a new function g=f^k where s(g)=s(f)^k and bs(g)≥=bs(f)^k.

Let’s see how to do this. The function g is just “recursive-f”, where the recursion is of depth k. Namely, g is a n^k-bit function, and to compute g(x) you arrange the bits of x in the leaves of a depth-k tree of degree n. At each node you apply f on the labels of the children, to get the label of the node. The output g(x) is the label of the root.

It’s quite easy to see that s(g)=s(f)^k. (I leave this as an exercise). It’s also quite easy to see that bs(g)≥bs(f)^k. I don’t know if the latter bound is tight (it would be very interesting if it isn’t! In fact, I’m willing to guess that it isn’t, and that this might have some relation with independent sets in graph powers, as in e.g. the paper of Dinur, Friedgut and Regev, but this is just a hunch)

Note that this construction doesn’t help improve a polynomial gap: if bs(f)=s(f)^t, then the best we can get from the relation of g and f that I have above is that bs(g)≥s(g)^t. However, this construction is interesting for various other reasons. It also suggests that to get an improved gap, it might be possible to play with the construction of g above (by, for example, reusing variables in some clever way) in a way that will not increase s(g) but will increase bs(g).

3. I’d like to suggest that to actually get progress, it might be worthwhile to summarize (maybe in a blog post or in a wiki) what is known about the s vs. bs question. Any upper bound? Any special classes of functions where the conjecture is closed (what about low-degree polynomials over F₂?), or where non-obvious results are known?

4. Let me start this, by citing a few interesting results from the Kenyon-Kutin paper that you link to:

(i) they define a concept of “l-block sensitivity”. This is the same as block sensitivity, where only blocks of length ≤ l are allowed. They prove that the l-block sensitivity of f is at most s(f)^l / l! (the `!` is a factorial, not an exclamation point). This means that to get a huge gap between bs(f) and s(f), you’ll have to choose a function where the block sensitivity is attained using large blocks.

(ii) they prove that if bs(f)≥Kn, then s(f)≥Ω(n^K/2). This implies that to get a huge gap, one would have to use a function f whose block sensitivity is sub-linear in the number of variables.

5. Just to throw out a guess, I’m willing to conjecture that for any f, s(f)≤2^√bs(f). Any takers?
gowers Says:
Comment #2 July 13th, 2010 at 12:03 pm
This is a very nice question. I hope you won’t mind if I post a very elementary thought indeed (by which I mean so elementary that anyone who has thought about the problem — by which I mean the variant over at Mathoverflow — for five minutes will have had) but one that helps me understand the question better.

My initial reaction was, “But surely there must be a point with at least d/2 neighbours of a different colour.” I think I was imagining some kind of false proof involving the intermediate value theorem. Anyhow, the example that helps me to understand the problem is where you colour the a point x in Z^d red if x_1 > 0 and blue otherwise. Then the only points that have neighbours of different colours are ones where x_1=0 or 1, and those have only one neighbour of a different colour. However, they violate the non-triviality condition you impose.

So now I understand the non-triviality condition.

Of course, a more sophisticated way to understand the condition would be to read more carefully your O(sqrt{d}) example, which I will now do.
Scott Says:
Comment #3 July 13th, 2010 at 12:08 pm
Elad, thanks so much for your excellent comment! By way of thanks, I’ve taken the liberty of fixing up your HTML.

Let me go you one further: on this philomath project, not only are half-baked ideas king, but quarter-baked ideas are Emperor.

As for summarizing what’s known about s vs. bs, yes, it’s an excellent idea! Besides what I already mentioned in the post, here are the main things that I can think of:

1. Suppose you look at average sensitivity versus average block-sensitivity, where average just means that you average s_x(f) and bs_x(f) respectively over all inputs x∈{0,1}ⁿ. Then the analogous conjecture is false: there can be exponential gaps between the two. I’m pretty sure that was first shown by Anna Bernasconi. My counterexample to the GLN Conjecture gives another example of this phenomenon.

2. Sourav Chakraborty’s Masters thesis shows that the bs(f)=s(f)^O(1) conjecture holds for the class of minterm-transitive Boolean functions — indeed, s(f)=Ω(n^1/3) for that class.

3. Gotsman and Linial showed the s vs. bs problem equivalent to a different problem on the Boolean hypercube.

I’ll be happy to write some more about 1 above; anyone else want to write something about 2 and/or 3?

Also, what am I missing?
Scott Says:
Comment #4 July 13th, 2010 at 12:14 pm
BTW: Elad, I’m assuming that in your point 5 you meant to switch s(f) and bs(f)? 🙂
Scott Says:
Comment #5 July 13th, 2010 at 12:21 pm
Tim, thanks for joining!!

Yes, the nontriviality condition is essential, for exactly the reason you say.

My student Alex Arkhipov conjectures that the nontriviality condition could be relaxed, to merely require that, for each of the d directions, there’s at least one red/blue boundary as you move along that direction.

It seems plausible that his weaker condition suffices. Of course, lower-bounding s(C) with the stronger “nontriviality condition” would be enough to get the desired conclusion about the sensitivity vs. block-sensitivity problem.
gowers Says:
Comment #6 July 13th, 2010 at 1:35 pm
To continue in a very simple vein, here is a proof that a non-trivial colouring of Z^2 must give rise to a point with two neighbours of different colours.

Suppose that this is not the case. Let (x,y) be red and (x+1,y) blue. (The non-triviality condition guarantees that WLOG such a pair exists.) Then (x,y+1) must be red (or (x,y) would have two blue neighbours) and (x+1,y+1) must be blue. By induction in both directions, the entire vertical line containing x is red and the entire vertical line containing x+1 is blue. We can also find a WLOG red point (u,v) such that (u,v+1) is blue and get two horizontal lines, one red and one blue. Contradiction.

This of course implies the same conclusion (that some point must have at least two neighbours of the opposite colour) in d dimensions for all d > 1. So perhaps the first “interesting” case is to show that if d=3 then some point has at least three neighbours of the opposite colour — if that’s true.
gowers Says:
Comment #7 July 13th, 2010 at 1:43 pm
And this is a question, sufficiently different from the previous comment to belong to a different comment.

Suppose you use the following stronger non-triviality hypothesis: that the origin is red, and all points sufficiently far from the origin are blue. Is this likely to have a significant impact on the problem, or is it morally the same question?

Another question I have is this. One might expect a linear lower bound, for the following type of geometrical reason. Suppose that the boundary between the red set and the blue set is reasonably smooth. Then somewhere it will either have a normal that points in an oblique direction like (1,1,1,…,1). But then the set will look locally like the set of all points with coordinates summing to at most m, and each point on the boundary there has about d/2 neighbours of the opposite colour.

Obviously this geometric intuition is completely wrong, since there’s a sqrt{d} upper bound. But is it possible to explain in a pithy way (as opposed to simply saying go and look at the counterexample) why it is wrong? (I have a similar difficulty with the famous result of Kahn, Kalai and Linial, where the best example is not given by chopping up the cube using a hyperplane normal to (1,1,…,1).)
Andy D Says:
Comment #8 July 13th, 2010 at 2:00 pm
Hi Tim,

“Suppose you use the following stronger non-triviality hypothesis: that the origin is red, and all points sufficiently far from the origin are blue. Is this likely to have a significant impact on the problem, or is it morally the same question?”

This strongly impacts the conclusion: then the sensitivity is at least d. For, let’s start at the origin (= blue), staying in the blue set for as long as possible, and always increasing our distance from the origin. Eventually we get stuck. This means there’s a red point a lattice point away from us along every axis.

As a simple example where there’s a ‘smooth boundary’ but sensitivity is low, consider the red set defined by the condition [x_1 > 0] on the first coordinate. This coloring fails your first condition, of course.
Andy D Says:
Comment #9 July 13th, 2010 at 2:01 pm
Sorry to reverse colorings there 😛
Ashley Montanaro Says:
Comment #10 July 13th, 2010 at 2:02 pm
Hi Scott,

Great idea to choose this problem! I should admit to also going on s-vs-bs binges on a semi-regular basis, with a distinct lack of success. However, my most recent progress-less idea seemed to have a bit potential than previous ones, so in the spirit of half-bakedness, I’ll sketch it out here. Apologies for the length! Also, to keep it semi-concise, I’m going to use some standard notation in the literature on Fourier analysis of boolean functions (see Ryan O’Donnell’s course of the same name for definitions etc.).

There are many other complexity measures of boolean functions which *are* known to be polynomially related to block sensitivity (see the lovely review paper by Buhrman and de Wolf, “Complexity measures and decision tree complexity: a survey” for some examples). In particular, to prove the s-vs-bs conjecture, it suffices to show that the sensitivity of a boolean function is close to its degree as a polynomial over the reals. It’s known that s(f) can be as low as O(sqrt(deg(f)), and the conjecture is that this is the largest separation possible.

Anyway, the reason why polynomial degree might be related to sensitivity is rooted in Fourier analysis. The function S, where S(x) is the sensitivity of f at x, and f is a boolean function f:{0,1}^n -> {+1,-1}, is the convolution of a “discrete derivative” function D and f. Define “directional derivatives” d_i, where d_i(0) = 1/2 and d_i(e_i) = -1/2 (e_i being the bit-string with a 1 at position i, and 0 elsewhere), and set D = sum_i d_i. Then S is just D convolved with f, and D has a very simple Fourier expansion. This seems to suggest that Fourier analysis is a natural way to study S.

A first thing to do with this idea is to make the sensitivity more “non-local” by taking higher powers of D. In particular, if we do this enough times, we can approximate the degree of f! (In an attempt to thwart HTML comment gremlins, I’ll write the inner product between f and g as [f,g], f convolved with g as f * g, and use D^{*m} to mean the function D convolved with itself m times in what follows.) It turns out that, looking at the Fourier side,

[ f, D^{* m} f ] = sum_S |S|^m \hat{f}(S)^2

and thus, as m goes to infinity, ([f, D^{* m} f] )^(1/m) approaches deg(f).

So it would suffice to show

[ f, D^{* m} f ] = O( s(f)^(2m) ),

for example, to show deg(f) = O(s(f)^2). This would follow from showing

[ f, D^{* m} f ] is at most s(f)^2 [ f, D^{* (m-1)} f ].

This seems plausible, because we have from Holder’s inequality

[ f, D^{* m} f ] is upper bounded by || D * f ||_\infty || D^{*(m-1)} * f ||_1= s(f) || D^{*(m-1)} * f ||_1.

So if it were true that the sign of (D^{*(m-1)} * f)(x) were the same as the sign of f(x) everywhere, we would be done, as then

|| D^{*(m-1)} * f ||_1 = [ f, D^{* (m-1)} f ].

However, this need not be true… but perhaps these two quantities could be related somehow. As you’ll no doubt have guessed, I couldn’t get it to work 🙂

Anyway, that’s it; sorry again for the length, and also for the unexplained and no doubt soon-to-be-chewed-up notation…
TCS reminder Says:
Comment #11 July 13th, 2010 at 3:27 pm
Please be aware also for the attempt of the TCS community to build up its own “mathOverflow” site. We currently have about 40% (~300 users), for initiating this project.

MathOverflow is great, but I think a site for only TCS might be a good idea after all.

If you support this, please go HERE and commit!
Scott Says:
Comment #12 July 13th, 2010 at 3:33 pm
Hi Tim,

First, thanks for blogging this! (Now your post, my post, and MathOverflow all link to the other two.)

Obviously this geometric intuition is completely wrong, since there’s a sqrt{d} upper bound … as opposed to simply saying go and look at the counterexample) why it is wrong?

Besides Andy’s point—that what I hereby christen the Wall Counterexample (all blue if the first coordinate is less than 1 and all red otherwise)—has a “smooth boundary” between red and blue and yet sensitivity=1—there’s just one other general comment I can make. Namely, I haven’t had any success at all so far at porting “continuous intuitions” onto this problem—sensitivity seems like an extremely discrete notion, dependent on the neighborhood structure of Z^d.

As for why the sensitivity bound should be O(√d), rather than O(d^1/3) or whatever, there my only intuition comes from the counterexample itself. Namely, it’s possible to partition the coordinates into arbitrary “blocks”, B₁,…,B_k, such that

(1) For each i, there exists a blue point that’s sensitive to every coordinate in B_i, and
(2) for every set of coordinates S that intersects each B_i at exactly one coordinate, there exists a red point that’s sensitive to every coordinate in S.

This general sort of construction clearly gives you s(C)=O(√d) but no better, and extrapolating from one data point, it’s obviously the best possible… 🙂
Scott Says:
Comment #13 July 13th, 2010 at 3:59 pm
Just a suggestion: maybe we should take further discussion over to MathOverflow, given WordPress’s tendency to mangle everyone’s comments?
gowers Says:
Comment #14 July 13th, 2010 at 4:14 pm
Re the Wall counterexample, my suggestion was not that a smooth boundary on its own is enough (since I too had mentioned the Wall counterexample) but that in conjunction with the non-triviality conditions a smooth boundary would be forced to be oblique somewhere. And Andy has basically pointed out that in a sense this intuition is correct: if the set of red points is bounded then somewhere the sum of all the coordinates will be maximized for a red point, and then it will have d blue neighbours.

So this shows that the red and blue points have to be thoroughly mixed up. I find it slightly troubling that the example that gives the sqrt{d} bound is so hard to think about geometrically — it needs a fairly high dimension to make sense and is defined completely combinatorially. But I still wonder if it might be worth trying to get a mental picture of just how the red and blue points are managing to form boundaries that never seem to point in directions that aren’t aligned with most of the coordinate axes.
gowers Says:
Comment #15 July 13th, 2010 at 4:16 pm
My only anxiety about taking the discussion over to Mathoverflow is that the sort of tentative and possibly nonsensical things that are perfectly appropriate Polymath comments are not in the usual style of MO. But maybe that doesn’t matter too much.
Andy D Says:
Comment #16 July 13th, 2010 at 4:27 pm
“(since I too had mentioned the Wall counterexample)”

Sorry to’ve missed that.
Scott Says:
Comment #17 July 13th, 2010 at 5:41 pm
Thanks, Tim!

You actually brought up a general question I had about MO: can and should it be used for working collaboratively on hard questions? It has a number of features that make it very attractive for that purpose, not least of which is the beautiful HTML/TeX integration.

Reading the documents on the MO site, the single most important requirement seems to be that questions be specific and discussions have a clear goal—and a polymath-style project certainly wouldn’t violate that! On the other hand, there also seems to be a rule against “open questions”—or rather, if you ask one, then you have to be willing to accept “it’s open” as a satisfactory answer. But of course, there’s a huge gray area between the Riemann Hypothesis and a straightforward clarification question. That gray area is populated by things like my question, which is new but related to an open problem that’s somewhat well-known in one community, and not at all known in others.
gowers Says:
Comment #18 July 13th, 2010 at 6:12 pm
My thought was really that posts to MO are called “answers” whereas a Polymath-style post is more like a contribution to a conversation. Another possibility would be to use a WordPress blog. Probably you don’t want to migrate your blog over to WordPress, but there is a dedicated Polymath blog as well, and you’d be welcome to become some kind of moderator there (giving you full editing powers etc.). I’m not saying that’s what you should do, but it’s an option that’s there.
gowers Says:
Comment #19 July 13th, 2010 at 6:13 pm
It might also be worth going over to the meta part of MO and see what the MO people think about this possible use of the site.
Elad Verbin Says:
Comment #20 July 13th, 2010 at 7:07 pm
Scott, thanks for correcting my HTML! I’ll need to learn to do it the right way.

Here is a special case of the geometric variant that might be easier: Given a coloring C of Zd by{red,blue}, define its’ “color regions” to be maximal connected monochromatic sets of points. (Note that regions can be infinite). My question is: suppose that for each coordinate i, the region that includes the origin touches a region that contains a blue point on the i-th axis. Can you prove the conjecture for these cases?

Note that I restricted the set of inputs in such a way that now the “origin red region” (the one that contains that origin) *directly touches* regions that contain the promised blue points on the axes. Does this help us in any way? Maybe we have more geometric properties to work with now?
Scott Says:
Comment #21 July 13th, 2010 at 7:29 pm
Elad: I like your variant! Let R be the “origin red region.” Then indeed, one could imagine some sort of “contraction argument,” which started with d points in R that touched a blue region along axes 1 to d respectively, and then moved those d points closer and closer together (still staying on the boundary of R), until we found a single point on the boundary of R that touched blue points along many different axes simultaneously.

One obvious problem is that I don’t see where such an argument, if it worked, would lose the factor of √d that we know it has to lose! But the idea (as stated) is so vague that maybe we shouldn’t even worry about that yet.
Scott Says:
Comment #22 July 13th, 2010 at 7:35 pm
Tim:

I find it slightly troubling that the example that gives the sqrt{d} bound is so hard to think about geometrically — it needs a fairly high dimension to make sense and is defined completely combinatorially.

Yes, I had exactly the same worry! I came up with the Z^d problem specifically in order to move away from “Boolean function combinatorics,” and into something more geometric that one might have a better handle on. But then thinking about the √d example seems to send us right back to combinatorics! Indeed, as I mentioned on MO, I obtained that example by starting with a known example from the Boolean function world, and then applying my reduction. I almost certainly couldn’t have come up with it by “thinking geometrically.”

Then again, maybe the situation isn’t so bad. For note that the number of dimensions needed to exhibit the counterexample is not that large: it’s 4, or arguably even 3. (More about that in my next comment!)
Scott Says:
Comment #23 July 13th, 2010 at 8:32 pm
Let me come back to the situation in low dimensions, which Tim rightly asked about a while ago:

So perhaps the first “interesting” case is to show that if d=3 then some point has at least three neighbours of the opposite colour — if that’s true.

It’s false! Let me now give a coloring in d=3 where each point has at most 2 neighbors of the opposite color. (Indeed, I believe one can even do the same when d=4.)

As a preliminary step, let sx(C) be the number of differently-colored neighbors of x, and let rx(C) be the number of axes along which x has a differently-colored neighbor. Then as I said on MO, clearly rx(C) ≤ sx(C) ≤ 2rx(C). But the situation is better than that: I claim that we can speak completely interchangeably about sx and rx, without even losing a factor of 2.

Why? Well, suppose we have a coloring C that satisfies the nontriviality condition, such that

r(C) = maxx rx(C) = k.

Then here’s how to get a new coloring C’ that also satisfies the nontriviality condition, such that

s(C’) = maxx rx(C) = k

as well. Simply take each point x in C, and “blow it up” into a unit cube of 2d points, each colored the same way as x. The cubes in C’ should have exactly the same adjacency relations as the points in C. So in particular, the sensitivity isn’t affected by this transformation, nor is the nontriviality condition (henceforth NC). Indeed, the only thing the transformation really does is to remove the annoying problem of a point having two differently-colored neighbors along the same axis.

OK, with that out of the way, how do we give a coloring C of Z3 such that r(C)=2? Well, it was a little hard to describe, so I made a graphic of it (and crossposted to the MO site). Enjoy!
Scott Says:
Comment #24 July 13th, 2010 at 8:42 pm
Ashley M.: Thanks for your comment (and sorry for the delay)!

During some my previous s(f) vs. bs(f) binges, I also wondered whether we should forget entirely about bs(f), and concentrate instead on relating s(f) to some other measure that’s known to be polynomially related to bs(f), such as D(f) or deg(f).

Unfortunately, I couldn’t see how to use Fourier analysis to get a handle on the problem. To my way of thinking, the problem is fundamental: Fourier analysis is designed to tell us about the behavior of a function on random inputs. But we know that the average sensitivity can be exponentially smaller than the degree! So for sure we’ll need techniques that talk about the sensitivity of the worst-case input. But what analytic techniques are there that could plausibly do that? Can you explain how your approach, if it worked, would get around this problem?
Scott Says:
Comment #25 July 13th, 2010 at 8:52 pm
Btw, maybe it would be helpful if I gave an example of a Boolean function f whose average sensitivity is exponentially smaller than its average block-sensitivity. This isn’t hard: consider the TRIBES function, which is defined as follows. We partition the input bits into 2^m “tribes” of m bits each. We then set f(x)=1, if and only if there exists a tribe that’s all 1’s.

By construction, f(x)=0 with constant probability over x (specifically, ~1/e). And conditioned on f(x)=0, we can flip the 0 bits within any tribe to get a 1-input. Therefore the average block-sensitivity of f is ~2^m. On the other hand, it’s not hard to see that the average sensitivity of f is only ~m.
asdf Says:
Comment #26 July 13th, 2010 at 11:48 pm
Terry Tao just ran a mini-polymath about a math olympiad question, using a combination of a wordpress blog and a wiki. See this post about it, which points to the actual polymath thread and the wiki:

http://terrytao.wordpress.com/2010/07/08/mini-polymath2-discussion-thread/

I think some of the participants are looking into developing new software just to make this sort of thing easier.
Raoul Ohio Says:
Comment #27 July 14th, 2010 at 12:33 am
For a 14 second break from deep thinking, check out the flyby of 21 Lutetia a couple days ago:

http://www.sciencenews.org/view/generic/id/61072/title/Snapshots_of_the_past_#video

It is similar to flyby views in SF shows, but it is the real deal. It actually looks clean.
Amit Chakrabarti Says:
Comment #28 July 14th, 2010 at 12:36 am
Scott: Here’s some recent work done with Dartmouth undergrad Karn Seth, who got a Senior Thesis out of it. He started by essentially rediscovering Kenyon-Kutin! http://www.cs.dartmouth.edu/reports/abstracts/TR2010-673/
Kaveh Says:
Comment #29 July 14th, 2010 at 12:43 am
… what we need is just (*) s(Lift f)
Kaveh Says:
Comment #30 July 14th, 2010 at 12:45 am
Hi Scott,

It is also nice to see you blog again. 🙂
We were talking about your post with a friend, and would like to join the fun.

1. Elad mentioned a way of lifting the function. Can the lifting used in communication complexity be helpful here? It seems that the function f used for s(f) = O(sqrt(n)) is a lifted function.
( we have a function f, and (x,y) ( |x| = m, |y|=C(m,n) ), Lift(phi,f,x,y) is phi(f(x_1),…f(x_C(m,n)),y). In one version phi just returns the y-th input. In the function used by Rubinstein, phi is OR. In Elad’s tree construction it is more involved because we allow f to be nested, i.e. phi can apply f to arbitrary values.)

For Elad’s tree lift, s(Lift f) = s(f)^k is not clear. The problem is that for the value of x that has the maximum sensitivity, some sensitive bits are 1 and others are 0, so we need to be able to flip the value of the function both from 0 to 1 and from 1 to 0, i.e. 0-sensitivity and 1-sensitivity should be large for the function. The same problem arises with the bs. We can fix one of these problems with making x zero and output of the function 0, but not both of them. On the other hand, what we need is just (*) s(Lift f) leq s(f)^k leq bs(f)^k leq bs(Lift f). This does not give a better relation though.
The trivial attempt to make k grow by input size in the tree construction does not seem to work. The problem is that in *, both sides are raised to power k, so the relation between s and bs remains the same. We probably need a better estimate for bs(Lift f) or s(Lift f). Or some other way of lifting (although Elad’s versions feels to be optimal in some sense.)

2. I thought that Simon’s upperbound (lg(bs(f)) – lg lg(bs(f) +1)/2 leq s(f) means that s(f) should grove when bs(f) groves, so if I am not making a mistake, in the geometric version, for sufficiently large d (=bs) there should be a point with arbitrary large neighbors of different color? Does not it give a logarithmic lowerbound on s (w.r.t. d)?

Considering restricted class of functions might be helpful. So here is a question:
3. What is known about s/bs of restricted class of functions like AC^0, TC^0, … ?

(Scott, could you please remove the previous ones, I also used the less than sign. Sorry.)
Gil Says:
Comment #31 July 14th, 2010 at 1:12 am
Indeed a very nice question. Is it clear that the block sensitivity (or average block sensitivity) is bounded above by an exponential function of the sensitivity (average sensitivity)?
Noah Snyder Says:
Comment #32 July 14th, 2010 at 1:33 am
Having no good math ideas yet, let me respond to your question “Can and should MO be used for working collaboratively on hard questions?”

I think the consensus on MO is that it can be a good component of a polymath style attack on a problem, but that it’s best use in such a project is farming out smaller manageable problems rather than hosting the main discussion. The reason is that the MO interface is not optimized for discussion: voting rearranges comments and answer, comments are not all displayed, etc. What it is optimized for is answers to specific questions.

So I think your initial MO post was good for advertising the question (and who knows, maybe someone would just solve it straight away), but the discussion is probably best hosted here until you come up with specific sub-questions which you then post separately to MO.
Ashley Montanaro Says:
Comment #33 July 14th, 2010 at 3:19 am
Scott: Indeed, the fact that (intuitively) Fourier analysis deals with the average case rather than the worst case does seem to be a problem. This was what the idea of taking higher powers of this derivative operator was supposed to mitigate – the vague idea being that it might be possible to use the sensitivity to upper bound how large (D^{*m} * f) can be, in terms of the size of D^{*m-1} and s(f), turning the problem into just a question of proving an inductive step. As I described above, it’s *almost* possible to do this using Holder’s inequality, but it doesn’t quite work.
gowers Says:
Comment #34 July 14th, 2010 at 3:22 am
Scott, are the sets in your Z^3 example aligned properly? It looks to me as though the origin is red but has four blue neighbours (one to the right, one to the left, one above and one below).
Ashley Montanaro Says:
Comment #35 July 14th, 2010 at 3:24 am
Gil: In the paper “Sensitivity, block sensitivity, and l-block sensitivity of Boolean functions” by Kenyon and Kutin, they prove what I believe is the best known relationship between s(f) and bs(f): there can be at most an exponential gap.

In fact, it’s quite easy to see that the gap can be at most a (larger) exponential: by a result of Simon, the sensitivity is lower bounded by O(log(n)) for any boolean function depending on n variables, and the block sensitivity is at most n.
Gil Says:
Comment #36 July 14th, 2010 at 3:40 am
Thanks Ashley. One example we should look at in this context is the Ajtai-Linial example of a Boolean function described by a random depth 3 circuits with certain parameters. This function is rather balanced and it has average sensitivity (log n)^2. I wonder what is the maximum sensitivity.
Ben Wieland Says:
Comment #37 July 14th, 2010 at 3:41 am
gowers, he has switched to r, rather than s, so opposite directions only count once. You can get back to s by replacing each box by a size 2 cube of the same color.

I posted it on MO, but that example seems pretty complicated. Can’t you get by with a sea of red and three blue lines: (0,2,*), (2,*,0), (*,0,2)?
Scott Says:
Comment #38 July 14th, 2010 at 3:41 am
Noah (#31): Thanks!! That’s exactly the sort of information I was looking for.
Scott Says:
Comment #39 July 14th, 2010 at 3:47 am
Gil (#30):

Is it clear that the block sensitivity (or average block sensitivity) is bounded above by an exponential function of the sensitivity (average sensitivity)?

Great question! No, it’s not obvious, but Kenyon and Kutin proved such an exponential relation (improving a weaker exponential relation due to Hans-Ulrich Simon, I think).

It would be great if someone could explain their argument here. If no one has by the time I wake up tomorrow, I will. 🙂
Scott Says:
Comment #40 July 14th, 2010 at 3:52 am
Ashley M. (#34): I see that our comments went past each other!

So that pushes the problem back to explaining Simon’s result: why is s(f) at least log(n) for any Boolean function f depending on n variables?

I went through the argument once, and I remember it as being quite tricky. Just like for the s vs. bs problem, you somehow need to find an input that has large sensitivity, and Simon implicitly gives an (exponential-time) algorithm to do that.

Now that I think about it, the relevance to what we’re trying to do here could hardly be greater…
Scott Says:
Comment #41 July 14th, 2010 at 3:55 am
gowers (#33): Yes, what Ben Wieland (#36) said!
Scott Says:
Comment #42 July 14th, 2010 at 3:59 am
Ben Wieland (#36): Yes, you’re completely right! Your counterexample works and is much simpler than mine. And it has the further advantage of generalizing much more easily to d=4. In your notation, a coloring of Z^4 with r(C)=2 can be obtained by taking a sea of red, together with the following four blue planes:

(0,2,*,*)
(2,0,*,*)
(*,*,0,2)
(*,*,2,0)

The question remains of whether r(C) is 2 or 3 when d=5.
Scott Says:
Comment #43 July 14th, 2010 at 4:19 am
Kaveh (#30): Welcome! To take your questions in order:

1. Before we get into this, what’s the goal with lifting Boolean functions? Are we trying to construct a counterexample that beats Rubinstein’s?

2. A lower bound on the geometric sensitivity s(C) would indeed imply the same lower bound on the Boolean sensitivity s(f). But the converse isn’t clear at all! (At least not to me.) So, if we want a lower bound of the form s(C) ≥ log(d) (and we do!), we’ll probably need to delve into Simon’s proof and see if we can adapt it to the geometric setting.

3. That’s a great question! Of course, we know a great deal about the average sensitivity of AC0 functions. But I don’t know if we know anything about the ordinary s(f) and bs(f) for AC0 functions, that we don’t also know for arbitrary f.
Klas Markström Says:
Comment #44 July 14th, 2010 at 4:49 am
This is a nice problem.

One question that immediately comes to mind is the following:

Given a colouring C1 in dimension d1 and a colouring C2 in d2, define a direct product colouring in dimension d1+d2 by saying that a point (x1,x2) has colour C1(x1)+C2(x2) mod 2.

What is the best bound one can give for the sensitivity of the direct product colouring in terms of the sensitivities of C1 and C2?

One could consider the same question for a tensor product colouring.
Ashley Says:
Comment #45 July 14th, 2010 at 5:39 am
Here’s a brief sketch of Simon’s argument that, for any boolean function f depending on n variables, s(f) = Omega(log n). The idea is to say that, if s = s(f) is low, then in any direction i, there must be “relatively many” points x such that f(x) != f(x+e_i). Imagine that in each direction there are at least K points with this property. Then, summing over i, we’d have that n * K is a lower bound on (s * 2^n), which we get by summing the number of different neighbours over all points x.

Simon shows is that K is at least 2^(n – O(s)), so we get that s(f) * 2^(O(s)) is at least n, and the result follows.

Why should this be true? The basic idea is that, if s is low, then if f(x) != f(x+e_i) for even one point x, this forces there to be many “nearby” points y such that f(y) != f(y+e_i). For example, consider an x such that x_i = 0, f(x)=0, f(x+e_i)=1. Then there can be at most s-1 points y neighbouring x such that y_i = 0, f(y) = 1, and at most s-1 points z neighbouring x + e_i such that z_i = 1, f(z) = 0. So you can work out that there must be at least n-2s+1 points y neighbouring x such that y_i = 0, f(y)=0, f(y+e_i)=1.

This argument can be repeated for points neighbouring the initial point x, and using an isoperimetric-type result for the cube one can show that, as we have a set of vertices such that each vertex has a “lot” of differently-coloured neighbours, the size of this set must be “large” (exponentially big in (n-2s+1)).
Andris Says:
Comment #46 July 14th, 2010 at 9:01 am
Two notes:

1. A small (lower-order) improvement to Rubinstein’s example is possible. The new function f has (2k+1)^2 variables, which are divided into (2k+1) blocks with (2k+1) variables in each block. f=1 if there is a block x_1, …, x_{2k+1} such that either
a) x_{2i-1}=x_{2i}=1 for some i from 1 to k, or
b) x_{2k+1} = 1.

This function has bs(f)=(2k+1)(k+1) and s(f)=2k+1. Hence, bs(f) = (1/2+1/(4k+2)) (s(f))^2.

For the original Rubinstein’s function, bs(f)= 1/2 (s(f))^2.

2. Computer search shows that the example above is the best separation that is achievable for function with at most 12 variables.

Both of those results are from undegraduate project of Madars Virza which I recently supervised.
Tyson Williams Says:
Comment #47 July 14th, 2010 at 9:27 am
Here is my quarter-baked thought.

Does anything interesting come from generalizing the Rubinstein function to higher dimensions? That is, the Rubinstein function arranges the bits in a square. What if the bits where arranged in a (hyper) cube. Is a larger gap achieved?
gowers Says:
Comment #48 July 14th, 2010 at 9:53 am
@Tyson Williams: I thought about that while having a bath last night and briefly thought I could improve the example to get a bound of n^{1/k} for any fixed k. I then realized that I had made a silly mistake and couldn’t improve on k=2.

For what it’s worth, here is what I tried. It’s just one of many things one might try, so it doesn’t necessarily kill off your suggestion.

Let m be a positive integer and let’s name the points things like ijk, where i, j and k are between 1 and m (so there are m^3 points). I’m talking about Boolean functions on [m^3].

Now let us colour a Boolean function f red if there exists i such that for every j there exists k such that f(ijk)=1. It is straightforward to check that if f is red, then there is a set of m bits such that as long as you don’t change those bits then it remains red. (To find that set of bits, choose i such that for every j there exists k such that f(ijk)=1 and for every j choose k such that f(ijk)=1.)

For some reason I persuaded myself that by a piece of magic, if f was blue then there were still m^3-m bits that you could afford to change. But that’s rubbish — for each i you have to choose a j such that for every k you leave the bits ijk alone. And that adds up to m^2 bits. Since the product of the number of bits needed in the two cases is the total number of bits, it’s clear that there is no improvement.

Incidentally, that suggests a conjecture. I’ve given no thought to it at all, so it may well be nonsense. But can we generalize Scott’s conjecture in Z^d to the assertion that if ab=d then either there is a red point with a blue neighbours or a blue point with b red neighbours (up to a constant, say)?
Dániel Says:
Comment #49 July 14th, 2010 at 10:09 am
There is one advantage of Scott’s R^3 example over Ben’s simpler one. It is a tiling of a 4x2x2 rectangular box. We can assume in general that our low-sensitivity coloring is periodic. But what can we say about the size of the tiles? Is it easier to lower bound s(C) if we assume tiles with small sides? The smallest asymptotically interesting case seems to be the 4x…x4 box. Would it say anything interesting about the original question if we solved this one?
Gil Says:
Comment #50 July 14th, 2010 at 10:17 am
Inspired by Ajtai-Lineal example we can think about something like this. f is equal 1 if n functions f_1 …f_n all all equals 1; f_j equal 1 if one of g_j1,…,g_jn is equal 1
and every g_ji is a tribe function based on dividing the n variables into tribes of size roughly log n assigning each variable a random sign and setting g_ji be 1 if one of these tribes agree that f=1.
we can set the size of the tribes that prob (g_ij=1)=logn/n
and so prob f_j=1-1/n so prob f=1 is a constant.

Now the average sensitivity is something like (log n)^3 I wonder what the maximum sensitivity is. Maybe it is also polylog? I suppose finding function in AC_0 is max sensitivity polylog is known and of interest but I dont know it. Of course maybe it is easy to see that the max sensitivity is large.

Then we will worry about block sensitivity.

goes roughly like this. Divide the n variables into tribes in n different ways. Now, the function f is 1 if in each
Scott Says:
Comment #51 July 14th, 2010 at 3:58 pm
Dániel (#49):

The smallest asymptotically interesting case seems to be the 4x…x4 box. Would it say anything interesting about the original question if we solved this one?

Great question! Yes, it would imply a lower bound on the sensitivity in terms of what Kenyon and Kutin call the 2-block-sensitivity: basically, the block sensitivity where all blocks are restricted to have size at most 2.

Alas, Kenyon and Kutin already proved directly that Rubinstein’s function gives the optimal gap between sensitivity and 2-block-sensitivity.

In general, if you can prove a lower bound when the coloring is restricted to a 2^k…2^k periodic box, then you get a corresponding lower bound on the k-block-sensitivity.
Scott Says:
Comment #52 July 14th, 2010 at 4:07 pm
One other development. I have a counterexample to Alex Arkhipov’s conjecture, that if a coloring of Z^d has a red-blue boundary in each of the d axis directions, then that’s enough to imply the existence of a point with sensitivity k√d.

The counterexample is based on the standard example of a Boolean function that depends on all d variables but has sensitivity only log(d). Namely, the INDEX LOOKUP function f(x₁,…,x_d): examine the first log(d) bits of the input, treat those bits as an index to one of the remaining d-log(d) bits x_i, and then output x_i.

Take a d-dimensional cube, of size 2x2x…x2, and color each vertex blue if the corresponding f(x₁,…,x_d) equals 1 and red otherwise. Then tile Z^d with that cube. That will give a coloring C such that s(C)=O(log d), yet that has a red-blue boundary in each of the d axis directions.

The conclusion is that, if the conjecture is true, then it must be important somehow that there are blue points on the actual axes that pass through the red origin point.
Scott Says:
Comment #53 July 14th, 2010 at 5:59 pm
Andris (#46): That’s extremely good to know; thanks so much for posting!

Out of curiosity, how did Madars manage to check all 2⁴⁰⁹⁶ Boolean functions on 12 inputs? There must have been a very clever idea for pruning the search space!
Rudimentary M3 Says:
Comment #54 July 14th, 2010 at 9:41 pm
How can we prove the problem is NP–complete by reduction from Vertex-cover?

Problem: Given a collection of sets { S1, S2 ,…., Sn} and a positive integer K. Does there exist a set T with at most K elements with T ∩ Si, 1 in?

[we can use the edges on a graph to create the sets in the collection]
Andris Says:
Comment #55 July 15th, 2010 at 9:07 am
Scott (#53), Madars reduced the problem to SAT by taking an instance of SAT with 2^{12} variables (corresponding to values of the function f(x_1, …, x_12)) and constraints that describe bs(f) \geq b and s(f)\leq s. He then put the resulting SAT instances into a SAT solver.
Ian Says:
Comment #56 July 15th, 2010 at 9:41 am
Taking up Daniel’s train of thought:

Any given C can be converted to a periodic coloring C’ with r(C)=r(C’) as follows: from C take the d-dimensional rectangular prism with corners at the origin and at the blue points closest to the origin along each of the coordinate axes. Then C’ is produced by repeatedly reflecting the prism about the planes containing its faces. [Alternatively you could directly preserve s(C) by reflecting about planes distance 1/2 off each face.]

So WLOG we can consider just colorings of d-dimensional rectangular prisms whose coloring along edges emanating from the origin is all red except at their non-origin endpoints..

Using this idea, Scott’s d=3 example gives rise to a coloring generated by just a 3x2x2 cube.
Seamus Says:
Comment #57 July 15th, 2010 at 12:04 pm
OK. I’ve not read through the above comments, but I wanted to write this down before I forgot it.

It feels like there should be a related idea to this kind of sensitivity: “permutation sensitivity”. The n-place OR and AND functions are invariant under permuting the indices of their arguments. Other functions won’t be. Permutations have some structure that it might be nice to be able to use.

I don’t know if this even counts as a quarter-baked idea. It’s certainly 1/(2^n) baked for some n, but I can’t say much more than n \ge 2…
Scott Says:
Comment #58 July 15th, 2010 at 12:41 pm
Madars reduced the problem to SAT … He then put the resulting SAT instances into a SAT solver.

As I said, a very clever idea! 😉
Yuval Says:
Comment #59 July 15th, 2010 at 3:20 pm
Just a short note: Simon provides an extremely simple example of a function with low sensitivity. His function has two arguments, and index of length log(n), and a vector of length n. The function uses the index to, ahem, index the vector. The resulting sensitivity is log(n)+1, which is roughly log(log(n) + n).

I only mention this example since this is exactly the device used to lift functions (re Kaveh’s comment); the other device, XORing several inputs, is not helpful here.
Ian Says:
Comment #60 July 15th, 2010 at 4:20 pm
Concerning Elad’s idea of color regions/maximum connected monochromatic components:

Is is possible that a coloring that minimizes s(C) must have exactly one red component and one blue component (except d=1 of course)? Unless I’m being dumb it seems to hold for the examples we’ve seen so far.

Note that when traversing a particular monochromatic component, at each point you can continue within that component by moving in at least 2d-s(C) directions, so they certainly seem to spread out.

I’m not sure yet exactly how this would get us to the final solution, but since I tend to think geometrically I guess I’d like to have more geometric intuition about the problem, and if I’m wrong maybe a new type of counterexample would provide some insight…
Klas Markström Says:
Comment #61 July 15th, 2010 at 5:25 pm
Regarding Ian #60.

I think you might be right. If there is a finite monochromatic component I believe the extreme points of its convex hull, which are points belonging to the component, will have many neighbours outside the component.
Ian Says:
Comment #62 July 16th, 2010 at 11:46 am
Klas,

That sounds right to me, to prove that if s(C)
Ian Says:
Comment #63 July 16th, 2010 at 11:48 am
Klas,

That sounds right to me, to prove that if s(C) is less than d then there are no finite components. But can you use that to prove that there’s only one component of each color?

Thinking about this a bit more, I realize you can make a “double wall” example with s(C) = 1 and two disconnected components of the same color. So again we have a situation where a hypothesis is supported by the known examples with blue points on the coordinate axes, but not if you do away with that requirement. It makes me wonder if maybe we just don’t have exotic enough examples satisfying the coordinate condition…
Ben Wieland Says:
Comment #64 July 16th, 2010 at 2:51 pm
My d=3 r=2 example of 3 blue lines in a sea of red has three components of blue. If you are less careful about which lines you use, you might end up with two blue components.

In higher dimensions, you can get isolated components with even very good sensitivity. Take an optimal configuration in dimension d-1 and insert it into a d-dimensional sea of red. This worsens the sensitivity by 1. Then add a blue hyperplane parallel to the old stuff, and thus disconnected to whatever is going on there.

You can iterate this to get O(sqrt(d)) components and O(sqrt(d)) sensitivity, but I think you ought to be able to achieve O(sqrt(d)) components without sacrificing the leading coefficient of the sensitivity.
Klas Markström Says:
Comment #65 July 16th, 2010 at 3:39 pm
Ian #63, Yes my argument was intended for showing that there are no finite components.

Ben #64, Can’t you just sandwich copies of the optimal configurations from dimension d-1 with monochromatic hyperplanes and get infinitely many components in dimension d, and only increase the sensitivity by two?
Ben Wieland Says:
Comment #66 July 16th, 2010 at 6:07 pm
Sure, but that’s boring. One way of making that precise is to only count one blue component that crosses each axis. All other blue components can be turned red without worsening the example. Similarly, a red component not containing the origin in unnecessary.
Klas Markström Says:
Comment #67 July 18th, 2010 at 4:46 am
It is easy to see that s(d+1) is at most s(d)+1, just take an optimal d colouring and extend it as a d=1 colouring in the new direction, but is it known whether s(d) is monotone in d or not?
David Speyer Says:
Comment #68 July 18th, 2010 at 4:08 pm
Here is a half-baked attempt to improve Rubinstein’s example. Let F be the set of pairs (point, line passing through point) over a projective plane with q^2+q+1 points. Recall that such a pair is called a “flag”.

There are roughly q^3 points in F. I’ll take my n to be 2 |F|, so there are two bits of my input for each element of F. My goal is to build a function f with block sensitivity roughly q^3, and sensitivity roughly q.

The function f is 1 if there is a flag (p, \ell) both of whose bits are 1 but, for any flag (p’, \ell) or any flag (p, \ell’), both bits must be zero. Clearly, the block sensitivity is at least |F|, as I can start with the zero function and change the two bits of any flag to ones.

What is the sensitivity? If f(x) is 1, then there are only O(q) bits that can stop f from being 1: Letting (p, \ell) be the flag which makes x be 1, the only bits that matter are those for the q flags of the form (p’, \ell) and those for the q flags of the form (p, \ell’).

Now, what if f(x) is 0? It seems clear that the worst situation is to have a whole bunch of flags for which one bit is set, so that setting the other bit of any of these makes f become 1. Let S be the set of flags for which one bit is set; we want that there do not exist any two elements (p_1, \ell_1) and (p_2, \ell_2) of S with \ell_1 passing through p_2 or \ell_2 passing through p_1.

If we can show there does not exist such a set of flags S with |S| as large as q^{3/2}, then we beat Rubenstein’s example. Random constructions and geometric constructions both seem to fail around |S|=q. (But I only tried for half an hour.) Can anyone show such a set of flags does not exist, or construct one?
David Speyer Says:
Comment #69 July 18th, 2010 at 4:13 pm
I’m sorry, description of the conditions on S are mangled. Let me think for a bit and make the right statement.
David Speyer Says:
Comment #70 July 18th, 2010 at 4:17 pm
OK, right statement is just that you don’t want two flags with the same point or the same line. In other words, the idea here is to
David Speyer Says:
Comment #71 July 18th, 2010 at 4:19 pm
OK, right statement is just that you don’t want two flags with the same point or the same line. How many distinct points can you draw in the projective plane, and a choose a line through each point, without drawing the same line twice?
Klas Markström Says:
Comment #72 July 18th, 2010 at 4:32 pm
David #71,

Create a bipartite graph with the points of the plane as vertices on one side, and the lines as the vertices of the other. Add an edge between P and L if P is a point on the line L.

If I read your final question right you want a maximum matching in this graph. However the graph is regular and a bipartite regular graph has a perfect matching.
David Speyer Says:
Comment #73 July 18th, 2010 at 9:07 pm
Grrr, that was dumb of me. And, even worse, I bet that approximately regular graphs have roughly perfect matchings. Which makes a strategy like this unlikely to work.
Gábor Pete Says:
Comment #74 July 19th, 2010 at 4:15 am
Sorry, this will be a naive comment after having just read the \Z^d description at Tim Gowers’ blog and not having read any of the comments here, nor having actually thought for more than a minute, but maybe that’s OK for polymath, even if it is called philomath…

I have some topology feeling about this problem. It looks vaguely similar to

Sperner’s lemma, http://en.wikipedia.org/wiki/Sperner%27s_lemma,

and to the Alon-West solution of the necklace problem using Borsuk-Ulam, http://www.math.tau.ac.il/~nogaa/PDFS/Publications/The%20Borsuk-Ulam%20Theorem%20and%20bisection%20of%20necklaces.pdf.

Has anything like that been tried? Of course, this is somehow more quantitative now, but maybe that’s why homology groups exist… E.g., there is Robin Forman’s discrete Morse theory approach to evasiveness, Combinatorica (2000), http://www.math.rice.edu/~forman.
gowers Says:
Comment #75 July 22nd, 2010 at 5:05 am
I want to see if anyone will give me feedback on a heuristic argument that sqrt{d} ought to be the right bound. There’s a fairly big chasm at one point, and it may be that people will know examples that demonstrate that there is no way of bridging it.

The vague thought that lies behind the argument is this. Suppose you have two adjacent points of different colours. Let’s say they are x and y=x+e_1, where e_1=(1,0,…,0). And let’s suppose that the colouring has sensitivity k (by which I mean that each point has at most k neighbours not of the same colour). Then there are at most 2k unit vectors e_i with i>1 such that x+e_i and y+e_i have different colours. So if you look at the hyperplane P through x that’s orthogonal to e_1, and the hyperplane P+e_1 through y that’s parallel to it, then you find that if a point z in P has a different colour from its counterpart z+e_1 in P+e_1, then the same is true for almost all of z’s neighbours. So it looks as though points that are coloured differently from their counterparts form a pretty big set. (Actually, if you also look at points that are coloured the same way, you can say the same. So we’ve got a colouring in the hyperplane with similar properties to the original colouring. All I’m saying here is that the symmetric difference of two colourings of sensitivity at most k has sensitivity at most 2k.)

Now comes the leap. We know that one simple way of building a colouring of sensitivity at most k is to colour a codimension-k subspace with one colour and colour its complement with the other colour. Let’s suppose that that is in some sense the “worst” example. In that case we can argue as follows. Let the origin be red. For each i, go along the e_i axis until you first hit a blue point. Then look at the two hyperplanes P_i and P_i+e_i orthogonal to e_1 and containing the two points a_ie_i and (a_i+1)e_i where the change occurs. Then there is a codimension-2k subspace S_i generated by some of the e_j (that do not include e_i) such that every point in S_i + a_ie_i is red and every point in S_i + (a_i+1)e_i is blue (because for some magic reason in the worst case this is what happens — I’m not saying I actually believe that things would be anything like this simple). Let K_i be the set such that S_i=.

Suppose that i is in K_j and j is in K_i. That implies that S_i + a_ie_i has a non-empty intersection with S_j + (a_j+1)e_j (since we can choose (a_j+1)e_j from S_i and a_ie_i from S_j). And that is a contradiction.

Now let me show that if k is too small then we must have such a pair i,j. I’ll get it as follows. Define a sequence i_1,i_2,i_3,… by setting i_1=1 and for each h>1 choosing i_h to belong to the intersection of K_{i_1},…,K_{i_{h-1}}. We can continue to produce this sequence for at least m=n/2k steps (since each K_i has size at least n-2k). But that means that if m-1 is greater than k, then K_{i_m} must intersect {i_1,…,i_{m-1}}, which gives us some j such that i_j is in K_{i_m} and i_m is in K_{i_j}. And that gives us a square-root lower bound for k.

All this argument does is give a proof that a certain very special sort of example cannot give better than a square-root bound. But it feels as though this could be at one extreme of a more general argument, where we are dealing with very structured sets (codimension-2k subspaces). If we try to make the sets less structured, then it feels as though they will have to fill up more dimensions, in a certain sense, so there might be some trade-off to exploit. At the very least, it might be possible to obtain from the above thoughts a statement that some quite general class of examples cannot do better than Csqrt{d}.

This is of course another quarter-baked idea, but in the polymath/philomath spirit I am putting it out as it is rather than trying to work it into something more polished.
Ian Says:
Comment #76 July 22nd, 2010 at 1:24 pm
gowers, I think your definition of K_i got mangled by the comment gremlins. Can you clarify it?
gowers Says:
Comment #77 July 22nd, 2010 at 5:06 pm
Oops. I used angle brackets to denote “subspace generated by” — big mistake. What I wrote could be written less symbolically as follows: let K_i be the set of j that has the property that S_i is the subspace generated by the vectors e_j. (It’s annoyingly hard to express that: I want to say “Let K_i be the set of j such that S_i is the subspace generated by the vectors e_j,” but that sounds as though I am defining K_i to be the nonsensical {j: S_i is generated by the vectors e_j}.)

The following is a little test for my own benefit: $latex \langle e_j:j \in K_i\rangle$.
Tracy Hall Says:
Comment #78 July 22nd, 2010 at 8:57 pm
So far the discussion has been about s(f) as a function of bs(f), but to study their relative asymptotics it seems more natural to define bsmax(s) as the maximum block sensitivity of any Boolean function with sensitivity s; the question is then whether bsmax is bounded by a polynomial in s and in particular whether it is quadratic.

Andris (#46): The slight improvement to Rubinstein you described shows that for odd s, bsmax(s) is at least s*(s+1)/2, for a function on s^2 variables. Is there a general construction for even s that also achieves s*(s+1)/2, using s^2 variables? The “sorted” function Scott gave as an example does precisely this for the case s = 2. In fact a bit of hand calculation shows that bsmax(2) = 3 exactly, and moreover that every function that achieves it can be reduced to Scott’s example, up to symmetry, by replacing variables with constants. (Of course bsmax(1) = 1 uniquely as well, which is related to the wall counterexample.) What exactly did the exhaustive search by Madars Virza show? In particular, did it show that bsmax(3) = 6, with an essentially unique maximizing function up to 12 variables? That would indeed start to be suggestive that bsmax is just the sequence of triangular numbers, although on the other hand we know that there must be blocks of size at least 3 to do better than quadratic, which doesn’t leave a whole lot of room for a counterexample on 12 variables.
gowers Says:
Comment #79 July 23rd, 2010 at 5:23 am
In my previous comment (not including the one where I corrected the garbled text) I made the big and unjustified assumption that a set of sensitivity at most k must contain a k-codimensional affine subspace. In fact, I assumed even more than that — basically that every point in that set must be contained in such a subspace. But it occurs to me that there are things one can say that point in that direction, so perhaps there is a statement along those lines that is both provable and sufficiently strong to enable a modification of the argument to work.

Let’s just think what we know about sets of sensitivity k (again, I mean by this a non-empty subset A of Z^d such that every point x in A has at most k neighbours not in A). Andy pointed out some way back that such a set must be unbounded, since if it is not, then pick a point x such that x_1+…+x_d is maximized, and it has at least d neighbours not in the set.

Now it is clear that we can get a lot more out of this argument. For example, let K be any subset of {1,2,…,d} of size k+1. Then the (orthogonal) projection of A to the subspace S_K generated by those e_i with i in K cannot be bounded, since then the functional sum_{i in K}x_i would be maximized somewhere, and that would give us a point with k+1 neighbours not in A.

What sort of set has the property that its projection to all the subspaces S_K is unbounded? Well, a k-codimensional subspace will do, but it is far from necessary. We clearly need to say something more.

Can we characterize 1-sensitive subsets of Z^2? A representative example is to take the union of two adjacent horizontal or vertical lines. But another kind of example is to take the entirety of Z^2 and then remove from it a collection of isolated points (we could think of that as a complement of an “everywhere zero-dimensional” set). The second construction shows that we cannot hope for a 1-sensitive set to contain a horizontal or vertical line, but it still suggests that we have to have some kind of global “one-dimensionality”. I say “one” rather than “two” because we can also take a set such as {(x,y): -100 less than x less than 100} and remove isolated points from its interior. And that set is “globally one-dimensional” in a sense that I have yet to make precise. (But it involves dimensions that are aligned with coordinate axes.)

So here is a vague suggestion. Can we prove something like this? If A is a set of sensitivity k, then there must be some r such that A is globally (n-k+r)-dimensional with at most r-dimensional pieces removed from it. (Dimensions here are accurate to within plus or minus 1.)

As a starting problem, which ought to be easy, can we show that a non-empty subset of Z^10000000 of sensitivity 100, say, must contain infinitely many points on some line parallel to one of the coordinate axes?
Gil Says:
Comment #80 July 26th, 2010 at 12:12 pm
While I did not yet internalized Scott’s Z^d formulation it looks that in both formulations it will be very useful to characterize sets with small sensitivity. Thinking more about Boolean functions described by “random” small depth Boolean circuits while the average sensitivity is polylog (n) I am leaning to think that the average sensitivity will be much larger. (But I am not sure and have no clue how to prove such a thing.) Anyway, I realize that if yout function depends on all variables the sensitivity is at least log n but I would like to see various examples (ot just links to previous comments) of Boolean functions with sensitivity log n or polylog n .
Peter Shor Says:
Comment #81 July 26th, 2010 at 12:18 pm
I think I have a counterexample in Z^d. I’ve posted a description at Math Overflow (where there’s better LaTeX support) and already Scott has simplified it.

http://mathoverflow.net/questions/31482/the-sensitivity-of-2-colorings-of-the-d-dimensional-integer-lattice/33399
Scott Says:
Comment #82 July 27th, 2010 at 6:42 am
Just to let people know: Peter and I discussed it yesterday and unfortunately found a bug in his counterexample. The s(C)=Ω(√d) conjecture stands.
Ian Says:
Comment #83 July 29th, 2010 at 4:15 pm
I’d like to emphasize the earlier observation (originally by Daniel) that the coloring can be taken to be periodic. In fact if we want we can even assume the same periodicity in all coordinate directions.

I mention this again because I agree with previous comments that this problem has a topological/Sperner’s lemma “feel” to it, and a periodic coloring can be pushed to the d-dimensional torus, which has some significant topological differences from R^d. (Also, the periodicity comes naturally from how the coloring problem arose from the original boolean function problem.)

So I propose the torus-coloring sensitivity problem as a variant for consideration, whose results still strictly translate over to the original boolean function problem.
Ian Says:
Comment #84 July 30th, 2010 at 1:41 am
Here’s some ideas that naturally invoke a sqrt(d) bound, it’s late so I hope it’s right:

Start at a blue point with k red points adjacent to it. There are at least d-(k+1)r coordinate axes (“tangent coordinates”) we can move along from the blue point that gets us to another blue point with at least k red points in the same “normal directions” as before. Note that this number is positive unless d is less than or equal to r(r+1), in which case the sqrt(d) bound holds, so for contradiction purposes we can assume it always positive. In fact, we can assume, say, d-(k+1)r is at least d/2, or any other constant multiple of d, by the same argument.

Moving along these directions, one of two things can happen: 1) The same tangent directions work always, and we get a plane of dimension d-(k+1)r with k constant normal directions. This is a slight generalization of a property Gowers assumes above. Or, 2) at some blue point the previous tangent direction now contains a red point. In this case, this blue point now has at least (k+1) normal directions, and applying the above result, we can continue from here along a new d-(k+2)r dimensional plane with (k+1) fixed normal directions. Repeating the above, case 2) can only happen a finite number of times, namely, until we reach a blue point with r (mutually perpendicular) normal directions, i.e. a point with maximum sensitivity. In particular, any point of maximum sensitivity must belong to a d-(r+1)r dimensional plane, where every point in the plane has maximum sensitivity, the same color, and the same normal directions.

Now build on Gowers’ idea. Consider the possible maximum-sensitivity planes reachable from each of the d blue points on coordinate axes. Note that by construction, the plane(s) associated to the ith coordinate axis have constant ith coordinate (b/c there’s a normal in that direction always). Each max-sensitivity plane has at most 2r normal directions, and each contains the normal direction from the original coordinate axis, so there must be are at least d/2r such max-sensitivity planes.

I think we might be able to push through the other side of the inequality with at least a polynomial bound but I need to go to sleep now and I’m thinking there’s a good chance some or all of the above is wrong..

To be continued..
Ian Says:
Comment #85 July 30th, 2010 at 10:53 pm
I reworded my previous ideas more cleanly and better typeset on the Math Overflow site:
http://mathoverflow.net/questions/31482/the-sensitivity-of-2-colorings-of-the-d-dimensional-integer-lattice/33971#33971
anon Says:
Comment #86 August 9th, 2010 at 1:23 pm
Scott … the overall post and the idea seems somewhat taken from a paper you recently reviewed for a journal ?
Scott Says:
Comment #87 August 9th, 2010 at 5:33 pm
anon #86: I truly, honestly have no clue what you’re talking about. I didn’t recently review any paper having anything to do with s vs. bs.
michael Says:
Comment #88 August 10th, 2010 at 8:17 pm
I hope you don’t mind frivolous comments on this impressive blog. I want to point out a nice link to a different Philomath concept: the Philomath Frolic and Rodeo –
http://www.philomathrodeo.org/
Andris Says:
Comment #89 August 13th, 2010 at 8:38 am
Tracy (#78):

we do not have such an example for even s at the moment.

Regarding the maximizing example, it’s non-unique. Another function that achieves s=3, bs=6 is as follows:
– Denote the variables as x_0, x_1, x_2, y_0, y_1, y_2, z_0, z_1, z_2.
– If there is an i such that x_i\neq y_i and x_{(i+1) mod 3}=y_{i+1}, let f=(y_i+y_{(i+1) mod 3}+z_i) mod 2. (If such i exists, it is unique because we either have only one i:x_i=y_i or only one i:x_i\neq y_i.)
– Otherwise, let f=(x_0+x_1+x_2) mod 2.

We do not know if there are more functions or not. The drawback of using a SAT solver is that it outputs one solution to the SAT instance, not all solutions. Madars tried several different SAT solvers. One of them gave the modified Rubinstein function, the other gave the function above.