Thursday, June 08, 2006

Analytical approaches to Szemeredi's Theorem: general case

After having discussed the notion of Gowers uniformity, the statement of Szemeredi's theorem, and the proof for progressions of length 3, let me try to say something about the case of progressions of length 4 or more, and the several related open questions.

As already discussed, Gowers's proof is based on an "algorithm" that, given a subset A of ZN of size dN, does one of the following two things:

  1. It immediately finds a progression of length k in A; or
  2. it constructs a subset A' of ZN' such that: (i) if there is a progression of length k in A', then there is also such a progression in A; (ii) A' has size at least (d+poly(d))*N'; N' is at least a constant root of N.

As in the proof for the case of progressions of length 3, once we have such an algorithm we are able to find a progression of length k in A, provided that, initially, N is at least exp(exp(poly(1/d))).

How does the "algorithm" work? Let us, again, identify the set A with its characteristic function A: ZN -> {0,1}, and consider the expression
(*) Ex,y A(x)*A(x+y)*A(x+2y)*...*A(x+(k-1)*y)

It is possible to show that (*) equals dk (which is what we would expect if A were a "pseudorandom" set) plus or minus an error term that depends on Uk-1(A-d), where Uk-1 is the dimension-(k-1) Gowers uniformity as defined earlier. A result of this kind (relating Gowers uniformity and number of arithmetic progressions) is called a "generalized Von Neumann theorem" in the literature, for reasons that I unfortunately ignore. The proof is not difficult at all, it is just a series of applications of Cauchy-Schwartz. The genius is in the choice of the right definition. (And the technical difficulties are in what comes next.)

This means that if the dimension-(k-1) Gowers uniformity of the function f(x):=A(x)-d is small, then A contains a lot of length-k arithmetic progressions, and we are in case (1)

What remains to prove is that if the set A is such that f(x):=A(x)-d has noticeably large dimension-(k-1) Gowers uniformity then we can find a set A' in ZN' as in case (2) above. Gowers's paper is approximately 130 pages long, and about 100 pages are devoted to proving just that.

Obviously, I have no idea what goes on in those 100 pages, but (based on later work by Green and Tao) I can guess how it would be if we were trying to prove Szemeredi's theorem not in ZN but in Fnp with p prime and larger than k. Translated to such a setting (I think), Gowers's argument would proceed by showing that if Uk(A-d) is at least eps, then there is an affince subspace V of Fnp of dimension n/O(1) such that A restricted to V has density at least d+poly(eps). This, in turn, would be proved by showing

  1. If f has noticeably large dimension-(k-1) Gowers uniformity, then there is an affine subspace V of dimension n/O(1) and a polynomial phase function g of degree (k-2) such that, restricted to V, f and g are noticeably correlated
  2. If f is a function that has noticeable correlation with a low degree polynomial phase function, then there is a subspace of dimension n/O(1) such that f is correlated to a linear function when restricted to that subspace.

Let's make things simple by restricting ourselves to Boolean function (even though this p=2 case has no direct application to results about arithmetic progressions).

Then, part (1) is saying that if f:{0,1}n->{-1,1} has noticeably large dimension-k Gowers uniformity, then there is a polynomial p() of degree (k-1) over Z2, and an affine subspace V of dimesion n/O(1), such that f(x) and (-1)p(x) agree on noticeably more than 1/2 of the inputs x. Part (2) is saying that if f:{0,1}n->{-1,1} and (-1)p(x) agree on noticeably more than 1/2 of the inputs, where p is a low-degree polynomial, then there is an affine subspace V of dimension n/O(1) such that, restricted to V, f has agreement with a linear function on noticeably more than 1/2 of the inputs.

The two results together imply that if f has noticeably large Gowers uniformity then there is a large sub-space in which f is correlated with a linear function. (To reach the conclusion, you need to apply part (1), then do a change of variables so that now the V of part (1) looks like {0,1}n/O(1), and then apply part(2).)

Actually, what I just wrote may be an open question, but it should be provable along the lines of Gowers's proof, and it should be much easier to prove.

For the case of progressions of length 4 in Fnp, for small prime p>4, Green and Tao prove that if f is a bounded function of noticeably large dimension-3 Gowers uniformity, then f is correlated to a degree-2 polynomial over all of Fnp, and the correlation is especially good on a subspace of dimension n-O(1). (As opposed to n/O(1).) This improvement, and several additional ideas, lead to an improved quantitative version of Szemeredi's theorem for progressions of length 4 in Fnp, and they also announced a similar improved result for progression of length 4 over the integers.

Their proof relies on the analysis of a certain "linearity test in the highly noisy case," which is also used (and, I think, proved for the first time) in Gowers's paper. In the, simpler, Boolean, case, the result is
Let F:{0,1}n -> {0,1}m, and suppose that

Prx,y[F(x)+F(y) = F(x+y)] > eps

where all operations are bitwise mod 2. Then there is an eps' (depending only on eps, but independent of n,m) and a matrix M such that the functions x->F(x) and x->Mx have agreement at least eps'


(Update: as Alex explains in his comment below, the above sentence is quite misleading. Gowers proves quite a different statement in ZN, a setting in which an analogous "highly noise linearity test" statement is provably false, and where it is difficult to even get the statement of the right analog. The Boolean statement above is proved for the first time in Alex's paper, with a proof whose outline is similar to Gowers's; Green and Tao prove the highly noise linearity test result in other prime fields, also following Gowers's argument.)

In the known proof, which uses difficult results from additive combinatorics, eps' is exponentially small in eps. Here two major open questions are: (i) find a simple proof and (ii) find a proof where eps'=poly(eps)

More ambitiously, there could be a simple proof that (say, in the Boolean case) if f has noticeable Gowers uniformity, then f is noticeably correlated with a linear function when restricted to a large subspace. The experts in the area would probably be able to turn a proof in the Boolean case to a proof that applies to the setting where we have ZN instead of {0,1}n, Bohr sets instead of subspaces, and things I do not understand instead of polynomials. Even so, if the proof in Boolean setting were simple enough, the whole thing could be significantly simpler than Gowers's proof, and there could be room for quantitative improvements.

Monday, June 05, 2006

Poincare conjecture proved for real?

The Poincare conjecture is one the seven Clay millenium problems, each worth a million-dollar prize. It says something about all 3-dimensional "simply connected" manifolds being "equivalent" to the sphere in R4. A Fields medal was awarded for the proof for k-dimensional manifolds, k>4, and another Fields medal for k=4. (In those cases the "equivalence" is with the sphere in Rk+1.)

A few years ago, the Russian mathematician Grigori Perelman posted a series of papers on the arxiv. The papers outlined a proof of the "Thurston geometrization conjecture," a statement that implies the Poincare conjecture.

Ever since, the status of the Poincare conjecture has remained uncertain. Perelman's papers do not contain a full proof, but various experts have been in agreement that most likely Perelman's ideas can be successfully formalized. Meanwhile, Perelman's himself seemed to be oblivious to the matters of writing a full paper and claiming the prize.

Now comes the news (via Not Even Wrong) that two Chinese mathematicians have written a 300-page paper that formalizes Perelman's argument, and the paper is about to be published in the Asian Journal of Mathematics.

Update 6/6/06: The Guardian technology blog picks up the story.

Update 6/6/06: The Notices of the AMS have an article on the genesis of the Clay millenium prizes. The stories are very interesting. (For example Wiles's insights into mathematical politics.) I noticed one paragraph in particular. Perhaps with Perelman's case in mind, Arthur Jaffe writes
The rules for the prize resulted from a fair amount of thought. [...] One major safeguard involved the importance of publication of the solution. [...] Of course there can also be unforeseen circumstances. For example, an author of a solution may not write it down completely [...]


Update 6/7/06: An April 10 press releas from the organizers of ICM'06 suggests that there were plans to announce in Madrid that Perelman's proof has been "checked."

Update 6/8/06: see also the May 15 ICM bulletin.

Sunday, June 04, 2006

Analytical approaches to Szemeredi's Theorem: k=3

Here is the idea of Roth's proof of the k=3 case of Szemeredi's Theorem. (See yesterday's post if you have no idea what I am talking about.) We have a subset A of ZN of size dN and we want to find a length-3 arithmetic progression in A. A first observation is that if d > 2/3 then we are done, because if we pick a,b at random there is positive probability that a,a+b,a+b+b are all in A. (Use the union bound.) The proof will work by "induction" on d, showing that the truth of the theorem for a smaller value of d can be derived from the truth of the theorem for a larger value d. Of course d is a continous parameter, so it will not really be induction, but that's how we should think of it.

The main result is an "algorithm" that given A subset of ZN of size dN does one of the following two things:

  1. It immediately finds a length-3 progression in A; or

  2. it constructs a subset A' of ZN' such that (i) if A' has a length-3 progression then so does A; (ii) A' has size at least about (d+d2)*N'; and (iii) N' is about sqrt(N).

So, given A, we run the algorithm, and we either immediately find the progression, or we get A'. In the latter case, we run the algorithm on A', and we either get a progression in A' (implying a progression in A) or we get a new set A'', and so on. But how many times do we have to keep doing this? At each step, the density of the set increases, and if it ever goes above 2/3 then we are also done. So, how many times can you do the d -> d + d2 transformation until you get 2/3? Certainly no more than O(1/d2) times, but actually O(1/d) is enough. This means that we will always find a length-3 progression in A, provided that N is large enough that we can take its square root O(1/d) times. So if N is doubly exponential in 1/d we have our proof.

Gowers's proof for the general case has the same outline. Now we want to find a length-k progression in A. If the density of A is more than (k-1)/k, we are done. Gowers describes an algorithm that, given A, either finds a length-k progression or constructs a set A' in ZN' such that if A' has a length-k progression that so does A'. Furthermore, A' has density d+poly(d) and N' is a constant root of N.

How do these arguments work? Let us look at Roth's proof in the case in which A is a subset of Fnp with prime p; for concreteness, take p=3. Given A, consider its characteristic function (that we denote again A) A: Fn3 -> {0,1} and compute its Fourier transform.

Consider now the probability that, when picking a,b at random, the three points a,a+b,a+b+b are all in A. This is the same as
(*) Ea,b A(a)*A(a+b)*A(a+b+b)

which can be expressed really cleanly in terms of the Fourier coefficients of A. In particular, the expression (*) is d3, which is the number you would get if you were looking at 3 independent points, plus or minus an error term that depends on the largest Fourier coefficient of A. If all Fourier coefficients are less than, say, d2/2, then epression (*) is at least d3/2, and we have plenty of length-3 progressions in A; this is case (1) of the proof. In this case, we should think of A as a "pseudorandom set against adversaries that look for random length-3 progressions."

What happens if A has a large Fourier coefficient? This means that A is correlated with a linear function, and so there is an affine subspace V in Fn3 of dimension n-1 such that A is denser in V than in the rest of Fn3. Now, map V to Fn-13 via an invertible affine map, and let A' be the image of A under this map. If there is a length-3 progression in A', that's just 3 points on a line; but then it means that also in A we had 3 points on a line. The set A' has density about d+d2 in Fn-13, and so we have part (2) of the argument.

As a bonus, we only lost a constant fraction of our group size at each step, so d can be as large as about 1/logN, where N=3n is the size of the initial group containing A.

Note the "win-win" structure of the argument. If A has small Fourier coefficients, then it is "pseudorandom", and we get what we want from the pseudorandomness. Otherwise, A has some "linear structure," and we can take advantage of it to reduce to a simpler instance of our original problem.

This "either we get what we want or we reduce to a simpler case" proof strategy has been used in some constructions of extractors, but it seems such a good idea that it may apply more widely, and one should keep it mind.

What about lower bounds? Excellent question. There is a very simple construction due to Behrend of a subset of size N/exp(sqrt(log N)) of {1,...,N} that has no length-3 progression, so N must be super-polynomial in 1/d when we prove the theorem over the integers (or over ZN with prime N, the two cases are essentially equivalent). Note that if you try to find a large set with no length-3 progression using the probabilistic method, you will not go very far. This is a better than random explicit construction, a rarity in extremal combinatorics. By the way, Behrend's construction is very simple (here is a half-page description), it is 60 year old, and nobody has been able to improve it ever since.

What about lower bounds for Fn3? There is no known construction of a large subset avoiding length-3 progressions. The best lower bounds are of the form cn, with c<3. Is there a (3-o(1))n lower bound? This is a major open question. If there were a (2.999)n upper bound, then it would be a stunning result, because it would show that the business of translating proofs from finite fields to the integers using Bohr sets cannot be applied as a black box, but it works on some proofs and does not work on other proofs.

Xi Shuashua!

This is getting too serious, it is time for some mandarin pop.



This song, 嘻唰唰, while (intentionally) silly, is quite catchy, and it was a big hit in China. It has also been at the center of a controversy: the band has been accused of plagiarizing the song from an older Japanese pop hit. It really makes you think: to write a song like this you have to plagiarize it? It is like the story of the worst referee report ever.
The paper presents a wrong proof of a known result. The mistake, however, is not new.