Saturday, June 03, 2006

Szemeredi's theorem

Szemeredi's theorem on arithmetic progressions is one of the great triumphs of the "Hungarian approach" to mathematics: pose very difficult problems, and let deep results, connections between different areas of math, and applications, come out as byproducts of the search for a solution.

(Tim Gowers's essay The two cultures of mathematics discusses brilliantly this "problem solving" way of doing math, as distinct from the "theory building" way.)

Some of the math surrounding Szemeredi's theorem has found applications in property testing, in PCP constructions, and in extractor constructions, and more future applications are likely. Besides, it is truly beautiful (and somewhat accessible) math. It can be a lot of fun for a computer scientist to delve into this literature.

In telling the story of Szemeredi's theorem, one usually starts from Van der Waerden's 1927 theorem that, no matter how you color the integers with a finite number of colors, there are arbitrarily long monochromatic arithmetic progressions. (An aritmetic progression is a sequence of the form a,a+b,a+2b,...,a+kb,....) In a more quantitative form, the theorem says that for every constants c,k there is an N(c,k) such that for every N>N(c,k), no matter how we color the integers {1,...,N} with c colors, there will be a monochromatic arithmetic progression of length k.

In 1936, Erdos and Turan conjectured that the coloring in Van der Waerden's theorem was just a "distraction," and that the "true reason" why the theorem is true is that every dense enough set of integers (in particular, the largest color class in Van der Waerden's theorem) must contain long arithmetic progressions. Specifically, they conjectured that for every density 0 < d <1 and every integer k there is an N(d,k) such that every subset A of {1,...,N} of cardinality dN contains a length-k arithmetic progression, provided N> N(d,k).

This conjecture became one of the great unsolved problems in Ramsey theory. In 1953, Roth proved the k=3 case of the Erdos-Turan conjecture. In Roth's proof, N(d,3) is doubly exponential in 1/d. In other words, Roth proves that a subset of {1,...,N} containing at least about N/log log N elements must contain a length-3 progression. The proof is analytical, and uses Fourier analysis. (Or the "circle method" as number theorists say.)

It was only in 1974 that Szemeredi proved the Erdos-Turan conjecture in the general case, and this is the result that is now known as Szemeredi's Theorem. Szemeredi's proof is combinatorial, and works via a reduction to a graph-theoretic problem. The Szemeredi regularity Lemma, that has several applications in computer science, is part of the proof. The value of even N(d,4) is a tower of exponentials in Szemeredi's proof; one only gets that a subset of {1,...,N} of size about N/log* log* n must contain a length-4 progression.

In 1977 Furstemberg found a completely different proof. He proved a "transfer theorem" showing that results about arithmetic progressions (and other related structures) can be derived from statements about certain continous objects, and then he proved the continous statements by "standard methods" in Ergodic Theory. The transfer theorem uses a result that requires the Axiom of Choice and so, while the proof establishes that finite values of N(d,k) exist, it is impossible, even in principle, to extract any quantitative bound.

Quantitative bounds are important because of the long-standing (and recently solved) question of whether the primes contain arbitrarily long arithmetic progressions. If one could show that a subset of {1,...,N} of size about N/log N contains long arithmetic progressions, then the result for the primes follows purely from the fact that the set of primes is dense enough.

There is a simple reduction that shows that, to prove Szemeredi's theorem over the integers, it is enough to prove it for the additive group ZN, with N prime. (That is, it is enough to look for arithmetic progressions mod N in {1,...,N} .) Also, if one looks at Roth's proof, it can be seen that it gives a much stronger bound in the additive group Fn3; in that setting, the proof shows that in every subset A of size about 3n/n there are three points in arithmetic progressions (that is, three points of the form a,a+b,a+b+b). The bound is of the form N/log N, where N is the size of the group. The proof, however, uses the fact that Fn3 is a linear space, that it has subspaces, and so on. Bourgain (1999) was able to "simulate" this proof in ZN by using the notion of "Bohr" set of ZN, and showing that it can play a role analogous to that of "subspace" in Fn3. Bourgain obtains a bound that is about N/sqrt(log N).

The great quantitative breakthrough came with the work of Gowers (1998-2001), who proved bounds of the form N/loglog N (as in Roth's proof) for progressions of any length. His work introduced the notion of Gowers uniformity I discussed two days ago.

So there are now three completely different proofs, Szemeredi's proof using graph theoretic arguments, Furstemberg's proof using Ergodic Theory, and Gowers's analytical proof. Recent work by Tao, however, shows that there are deep similarities between the proofs, although a complete "understanding" of what goes on is probably still to come.

There is also a new (2004-05) fourth proof, that uses a hypergraph regularity lemma, and that is similar in spirit to (but technically very different from, and simpler than) Szemeredi's proof. It also has deep relations with the the other three proofs. (See e.g. this paper.) It is an independent achievement of Nagle, Rödl, Schacht, and Skokan; Gowers; and Tao.

The most famous result on arithmetic progressions is probably the Green-Tao proof that the primes contain arbitrarily long arithmetic progressions.

The structure of the proof is quite amazing, and perhaps it contains ideas that could also be useful in computer science. They begin by defining a notion of pseudorandomness for sets of integers (which requires two conditions: Gowers uniformity and an additional properties). Then they show:

  1. Assuming Szemeredi's theorem as a black-box, the theorem is also true for subsets of pseudorandom sets. That is, if S is a large enough and pseudorandom enough set of integers, and A is a subset of A of constant density, then A must contain long arithmetic progressions. (Roughly speaking, a "distinguisher" that looks for arithmetic progressions cannot distinguish a dense subset of a pseudorandom set from a dense subset of all the integers. This is akin to the notion of "pseudo-entropy" in HILL, but the quantification is different.)

  2. The set of "pseudoprimes" (integers having only few, large, divisors) is pseudorandom in the integers. (This uses new results of Goldston and Yildirim.)

  3. The primes have constant density in the pseudoprimes.

Coming up next: Roth's proof for k=3, the structure of Gowers's proof for general k, the Green-Tao improvement for k=4, and the constellation of wonderful open questions around the k=3 and k=4 cases in finite vector spaces. (After that, back to politics, food, movies, and China.)


  1. Anonymous Helger
    6/03/2006 04:24:00 PM

    Thanks! Because I've read your paper with Alex and skimmed the Green-Tao paper, most of this post was understandable. However, it would be easier if you could define pseudorandomness/pseudoprimes: otherwise people with (not only) crypto background would get confused.

  2. Anonymous Anonymous
    6/03/2006 08:11:00 PM

    Could you recommend papers/tutorials for TCS graduate students to get into this material?

  3. Anonymous Anonymous
    6/03/2006 10:18:00 PM


    Have a look at Tao's expository note Obstructions to uniformity, and arithmetic patterns in the primes for a good discussion of the number-theoretic issues involved.

    At a rough level, you can think of the "pseudoprimes" as all numbers in {1,2,...,n} which are coprime to (say) {1,2,...,n^{0.01}}.

  4. Anonymous Helger
    6/06/2006 03:49:00 AM

    Anonymous: my point was just that it is confusing to use the standard notions(at least in one community) for something completely different in another community. Pseudorandom/pseudoprimes are well established notions. Moreover, Growers himself now uses the word "quasirandom" (eg., ) but it still means something different.

    Btw, as I said, I've myself read some of the background papers.

  5. Anonymous Anonymous
    6/06/2006 02:11:00 PM

    Different notions in different fields are often given the same name. Despite much fuss often being made about it, there is almost never any confusion when one reads statements in context. Even in TCS, pseudorandom means something different to cryptographers than it does to people trying to derandomize BPP, for instance.

  6. Blogger altan
    3/14/2008 04:35:00 AM

    This comment has been removed because it linked to malicious content. Learn more.


Post a Comment

Links to this post:

Create a Link

<< Home