### Expanders and groups

I have started to read a new paper by Gowers titled

(The distinction is sometimes artificial: for example the support of a pseudorandom distribution can also be thought of as a quasi-random set. To add confusion, in older literature on extractors certain high-entropy sources were called "quasi-random" sources.)

Although the paper does not directly talk about expander graphs, it addresses related questions, and it gives an opportunity for a post (or more than one) on expander graph constructions.

Informally, an expander graph is a graph that, while having small bounded degree, is very well-connected. For example, a $\sqrt{n} \times \sqrt{n}$ grid can be disconnected into two equal-size connected components after removing just $\sqrt{n}$ edges. This means that a $\Omega(1)$ fraction of all pairs of vertices can be disconneted while only removing a $o(1)$ fraction of all edges; for this reason, we do not think of a 2-dimensional grid as a good expander. In a graph of degree $O(1)$ it is always possible to cut off $k$ vertices from the rest of the graph by removing $O(k)$ edges; a graph is an expander if this is best possible. That is, a graph is an expander if for every set of $k$ vertices (where $k$ is less than half of the total number of vertices) you need to remove at least $\Omega(k)$ edges to disconnect those vertices from the rest of the graph.

More quantitatively, a $d$-regular graph has

We have a good construction of expanders if for fixed constants $\epsilon$ and $d$ we are able to construct $\epsilon$-expanders of degree $d$ with any desired number of vertices.

The applications of expander graphs are vast, they are useful to reduce randomness in probabilistic algorithms, they come up in certain constructions of data structures, of randomness extractors, and of error-correcting codes, and they have many applications in complexity theory. Most recently, Dinur's new proof of the PCP Theorem uses expanders. A great set of lecture notes on expanders is here, and not so great one is here (lectures 8-12 are on expanders).

Before the great zig-zag graph product revolution, almost all the good expander graph constructions were

If $G$ is a group and $D\subset G$ is a subset, then $G$ and $D$ define a $|D|$-regular graph with $|G|$ vertices as follows: every element of $G$ is a vertex, and vertices $u$ and $v$ are connected if $u-v \in D$. If we assume that, for every $a\in D$ we also have $-a \in D$, then we can think of the graph as being undirected.

A first trivial observation is that if $D$ is a set of generators, that is, if we can realize every element of $G$ as a sum of elements of $D$, then the graph is connected, otherwise it is disconnected.

Another point, which is true for all graphs, not just Cayley graphs, is that we can approximate the expansion of a graph by computing the eigenvalues of the adjacency matrix and looking at the difference between the largest and the second largest in absolute value. This was a point that I used to find mystifying, until I studied it and realized that it makes perfect sense. I may talk about it in another post, but I want to mention an amusing fact. The connection between eigenvalues and expansion has an easy direction (if there is a large eigenvalue gap then there is large expansion) and a difficult direction (if there large expansion then there must be large eigenvalue gap). The easy direction was "well known," and the difficult direction was discovered around the same time in two distinct communities for

Alon discovered it while working on constructions of expanders. It was known how to construct graphs with noticeable eigenvalue gap which would then (by the "simple direction") be good expanders. This is because, for the algebraic constructions that we will talk about in a minute, it is easier to reason about the eigenvalues of the adjacency matrix than about the number of edges in a generic cut. Alon, by proving the hard direction, showed that the approach via eigenvalue can be taken "with no loss of generality,"

Others (Aldous and Diaconis? Jerrum and Sinclair? I never get this reference right) discovered the same result working on the problem of bounding the time that it takes for a random walk on certain graphs to converge to the stationary distribution. To answer such questions, one needs to know the eigenvalue gap of the graphs, a quantity that is difficult to estimate directly in important cases. For those graphs, however, it is possible to reason about the size of cuts, and hence, via the difficult connection, obtain results on the eigenvalue gap (and, thus, on the convergence time, or "mixing" time of a random walk).

This "difficult direction" result has also an important algorithmic application. The counterpositive statement is that if largest and second largest eigenvalue are close in value, then the graph is not expanding, so one can cut off a set of vertices by deleting a comparatively small number of vertices. The proof is algorithmic: given the eigenvector of the second largest eigenvalue, this small cut can be found efficiently. This is the principle at work in "spectral partitioning" algorithms, often used in practice (and, occasionally, in theory).

Well, that was a long digression, but it's also an interesting story. Back to our

Unfortunately, a Cayley graph based on an Abelian group can never be a good expander. To see why, you'll have to believe my claim that in a good expander with $n$ vertices the diameter is at most $O(\log n)$. Consider now an $n$-vertex Cayley graph of degree $d$. Every vertex $x$ is reachable from vertex $0$ via a path of length $O(\log n)$. That is, $x$ is the sum of $O(\log n)$ elements of the set $D$. Because of commutativity, we can rearrange the sum so that it looks as

$(a_1+ \ldots + a_1) + (a_2 + \ldots + a_2) +\ldots (a_d + \ldots + a_d)$

where $a_1,\ldots,a_d$ are the elements of $D$. We see that the sum is specified just by saying how many time each element of $D$ must be added, and so $x$ is specified by $d$ numbers, each between $1$ and $O(\log n)$. This gives $n \leq O((\log n)^d)$ which is a contradiction.

Indeed, we see that Cayley graphs based on Abelian groups fail very badly to have the required diameter condition: if the degree is $d$, there will be vertices at distance about $n^{1/d}$ from each other, a bound that is met by a $d$-dimensional grid.

Intuitively, then, we can get an expanding bounded-degree Cayley graph only if we start from a "very non-Abelian" group. But how do we quantify the "Abelianity" of a group, and what kind of groups are "very non-Abelian"? This will be the subject of the next post.

*Quasi-random groups*. "Quasi-random" is the more-or-less standard term used to denote a single object that has some of the typical properties of a random object; this is distinct from the term "pseudo-random" which is more often applied to a distribution that satisfies a certain set of properties with approximately the same probability as the uniform distribution.(The distinction is sometimes artificial: for example the support of a pseudorandom distribution can also be thought of as a quasi-random set. To add confusion, in older literature on extractors certain high-entropy sources were called "quasi-random" sources.)

Although the paper does not directly talk about expander graphs, it addresses related questions, and it gives an opportunity for a post (or more than one) on expander graph constructions.

Informally, an expander graph is a graph that, while having small bounded degree, is very well-connected. For example, a $\sqrt{n} \times \sqrt{n}$ grid can be disconnected into two equal-size connected components after removing just $\sqrt{n}$ edges. This means that a $\Omega(1)$ fraction of all pairs of vertices can be disconneted while only removing a $o(1)$ fraction of all edges; for this reason, we do not think of a 2-dimensional grid as a good expander. In a graph of degree $O(1)$ it is always possible to cut off $k$ vertices from the rest of the graph by removing $O(k)$ edges; a graph is an expander if this is best possible. That is, a graph is an expander if for every set of $k$ vertices (where $k$ is less than half of the total number of vertices) you need to remove at least $\Omega(k)$ edges to disconnect those vertices from the rest of the graph.

More quantitatively, a $d$-regular graph has

*edge expansion*at least $\epsilon$ if for every set $S$ of vertices there are at least $\epsilon \cdot |S|$ edges going between vertices in $S$ and vertices not in $S$. (Provided $|S|$ is less than half the total number of vertices.)We have a good construction of expanders if for fixed constants $\epsilon$ and $d$ we are able to construct $\epsilon$-expanders of degree $d$ with any desired number of vertices.

The applications of expander graphs are vast, they are useful to reduce randomness in probabilistic algorithms, they come up in certain constructions of data structures, of randomness extractors, and of error-correcting codes, and they have many applications in complexity theory. Most recently, Dinur's new proof of the PCP Theorem uses expanders. A great set of lecture notes on expanders is here, and not so great one is here (lectures 8-12 are on expanders).

Before the great zig-zag graph product revolution, almost all the good expander graph constructions were

*Cayley graphs of non-Abelian groups*. Let's see what this means and how it helps.If $G$ is a group and $D\subset G$ is a subset, then $G$ and $D$ define a $|D|$-regular graph with $|G|$ vertices as follows: every element of $G$ is a vertex, and vertices $u$ and $v$ are connected if $u-v \in D$. If we assume that, for every $a\in D$ we also have $-a \in D$, then we can think of the graph as being undirected.

A first trivial observation is that if $D$ is a set of generators, that is, if we can realize every element of $G$ as a sum of elements of $D$, then the graph is connected, otherwise it is disconnected.

Another point, which is true for all graphs, not just Cayley graphs, is that we can approximate the expansion of a graph by computing the eigenvalues of the adjacency matrix and looking at the difference between the largest and the second largest in absolute value. This was a point that I used to find mystifying, until I studied it and realized that it makes perfect sense. I may talk about it in another post, but I want to mention an amusing fact. The connection between eigenvalues and expansion has an easy direction (if there is a large eigenvalue gap then there is large expansion) and a difficult direction (if there large expansion then there must be large eigenvalue gap). The easy direction was "well known," and the difficult direction was discovered around the same time in two distinct communities for

*opposite reasons*.Alon discovered it while working on constructions of expanders. It was known how to construct graphs with noticeable eigenvalue gap which would then (by the "simple direction") be good expanders. This is because, for the algebraic constructions that we will talk about in a minute, it is easier to reason about the eigenvalues of the adjacency matrix than about the number of edges in a generic cut. Alon, by proving the hard direction, showed that the approach via eigenvalue can be taken "with no loss of generality,"

Others (Aldous and Diaconis? Jerrum and Sinclair? I never get this reference right) discovered the same result working on the problem of bounding the time that it takes for a random walk on certain graphs to converge to the stationary distribution. To answer such questions, one needs to know the eigenvalue gap of the graphs, a quantity that is difficult to estimate directly in important cases. For those graphs, however, it is possible to reason about the size of cuts, and hence, via the difficult connection, obtain results on the eigenvalue gap (and, thus, on the convergence time, or "mixing" time of a random walk).

This "difficult direction" result has also an important algorithmic application. The counterpositive statement is that if largest and second largest eigenvalue are close in value, then the graph is not expanding, so one can cut off a set of vertices by deleting a comparatively small number of vertices. The proof is algorithmic: given the eigenvector of the second largest eigenvalue, this small cut can be found efficiently. This is the principle at work in "spectral partitioning" algorithms, often used in practice (and, occasionally, in theory).

Well, that was a long digression, but it's also an interesting story. Back to our

*constructions*of expander graphs, we want to describe a bounded-degree graph explicitly and prove that there is a noticeable gap between largest and second largest eigenvalue. If the graph is a Cayley graph derived from a group $G$ and set of generators $D$ (and $D$ is chosen properly), it turns out that the eigenvalues of the adjacency matrix of the graph can be explicitly computed by looking at the*representations*of the group $G$. If $G$ is Abelian (if the group operation is commutative), then the representation theory of $G$ is very simple and it is "just" the theory of Fourier transforms of functions $f: G \rightarrow C$ (where C are the complex numbers), which is much easier than it sounds.Unfortunately, a Cayley graph based on an Abelian group can never be a good expander. To see why, you'll have to believe my claim that in a good expander with $n$ vertices the diameter is at most $O(\log n)$. Consider now an $n$-vertex Cayley graph of degree $d$. Every vertex $x$ is reachable from vertex $0$ via a path of length $O(\log n)$. That is, $x$ is the sum of $O(\log n)$ elements of the set $D$. Because of commutativity, we can rearrange the sum so that it looks as

$(a_1+ \ldots + a_1) + (a_2 + \ldots + a_2) +\ldots (a_d + \ldots + a_d)$

where $a_1,\ldots,a_d$ are the elements of $D$. We see that the sum is specified just by saying how many time each element of $D$ must be added, and so $x$ is specified by $d$ numbers, each between $1$ and $O(\log n)$. This gives $n \leq O((\log n)^d)$ which is a contradiction.

Indeed, we see that Cayley graphs based on Abelian groups fail very badly to have the required diameter condition: if the degree is $d$, there will be vertices at distance about $n^{1/d}$ from each other, a bound that is met by a $d$-dimensional grid.

Intuitively, then, we can get an expanding bounded-degree Cayley graph only if we start from a "very non-Abelian" group. But how do we quantify the "Abelianity" of a group, and what kind of groups are "very non-Abelian"? This will be the subject of the next post.

## 3 Comments:

12/07/2006 04:50:00 PM

a minor correction, luca -- Caley

should instead be Cayley.

aravind

12/08/2006 01:40:00 PM

Thank you, its a very informative post.

3/12/2007 02:49:00 PM

Excellent post!

Post a Comment

<< Home