in theory: 2006-12-24

Wednesday, December 27, 2006

Breaking News!

You may remember that, in July, Italy won the football Word Cup. No? Best defense ever? Materazzi? Headbutt? 意大利万岁? Anybody?

Anyways, I was in Rome that night, I borrowed a friend's camera, and I took some pictures on the street. I gave the camera back, and he told me he would email me the pictures "in a few days." I got the pictures today, which is what he meant. I suppose anybody who has sent me a paper to referee, only to be assured I would send the review "in a few days," is now nodding knowingly.

I did not know how the settings of the camera worked, it was night, we were never able to stop the car (except in traffic), so the pictures are dark and shaky, then the battery run out just when we got to the center, plus I ran out of gas, I had a flat tire, I didn't have enough money for cab fare, my tux didn't come back from the cleaners, an old friend came in from out of town, someone stole my car, there was an earthquake, a terrible flood, LOCUSTS!

Having dispensed with the excuses, in the interest of timely dissemination here are some of the pictures.

We start driving from Monte Sacro, a neighborhood in the North-East of the city, about 5 miles away from the center. It's less than two hours since the game is over, and a newspaper kiosk is selling day-after newspapers with chronicles of the game.

In this much time they wrote the articles, printed the papers, and got them all over the city, which is, of course, completely gridlocked. This shows that when something is really important, Italians can be efficient. (No, I don't know why there is advertising for Newsweek in a Monte Sacro newspaper kiosk.)

After more than an hour, we get to the Muro Torto, the wide road (with tunnels) that runs along historical walls and goes toward Piazza del Popolo.

Almost all the traffic is, of course, in the direction toward the center, which is where we are trying to go.

This gentleman has "W la fica" writeen on his chest. (Sorry, no translation.)

Since the traffic is not moving, one guy has the time to get out of his car and climb on top of a truck.

Then there is the group of guys running around in tighty whiteys.

This guy, instead, is jumping up and down on a Mercedes S-series. Note that he removed his shoes, so there is not risk of damaging the car.

The friend behind him, instead, his standing on the windshield. The Germans sure know how to make sturdy cars.

Then there is the Zidane coffin.

The lady in the red car is not showing a lot of enthusiasm. Seven people have fit into this small Citroen cabrio.

Note again the serious lady in the red car, and the fact that nobody is driving the Citroen. Carrying open alchoolic beverages in a car is actually legally in Italy. Driving this way, however, is allowed only on special occasions.

This is a Fiat 500, the car on which I learned how to drive. (No, I am not that old, it was a used car!)

This is the closest I got to taking a picture of Piazza del Popolo.

Catenaccio (heavy chain - the kind used to lock a gate) is the term used to define Italy's defense.

Monday, December 25, 2006

Slaughtering the Cow to Get the Butter

Americans are fond of rankings and lists, and the end of the year is the time when you see top ten this and worst seven that wherever you look.

Media Matters has compiled a list of the 11 most outrageous comments of 2006 by right-wing commentators.

There is, for example, nationally syndacated obese radio host Rush Limbaugh pointing out that obesity is more prevalent in heavily Republican states and least prevalent in heavily Democratic ones, thus showing that obesity is caused by leftist liberal policies. But the best part is when he explains that, in dealing with the poor, the government is not teaching the poor how to slaughter the cow to get the butter, it just gives them the butter. Just listen to it.

Among the honorable mentions, nationally syndacated radio host Neal Boortz commented on the hairstyle of Representative Cynthia McKinney's of Georgia as follows: "She looks like a ghetto slut." Representative McKinney is black.

It is, however, the screen captions on Fox News which are the most hilarious.

Sunday, December 24, 2006

Indonesia, Papua New Guinea, and Berkeley

This is were the notable earthquakes of the last few days have taken place, according to the US geological survey web site. And Berkeley had the distinction of three earthquakes in four days, as reported, among other sources, in the People's Daily.

It must be especially annoying if you happen to be living on a tree.

An associated press article is so very reassuring:

But the minor earthquakes should not be interpreted as omens of a more destructive one to come, said Jessica Sigala, a geophysicist with the National Earthquake Information Center.

"It could mean there’s something coming, it could mean there’s nothing coming," Sigala said.

~~Merry Christmas!~~ Happy Holydays!

[Update 12/27: now that a real earthquake has hit Taiwan, this post sounds especially needless and petulant.]

Thursday, December 21, 2006

Characters and Expansion

In which we look at the representation theory of Abelian groups and at the discrete Fourier transform, we completely understand the eigenvalues and eigenvectors of the Cayley graph of an Abelian group, and we apply this powerful theory to prove that the cycle is a connected graph.

When we talked about group representations, we saw how to map each element of a group into a matrix in such a way that group addition becomes matrix multiplication.

In general, if $G$ is a finite group, then an n-dimensional representation of $G$ is a map

$\rho: G \rightarrow {\mathbb C}^{n\times n}$

such that $\rho$ is a group homomorphism between $G$ and the group of matrices $\rho(G)$. That is, we have that $\rho(g)$ is an invertible matrix for every $g$, $\rho(0)=I$, $\rho(a+b) = \rho(a) \times \rho(b)$ and $\rho(-g) = (\rho(g))^{-1}$.

(Scott will be delighted to know that the theory need not stop at finite matrices, and one may also look at representations mapping group elements into invertible linear transformations over an infinite-dimensional Hilbert space. To study finite groups, however, finite-dimensional representations are good enough.)

Note that $\rho$ need not be injective, and so, for every group, we can always define the 1-dimensional representation $\rho(g) = 1$ for every $g$. Perhaps unsurprisingly, this is usually called the trivial representation. The precise statement of a theorem of Frobenius that we mentioned earlier is that every non-trivial representation of $PSL(2,p)$ has dimension at least $(p-1)/2$.

In an Abelian group, however, there are plenty of 1-dimensional representations, and, in fact, all the information about the group is "encoded" in the set of its 1-dimensional representations. We will call a 1-dimensional representation a character of the group. (This equivalence holds only for finite groups, otherwise we need an extra condition.) Hence a character of a group $G$ is a mapping

$\chi: G \rightarrow {\mathbb C}$

such that $\chi(g) \neq 0$ for every $g$, $\chi(0)=1$, $\chi(a+b) = \chi(a)\cdot \chi(b)$ and $\chi(-g) = (\chi(g))^{-1}$. In a finite group $G$, we always have the property that if we add an element to itself $|G|$ times we get $0$; from this we can deduce that, if $\chi()$ is a character, then $\chi(g)$ must be a $|G|$-th root of 1, that is, it must be a number of the form $e^{2\pi x i/|G|}$ for some $x$. In particular, $|\chi(g)|=1$ for every $g$.

(For general groups, we say that $\chi$ is a character if it is a 1-dimensional representation and it has the property that $|\chi(g)|=1$ for every $g$. While the latter property is a consequence of the former for finite groups and, more generally, for torsion groups, this is not the case for other groups. For example, $\chi(x) := e^{x}$ is a 1-dimensional representation for $\mathbb Z$ but it is not a character.)

At first sight, it might look like a finite group might have an infinite number of characters (or at least that's what I thought when I first saw the definition). There is however a simple argument that shows that a finite group $G$ cannot have more than $|G|$ characters.

Consider the set of all functions of the form $f: G \rightarrow {\mathbb C}$; we can think of it as a $|G|$-dimensional vector space over $\mathbb C$. We have that the characters of $G$ are linearly independent, and so there cannot be more than $|G|$ of them. In fact, even more is true: the characters of $G$ are orthogonal to each other. For two functions $f,h$, define their inner product as

$(f,g) := \frac1 {|G|} \sum_{a\in G} f(a) \overline{g(a)} $

Then, if $\chi$ and $\eta$ are two different characters, we have

$(\chi,\eta) = \frac1 {|G|} \sum_{a\in G} \chi(a) \overline{\eta(a)}=0 $

The normalization in the definition of inner product is convenient because then we also have $(\chi,\chi)=1$ if $\chi$ is a character. (We will not prove these claims, but the proofs are very simple.)

We will now show that the cyclic group ${\mathbb Z}_N$ has indeed $N$ distinct characters $\chi_0,\ldots,\chi_{N-1}$. This means that this set of characters forms an orthonormal basis for the set of functions $f: {\mathbb Z}_N \rightarrow {\mathbb C}$, and that each such function can be written as a linear combination

$f(x) = \sum_{c=0}^{N-1} \hat f(c) \chi_c (x)$ (*)
where the coefficients $\hat f(c)$ can be computed as the inner product
$\hat f(c) = (f,\chi_c)$

(The equation (*) is the Fourier expansion of $f$, and the function $\hat f()$ is called the Fourier transform of $f$.)

Here is how to define the characters $\chi_c$: for each $c$, define $\chi_c(x):= e^{2\pi c x /N}$. It is immediate to verify that each such function is a character, and that, for $c=0,\ldots,N-1$ we are, indeed, defining $N$ different functions. By the previously mentioned results, there is no other function.

Let's see one more example: consider $({\mathbb Z}_2)^n$, that is, $\{0,1\}^n$ with the operation of bit-wise XOR. For each ${\mathbf a} = (a_1,\ldots,a_n)\in \{0,1\}^n$, define the function

$\chi_{\mathbf a} (x_1,\ldots,x_n) := (-1)^{\sum_i a_i x_i }$
We can again verify that all these $2^n$ functions are distinct, that each of them is a character, and so that they include all the characters of $({\mathbb Z}_2)^n$. The reader should be able to see the pattern and to reconstruct the $N_1\times \cdots N_k$ characters of the group
${\mathbb Z}_{N_1} \times \cdots \times {\mathbb Z}_{N_k}$
thus having a complete understanding of the set of characters of any given finite Abelian group.

At long last, we can get to the subject of expansion.

Let $A$ be a finite Abelian group, and $S$ be a set of generators such that if $a\in S$ then $-a\in S$. Definite the Cayley graph $G=(A,E)$ where every element of $A$ is a vertex and where $(a,b)$ is an edge if $a-b\in S$. This is cleary an $|S|$-regular undirected graph. Let $\chi_1,\ldots,\chi_{|A|}$ be the set of characters of $A$. Think of each character $\chi_i$ as an $|A|$-dimensional vector. (The vector that has the value $\chi_i(a)$ in the $a$-th coordinate.) Then these vectors are orthogonal to each other. We claim that they are also eigenvectors for the adjacency matrix of $G$.

Note a surprising fact here: the graph $G$ depends on the group $A$ and on the choice of generators $S$. The eigenvectors of the adjacency matrix, however, depend on $A$ alone. (The eigenvalues, of course, will depend on $S$.)

The proof of the claim is immediate. Let $M$ be the adjacency matrix of $G$, then the $b$-th entry of the vector $\chi_i \times M$ is

$\sum_{a\in A} \chi_i(a) M(a,b) = \sum_{s\in S} \chi_i(b+s) =$
$ \sum_{s\in S} \chi_i(b)\chi_i (s) = \chi_i (b) \sum_{s\in S} \chi_i(s)$
We are done! The vector $\chi_i$ is indeed an eigenvector, of eigenvalue $\sum_{s\in S} \chi_i(s)$. Since we have $|A|$ eigenvectors, and they are linearly independent, we have found all eigenvalues, with multiplicities.

Let's look at the cycle with $N$ vertices. It is the Cayley graph of ${\mathbb Z}_N$ with generators $\{-1,1\}$. We have $N$ eigenvalues, one for each $c=0,\ldots,N$, and we have
$\lambda_c = \chi_c(-1) + \chi_c(1) = e^{-2\pi c i/N} + e^{-2\pi c i/N}$
this may seem to contradict our earlier claim that all eigenvalues were going to be real. But two lines of trigonometry show that, in fact,
$\lambda_c = 2 \cos (2\pi c/N)$
As expected in a 2-regular graph, the largest eigenvalue is $\lambda_0 = 2\cos(0)=2$. The second largest eigenvalue, however, is $2\cos(2\pi /N$, which is $2-\Theta(1/N^2)$. This means that the expansion of the cycle is at least $\Omega(1/N^2)$ and, in particular, the cycle is a connected graph!

(After we are done having a good laugh at the expense of algebraic graph theory, I should add that this calculation, plus some relatively easy facts, implies that a random walk on the cycle mixes in $O(N^2)$ steps. Not only this is the right bound, but it is also something that does not have a completely straightforward alternative proof.)

We do a little better with the hypercube. The hypercube is the Cayley graph of the group $({\mathbb Z}_2)^n$ with the generator set that contains all the vectors that have precisely one 1. Now we have one eigenvalue for each choice of $(a_1,\ldots,a_n)$, and the corresponding eigenvalue is
$\sum_{i=1}^n (-1)^{a_i}$.

The largest eigenvalue, when all the $a_i$ are zero, is $n$, as expected in an $n$-regular graph. The second largest eigenvalue, however, is at most $n-2$. This means that the expansion of the hypercube is at least 1. (As it happens, it is precisely 1, as shown by a dimension cut.)

Sunday, December 17, 2006

Applied Philosophy

This weekend the New York Times magazine has an article by Princeton philosopher Peter Singer, on the subject of philantropy, poverty and ethics.

Singer is known for his position that all human lives have the same value, a position that sounds entirely uncontroversial until one starts to explore its ramifications. Consider for example the case of real estate investor Zell Kravinsky.

Kravinsky gave almost all of his \$45 million real estate fortune to health-related charities, retaining only his modest family home in Jenkintown, near Philadelphia, and enough to meet his family’s ordinary expenses. After learning that thousands of people with failing kidneys die each year while waiting for a transplant, he contacted a Philadelphia hospital and donated one of his kidneys to a complete stranger.

[...] Kravinsky has a mathematical mind — a talent that obviously helped him in deciding what investments would prove profitable — and he says that the chances of dying as a result of donating a kidney are about 1 in 4,000. For him this implies that to withhold a kidney from someone who would otherwise die means valuing one’s own life at 4,000 times that of a stranger, a ratio Kravinsky considers “obscene.”

I have to say that I have never found Utilitarianism convincing (and you don't want to get an Utilitarian started with his mental experiments), even though I admit that there is hardly any known better premise on which to reason about Ethics.

The article has a lot of interesting information, including the following argument, which I had never seen before in these terms:

Thomas Pogge, a philosopher at Columbia University, has argued that at least some of our affluence comes at the expense of the poor. He bases this claim not simply on the usual critique of the barriers that Europe and the United States maintain against agricultural imports from developing countries but also on less familiar aspects of our trade with developing countries. For example, he points out that international corporations are willing to make deals to buy natural resources from any government, no matter how it has come to power. This provides a huge financial incentive for groups to try to overthrow the existing government. Successful rebels are rewarded by being able to sell off the nation’s oil, minerals or timber.

In their dealings with corrupt dictators in developing countries, Pogge asserts, international corporations are morally no better than someone who knowingly buys stolen goods [...]

Thursday, December 14, 2006

Zeilberger on why P differs from NP

Scott Aaronson, Lance Fortnow and Bill Gasarch have discussed the reasons why they believe P differs from NP here, here and here.

Doron Zeilberger, motivated by Avi Wigderson's talk in Madrid devotes his latest opinion to the subject. (Via Luca A.)

His dismissal of the creativity cannot be mechanized argument is based on his long-standing belief (which I share) that human creativity, especially human mathematical creativity can in fact be emulated by algorithms and that, in the long run, algorithms will end up being superior. I think this is a misunderstanding of the argument, whose point is rather that creating something good (by humans and by algorithms) seems to require much more effort than appreciating something good, and that there are levels of "genius" which we would be able to recognize if we saw them but that we are prepared to consider infeasible. In the end, this is the same as the a fool can ask more questions than a wise man can answer argument that Zeilberger himself proposes.

Then there is the issue of whether it is fair to say that "P $\neq$ NP is a statement that affirms its own intractability." Indeed, the P versus NP question is a statement about asymptotics, while proving it is a problem of finite size.

I have two observations.

One is that the "natural proofs" results show that, assuming strong one-way functions exist (an assumption in the "ballpark" of P $\neq$ NP) there are boolean functions that are efficiently computable but have all the efficiently computable properties of random functions. This means that any circuit lower bound proof must work in a way that either would fail when applied to random functions (and there are reasons why it is difficult to come up with such proofs) or would rely on hard-to-compute properties of the function in question. So although the proof is a finite object, it does define an "algorithm" (the one that describes the properties of the function that are used in the proof) and such algorithm cannot be asymptotically efficient.

The other is that, however cleaner our theories are when formulated asymptotically, we should not lose sight of the fact that the ultimate goals of complexity theory are finite results. It will be a historic milestone when we prove that P $\neq$ NP by showing that SAT requires time at least $2^{-300} \cdot n^{(\log n) / 100 }$, but the real deal is to show that there is no circuit of size less than $2^{300}$ that solves SAT on all formulae having 10,000 clauses or fewer. The statements that we care about are indeed finite.

Wednesday, December 13, 2006

Russian humor

The January issue of the Notices of the AMS is out, and it contains an article by Anatoly Vershik on the Clay Millenium prize. You may remember a discussion we had on the wisdom of awards in general, in the context of the AMS idea of creating a fellows program and of the pros and cons of having a best paper award at STOC and FOCS.

Vershik returns to some of the standard points in this debate, and makes a few new ones. Although the tone of the article is completely serious, there are hints of deadpan humor (especially in the way he characterizes the opinions of others).

There is really no connection, but I was reminded of Closing the collapse gap, the hilarious presentation by Dmitry Orlov (found via the Peking Duck), in which he argues that when the U.S. economy will collapse, we will be much less prepared than the Russians were, and we will be in much deeper troubles. As stated in the comments at the Peking Duck, when Russians are deadly serious about something, they deal with it through dark humour.

Tuesday, December 12, 2006

Eigenvalues and expansion

Before trying to describe how one reasons about the eigenvalues of the adjacency matrix of a Cayley graph, I would like to describe the "easy direction" of the relationships between eigenvalues and expansion.

First, a one-minute summary of linear algebra. If $A$ is an $n\times n$ matrix, then a (column) non-zero vector $x$ is an eigenvector of $A$ provided that, for some scalar $\lambda$, we have $Ax= \lambda x$. If so, then $\lambda$ is an eigenvalue of $A$. A couple of immediate observations:

The equation $Ax=\lambda x$ can be rewritten as $(A-\lambda I)x = 0$, so if $\lambda$ is an eigenvalue of $A$ then the columns of the matrix $A-\lambda I$ are not linearly independent, and so the determinant of $A-\lambda I$ is zero. But we can write $det(A-\lambda I)$ as a degree-$n$ polynomial in the unknown $\lambda$, and a degree-$n$ polynomial cannot have more than $n$ roots.

So, an $n\times n$ matrix has at most $n$ eigenvalues.
If $\lambda$ is an eigenvalue of $A$, then the set of vectors $x$ such that $Ax=\lambda x$ is a linear space. If $\lambda_1$ and $\lambda_2$ are different eigenvalues, then their respective spaces of eigenvectors are independent (they only intersect at $0$).

If $A$ is a symmetric real-valued matrix, then a series of miracles (or obvious things, depending on whether you have studied this material or not) happen:

All the roots of the polynomial $det(A-\lambda I)$ are real; so, counting multiplicities, $A$ has $n$ real eigenvalues.
If $\lambda$ is an eigenvalue of $A$, and it is a root of multiplicity $k$ of $det(A-\lambda I)$, then the space of vectors $x$ such that $Ax=\lambda x$ has dimension $k$.
If $\lambda_1 \neq \lambda_2$ are two different eigenvalues, then their respective spaces of eigenvectors are orthogonal

From this fact, it is easy to conclude that starting from symmetric real matrix $A$ we can find a sequence of $n$ numbers

$\lambda_1\geq \ldots \geq \lambda_n$

and a sequence of $n$ unit vectors $v_1,\ldots,v_n$ such that the $v_i$ are orthogonal to each other and such that $A v_i = \lambda_i v_i$.

Now the claim is that these $n$ numbers and $n$ vectors completely describe $A$. Indeed, suppose that we are given a vector $x$ and that we want to compute $Ax$. First, we can write $x$ as a linear combination of the $v_i$:

$x= \alpha_1 v_1 + \cdots \alpha_n v_n$

and since the $v_i$ are an orthonormal basis, we know what the coefficients $\alpha_i$ are like: each $\alpha_i$ is just the inner product of $x$ and $v_i$.

Now, by linearity, we have

$Ax = \alpha_1 A v_1 + \cdots \alpha_n A v_n$

and, since the $v_i$ are eigenvectors,

$= \alpha_1 \lambda_1 v_1 + \cdots + \alpha_n \lambda_n v_n$

So, once we express vectors in the basis defined by the $v_i$, we see that multiplication by $A$ is just the effect of multiplying coordinate $i$ by the value $\lambda_i$.

But there is more. Recall that $\alpha_i = v_i^T x$ and so we can also write

$Ax = (\lambda_1 v_1^T v_1) x + \cdots + (\lambda_n v_n^T v_n) x$

and so, in particular, matrix $A$ is just

$A = \lambda_1 v_1^T v_1 + \cdots + \lambda_n v_n^T v_n $

a sum of $n$ rank-1 matrices, with the coefficients of the sum being the eigenvalues.

Let now $A$ be the adjacency matrix of a $d$-regular undirected graph (so $A$ is real-valued and symmetric), $\lambda_1\geq \cdots \geq \lambda_n$ be its eigenvalues, and $v_1,\ldots,v_n$ be a corresponding set of orthonormal eigenvectors.

It is not hard to see (no, really, it's two lines, see the bottom of page 2 of these notes) that all eigenvalues are between $d$ and $-d$. Also, $d$ is always an eigenvalue, as shown by the eigenvector $(1,\ldots,1)$. This means that $\lambda_1 = d$ and we can take

$v_1 = \left( \frac 1 {\sqrt n},\cdots, \frac 1 {\sqrt n} \right)$

and we have

$A = (d v_1^T v_1) x + \cdots + (\lambda_n v_n^T v_n) x$

Suppose now that all the other $\lambda_i$, $i=2,\ldots,n$ are much smaller than $d$ in absolute value. Then, intuitively, the sum is going to be dominated by the first term, and $A$ behaves sort of like the matrix $(d v_i^T v_i)$, that is, the matrix that has the value $d/n$ in each entry.

You may be skeptical that such an approximation makes sense, but see where this leads. If our graph is $G=(V,E)$, and $(S,V-S)$ is a cut of the set of vertices into two subsets, let $1_S$ be the $n$-dimensional vector that is 1 in coordinates corresponding to $S$ and 0 in the other coordinates, and define $1_{V-S}$ similarly. Then the number of edges crossing the cut is precisely

$1_S^T A 1_{V-S}$

and if we replace $A$ by the matrix that has $d/n$ in each coordinate we get the approximation $|S|\cdot|V-S|\cdot d /n$, which is the expected number of edges that would cross the cut if we had a random graph of degree $d$. This is pretty much the right result: if $|\lambda_2|,\ldots, |\lambda_n|$ are all much smaller than $d$, and if $|S|$ and $|V-S|$ are at least a constant times $n$, then the number of edges crossing the cut is indeed very close to $|S|\cdot |V-S| \cdot d/n$. This result is known as the Expander Mixing Lemma, and the proof is just a matter of doing the "obvious" calculations. (Write $1_S$ and $1_{V-S}$ in the eigenvector basis, distribute the product, remember that the $v_i$ are orthogonal, use Cauchy-Schwarz ...)

It is perhaps less intuitive that, provided that all other eigenvalues are at least a little bit smaller than $d$, then a reasonably large number of edges crosses the cut. In particular, if $|S| \leq n/2$ and if $\lambda_2 \leq d-\epsilon$, then at least $\epsilon |S|/2$ edges cross the cut. That is, if all eigenvalues except the largest one are at most $d-\epsilon$ then the edge expansion of the graph is at least $\epsilon/2$. (It takes about five lines to prove this, see the beginning of the proof of Thm 1 in these notes.)

To summarize: the adjacency matrix of an undirected graph is symmetric, and so all eigenvalues are real. If the graph is $d$-regular, then $d$ is the largest eigenvalue, and all others are between $d$ and $-d$. If all the others are between $d-\epsilon$ and $-d$, then the graph has edge-expansion $\epsilon/2$ (or better).

Sunday, December 10, 2006

Madonna and Letterman (1994)

As a break from expanders and groups, here is Madonna's 1994 notorious appearence on Letterman's show.

The incident has its own Wikipedia entry, one more piece of evidence that Wikipedia is superior to the Encyclopaedia Britannica.

Friday, December 08, 2006

Group representation

(As will become clear soon, I am totally out of my depth: Readers who find glaring mistakes, and even small ones, should note them in the comments.)

In the previous post on expanders we left off with the observation that if we want to construct an expanding Cayley graph, then we need to start from a group where all $n$ elements of the group can be obtained by a sum of $O(log n)$ terms chosen, with repetitions, from a set of $O(1)$ generators; this means that a constant root of the $n^{O(1)}$ ways of rearranging such sums must lead to different results. The group, then, has to be non-commutative in a very strong sense.

I will describe a way in which a group used to construct expanders is "very non-commutative." This will not be a sufficient property for the expansion, but it is closely related and it will come up in the paper of Gowers that we will talk about later.

With this objective in mind, let's see the notion of group representation. Let's start from the simplest group, the cyclic group of $N$ elements $Z_N$ with the operation of addition (or $Z/NZ$ as mathematicians write). One can visualize this group as $N$ points equally spaced along a circle in the plane, with element $a$ forming an angle of $2\pi a/ N$ with the $x$ axis. If we want to add an element that forms an angle $\alpha$ to an element that forms an angle $\beta$, that's the same as putting the "two angles next to each other" and we get a point that forms an angle $\alpha+\beta$. More algebraically, we identify an element $a$ with the complex number $e^{2\pi a i/n}$ and we realize group addition as multiplication of complex numbers. There is, however, another way of visualizing the cyclic group on a cycle. We can think of group element $a$ as the operation that takes a point on the cycle and moves it by an angle of $2\pi a/N$. Addition of group elements now corresponds to "composition," that is, corresponds to applying one rotation after the other.

In this view, element $a$ is now the function that maps complex number $x$ into complex number $x\cdot e^{2\pi a i/N}$.

Suppose now that our group is a product $Z_{N_1} \times \cdots \times Z_{N_k}$ of cyclic groups. This means that a group elements if a $k$-tuple of the form $(a_1,\ldots,a_d)$, and that

$(a_1,\ldots,a_d) + (b_1,\ldots,b_d) = (a_1+b_1 \bmod N_1,\ldots,a_k+b_k \bmod N_k)$

It now makes sense to view a group element $(a_1,\ldots,a_d)$ as a "high-dimensional rotation" operation that takes in input $k$ complex numbers $(x_1,\ldots,x_k)$ and outputs the $k$ complex numbers

$(x_1 \cdot e^{2\pi a_1 i /N_1},\ldots, x_k \cdot e^{2\pi a_k i /N_k})$

If we take this view, we have, again, the property that group addition becomes composition of functions. Note, also, that the function that we have associated to each group element is a very simple type of linear function: it is simply multiplication of $x$ times a diagonal matrix that has diagonal

$(e^{2\pi a_1 i /N_1},\ldots, e^{2\pi a_k i /N_k})$

Notice, also, that if $f$ is a function of the form $f(x) = A\cdot x$, where $A$ is a matrix, and $g$ is a function of the form $B\cdot x$ where $B$ is a matrix, then $f(g(x))= A\cdot B\cdot x$. That is, for linear functions, function composition is the same as matrix multiplication.

To summarize, we have started from the group $Z_{N_1} \times \cdots \times Z_{N_k}$, and we have found a way to associate a complex-valued diagonal matrix to each group element in such a way that group addition becomes matrix multiplication.

It is known that all finite Abelian groups can be written as $Z_{N_1} \times \cdots \times Z_{N_k}$, so this type of representation via diagonal matrices is possible for every finite Abelian group.

What about more general groups? If $G$ is an arbitrary finite group, it is possible to associate to each element a block-diagonal matrix in such a way that group addition becomes matrix multiplication.

(It is common to call a group operation "multiplication" when a group is not Abelian, but for consistency with the rest of the post I will keep calling it addition.)

(By the way, it is also possible to represent infinite groups by associating a linear operator to each group element, but we will only discuss finite groups here.)

If the group is Abelian, then the matrices are diagonal, that is, they are block-diagonal matrices with "block size one." So one way of quantifying "how non-Abelian" is a group is to consider how small the blocks can be in such matrix representations. That's the dimension of the representation.

Here is an example of a family of groups whose representations cannot be low-dimensional. Let $p$ be a prime (it could also be a prime power) and let us consider $2 \times 2$ matrices whose entries are integers $\bmod p$. Let us restrict ourselves to matrices whose determinant is 1 modulo $p$, and consider the operation of matrix multiplication (where, also, all operations are mod $p$). This set of matrices forms a group, because the matrices are invertible (the determinant is non-zero) and the set contains the identity matrix and is closed under product. This group is called $SL(2,p)$. The group $SL(2,p)$ contains the tiny subgroup of two elements $\{ I,-I\}$; in the representation of $SL(2,p)$ this shows up as a block of size 1. If we take the quotient of $SL(2,p)$ by $\{ I,-I \}$ then we get another group, which is called $PSL(2,p)$.

It is now a theorem of Frobenius that every representation of $PSL(2,p)$ has dimension at least $(p-1)/2$. This is really large compared to the size of $PSL(2,p)$: the group $SL(2,p)$ has $p(p-1)^2$ elements, and so $PSL(2,p)$ has $p(p-1)^2/2$ elements. The dimension of the representation is thus approximately the cube root of the number of elements of the group.

Going back to representations of Abelian groups, we see that, in that case, not only each block had size one, but also that the entire matrix had size at most logarithmic in the number of elements of the group. This shows that $SL(2,p)$ and $PSL(2,p)$ are, at least in this particular sense, "very non-Abelian," and it is an encouraging sign that they may be useful in constructing expanders and other quasi-random objects.

On his way toward a question about subsets of groups not containing triples of the form $\{ a, b, a+b\}$, Gowers shows that every dense Cayley graph constructed on $PSL(2,p)$ has constant edge expansion. (By dense, we mean that the degree is $\Omega(n)$, where $n$ is the number of vertices.) This is a result that, assuming Frobenius theorem, has a simple proof, which I hope to understand and describe at a later point.

The celebrated Ramanujan graphs of Lubotzky, Phillips and Sarnak are constant-degree Cayley graphs constructed from $PGL(2,p)$, which is the same as $PSL(2,p)$ except that we put no restriction on the determinant. Their relationship between degree and eigenvalue gap is best possible. The analysis of Lubotzky et al. is, unfortunately, completely out of reach.

Wednesday, December 06, 2006

Expanders and groups

I have started to read a new paper by Gowers titled Quasi-random groups. "Quasi-random" is the more-or-less standard term used to denote a single object that has some of the typical properties of a random object; this is distinct from the term "pseudo-random" which is more often applied to a distribution that satisfies a certain set of properties with approximately the same probability as the uniform distribution.

(The distinction is sometimes artificial: for example the support of a pseudorandom distribution can also be thought of as a quasi-random set. To add confusion, in older literature on extractors certain high-entropy sources were called "quasi-random" sources.)

Although the paper does not directly talk about expander graphs, it addresses related questions, and it gives an opportunity for a post (or more than one) on expander graph constructions.

Informally, an expander graph is a graph that, while having small bounded degree, is very well-connected. For example, a $\sqrt{n} \times \sqrt{n}$ grid can be disconnected into two equal-size connected components after removing just $\sqrt{n}$ edges. This means that a $\Omega(1)$ fraction of all pairs of vertices can be disconneted while only removing a $o(1)$ fraction of all edges; for this reason, we do not think of a 2-dimensional grid as a good expander. In a graph of degree $O(1)$ it is always possible to cut off $k$ vertices from the rest of the graph by removing $O(k)$ edges; a graph is an expander if this is best possible. That is, a graph is an expander if for every set of $k$ vertices (where $k$ is less than half of the total number of vertices) you need to remove at least $\Omega(k)$ edges to disconnect those vertices from the rest of the graph.

More quantitatively, a $d$-regular graph has edge expansion at least $\epsilon$ if for every set $S$ of vertices there are at least $\epsilon \cdot |S|$ edges going between vertices in $S$ and vertices not in $S$. (Provided $|S|$ is less than half the total number of vertices.)

We have a good construction of expanders if for fixed constants $\epsilon$ and $d$ we are able to construct $\epsilon$-expanders of degree $d$ with any desired number of vertices.

The applications of expander graphs are vast, they are useful to reduce randomness in probabilistic algorithms, they come up in certain constructions of data structures, of randomness extractors, and of error-correcting codes, and they have many applications in complexity theory. Most recently, Dinur's new proof of the PCP Theorem uses expanders. A great set of lecture notes on expanders is here, and not so great one is here (lectures 8-12 are on expanders).

Before the great zig-zag graph product revolution, almost all the good expander graph constructions were Cayley graphs of non-Abelian groups. Let's see what this means and how it helps.

If $G$ is a group and $D\subset G$ is a subset, then $G$ and $D$ define a $|D|$-regular graph with $|G|$ vertices as follows: every element of $G$ is a vertex, and vertices $u$ and $v$ are connected if $u-v \in D$. If we assume that, for every $a\in D$ we also have $-a \in D$, then we can think of the graph as being undirected.

A first trivial observation is that if $D$ is a set of generators, that is, if we can realize every element of $G$ as a sum of elements of $D$, then the graph is connected, otherwise it is disconnected.

Another point, which is true for all graphs, not just Cayley graphs, is that we can approximate the expansion of a graph by computing the eigenvalues of the adjacency matrix and looking at the difference between the largest and the second largest in absolute value. This was a point that I used to find mystifying, until I studied it and realized that it makes perfect sense. I may talk about it in another post, but I want to mention an amusing fact. The connection between eigenvalues and expansion has an easy direction (if there is a large eigenvalue gap then there is large expansion) and a difficult direction (if there large expansion then there must be large eigenvalue gap). The easy direction was "well known," and the difficult direction was discovered around the same time in two distinct communities for opposite reasons.

Alon discovered it while working on constructions of expanders. It was known how to construct graphs with noticeable eigenvalue gap which would then (by the "simple direction") be good expanders. This is because, for the algebraic constructions that we will talk about in a minute, it is easier to reason about the eigenvalues of the adjacency matrix than about the number of edges in a generic cut. Alon, by proving the hard direction, showed that the approach via eigenvalue can be taken "with no loss of generality,"

Others (Aldous and Diaconis? Jerrum and Sinclair? I never get this reference right) discovered the same result working on the problem of bounding the time that it takes for a random walk on certain graphs to converge to the stationary distribution. To answer such questions, one needs to know the eigenvalue gap of the graphs, a quantity that is difficult to estimate directly in important cases. For those graphs, however, it is possible to reason about the size of cuts, and hence, via the difficult connection, obtain results on the eigenvalue gap (and, thus, on the convergence time, or "mixing" time of a random walk).

This "difficult direction" result has also an important algorithmic application. The counterpositive statement is that if largest and second largest eigenvalue are close in value, then the graph is not expanding, so one can cut off a set of vertices by deleting a comparatively small number of vertices. The proof is algorithmic: given the eigenvector of the second largest eigenvalue, this small cut can be found efficiently. This is the principle at work in "spectral partitioning" algorithms, often used in practice (and, occasionally, in theory).

Well, that was a long digression, but it's also an interesting story. Back to our constructions of expander graphs, we want to describe a bounded-degree graph explicitly and prove that there is a noticeable gap between largest and second largest eigenvalue. If the graph is a Cayley graph derived from a group $G$ and set of generators $D$ (and $D$ is chosen properly), it turns out that the eigenvalues of the adjacency matrix of the graph can be explicitly computed by looking at the representations of the group $G$. If $G$ is Abelian (if the group operation is commutative), then the representation theory of $G$ is very simple and it is "just" the theory of Fourier transforms of functions $f: G \rightarrow C$ (where C are the complex numbers), which is much easier than it sounds.

Unfortunately, a Cayley graph based on an Abelian group can never be a good expander. To see why, you'll have to believe my claim that in a good expander with $n$ vertices the diameter is at most $O(\log n)$. Consider now an $n$-vertex Cayley graph of degree $d$. Every vertex $x$ is reachable from vertex $0$ via a path of length $O(\log n)$. That is, $x$ is the sum of $O(\log n)$ elements of the set $D$. Because of commutativity, we can rearrange the sum so that it looks as

$(a_1+ \ldots + a_1) + (a_2 + \ldots + a_2) +\ldots (a_d + \ldots + a_d)$

where $a_1,\ldots,a_d$ are the elements of $D$. We see that the sum is specified just by saying how many time each element of $D$ must be added, and so $x$ is specified by $d$ numbers, each between $1$ and $O(\log n)$. This gives $n \leq O((\log n)^d)$ which is a contradiction.

Indeed, we see that Cayley graphs based on Abelian groups fail very badly to have the required diameter condition: if the degree is $d$, there will be vertices at distance about $n^{1/d}$ from each other, a bound that is met by a $d$-dimensional grid.

Intuitively, then, we can get an expanding bounded-degree Cayley graph only if we start from a "very non-Abelian" group. But how do we quantify the "Abelianity" of a group, and what kind of groups are "very non-Abelian"? This will be the subject of the next post.

Monday, December 04, 2006

Post-modern cryptography

Oded Goldreich has written an essay in response to two essays on "provable security" by Koblitz and Menezes. Oded says that "Although it feels ridiculous to answer [the claims of Koblitz and Menezes], we undertake to do so in this essay. In particular, we point out some of the fundamental philosophical flaws that underly the said article and some of its misconceptions regarding theoretical research in Cryptography in the last quarter of a century."

Neil Koblitz spoke here at IPAM in October on the somewhat related matter of how to interpret results in the Random Oracle model and in the Generic Group model. There is an audio file of his talk.

in theory