Pseudorandomness for Polynomials
I am currently in Hong Kong for my second annual winter break visit to the Chinese University of Hong Kong. If you are around, come to CUHK on Tuesday afternoon for a series of back-to-back talks by Andrej Bogdanov and me.
First, I'd like to link to this article by Gloria Steinem. (It's old but I have been behind with my reading.) I believe this presidential campaign will bring up serious reflections on issues of gender and race, and I look forward to the rest of it.
Secondly, I'd like to talk about pseudorandomness against low-degree polynomials.
Naor and Naor constructed in 1990 a pseudorandom generator whose output is pseudorandom against tests that compute affine functions in $F_2$. Their construction maps a seed of length $O(\log n /\epsilon)$ into an $n$-bit string in $F_2^n$ such that if $L: F_2^n \to F_2$ is an arbitrary affine function, $X$ is the distribution of outputs of the generator, and $U$ is the uniform distribution over $F_2^n$, we have
(1) $ | Pr [ L(X)=1] - Pr [ L(U)=1] | \leq \epsilon $
This has numerous applications, and it is related to other problems. For example, if $C\subseteq F_2^m$ is a linear error-correcting code with $2^k$ codewords, and if it is such that any two codewords differ in at least a $\frac 12 - \epsilon$ fraction of coordinates, and in at most a $\frac 12 + \epsilon$ fraction, then one can derive from the code a Naor-Naor generator mapping a seed of length $\log m$ into an output of length $k$. (It is a very interesting exercise to figure out how.) Here is another connection: Let $S$ be the (multi)set of outputs of a Naor-Naor generator over all possible seeds, and consider the Cayley graph constructed over the additive group of $F_2^n$ using $S$ as a set of generators. (That is, take the graph that has a vertex for every element of $\{0,1\}^n$, and edge between $u$ and $u+s$ for every $s\in S$, where operations are mod 2 and componentwise.) Then this graph is an expander: the largest eigenvalue is $|S|$, the degree, and all other eigenvalues are at most $\epsilon |S|$ in absolute value. (Here too it's worth figuring out the details by oneself. The hint is that in a Cayley graph the eigenvectors are always the characters, regardless of what generators are chosen.) In turn this means that if we pick $X$ uniformly and $Y$ according to a Naor-Naor distribution, and if $A\subseteq F_2^n$ is a reasonably large set, then the events $X\in A$ and $X+Y \in A$ are nearly independent. This wouldn't be easy to argue directly from the definition (1), and it is an example of the advantages of this connection.
There is more. If $f: \{0,1\}^n \rightarrow \{0,1\}$ is such that the sum of the absolute values of the Fourier coefficients is $t$, $X$ is a Naor-Naor distribution, and $U$ is uniform, we have
$ | Pr [ f(X)=1] - Pr [ f(U)=1] | \leq t \epsilon |$
and so a Naor-Naor distribution is pseudorandom against $f$ too, if $t$ is not too large. This has a number of applications: Naor-Naor distribution are pseudorandom against tests that look only at a bounded number of bits, it is pseudorandom against functions computable by read-once branching programs of width 2, and so on.
Given all these wonderful properties, it is natural to ask whether we can construct generators that are pseudorandom against quadratic polynomials over $F_2^n$, and, in general, low-degree polynomials. This question has been open for a long time. Luby, Velickovic, and Wigderson constructed such a generator with seed length $2^{(\log n)^{1/2}}$, using the Nisan-Wigderson methodology, and this was not improved upon for more than ten years.
When dealing with polynomials, several difficulties arise that are not present when dealing with linear functions. One is the correspondence between pseudorandomness against linear functions and Fourier analysis; until the development of Gowers uniformity there was no analogous analytical tool to reason about pseudorandomness against polynomials (and even Gowers uniformity is unsuitable to reason about very small sets). Another difference is that, in Equation (1), we know that $Pr [L(U)=1] = \frac 12$, except for the constant function (against which, pseudorandomness is trivial). This means that in order to prove (1) it suffices to show that $Pr[L(X)=1] \approx \frac 12$ for every non-constant $L$. When we deal with a quadratic polynomial $p$, the value $Pr [p(U)=1]$ can be all over the place between $1/4$ and $3/4$ (for non-constant polynomials), and so we cannot simply prove that $Pr[p(X)=1]$ is close to a certain known value.
A first breakthrough with this problem came with the work of Bogdanov on the case of large fields. (Above I stated the problem for $F_2$, but it is well defined for every finite field.) I don't completely understand his paper, but one of the ideas is that if $p$ is an absolutely irreducible polynomial (meaning it does not factor even in the algebraic closure of $F$), then $p(U)$ is close to uniform over the field $F$; so to analyze his generator construction in this setting one "just" has to show that $p(X)$ is nearly uniform, where $X$ is the output of his generator. If $p$ factors then somehow one can analyze the construction "factor by factor," or something to this effect. This approach, however, is not promising for the case of small fields, where the absolutely irreducible polynomial $x_1 + x_2 x_3$ has noticeable bias.
The breakthrough for the boolean case came with the recent work of Bogdanov and Viola. Their starting point is the proof that if $X$ and $Y$ are two independent Naor-Naor generators, then $X+Y$ is pseudorandom for quadratic polynomials. To get around the unknown bias problem, they divide the analysis into two cases. First, it is known that, up to affine transformations, a quadratic polynomial can be written as $x_1x_2 + x_3x_4 + \cdots + x_{k-1} x_k$, so, since applying an affine transformation to a Naor-Naor generator gives a Naor-Naor generator, we may assume our polynomial is in this form.
Now, it would be nice if every degree-3 polynomial could be written, up to affine transformations, as $x_1x_2 x_3 + x_4x_5x_6 + \cdots$, but there is no such characterization, so one has to find the right way to generalize the argument.
In the Bogdanov-Viola paper, they prove
There is a gap between the two cases, because Case 1 requires correlation with a polynomial of degree $d-1$ and Case 2 requires small Gowers uniformity $U^d$. The Gowers norm inverse conjecture of Green Tao is that a noticeably large $U^d$ norm implies a noticeable correlation with a degree $d-1$ polynomial, and so it fills the gap. The conjecture was proved by Samorodnitsky for $d=3$ in the boolean case and for larger field and $d=3$ by Green and Tao. Assuming the conjecture, the two cases combine to give an inductive proof that if $X_1,\ldots X_d$ are $d$ independent Naor-Naor distributions then $X_1+\ldots+X_d$ is pseudorandom for every degree-$d$ polynomial.
Unfortunately, Green and Tao and Lovett, Meshulam, and Samorodnitsky prove that the Gowers inverse conjecture fails (as stated above) for $d\geq 4$ in the boolean case.
Lovett has given a different argument to prove that the sum of Naor-Naor generators is pseudorandom for low-degree polynomials. His analysis also breaks down in two cases, but the cases are defined based on the largest Fourier coefficient of the polynomial, rather than based on its Gowers uniformity. (Thus, his analysis does not differ from the Bogdanov-Viola analysis for quadratic polynomials, because the dimension-2 Gowers uniformity measures the largest Fourier coefficient, but it differs when $d\geq 3$.) Lovett's analysis only shows that $X_1 +\cdots + X_{2^{d-1}}$ is pseudorandom for degree-$d$ polynomials, where $X_1,\ldots,X_{2^{d-1}}$ are $2^{d-1}$ independent Naor-Naor generators, compared to the $d$ that would have sufficed in the conjectural analysis of Bogdanov and Viola.
The last word on this problem (for now) is this paper by Viola, where he shows that the sum of $d$ independent Naor-Naor generators is indeed pseudorandom for degree-$d$ polynomials.
Again, there is a case analysis, but this time the cases depend on whether or not $Pr [p(U)=1] \approx \frac 12$.
If $p(U)$ is noticeably biased (this corresponds to a small $k$ in the quadratic model case), then it follows from the previous Bogdanov-Viola analysis that a distribution that is pseudorandom against degree $d-1$ polynomials will also be pseudorandom against $p$.
The other case is when $p(U)$ is nearly unbiased, and we want to show
that $p(X_1+\ldots +X_d)$ is nearly unbiased. Note how weak is the assumption, compared to the assumption that $p$ has small dimension-$d$ Gowers norm (in Bogdanov-Viola) or that all Fourier coefficients of $p$ are small (in Lovett). The same three tools that work in the quadratic case, however, work here too, in a surprisingly short proof.
First, I'd like to link to this article by Gloria Steinem. (It's old but I have been behind with my reading.) I believe this presidential campaign will bring up serious reflections on issues of gender and race, and I look forward to the rest of it.
Secondly, I'd like to talk about pseudorandomness against low-degree polynomials.
Naor and Naor constructed in 1990 a pseudorandom generator whose output is pseudorandom against tests that compute affine functions in $F_2$. Their construction maps a seed of length $O(\log n /\epsilon)$ into an $n$-bit string in $F_2^n$ such that if $L: F_2^n \to F_2$ is an arbitrary affine function, $X$ is the distribution of outputs of the generator, and $U$ is the uniform distribution over $F_2^n$, we have
(1) $ | Pr [ L(X)=1] - Pr [ L(U)=1] | \leq \epsilon $
This has numerous applications, and it is related to other problems. For example, if $C\subseteq F_2^m$ is a linear error-correcting code with $2^k$ codewords, and if it is such that any two codewords differ in at least a $\frac 12 - \epsilon$ fraction of coordinates, and in at most a $\frac 12 + \epsilon$ fraction, then one can derive from the code a Naor-Naor generator mapping a seed of length $\log m$ into an output of length $k$. (It is a very interesting exercise to figure out how.) Here is another connection: Let $S$ be the (multi)set of outputs of a Naor-Naor generator over all possible seeds, and consider the Cayley graph constructed over the additive group of $F_2^n$ using $S$ as a set of generators. (That is, take the graph that has a vertex for every element of $\{0,1\}^n$, and edge between $u$ and $u+s$ for every $s\in S$, where operations are mod 2 and componentwise.) Then this graph is an expander: the largest eigenvalue is $|S|$, the degree, and all other eigenvalues are at most $\epsilon |S|$ in absolute value. (Here too it's worth figuring out the details by oneself. The hint is that in a Cayley graph the eigenvectors are always the characters, regardless of what generators are chosen.) In turn this means that if we pick $X$ uniformly and $Y$ according to a Naor-Naor distribution, and if $A\subseteq F_2^n$ is a reasonably large set, then the events $X\in A$ and $X+Y \in A$ are nearly independent. This wouldn't be easy to argue directly from the definition (1), and it is an example of the advantages of this connection.
There is more. If $f: \{0,1\}^n \rightarrow \{0,1\}$ is such that the sum of the absolute values of the Fourier coefficients is $t$, $X$ is a Naor-Naor distribution, and $U$ is uniform, we have
$ | Pr [ f(X)=1] - Pr [ f(U)=1] | \leq t \epsilon |$
and so a Naor-Naor distribution is pseudorandom against $f$ too, if $t$ is not too large. This has a number of applications: Naor-Naor distribution are pseudorandom against tests that look only at a bounded number of bits, it is pseudorandom against functions computable by read-once branching programs of width 2, and so on.
Given all these wonderful properties, it is natural to ask whether we can construct generators that are pseudorandom against quadratic polynomials over $F_2^n$, and, in general, low-degree polynomials. This question has been open for a long time. Luby, Velickovic, and Wigderson constructed such a generator with seed length $2^{(\log n)^{1/2}}$, using the Nisan-Wigderson methodology, and this was not improved upon for more than ten years.
When dealing with polynomials, several difficulties arise that are not present when dealing with linear functions. One is the correspondence between pseudorandomness against linear functions and Fourier analysis; until the development of Gowers uniformity there was no analogous analytical tool to reason about pseudorandomness against polynomials (and even Gowers uniformity is unsuitable to reason about very small sets). Another difference is that, in Equation (1), we know that $Pr [L(U)=1] = \frac 12$, except for the constant function (against which, pseudorandomness is trivial). This means that in order to prove (1) it suffices to show that $Pr[L(X)=1] \approx \frac 12$ for every non-constant $L$. When we deal with a quadratic polynomial $p$, the value $Pr [p(U)=1]$ can be all over the place between $1/4$ and $3/4$ (for non-constant polynomials), and so we cannot simply prove that $Pr[p(X)=1]$ is close to a certain known value.
A first breakthrough with this problem came with the work of Bogdanov on the case of large fields. (Above I stated the problem for $F_2$, but it is well defined for every finite field.) I don't completely understand his paper, but one of the ideas is that if $p$ is an absolutely irreducible polynomial (meaning it does not factor even in the algebraic closure of $F$), then $p(U)$ is close to uniform over the field $F$; so to analyze his generator construction in this setting one "just" has to show that $p(X)$ is nearly uniform, where $X$ is the output of his generator. If $p$ factors then somehow one can analyze the construction "factor by factor," or something to this effect. This approach, however, is not promising for the case of small fields, where the absolutely irreducible polynomial $x_1 + x_2 x_3$ has noticeable bias.
The breakthrough for the boolean case came with the recent work of Bogdanov and Viola. Their starting point is the proof that if $X$ and $Y$ are two independent Naor-Naor generators, then $X+Y$ is pseudorandom for quadratic polynomials. To get around the unknown bias problem, they divide the analysis into two cases. First, it is known that, up to affine transformations, a quadratic polynomial can be written as $x_1x_2 + x_3x_4 + \cdots + x_{k-1} x_k$, so, since applying an affine transformation to a Naor-Naor generator gives a Naor-Naor generator, we may assume our polynomial is in this form.
- Case 1: if $k$ is small, then the polynomial depends on few variables, and so even just one Naor-Naor distribution is going to be pseudorandom against it;
- Case 2: if $k$ is large, then the polynomial has very low bias, that is, $Pr[p(U)] \approx \frac 12$. This means that it is enough to prove that $Pr[p(X+Y)] \approx \frac 12$, which can be done using (i) Cauchy-Schwartz, (ii) the fact that $U$ and $U+X$ are nearly independent if $U$ is uniform and $X$ is Naor-Naor, and (iii) the fact that for fixed $x$ the function $y \rightarrow p(x+y) - p(x)$ is linear.
Now, it would be nice if every degree-3 polynomial could be written, up to affine transformations, as $x_1x_2 x_3 + x_4x_5x_6 + \cdots$, but there is no such characterization, so one has to find the right way to generalize the argument.
In the Bogdanov-Viola paper, they prove
- Case 1: if $p$ of degree $d$ is correlated with a degree $d-1$ polynomial, and if $R$ is a distribution that is pseudorandom against degree $d-1$ polynomials, then $R$ is also pseudorandom against $p$;
- Case 2: if $p$ of degree $d$ has small Gowers uniformity norm of dimension $d$, then $Pr [p(U)=1] \approx \frac 12$, which was known, and if $R$ is pseudorandom for degree $d-1$ and $X$ is a Naor-Naor distribution, then $Pr[p(R+X)=1] \approx \frac 12$ too.
There is a gap between the two cases, because Case 1 requires correlation with a polynomial of degree $d-1$ and Case 2 requires small Gowers uniformity $U^d$. The Gowers norm inverse conjecture of Green Tao is that a noticeably large $U^d$ norm implies a noticeable correlation with a degree $d-1$ polynomial, and so it fills the gap. The conjecture was proved by Samorodnitsky for $d=3$ in the boolean case and for larger field and $d=3$ by Green and Tao. Assuming the conjecture, the two cases combine to give an inductive proof that if $X_1,\ldots X_d$ are $d$ independent Naor-Naor distributions then $X_1+\ldots+X_d$ is pseudorandom for every degree-$d$ polynomial.
Unfortunately, Green and Tao and Lovett, Meshulam, and Samorodnitsky prove that the Gowers inverse conjecture fails (as stated above) for $d\geq 4$ in the boolean case.
Lovett has given a different argument to prove that the sum of Naor-Naor generators is pseudorandom for low-degree polynomials. His analysis also breaks down in two cases, but the cases are defined based on the largest Fourier coefficient of the polynomial, rather than based on its Gowers uniformity. (Thus, his analysis does not differ from the Bogdanov-Viola analysis for quadratic polynomials, because the dimension-2 Gowers uniformity measures the largest Fourier coefficient, but it differs when $d\geq 3$.) Lovett's analysis only shows that $X_1 +\cdots + X_{2^{d-1}}$ is pseudorandom for degree-$d$ polynomials, where $X_1,\ldots,X_{2^{d-1}}$ are $2^{d-1}$ independent Naor-Naor generators, compared to the $d$ that would have sufficed in the conjectural analysis of Bogdanov and Viola.
The last word on this problem (for now) is this paper by Viola, where he shows that the sum of $d$ independent Naor-Naor generators is indeed pseudorandom for degree-$d$ polynomials.
Again, there is a case analysis, but this time the cases depend on whether or not $Pr [p(U)=1] \approx \frac 12$.
If $p(U)$ is noticeably biased (this corresponds to a small $k$ in the quadratic model case), then it follows from the previous Bogdanov-Viola analysis that a distribution that is pseudorandom against degree $d-1$ polynomials will also be pseudorandom against $p$.
The other case is when $p(U)$ is nearly unbiased, and we want to show
that $p(X_1+\ldots +X_d)$ is nearly unbiased. Note how weak is the assumption, compared to the assumption that $p$ has small dimension-$d$ Gowers norm (in Bogdanov-Viola) or that all Fourier coefficients of $p$ are small (in Lovett). The same three tools that work in the quadratic case, however, work here too, in a surprisingly short proof.
2 Comments:
1/16/2008 03:33:00 PM
How about a post on what a beginning grad student can read to get up to speed on this area? It seems there are no surveys or lecture notes yet, and obtaining the necessary math background is difficult without knowing which math is needed (and which books are good for this purpose). Or do you recommend just diving in to the papers and learning the background as needed?
1/18/2008 05:27:00 AM
You have described one generalization of Naor and Naor's work: moving from pseudo-randomness with respect to linear functions to pseudo-randomness wrt polynomials.
What about another (maybe easier) generalization: pseudo-randomness wrt several linear functions simultaneously?
For instance, if we have two (distinct, non identically zero) linear functions f and g, one would expect each of the 4 possible values of the pair (f(x),g(x)) to be taken with probability approximately 1/4.
Post a Comment
<< Home