Wednesday, October 18, 2006

The limitations of linear and semidefinite programming

The fact that Linear Programming (LP) can be solved in polynomial time (and, also, efficiently in practice) and that it has such a rich geometric theory and such remarkable expressive power makes LP a powerful unifying concept in the theory of algorithms. It "explains" the existence of polynomial time algorithms for problems such as Shortest Paths and Min Cut, and if one thinks of the combinatorial algorithms for such problems as algorithms for the corresponding linear programs, one gains additional insights.

When looking for algorithms for a new combinatorial problem, a possible approach is to express the problem as a 0/1 integer program, then relax it to a linear program, by letting variables range between 0 and 1, and then hope for the best. "The best" being the lucky event that the value of the optimum of the relaxation is the same as that of the combinatorial program, or at least a close approximation. If one finds that, instead, the relaxation has optimal fractional solutions of cost very different from the combinatorial optimum, one may try to add further inequalities that are valid for 0/1 solutions and that rule out the problematic fractional solutions.

Many "P=NP" papers follow this approach, usually by presenting a polynomial-size linear programming relaxation of TSP and then "proving" that the optimum of the relaxation is the same as the combinatorial optimum. One can find recent examples here and here.

Similar results were "proved" in a notorious series of paper by Ted Swart in the mid 1980s. After counterexamples were found, he would revise the paper adding more inequalities that would rule out the counterexample.

Finally, Mihalis Yannakakis took matters into his own hands and proved that all "symmetric" relaxations of TSP of sub-exponential size have counterexamples on which the optimum of the relaxation is different from the combinatorial optimum. (All of the LPs suggested by Swart where "symmetric" according to Yannakakis's definition.)

This is actually one of the few known lower bounds that actually applies to a "model of computation" capturing a general (and otherwise promising) class of algorithms.

(I first read about this story in Christos Papadimitriou's complexity book, but I found the above references in Gerhard J Woeginger's P versus NP page.)

In the theory of approximation algorithms, we have plenty of problems that are believed to be intractable but that are not known to be NP-hard, such as approximating Vertex Cover within a factor of 1.9, or approximating Sparsest Cut within a factor of 10. LP and Semidefinite Programming (SDP) approaches are more or less the only promising tools we have to prove that such problems are tractable and, while we wait for NP-hardness result (for now, we only have "Unique-Games-hardness"), it is good to see whether certain candidate LP and SDP algorithms have any chance, or if they admit counterexamples showing large gaps between the optimum of the relaxation and the combinatorial optimum.

The problem with this approach is the nearly endless variety of relaxations that one can consider: what happens when we add triangle inequalities? and pentagonal inequalities? and so on. As in the case of Yannakakis's result, it would be great to have a result that says "all SDP relaxations of Vertex Cover of type X fail to achieve an approximation ratio smaller than 2," where "type X" is a general class of sub-exponential size SDP relaxations that include the type of inequalities that people use "in practice."

Lovasz and Schrijver describe a method, denoted LS+, that starts from an LP relaxation of a problem (actually it can start from any convex relaxation), and then turns it into tighter and tighter SDP relaxations, by adding auxiliary variables and linear and semidefinite constraints. A weaker version of the method, denoted LS, only adds auxiliary variables and linear constraints.

A nice thing about the method is that, after you apply it to your initial relaxation, thus getting a tighter relaxation, you can then apply it again to the tighter one, thus getting an even better relaxation, and so on. Starting from an LP relaxation with $n$ variables and poly($n$) constraints, $k$ applications of the method yield a relaxation solvable in $n^{O(k)}$ time, which is polynomial for all fixed $k$ and sub-exponential for $k=o(n/log n)$. Lovasz and Schrijver prove that, after $k$ applications (or "rounds") the resulting relaxation enforces al inequalities over $k$-tuples of variables that are valid for 0/1 solutions. (In particular, one gets the combinatorial optimum after $n$ rounds.) Typical approaches in the design of approximation algorithms are SDP with local inequalities (triangle inequalities etc.), and this is all captured after a few rounds of LS+.

It would be great to show that no constant (ideally, no sublinear) number of rounds of LS+ starting from the basic LP relaxation gives a $2-\epsilon$ approximation for vertex cover. Arora, and others Bollobas and Lovasz considered related questions in a FOCS 2002 paper that has inspired a considerable amount of later work. (See the introduction of the journal version.) Unfortunately the question remains open evern for two rounds of LS+. After one round, one gets an SDP relaxation equivalent to (number of vertices minus) the Lovasz Theta function, and Goemans and Kleinberg prove that such SDP does not achieve approximation better than 2. Beyond that, it is pretty much terra incognita. Charikar proves that a relaxation with triangle inequalities (which is incomparable with two rounds of LS+ and is weaker than three rounds) does not achieve approximation better than 2. Also, a sublinear number of rounds of LS+ does not achieve approximation better than 7/6. For LS, which, I remind you, generates linear programming relaxations rather than semidefinite programming ones, we know that no sublinear number of rounds leads to an approximation better than 2.

I will illustrate the main idea in the LS and LS+ method using the example of Vertex Cover. In the linear programming formulation, we have variables $x_i$, one for each vertex $i$, and the constraints that $x_i + x_j \geq 1$ for each edge $(i,j)$ and that $0\leq x_i \leq 1$. We would like to add constraints that are only satisfied by 0/1 solutions, and the constraint $x_i^2 = x_i$ would work beautifully except that it is not linear (nor convex). Instead, Lovasz and Schrijver add new variables $y_{i,j}$, one for each pair of vertices, with the idea that we would like to have $y_{i,j}=x_i * x_j$; then they add the requirement $y_{i,i} = x_i$ and various inequalities that make the $y_{i,j}$ be "sort of like" $x_i*x_j$. In particular, we would like to require that if $x_k \neq 0$, then $x_i = y_{i,k}/x_k$. This is again a non-linear constraint, but, at least, we can check whether, for each fixed $k$, $y_{i,k}/x_k$ is a fractional vertex cover: we just need to check the inequalities $y_{i,k} + y_{j,k} \geq x_k$ for each edge $(i,j)$

Similarly, if $x_k \neq 1$, we expect to have $x_i = (x_i-y_{i,k})/(1-x_k)$. We cannot check it directly, but we can check if
$x_i - y_{i,k} + x_j - y_{j,k} \geq 1-x_k$
hold for each edge $(i,j)$. Finally, we obviously want $y_{i,j} = y_{j,i}$. This describe the LS method. For LS+, we add the requirement that the symmetric matrix $(y_{i,j})$ be positive semidefinite. Being positive semidefinite means that there are vectors $b_1,\ldots,b_n$ such that
$y_{i,j} = < b_i , b_j>$
where $<\cdot,\cdot>$ denotes inner product. If $y_{i,j} = x_i *x_j$ then $(y_{i,j})$ is clearly positive semidefinite.

3 Comments:

  1. Anonymous Anonymous
    10/18/2006 03:53:00 PM

    Note that it would be a very good result if someone proved that after \Omega_{\epsilon}(n) rounds of the LS_{+} operator, we get an integrality gap of $2-\epsilon$ for the Vertex Cover problem.

    This would provide (yet another) strong indication that VC is $2-\epsilon$ - hard to approximate (unless P=NP).

     
  2. Anonymous Anonymous
    10/18/2006 09:35:00 PM

    Luca, great post!



    I would only argue that, in a field where we order author names alphabetically, one should rely less on Author1, et. al. citations, especially if the paper has only three authors. I think "Arora and others" is perhaps slightly strange sounding... who are those others? Oh yes, Bollobas and Lovasz (no slackers). But it's only a blog! yes, so space is less expensive here. maybe this is a ridiculous comment... who cares? somehow, it seems like it matters.

     
  3. Anonymous Anonymous
    10/31/2006 04:00:00 PM

    Wonderful post, thanks Luca!

     

Post a Comment

<< Home