In the previous

random kSAT post we saw that for every $k$ there is a constant $c_k$ such that

- A random kSAT formula with $n$ variables and $m$ clauses is conjectured to be almost surely satisfiable when $m/n < c_k - \epsilon$ and almost surely unsatisfiable when $m/n > c_k + \epsilon$;

- There is an algorithm that is conjectured to find satisfying assignments with high probability when given a random kSAT formula with $n$ variables and fewer than $(c_k - \epsilon) n$ clauses.

So, conjecturally, the probability of satisfiability of a random kSAT formula has a sudden jump at a certain threshold value of the ratio of clauses to variables, and in the regime where the formula is likely to be satisfiable, the kSAT problem is easy-on-average.

What about the regime where the formula is likely to be unsatisfiable? Is the problem still easy on average? And what would that exactly mean? The natural question about average-case complexity is: is there an efficient algorithm that, in the unsatisfiable regime, finds with high probability a

*certifiably correct* answer? In other words, is there an algorithm that efficiently delivers a

*proof of unsatisfiability* given a random formula with $m$ clauses and $n$ variables, $m> (c_k + \epsilon) n$?

Some non-trivial algorithms, that I am going to describe shortly, find such unsatisfiability proofs but only in regimes of fairly high density. It is also known that certain broad classes of algorithms fail for all constant densities. It is plausible that finding unsatisfiability proofs for random kSAT formulas with any constant density is an intractable problem. If so, its intractability has a number of interesting consequences, as

shown by Feige.

A first observation is that if we have an unsatisfiable 2SAT formula then we can easily prove its unsatisfiability, and so we may try to come with some kind of reduction from 3SAT to 2SAT. In general, this is of course hopeless. But consider a random 3SAT formula $\phi$ with $n$ variables and $10 n^2$ clauses. Now, set $x_1 \leftarrow 0$ in $\phi$, and consider the resulting formula $\phi'$. The variable $x_1$ occurred in about $30 n$ clauses, positively in about $15 n$ of them (which have now become 2SAT clauses in $\phi'$) and negatively in about $15 n$ clauses, that have now disappeared in $\phi'$. Let's look at the 2SAT clauses of $\phi'$: there are about $15 n$ such clauses, they are random, so they are extremely likely to be unsatisfiable, and, if so, we can easily prove that they are. If the 2SAT subset of $\phi'$ is unsatisfiable, then so is $\phi'$, and so we have a proof of unsatisfiability for $\phi'$.

Now set $x_1 \leftarrow 1$ in $\phi$, thus constructing a new formula $\phi''$. As before, the 2SAT part of $\phi''$ is likely to be unsatisfiable, and, if so, its unsatisfiability is easily provable in polynomial time.

Overall, we have that we can prove that $\phi$ is unsatisfiable when setting $x_1 \leftarrow 0$, and also unsatisfiable when setting $x_1\leftarrow 1$, and so $\phi$ is unsatisfiable.

This works when $m$ is about $n^2$ for 3SAT, and when $m$ is about $n^{k-1}$ for kSAT. By fixing $O(\log n)$ at a time it is possible to shave another polylog factor. These idea is due to Beame, Karp, Pitassi, and Saks.

A limitation of this approach is that it produces polynomial-size

*resolution* proofs of unsatisfiability and, in fact

*tree-like resolution* proofs. It is known that polynomial-size resolution proofs do not exist for random 3SAT formulas with fewer than $n^{1.5-\epsilon}$ clauses, and tree-like resolution proofs do not exist even when the number of clauses is just less than $n^{2-\epsilon}$. This is a limitation that afflicts all backtracking algorithms, and so all approaches of the form ``let's fix some variables, then apply the 2SAT algorithm.'' So something really different is needed to make further progress.

Besides the 2SAT algorithm, what other algorithms do we have to prove that

*no solution exists* for a given problem? There are algorithms for linear and semidefinite programming, and there is Gaussian elimination. We'll see how they can be applied to random kSAT in the next theory post.