akshay ~ google

Very recently, Subhash Khot won the Rolf Nevanlinna Prize, considered one of the top honours in the field of mathematics, for his contribution to computational complexity theory. The conjecture has broad applications in the theory of hardness of approximations and is unusual in the sense that unlike textup{P} stackrel{?}{=} textup{NP}

problem, the academic world seems evenly divided on whether this conjecture is true or not.

“Some very natural, intrinsically interesting statements about things like voting and foams just popped out of studying the UGC…. Even if the UGC turns out to be false, it has inspired a lot of interesting math research.”

—Ryan O’Donnell

This post is very basic and targeted towards anyone who has a knowledge of what complexity classes textup{P}

Hard and

Complete means.

Assuming

, researchers started exploring footholds for finding near optimal solutions efficiently. However, as it turns out, for some textup{NP}-

Complete optimization, it is not possible to approximate beyond a particular factor. Perhaps an example will highlight point.

Approximation Algorithms

Any

optimization,

, is either a minimization or a maximization problem. For a minimization problem, for each instance

, there exists a non-empty feasible set of solutions each of which is assigned an objective value. Our goal is to come up with the one whose objective value is lowest. Let's us call such a solution as optimal solution and let's denote it value by textup{OPT}(I)

. We wish to come up with a solution which is as close (but greater since

is a minimization problem) to

as possible. Suppose, an approximation algorithm mathcal{A}

outputs a solution which at most alpha

times

(

). We say that mathcal{A}

is a

factor approximation algorithm. Similar results hold for maximization problems as well. We will now prove the hardnes for TSP.

Example: Travelling Salesman Problem (TSP)

We will show that it is hard to approximate TSP for any approximation factor. To prove this, we will transform Hamiltonian Cycle Problem to TSP.

TSP: Given a weighted undirected graph, find the minimum weight tour that visits each vertex exactly once.
Hamiltonian Cycle Problem: Given a Graph , does there exist a simple cycle that visits all the vertices of exactly once?

Given an instance

of Hamiltonian Cycle Problem, construct an instance

of TSP as follows:

has a hamiltonian cycle, textup{OPT}(H)=n

. Otherwise, the tour must include an edge of weight alpha n

. Hence,

If there is a alpha-

factor approximation factor for TSP, we can reduce Hamiltonian Cycle Problem to TSP and check the decidability of Hamiltonian Cycle Problem. If

has a hamiltonian cycle, textup{OPT}(H)=n

. Hence, the algorithm outputs a tour of weight at most alpha n

. Otherwise, textup{OPT}(H) > alpha n

which implies the tour the algorithm outputs has weight greater than alpha n

. This creates a gap between the YES/NO instances of Hamiltonian Cycle Problem and it's decidability can be checked efficiently, which is not possible. Hence, it's hard to approximate TSP to a factor of alpha

, for any

Reduction

Let

be a minimization problem. A gap-introducing reduction maps an instance phi

of SAT to a instance

such that

Obviously, alpha ge 1

. Such a kind of gap-introducing reduction immediately implies an inapproximability of alpha

for

One problem with the above approach is blowing an “additive” gap to a “multiplicative” gap.

PCP Theorem

Probabilistic characterization of textup{NP}

class yields a general technique for gap-introducing reduction. Informally speaking, a probabilistically checkable proof for an textup{NP}

language is a proof whose validity can be checked probabilistically by examining its very few bits. A probabilistically checkable proof system comes with two parameters: (a) r(n):

the number of random bits required by the verifier, and (b) q(n):

the number of bits of the proof the verifier is allowed to examine.

A language L in textup{PCP}(r(n),q(n))

if there's a verifier

that on input

, obtains a random string of length c cdot r(|x|)

and queries d cdot q(|x|)

bits of the proof such that (

are constants):

One direction of the proof, textup{NP} subseteq textup{PCP}(log n,1)

is easy (try proving it as a small exercise). Other direction has been a result of years of research by various CS Theorists. For an excellent exposition to the history of PCP Theorem, refer here. Fortunately, the theorem, modulo its proof, is sufficient to derive hardness results.

Hardness of MAX-3SAT

In this example, we will try proving the hardness for textup{MAX-3SAT}

. The reduction is from textup{3SAT}

. Specifically, there exists a constant alpha^*

such that a textup{3SAT}

formula

can be converted to a textup{MAX-3SAT}

formula

such that

The PCP for phi

consists of a truth assignment to its boolean variables. The essential idea of the reduction is to encode the probabilistically checkable proof as a textup{MAX-3SAT}

instance. The verifier uses c log n

random bits and queries

bits of the proof. In all, there can be n^c

different possible random strings generated and hence a total of qn^c

locations of the proof can be queried by the verifier. psi

will have a variable corresponding to each of these locations.

A random string

picked by the verifier gives us a value either True or False based on the values of the variables in those

locations. This truth value can be represented as a function $f_r : {0,1}^q to {0,1}$ . Hence, we can define a textup{3SAT}

boolean formula psi_r

as follows: for all (v_1,dots,v_q)

such that

, add a clause g(u_1) lor dots lor g(u_q)

where

are the corresponding variables and $g(u_i)=bar{u_i}$ if v_i=1

and

otherwise. Then can at most be 2^q

clauses in psi_r

. Also, the length of each clause is

. Ensure that the length of each clause is

by adding

new variables to each clause. The maximum number of clauses now is q2^q

has at most n^cq2^q

clauses. If phi

is satisfiable, all the clauses of psi

is true. However, if phi

is not satisfiable, at least half the random string rejects the proof i.e. at least half of psi_r

are not satisfiable. Hence, number of unsatisfiable clauses in psi

must be at least n^c/2

. Hence, $textup{OPT}(psi) < 1/q2^{q+1}$ .

PCP Theorem was a landmark result in the field of computational complexity and after its inception, the focus moved on to produce optimal results i.e. to prove approximability and inaproximability results for a problem that match each other. The most influential development consisted of textup{Label Cover problem}

(a.k.a.

), Raz's Parallel Repetition Theorem, introduction of Long Code, its application in analyzing PCPs, and Hastad's use of Fourier Series to analyze Long Code. I will briefly mention about these results.

Label Cover Problem (a.k.a. 2-Prover-1-Round Game)

A $textup{2-Prover-1-Round Game } mathcal{U}_{2p1r}(G(V,W,E),[m],[n],{pi_e|e in E})$ is a CSP. It consists of a bipartite graph

where vertices represent variables and edges represent constraints. Goal is to find a labelling L : V to [m], W to [n]

such that for all edges e=(u,w) in E

, the following “projection” constraint is satisfied: pi_e(L(u))=L(w)

. Let $textup{OPT}(mathcal{U}_{2p1r})$ denote its optimal value.

$textup{OPT}(mathcal{U}_{2p1r}) := max_{L:V to [m], W to [n]} frac{1}{|E|} cdot |{e in E ~ | ~ L satisfies e}|$

I will now give the game formulation of textup{2-Prover-1-Round Game}:

Given an instance $mathcal{U}_{2p1r}$ , consider a probabilistic verifier

which picks an edge e=(v,w) in E

at random and sends

to Prover

and

to Prover

. The provers respond back with labels from set [m]

and

respectively. The Verifier accepts only if pi_e(i)=j

where

and

are the labels returned. The provers’ strategy is to maximize the probability of acceptance. The probability, called the value of the game, will obviously be same as $textup{OPT}(mathcal{U}_{2p1r})$ . This establishes the analogue between Constraint Satisfaction View and the

Prover

Round view.

We are interested in the case when the label sets [m]

and

have constant sizes. The PCP Theorem implies that the gap version of $mathcal{U}_{2p1r}$ is textup{NP}-

Hard and this gap can be amplified using Raz's Parallel Repetition Theorem.

PCP Theorem + Raz's Parallel Repetition Theorem

For every

, $textup{Gap2P1R}_{1,delta}$ is textup{NP}-

Hard for instances with label cover of size poly(1/delta)

. Specifically, there exists a constance

such that for every delta>0

, $mathcal{U}_{2p1r}(G(V,W,E),[m],[n],{pi_e|e in E})$ , m=(1/delta)^C

, it is

Hard to distinguish between:

Many inapproximability results are obtained by reduction from $textup{Gap2P1R}_{1,delta}$ .

The inapproximability results derived from textup{Unique Games}

often use gadgets constructed from Boolean hypercube. These reductions can be viewed as PCPs and the gadgets test, probabilistically, whether a given codeword is a Long Code or not. A useful Long Code-ing scheme is the so called dictatorship function on a boolean hypercube - it's a function $f:{-1,1}^n to {-1,1}$ that depends only on one coordinate i.e. $f(mathbf{x})=x_i$ for some fixed

. The truth table for this function can be thought of as an encoding scheme for

. We need that a dictatorship function passes this test with probability ge c

whereas a function that is far from bring a dictatorship function passes this test with probability at most

. This gap c/s

essentially translates to a $textup{Gap}mathcal{I}_{c,s}$ instance of

The PCP replaces every vertex of textup{2-Prover-1-Round Game}

with a boolean hypercube: for $mathcal{U}_{2p1r}(G(V,W,E),[m],[n],{pi_e|e in E})$ , every v in V

is replaced by a

dimensional hypercube and every w in W

is replaced by a

dimensional hypercube. The PCP consists of truth table of boolean functions on these hypercubes. PCP testing consits of two parts:

Unique Games Conjecture

The PCP strategy described above succeeds for some problems (

), it doesn't yield any useful results for problems such as

, and

. For the first set of problems, PCPs are allowed to make three or more queries but for the second set of problems, at most two queries are allowed, which makes the PCP very weak.

It was pointed out that another barrier is the “many-to-one”-ness of the projection constraints pi_e

, i.e., when $frac{m}{n} to infty$ . This poses a problem in the consistency testing part where a

query PCP is too weak to ensure consistency between two hypercubes of vastly varying dimensions. This motivated the study of textup{Unique Games}

where

and

is a bijection.

Unique Game

$mathcal{U}(G(V,E),[n],{pi_e|e in E})$ is a constraint satisfaction problem: given a directed graph G(V,E)

where vertices represent variables and edges represent constraint, the objective is to assign a label to each vertex from the set [n]

such that maximum number of edges are satisfied. The constraint on each edge

is a bijection pi_e:[n] to [n]

. An edge

is satisfied by a labelling L:V to[n]

$textup{OPT}(mathcal{U}) := max_{L:V to [n]} frac{1}{|E|} cdot |{e in E ~ | ~ L satisfies e}|$

As opposed to textup{2-Prover-1-Round Game}

, the graph here need not be bipartite. This distinction is minor as can be seen by the following game formulaion of textup{Unique Game}

: given an instance $mathcal{U}(G(V,E),[n],{pi_e|e in E})$ of textup{Unique Game}

problem, the verifier picks an edge e=(u,v) in E

at random and sends

to prover

and

to prover

returns a label in [n]

and the verifier acceptes only if pi_e(i)=j

where

are the answers of two provers.

Note that if textup{OPT}(mathcal{U})=1

, then such a labelling can be found in polynomial time: fixing the label of a vertex automatically fixes the label of every vertex which is its neighbour and so on. From the viewpoint of textup{Unique Games Conjecture}