Unique Games Conjecture

Very recently, Subhash Khot won the Rolf Nevanlinna Prize, considered one of the top honours in the field of mathematics, for his contribution to computational complexity theory. The conjecture has broad applications in the theory of hardness of approximations and is unusual in the sense that unlike textup{P} stackrel{?}{=} textup{NP} problem, the academic world seems evenly divided on whether this conjecture is true or not.

“Some very natural, intrinsically interesting statements about things like voting and foams just popped out of studying the UGC…. Even if the UGC turns out to be false, it has inspired a lot of interesting math research.”

—Ryan O’Donnell

This post is very basic and targeted towards anyone who has a knowledge of what complexity classes textup{P}, textup{NP}, textup{NP}-Hard and textup{NP}-Complete means.

Assuming textup{P} neq textup{NP}, researchers started exploring footholds for finding near optimal solutions efficiently. However, as it turns out, for some textup{NP}-Complete optimization, it is not possible to approximate beyond a particular factor. Perhaps an example will highlight point.

Approximation Algorithms

Any textup{NP}-optimization, Pi, is either a minimization or a maximization problem. For a minimization problem, for each instance I of Pi , there exists a non-empty feasible set of solutions each of which is assigned an objective value. Our goal is to come up with the one whose objective value is lowest. Let's us call such a solution as optimal solution and let's denote it value by textup{OPT}(I). We wish to come up with a solution which is as close (but greater since Pi is a minimization problem) to Pi as possible. Suppose, an approximation algorithm mathcal{A} outputs a solution which at most alpha times textup{OPT}(I) (alpha > 1). We say that mathcal{A} is a alpha-factor approximation algorithm. Similar results hold for maximization problems as well. We will now prove the hardnes for TSP.

Example: Travelling Salesman Problem (TSP)

We will show that it is hard to approximate TSP for any approximation factor. To prove this, we will transform Hamiltonian Cycle Problem to TSP.

TSP: Given a weighted undirected graph, find the minimum weight tour that visits each vertex exactly once.
Hamiltonian Cycle Problem: Given a Graph G, does there exist a simple cycle that visits all the vertices of G exactly once?

Given an instance G of Hamiltonian Cycle Problem, construct an instance H of TSP as follows:

  1. V(H) = V(G).

  2. H is a complete graph.

  3. For all edges e in E(G), w_e(H)=1.

  4. For other edges, w_e(H)=alpha n where n=|V(G)|.

If G has a hamiltonian cycle, textup{OPT}(H)=n. Otherwise, the tour must include an edge of weight alpha n. Hence, textup{OPT}(H) > alpha n.

If there is a alpha-factor approximation factor for TSP, we can reduce Hamiltonian Cycle Problem to TSP and check the decidability of Hamiltonian Cycle Problem. If G has a hamiltonian cycle, textup{OPT}(H)=n. Hence, the algorithm outputs a tour of weight at most alpha n. Otherwise, textup{OPT}(H) > alpha n which implies the tour the algorithm outputs has weight greater than alpha n. This creates a gap between the YES/NO instances of Hamiltonian Cycle Problem and it's decidability can be checked efficiently, which is not possible. Hence, it's hard to approximate TSP to a factor of alpha, for any alpha.

Reduction

Let Pi be a minimization problem. A gap-introducing reduction maps an instance phi of SAT to a instance x of Pi such that

  • If phi is satisfiable, then textup{OPT}(x) le f(x), and

  • If phi is not satisfiable, then textup{OPT}(x) > alpha f(x).

Obviously, alpha ge 1. Such a kind of gap-introducing reduction immediately implies an inapproximability of alpha for Pi.

One problem with the above approach is blowing an “additive” gap to a “multiplicative” gap.

PCP Theorem

Probabilistic characterization of textup{NP} class yields a general technique for gap-introducing reduction. Informally speaking, a probabilistically checkable proof for an textup{NP} language is a proof whose validity can be checked probabilistically by examining its very few bits. A probabilistically checkable proof system comes with two parameters: (a) r(n): the number of random bits required by the verifier, and (b) q(n): the number of bits of the proof the verifier is allowed to examine.

A language L in textup{PCP}(r(n),q(n)) if there's a verifier V that on input x, obtains a random string of length c cdot r(|x|) and queries d cdot q(|x|) bits of the proof such that (c & d are constants):

  • If x in L, then there's a proof which verifier accepts with probability 1, and

  • If x notin L, then every proof is accepted with probability < frac{1}{2}.

The PCP Theorem gives another characterization of the class textup{NP}.

PCP Theorem: textup{NP}=textup{PCP}(log n,1)

One direction of the proof, textup{NP} subseteq textup{PCP}(log n,1) is easy (try proving it as a small exercise). Other direction has been a result of years of research by various CS Theorists. For an excellent exposition to the history of PCP Theorem, refer here. Fortunately, the theorem, modulo its proof, is sufficient to derive hardness results.

Hardness of MAX-3SAT

In this example, we will try proving the hardness for textup{MAX-3SAT}. The reduction is from textup{3SAT}. Specifically, there exists a constant alpha^* such that a textup{3SAT} formula phi can be converted to a textup{MAX-3SAT} formula psi such that

  • If phi is satisfiable, then textup{OPT}(psi)=1, and

  • If phi is not satisfiable, then textup{OPT}(psi) < alpha^*.

The PCP for phi consists of a truth assignment to its boolean variables. The essential idea of the reduction is to encode the probabilistically checkable proof as a textup{MAX-3SAT} instance. The verifier uses c log n random bits and queries q bits of the proof. In all, there can be n^c different possible random strings generated and hence a total of qn^c locations of the proof can be queried by the verifier. psi will have a variable corresponding to each of these locations.

A random string r picked by the verifier gives us a value either True or False based on the values of the variables in those q locations. This truth value can be represented as a function f_r : {0,1}^q to {0,1}. Hence, we can define a textup{3SAT} boolean formula psi_r as follows: for all (v_1,dots,v_q) such that f(v_1,dots,v_q)=0, add a clause g(u_1) lor dots lor g(u_q) where u_1,dots,u_q are the corresponding variables and g(u_i)=bar{u_i} if v_i=1 and 0 otherwise. Then can at most be 2^q clauses in psi_r. Also, the length of each clause is q. Ensure that the length of each clause is 3 by adding q-2 new variables to each clause. The maximum number of clauses now is q2^q.

psi := cap_r psi_r. psi has at most n^cq2^q clauses. If phi is satisfiable, all the clauses of psi is true. However, if phi is not satisfiable, at least half the random string rejects the proof i.e. at least half of psi_r are not satisfiable. Hence, number of unsatisfiable clauses in psi must be at least n^c/2. Hence, textup{OPT}(psi) < 1/q2^{q+1}.

PCP Theorem was a landmark result in the field of computational complexity and after its inception, the focus moved on to produce optimal results i.e. to prove approximability and inaproximability results for a problem that match each other. The most influential development consisted of textup{Label Cover problem} (a.k.a. textup{2-Prover-1-Round Game}), Raz's Parallel Repetition Theorem, introduction of Long Code, its application in analyzing PCPs, and Hastad's use of Fourier Series to analyze Long Code. I will briefly mention about these results.

Label Cover Problem (a.k.a. 2-Prover-1-Round Game)

A textup{2-Prover-1-Round Game } mathcal{U}_{2p1r}(G(V,W,E),[m],[n],{pi_e|e in E}) is a CSP. It consists of a bipartite graph G(V,W,E) where vertices represent variables and edges represent constraints. Goal is to find a labelling L : V to [m], W to [n] such that for all edges e=(u,w) in E, the following “projection” constraint is satisfied: pi_e(L(u))=L(w). Let textup{OPT}(mathcal{U}_{2p1r}) denote its optimal value.

 textup{OPT}(mathcal{U}_{2p1r}) := max_{L:V to [m], W to [n]} frac{1}{|E|} cdot |{e in E ~ | ~ L  satisfies  e}|

I will now give the game formulation of textup{2-Prover-1-Round Game}: Given an instance mathcal{U}_{2p1r}, consider a probabilistic verifier V which picks an edge e=(v,w) in E at random and sends v to Prover P1 and w to Prover P2. The provers respond back with labels from set [m] and [n] respectively. The Verifier accepts only if pi_e(i)=j where i and j are the labels returned. The provers’ strategy is to maximize the probability of acceptance. The probability, called the value of the game, will obviously be same as textup{OPT}(mathcal{U}_{2p1r}). This establishes the analogue between Constraint Satisfaction View and the 2-Prover-1-Round view.

We are interested in the case when the label sets [m] and [n] have constant sizes. The PCP Theorem implies that the gap version of mathcal{U}_{2p1r} is textup{NP}-Hard and this gap can be amplified using Raz's Parallel Repetition Theorem.

PCP Theorem + Raz's Parallel Repetition Theorem

For every delta > 0, textup{Gap2P1R}_{1,delta} is textup{NP}-Hard for instances with label cover of size poly(1/delta). Specifically, there exists a constance C such that for every delta>0, mathcal{U}_{2p1r}(G(V,W,E),[m],[n],{pi_e|e in E}), m=(1/delta)^C, it is textup{NP}-Hard to distinguish between:

  • YES case: textup{textup{OPT}}(mathcal{U}_{2p1r})=1.

  • NO case: textup{textup{OPT}}(mathcal{U}_{2p1r})<delta.

Many inapproximability results are obtained by reduction from textup{Gap2P1R}_{1,delta}.

The inapproximability results derived from textup{Unique Games} often use gadgets constructed from Boolean hypercube. These reductions can be viewed as PCPs and the gadgets test, probabilistically, whether a given codeword is a Long Code or not. A useful Long Code-ing scheme is the so called dictatorship function on a boolean hypercube - it's a function f:{-1,1}^n to {-1,1} that depends only on one coordinate i.e. f(mathbf{x})=x_i for some fixed i. The truth table for this function can be thought of as an encoding scheme for i. We need that a dictatorship function passes this test with probability ge c whereas a function that is far from bring a dictatorship function passes this test with probability at most s. This gap c/s essentially translates to a textup{Gap}mathcal{I}_{c,s} instance of mathcal{I}.

The PCP replaces every vertex of textup{2-Prover-1-Round Game} with a boolean hypercube: for mathcal{U}_{2p1r}(G(V,W,E),[m],[n],{pi_e|e in E}), every v in V is replaced by a m-dimensional hypercube and every w in W is replaced by a n-dimensional hypercube. The PCP consists of truth table of boolean functions on these hypercubes. PCP testing consits of two parts:

  1. Codeword Testing: Each boolean function is close to a dictatorship function, and

  2. Consistency Testing: For an edge e=(v,w) in E, pi_e(i)=j, where i and j are the labels which the dictatorship function on boolean hypercubes for vertex v and (respectively) w correspond to.

Unique Games Conjecture

The PCP strategy described above succeeds for some problems (textup{MAX-3SAT}, textup{Clique}, textup{Hypergraph Coloring}), it doesn't yield any useful results for problems such as textup{Vertex Cover}, textup{MaxCut}, textup{Min-2SAT-Deletion}, and textup{Graph Coloring}. For the first set of problems, PCPs are allowed to make three or more queries but for the second set of problems, at most two queries are allowed, which makes the PCP very weak.

It was pointed out that another barrier is the “many-to-one”-ness of the projection constraints pi_e in textup{2-Prover-1-Round Game}, i.e., when frac{m}{n} to infty. This poses a problem in the consistency testing part where a 2 query PCP is too weak to ensure consistency between two hypercubes of vastly varying dimensions. This motivated the study of textup{Unique Games} where m=n and pi_e: [m] to [n] is a bijection.

Unique Game

A textup{Unique Game} mathcal{U}(G(V,E),[n],{pi_e|e in E}) is a constraint satisfaction problem: given a directed graph G(V,E) where vertices represent variables and edges represent constraint, the objective is to assign a label to each vertex from the set [n] such that maximum number of edges are satisfied. The constraint on each edge e is a bijection pi_e:[n] to [n]. An edge e=(v,w) is satisfied by a labelling L:V to[n] if pi_e(L(v))=L(w).

 textup{OPT}(mathcal{U}) := max_{L:V to [n]} frac{1}{|E|} cdot |{e in E ~ | ~ L  satisfies  e}|

As opposed to textup{2-Prover-1-Round Game}, the graph here need not be bipartite. This distinction is minor as can be seen by the following game formulaion of textup{Unique Game}: given an instance mathcal{U}(G(V,E),[n],{pi_e|e in E}) of textup{Unique Game} problem, the verifier picks an edge e=(u,v) in E at random and sends u to prover P_1 and v to prover P_2. P_1 & P_2 returns a label in [n] and the verifier acceptes only if pi_e(i)=j where i & j are the answers of two provers.

Note that if textup{OPT}(mathcal{U})=1, then such a labelling can be found in polynomial time: fixing the label of a vertex automatically fixes the label of every vertex which is its neighbour and so on. From the viewpoint of textup{Unique Games Conjecture}, the interesting case is when textup{OPT}(mathcal{U})=1-epsilon where epsilon>0.

Unique Game Conjecture

Unique Games Conjecture: textup{GAP}mathcal{U}_{1-epsilon,delta} is textup{NP}-Hard

For every epsilon,delta>0, there exists a n=n(epsilon,delta), such that given a textup{Unique Game} instance mathcal{U}(G(V,E),[n],{pi_e|e in E}), it is textup{NP}-Hard to distinguish between the two cases:

  • YES case: textup{OPT}(mathcal{U}) ge 1 - epsilon.

  • NP case: textup{OPT}(mathcal{U}) le delta.

Note that the conjecture is false if epsilon=0. Also, a random assignment satisfies frac{1}{n} fraction of edges and hence n ge frac{1}{delta}.