NP-Hard Introduction: The Boolean Satisfiability Problem (SAT)

Introduction

The Boolean Satisfiability Problem (SAT) is one of the most fundamental problems in computer science and serves as the cornerstone for understanding NP-completeness and NP-hardness. In this post, we’ll explore SAT as an introduction to the theory of computational complexity, particularly focusing on NP-hard problems.

What is SAT?

The Boolean Satisfiability Problem asks: Given a Boolean formula, is there an assignment of truth values to the variables that makes the formula evaluate to TRUE?

Problem Definition

Input: A Boolean formula in Conjunctive Normal Form (CNF)

Variables: x₁, x₂, …, x_n (each can be TRUE or FALSE)
Literals: a variable or its negation (e.g., x₁ or ! x₁)
Clauses: disjunctions of literals (e.g., (x₁ ∨ ! x₁ ∨ x₁))
Formula: conjunction of clauses (e.g., (x₁ ∨ ! x₁) ∧ (! x₁ ∨ x₁) ∧ (x₁ ∨ ! x₁))

Output: YES if there exists an assignment that satisfies all clauses, NO otherwise

Example

Consider the formula: (x₁ ∨ x₁) ∧ (! x₁ ∨ x₁) ∧ (! x₁ ∨ ! x₁)

Is this satisfiable?

Let’s evaluate each clause:

Clause 1: (x₁ ∨ x₁) simplifies to x₁
Clause 2: (! x₁ ∨ x₁) is always TRUE (tautology)
Clause 3: (! x₁ ∨ ! x₁) simplifies to ! x₁

So the formula simplifies to: x₁ ∧ TRUE ∧ ! x₁ = x₁ ∧ ! x₁ = FALSE

Analysis:

If x₁ = TRUE:
- Clause 1: (TRUE ∨ TRUE) = TRUE ✓
- Clause 2: (FALSE ∨ TRUE) = TRUE ✓
- Clause 3: (FALSE ∨ FALSE) = FALSE ✗
If x₁ = FALSE:
- Clause 1: (FALSE ∨ FALSE) = FALSE ✗
- Clause 2: (TRUE ∨ FALSE) = TRUE ✓
- Clause 3: (TRUE ∨ TRUE) = TRUE ✓

Conclusion: This formula is unsatisfiable because no single assignment to x₁ can satisfy all three clauses simultaneously.

Better Example (Satisfiable):

Consider the formula: (x₁ ∨ ! x₁) ∧ (x₁)

This is satisfiable:

Clause 1: (x₁ ∨ ! x₁) is always TRUE (tautology)
Clause 2: (x₁) requires x₁ = TRUE

With x₁ = TRUE, both clauses are satisfied, so the formula is satisfiable.

Complexity Classes: P, NP, and NP-Complete

Before diving deeper into SAT, let’s review the key complexity classes:

P (Polynomial Time)

Problems solvable in polynomial time by a deterministic Turing machine
Examples: Sorting, shortest path, matrix multiplication

NP (Nondeterministic Polynomial Time)

Problems where a solution can be verified in polynomial time
If someone gives you a solution, you can check if it’s correct quickly
Examples: SAT (given an assignment, we can verify it in polynomial time), graph coloring, subset sum

NP-Complete

Problems that are:
1. In NP (verifiable in polynomial time)
2. NP-hard (every problem in NP can be reduced to it in polynomial time)
If any NP-complete problem has a polynomial-time algorithm, then P = NP

NP-Hard

Problems that are at least as hard as the hardest problems in NP
Every problem in NP can be reduced to an NP-hard problem
NP-hard problems may not be in NP themselves (they might not be decision problems)

Why SAT Matters: Cook-Levin Theorem

The Cook-Levin Theorem (1971) established SAT as the first NP-complete problem. This theorem states:

Boolean Satisfiability (SAT) is NP-complete.

Significance

This theorem is fundamental because:

First NP-Complete Problem: SAT was proven to be NP-complete, establishing the concept
Reduction Target: Since SAT is NP-complete, we can prove other problems are NP-complete by reducing SAT to them (or reducing them to SAT)
P vs NP: If we find a polynomial-time algorithm for SAT, we prove P = NP (one of the most important open problems in computer science)

Proof Sketch

The Cook-Levin theorem proves SAT is NP-complete by:

SAT ∈ NP: Given a satisfying assignment, we can verify it in polynomial time by evaluating the formula
SAT is NP-hard: For any problem L in NP, we can construct a polynomial-time reduction that converts instances of L into SAT instances. This is done by encoding the computation of a nondeterministic Turing machine as a Boolean formula.

3-SAT: A Special Case

3-SAT is a restricted version of SAT where each clause contains exactly 3 literals.

Example: (x₁ ∨ x₁ ∨ x₁) ∧ (! x₁ ∨ x₁ ∨ ! x₁) ∧ (x₁ ∨ ! x₁ ∨ x₁)

Why 3-SAT?

3-SAT is NP-complete (can be proven by reducing SAT to 3-SAT)
Many reductions use 3-SAT as the starting point because of its simpler structure
2-SAT (clauses with 2 literals) is NOT NP-complete - it’s actually solvable in polynomial time!

Complexity Comparison

Problem	Complexity	Notes
SAT	NP-complete	General CNF formulas
3-SAT	NP-complete	Exactly 3 literals per clause
2-SAT	P (polynomial-time)	Solvable in O(n + m) time

Key Insight: The jump from 2 literals to 3 literals per clause is the boundary between polynomial-time solvability and NP-completeness!

Reduction Techniques

To prove a problem is NP-complete, we typically:

Show it’s in NP: Demonstrate that solutions can be verified in polynomial time
Show it’s NP-hard: Reduce a known NP-complete problem (like SAT or 3-SAT) to it

Common Reduction Patterns from SAT/3-SAT

Many NP-complete problems are proven by reducing from 3-SAT:

Independent Set: Create a graph with vertices for each literal occurrence, edges between conflicting literals
Vertex Cover: Similar construction to Independent Set
Clique: Related graph constructions
Graph Coloring: Encode clauses and variables as graph constraints

Practical Implications

Why NP-Hardness Matters

If a problem is NP-hard:

No Known Polynomial-Time Algorithm: Despite decades of research, no efficient (polynomial-time) algorithm exists
Exponential Worst-Case: Best known algorithms have exponential worst-case time complexity
Heuristics and Approximation: We often rely on:
- Heuristic algorithms (work well in practice but no guarantees)
- Approximation algorithms (guaranteed to be within some factor of optimal)
- Special cases (restricted versions that are tractable)

Real-World Applications

Despite being NP-complete, SAT solvers are widely used:

Hardware Verification: Checking if circuit designs meet specifications
Software Verification: Finding bugs, proving program correctness
Planning and Scheduling: Resource allocation, task scheduling
Cryptography: Some cryptographic problems reduce to SAT

Modern SAT solvers use sophisticated techniques (conflict-driven clause learning, unit propagation, etc.) and can handle instances with millions of variables in practice.

Runtime Analysis

Brute Force Approach

Algorithm: Try all 2ⁿ possible truth assignments

Time Complexity: O(2^n · m) where n is number of variables and m is number of clauses
Space Complexity: O(n) for storing current assignment
Analysis: For each of 2ⁿ assignments, evaluate m clauses, each taking O(1) time per clause

Backtracking (DPLL Algorithm)

Algorithm: Systematic search with early pruning

Time Complexity: O(2^n) worst-case, but much better in practice with pruning
Space Complexity: O(n) for recursion stack
Improvements: Unit propagation and pure literal elimination reduce search space significantly

Modern SAT Solvers (CDCL)

Conflict-Driven Clause Learning (CDCL):

Time Complexity: Exponential worst-case, but very efficient in practice
Space Complexity: O(n + m + L) where L is learned clauses
Key Techniques:
- Unit propagation: O(m) per decision
- Conflict analysis: O(n) per conflict
- Clause learning: Adds learned clauses to prevent same conflicts
Practical Performance: Can solve instances with millions of variables and clauses

2-SAT Special Case

2-SAT is NOT NP-complete - it’s in P (polynomial-time solvable)!

Algorithm: Build implication graph, check for strongly connected components

Time Complexity: O(n + m) using Kosaraju’s or Tarjan’s algorithm
Space Complexity: O(n + m)
Why Polynomial: 2-SAT has special structure that allows polynomial-time solution
Key Method:
1. Create implication graph: For clause (a ∨ b), add edges (!a → b) and (!b → a)
2. Find strongly connected components (SCCs)
3. Check if any variable and its negation are in the same SCC
4. If no conflict, formula is satisfiable; otherwise unsatisfiable

Why 2-SAT is easier: With only 2 literals per clause, the constraint structure is simpler and can be modeled as a graph problem solvable in polynomial time.

Verification Complexity

Given a candidate assignment:

Time Complexity: O(m) - evaluate each clause once
Space Complexity: O(1) additional space
This polynomial-time verifiability is why SAT is in NP

Key Takeaways

SAT is NP-Complete: The Boolean Satisfiability Problem is the foundational NP-complete problem
Cook-Levin Theorem: Established SAT as the first NP-complete problem, enabling all other NP-completeness proofs
Reduction Strategy: To prove a problem is NP-complete, reduce from a known NP-complete problem (often SAT or 3-SAT)
Practical Impact: NP-hardness doesn’t mean “impossible” - many practical algorithms exist, but they don’t guarantee polynomial-time performance
P vs NP: The question of whether P = NP remains one of the most important open problems in computer science

Practice Problems

Determine if the following 3-SAT instance is satisfiable: (x₁ ∨ x₁ ∨ x₁) ∧ (! x₁ ∨ x₁ ∨ ! x₁) ∧ (x₁ ∨ ! x₁ ∨ x₁) ∧ (! x₁ ∨ ! x₁ ∨ ! x₁)
Why is 2-SAT solvable in polynomial time while 3-SAT is NP-complete?
Research one reduction from 3-SAT to another NP-complete problem (e.g., Independent Set, Vertex Cover)
Explain why verifying a SAT solution is in polynomial time, but finding one might not be.

Understanding SAT and NP-completeness is crucial for algorithms courses and provides the foundation for recognizing when problems are computationally intractable and require alternative approaches.