NP-Hard Introduction: The Boolean Satisfiability Problem (SAT)
Introduction
The Boolean Satisfiability Problem (SAT) is one of the most fundamental problems in computer science and serves as the cornerstone for understanding NP-completeness and NP-hardness. In this post, we’ll explore SAT as an introduction to the theory of computational complexity, particularly focusing on NP-hard problems.
What is SAT?
The Boolean Satisfiability Problem asks: Given a Boolean formula, is there an assignment of truth values to the variables that makes the formula evaluate to TRUE?
Problem Definition
Input: A Boolean formula in Conjunctive Normal Form (CNF)
- Variables: x₁, x₂, …, x_n (each can be TRUE or FALSE)
- Literals: a variable or its negation (e.g., x₁ or ! x₁)
- Clauses: disjunctions of literals (e.g., (x₁ ∨ ! x₁ ∨ x₁))
- Formula: conjunction of clauses (e.g., (x₁ ∨ ! x₁) ∧ (! x₁ ∨ x₁) ∧ (x₁ ∨ ! x₁))
Output: YES if there exists an assignment that satisfies all clauses, NO otherwise
Example
Consider the formula: (x₁ ∨ x₁) ∧ (! x₁ ∨ x₁) ∧ (! x₁ ∨ ! x₁)
Is this satisfiable?
Let’s evaluate each clause:
- Clause 1: (x₁ ∨ x₁) simplifies to x₁
- Clause 2: (! x₁ ∨ x₁) is always TRUE (tautology)
- Clause 3: (! x₁ ∨ ! x₁) simplifies to ! x₁
So the formula simplifies to: x₁ ∧ TRUE ∧ ! x₁ = x₁ ∧ ! x₁ = FALSE
Analysis:
- If x₁ = TRUE:
- Clause 1: (TRUE ∨ TRUE) = TRUE ✓
- Clause 2: (FALSE ∨ TRUE) = TRUE ✓
- Clause 3: (FALSE ∨ FALSE) = FALSE ✗
- If x₁ = FALSE:
- Clause 1: (FALSE ∨ FALSE) = FALSE ✗
- Clause 2: (TRUE ∨ FALSE) = TRUE ✓
- Clause 3: (TRUE ∨ TRUE) = TRUE ✓
Conclusion: This formula is unsatisfiable because no single assignment to x₁ can satisfy all three clauses simultaneously.
Better Example (Satisfiable):
Consider the formula: (x₁ ∨ ! x₁) ∧ (x₁)
This is satisfiable:
- Clause 1: (x₁ ∨ ! x₁) is always TRUE (tautology)
- Clause 2: (x₁) requires x₁ = TRUE
With x₁ = TRUE, both clauses are satisfied, so the formula is satisfiable.
Complexity Classes: P, NP, and NP-Complete
Before diving deeper into SAT, let’s review the key complexity classes:
P (Polynomial Time)
- Problems solvable in polynomial time by a deterministic Turing machine
- Examples: Sorting, shortest path, matrix multiplication
NP (Nondeterministic Polynomial Time)
- Problems where a solution can be verified in polynomial time
- If someone gives you a solution, you can check if it’s correct quickly
- Examples: SAT (given an assignment, we can verify it in polynomial time), graph coloring, subset sum
NP-Complete
- Problems that are:
- In NP (verifiable in polynomial time)
- NP-hard (every problem in NP can be reduced to it in polynomial time)
- If any NP-complete problem has a polynomial-time algorithm, then P = NP
NP-Hard
- Problems that are at least as hard as the hardest problems in NP
- Every problem in NP can be reduced to an NP-hard problem
- NP-hard problems may not be in NP themselves (they might not be decision problems)
Why SAT Matters: Cook-Levin Theorem
The Cook-Levin Theorem (1971) established SAT as the first NP-complete problem. This theorem states:
Boolean Satisfiability (SAT) is NP-complete.
Significance
This theorem is fundamental because:
- First NP-Complete Problem: SAT was proven to be NP-complete, establishing the concept
- Reduction Target: Since SAT is NP-complete, we can prove other problems are NP-complete by reducing SAT to them (or reducing them to SAT)
- P vs NP: If we find a polynomial-time algorithm for SAT, we prove P = NP (one of the most important open problems in computer science)
Proof Sketch
The Cook-Levin theorem proves SAT is NP-complete by:
- SAT ∈ NP: Given a satisfying assignment, we can verify it in polynomial time by evaluating the formula
- SAT is NP-hard: For any problem L in NP, we can construct a polynomial-time reduction that converts instances of L into SAT instances. This is done by encoding the computation of a nondeterministic Turing machine as a Boolean formula.
3-SAT: A Special Case
3-SAT is a restricted version of SAT where each clause contains exactly 3 literals.
Example: (x₁ ∨ x₁ ∨ x₁) ∧ (! x₁ ∨ x₁ ∨ ! x₁) ∧ (x₁ ∨ ! x₁ ∨ x₁)
Why 3-SAT?
- 3-SAT is NP-complete (can be proven by reducing SAT to 3-SAT)
- Many reductions use 3-SAT as the starting point because of its simpler structure
- 2-SAT (clauses with 2 literals) is NOT NP-complete - it’s actually solvable in polynomial time!
Complexity Comparison
| Problem | Complexity | Notes |
|---|---|---|
| SAT | NP-complete | General CNF formulas |
| 3-SAT | NP-complete | Exactly 3 literals per clause |
| 2-SAT | P (polynomial-time) | Solvable in O(n + m) time |
Key Insight: The jump from 2 literals to 3 literals per clause is the boundary between polynomial-time solvability and NP-completeness!
Reduction Techniques
To prove a problem is NP-complete, we typically:
- Show it’s in NP: Demonstrate that solutions can be verified in polynomial time
- Show it’s NP-hard: Reduce a known NP-complete problem (like SAT or 3-SAT) to it
Common Reduction Patterns from SAT/3-SAT
Many NP-complete problems are proven by reducing from 3-SAT:
- Independent Set: Create a graph with vertices for each literal occurrence, edges between conflicting literals
- Vertex Cover: Similar construction to Independent Set
- Clique: Related graph constructions
- Graph Coloring: Encode clauses and variables as graph constraints
Practical Implications
Why NP-Hardness Matters
If a problem is NP-hard:
- No Known Polynomial-Time Algorithm: Despite decades of research, no efficient (polynomial-time) algorithm exists
- Exponential Worst-Case: Best known algorithms have exponential worst-case time complexity
- Heuristics and Approximation: We often rely on:
- Heuristic algorithms (work well in practice but no guarantees)
- Approximation algorithms (guaranteed to be within some factor of optimal)
- Special cases (restricted versions that are tractable)
Real-World Applications
Despite being NP-complete, SAT solvers are widely used:
- Hardware Verification: Checking if circuit designs meet specifications
- Software Verification: Finding bugs, proving program correctness
- Planning and Scheduling: Resource allocation, task scheduling
- Cryptography: Some cryptographic problems reduce to SAT
Modern SAT solvers use sophisticated techniques (conflict-driven clause learning, unit propagation, etc.) and can handle instances with millions of variables in practice.
Runtime Analysis
Brute Force Approach
Algorithm: Try all 2ⁿ possible truth assignments
- Time Complexity: O(2^n · m) where n is number of variables and m is number of clauses
- Space Complexity: O(n) for storing current assignment
- Analysis: For each of 2ⁿ assignments, evaluate m clauses, each taking O(1) time per clause
Backtracking (DPLL Algorithm)
Algorithm: Systematic search with early pruning
- Time Complexity: O(2^n) worst-case, but much better in practice with pruning
- Space Complexity: O(n) for recursion stack
- Improvements: Unit propagation and pure literal elimination reduce search space significantly
Modern SAT Solvers (CDCL)
Conflict-Driven Clause Learning (CDCL):
- Time Complexity: Exponential worst-case, but very efficient in practice
- Space Complexity: O(n + m + L) where L is learned clauses
- Key Techniques:
- Unit propagation: O(m) per decision
- Conflict analysis: O(n) per conflict
- Clause learning: Adds learned clauses to prevent same conflicts
- Practical Performance: Can solve instances with millions of variables and clauses
2-SAT Special Case
2-SAT is NOT NP-complete - it’s in P (polynomial-time solvable)!
Algorithm: Build implication graph, check for strongly connected components
- Time Complexity: O(n + m) using Kosaraju’s or Tarjan’s algorithm
- Space Complexity: O(n + m)
- Why Polynomial: 2-SAT has special structure that allows polynomial-time solution
- Key Method:
- Create implication graph: For clause (a ∨ b), add edges (!a → b) and (!b → a)
- Find strongly connected components (SCCs)
- Check if any variable and its negation are in the same SCC
- If no conflict, formula is satisfiable; otherwise unsatisfiable
Why 2-SAT is easier: With only 2 literals per clause, the constraint structure is simpler and can be modeled as a graph problem solvable in polynomial time.
Verification Complexity
Given a candidate assignment:
- Time Complexity: O(m) - evaluate each clause once
- Space Complexity: O(1) additional space
- This polynomial-time verifiability is why SAT is in NP
Key Takeaways
- SAT is NP-Complete: The Boolean Satisfiability Problem is the foundational NP-complete problem
- Cook-Levin Theorem: Established SAT as the first NP-complete problem, enabling all other NP-completeness proofs
- Reduction Strategy: To prove a problem is NP-complete, reduce from a known NP-complete problem (often SAT or 3-SAT)
- Practical Impact: NP-hardness doesn’t mean “impossible” - many practical algorithms exist, but they don’t guarantee polynomial-time performance
- P vs NP: The question of whether P = NP remains one of the most important open problems in computer science
Further Reading
- Cook-Levin Theorem: Original papers by Cook (1971) and Levin (1973)
- Garey & Johnson: “Computers and Intractability” - the classic reference on NP-completeness
- Modern SAT Solvers: Research on conflict-driven clause learning (CDCL) algorithms
- Approximation Algorithms: For NP-hard optimization problems
Practice Problems
-
Determine if the following 3-SAT instance is satisfiable: (x₁ ∨ x₁ ∨ x₁) ∧ (! x₁ ∨ x₁ ∨ ! x₁) ∧ (x₁ ∨ ! x₁ ∨ x₁) ∧ (! x₁ ∨ ! x₁ ∨ ! x₁)
-
Why is 2-SAT solvable in polynomial time while 3-SAT is NP-complete?
-
Research one reduction from 3-SAT to another NP-complete problem (e.g., Independent Set, Vertex Cover)
-
Explain why verifying a SAT solution is in polynomial time, but finding one might not be.
Understanding SAT and NP-completeness is crucial for algorithms courses and provides the foundation for recognizing when problems are computationally intractable and require alternative approaches.