Backtracking Algorithms — Deep Dive

Build a Sudoku solver, optimize with constraint propagation, and master advanced backtracking patterns in Python.

Technical framing

Backtracking is the systematic exploration of a search tree where each node represents a partial solution and edges represent choices. The search tree can be enormous — the N-Queens problem for N=20 has a search space of 20²⁰ ≈ 10²⁶ configurations. Without pruning, this is intractable. With constraint propagation and intelligent ordering, the same problem becomes solvable in seconds.

This deep dive covers advanced backtracking techniques that transform naive exponential search into practical algorithms.

Sudoku solver with constraint propagation

A complete Sudoku solver that combines backtracking with constraint propagation:

class SudokuSolver:
    def __init__(self, board: list[list[int]]):
        self.board = board
        self.rows = [set() for _ in range(9)]
        self.cols = [set() for _ in range(9)]
        self.boxes = [set() for _ in range(9)]
        self.empty_cells = []
        
        # Initialize constraints from given numbers
        for r in range(9):
            for c in range(9):
                if board[r][c] != 0:
                    num = board[r][c]
                    self.rows[r].add(num)
                    self.cols[c].add(num)
                    self.boxes[(r // 3) * 3 + c // 3].add(num)
                else:
                    self.empty_cells.append((r, c))
    
    def _candidates(self, r: int, c: int) -> set[int]:
        box_idx = (r // 3) * 3 + c // 3
        return set(range(1, 10)) - self.rows[r] - self.cols[c] - self.boxes[box_idx]
    
    def solve(self) -> bool:
        # Sort empty cells by number of candidates (MRV heuristic)
        self.empty_cells.sort(key=lambda rc: len(self._candidates(*rc)))
        return self._backtrack(0)
    
    def _backtrack(self, idx: int) -> bool:
        if idx == len(self.empty_cells):
            return True  # all cells filled
        
        r, c = self.empty_cells[idx]
        box_idx = (r // 3) * 3 + c // 3
        
        for num in self._candidates(r, c):
            # Make choice
            self.board[r][c] = num
            self.rows[r].add(num)
            self.cols[c].add(num)
            self.boxes[box_idx].add(num)
            
            if self._backtrack(idx + 1):
                return True
            
            # Undo choice
            self.board[r][c] = 0
            self.rows[r].remove(num)
            self.cols[c].remove(num)
            self.boxes[box_idx].remove(num)
        
        return False

The MRV heuristic

Minimum Remaining Values (MRV): always process the cell with the fewest valid candidates first. If a cell has only one candidate, fill it immediately. If a cell has zero candidates, backtrack immediately — no need to try anything.

This single heuristic reduces Sudoku solving from seconds to milliseconds for most puzzles.

Naked singles and hidden singles

Before backtracking, apply constraint propagation — deductions that fill cells without guessing:

def propagate(self) -> bool:
    """Apply constraint propagation. Returns False if contradiction found."""
    changed = True
    while changed:
        changed = False
        for r, c in list(self.empty_cells):
            candidates = self._candidates(r, c)
            if len(candidates) == 0:
                return False  # contradiction
            if len(candidates) == 1:
                num = candidates.pop()
                self.board[r][c] = num
                self.rows[r].add(num)
                self.cols[c].add(num)
                self.boxes[(r // 3) * 3 + c // 3].add(num)
                self.empty_cells.remove((r, c))
                changed = True
    return True

Peter Norvig’s Sudoku solver uses constraint propagation so aggressively that it rarely needs to backtrack at all.

Word search on a grid

Find if a word exists in a 2D character grid by moving horizontally or vertically:

def word_search(board: list[list[str]], word: str) -> bool:
    rows, cols = len(board), len(board[0])
    
    def backtrack(r: int, c: int, idx: int) -> bool:
        if idx == len(word):
            return True
        if (r < 0 or r >= rows or c < 0 or c >= cols or 
                board[r][c] != word[idx]):
            return False
        
        # Mark as visited
        original = board[r][c]
        board[r][c] = '#'
        
        for dr, dc in [(0, 1), (0, -1), (1, 0), (-1, 0)]:
            if backtrack(r + dr, c + dc, idx + 1):
                return True
        
        board[r][c] = original  # restore
        return False
    
    for r in range(rows):
        for c in range(cols):
            if backtrack(r, c, 0):
                return True
    return False

The in-place modification (board[r][c] = '#') avoids maintaining a separate visited set, saving memory and hash computation overhead.

Graph coloring

Determine if a graph can be colored with k colors such that no two adjacent nodes share a color:

def graph_coloring(adj: dict[int, list[int]], k: int) -> list[int] | None:
    nodes = sorted(adj.keys())
    colors = [0] * len(nodes)
    node_idx = {n: i for i, n in enumerate(nodes)}
    
    def is_safe(node: int, color: int) -> bool:
        idx = node_idx[node]
        for neighbor in adj[node]:
            if colors[node_idx[neighbor]] == color:
                return False
        return True
    
    def backtrack(i: int) -> bool:
        if i == len(nodes):
            return True
        node = nodes[i]
        for color in range(1, k + 1):
            if is_safe(node, color):
                colors[node_idx[node]] = color
                if backtrack(i + 1):
                    return True
                colors[node_idx[node]] = 0
        return False
    
    if backtrack(0):
        return colors
    return None

Graph coloring is NP-complete in general but tractable for small instances with good pruning. Applications: register allocation in compilers, scheduling, and map coloring.

Generating valid parentheses

Generate all combinations of n pairs of well-formed parentheses:

def generate_parentheses(n: int) -> list[str]:
    result = []
    
    def backtrack(path: list[str], open_count: int, close_count: int):
        if len(path) == 2 * n:
            result.append("".join(path))
            return
        if open_count < n:
            path.append("(")
            backtrack(path, open_count + 1, close_count)
            path.pop()
        if close_count < open_count:
            path.append(")")
            backtrack(path, open_count, close_count + 1)
            path.pop()
    
    backtrack([], 0, 0)
    return result

The constraint close_count < open_count is the pruning condition — you can only close a parenthesis if there is an unmatched open one.

Advanced pruning techniques

Forward checking

After making a choice, immediately check if any remaining variable has zero valid options. If so, backtrack before going deeper:

def backtrack_with_forward_check(state, idx):
    if idx == len(variables):
        return True
    
    for value in domain(variables[idx]):
        assign(variables[idx], value)
        
        # Forward check: do remaining variables still have options?
        if all(len(domain(v)) > 0 for v in variables[idx+1:]):
            if backtrack_with_forward_check(state, idx + 1):
                return True
        
        unassign(variables[idx])
    return False

Arc consistency (AC-3)

A more aggressive form of constraint propagation that removes values from domains that are inconsistent with any constraint:

from collections import deque

def ac3(domains: dict, constraints: dict) -> bool:
    queue = deque(constraints.keys())
    
    while queue:
        (xi, xj) = queue.popleft()
        if revise(domains, xi, xj, constraints[(xi, xj)]):
            if len(domains[xi]) == 0:
                return False  # unsolvable
            for xk in neighbors(xi) - {xj}:
                queue.append((xk, xi))
    return True

def revise(domains, xi, xj, constraint):
    revised = False
    for x in list(domains[xi]):
        if not any(constraint(x, y) for y in domains[xj]):
            domains[xi].remove(x)
            revised = True
    return revised

AC-3 is the backbone of constraint satisfaction problem (CSP) solvers. It reduces domains before search, often solving easy problems entirely without backtracking.

Performance optimization in Python

Bitmask representation

For problems with small domains (< 64 values), represent sets as bitmasks for O(1) set operations:

def n_queens_bitmask(n: int) -> int:
    count = 0
    all_ones = (1 << n) - 1
    
    def backtrack(cols: int, diag1: int, diag2: int):
        nonlocal count
        if cols == all_ones:
            count += 1
            return
        available = all_ones & ~(cols | diag1 | diag2)
        while available:
            pos = available & (-available)  # lowest set bit
            available ^= pos
            backtrack(cols | pos, (diag1 | pos) << 1, (diag2 | pos) >> 1)
    
    backtrack(0, 0, 0)
    return count

This bitmask N-Queens solver is 10-50x faster than the set-based version because:

Set operations become single bitwise operations
No hash computation or memory allocation
Better CPU branch prediction

Benchmarks: N-Queens solutions

N	Set-based (ms)	Bitmask (ms)	Solutions
8	5	0.3	92
10	50	3	724
12	800	45	14,200
14	15,000	800	365,596

Iterative deepening

For problems where the solution depth is unknown, combine DFS with increasing depth limits to get BFS-like completeness with DFS-like memory usage:

def iterative_deepening_backtrack(initial_state, max_depth):
    for depth_limit in range(max_depth + 1):
        result = depth_limited_search(initial_state, depth_limit)
        if result is not None:
            return result
    return None

def depth_limited_search(state, limit):
    if is_solution(state):
        return state
    if limit == 0:
        return None
    for choice in valid_choices(state):
        make_choice(state, choice)
        result = depth_limited_search(state, limit - 1)
        if result is not None:
            return result
        undo_choice(state, choice)
    return None

When to use backtracking vs. other approaches

Problem type	Best approach
Find all solutions	Backtracking
Find any valid solution	Backtracking with early exit
Count solutions	Backtracking or DP
Optimize (min/max)	DP or branch-and-bound
Continuous optimization	Gradient descent, not backtracking
Constraint satisfaction	Backtracking + constraint propagation

One thing to remember: The power of backtracking lies not in the search itself but in the pruning. A backtracking algorithm without pruning is brute force. With good pruning — MRV heuristics, forward checking, constraint propagation — the same algorithm becomes practical for problems with astronomical search spaces.

pythonalgorithmsbacktracking