Quantum Error Correction in Python — Deep Dive

Stabilizer Formalism

Quantum error correction codes are most elegantly described using the stabilizer formalism. A stabilizer code is defined by a set of Pauli operators (tensor products of I, X, Y, Z) that “stabilize” the code space — meaning every valid codeword is a +1 eigenvector of every stabilizer.

For an [[n, k, d]] code:

  • n physical qubits encode k logical qubits
  • d is the code distance — the minimum number of physical qubit errors needed to cause a logical error
  • The code can correct up to ⌊(d-1)/2⌋ errors

The 3-Qubit Bit-Flip Code in Stabilizer Language

# Stabilizers: Z₁Z₂ and Z₂Z₃
# These detect bit-flip errors without revealing the encoded state

import stim

# Build a repetition code circuit
circuit = stim.Circuit("""
    # Initialize logical |0⟩ = |000⟩
    R 0 1 2

    # Encoding (already in |000⟩ for logical |0⟩)

    # Error channel: each qubit has 5% chance of X error
    X_ERROR(0.05) 0 1 2

    # Syndrome extraction
    # Ancilla qubits 3,4 measure stabilizers
    R 3 4
    CX 0 3
    CX 1 3
    CX 1 4
    CX 2 4
    M 3 4

    # Measure data qubits
    M 0 1 2
""")

# Sample many rounds
sampler = circuit.compile_sampler()
samples = sampler.sample(shots=10000)

Stim: High-Performance QEC Simulation

Stim, developed by Google, is the go-to tool for large-scale quantum error correction simulation. It exploits the Gottesman-Knill theorem: stabilizer circuits (Clifford gates + Pauli measurements) can be simulated efficiently in polynomial time.

Building a Surface Code

import stim
import numpy as np

def make_surface_code_circuit(distance, rounds, noise):
    """Generate a distance-d surface code memory experiment."""
    circuit = stim.Circuit()

    # Surface code on a d×d grid
    # Data qubits on vertices, measure qubits on faces/edges
    data_qubits = []
    x_stabilizers = []  # Detect phase errors
    z_stabilizers = []  # Detect bit-flip errors

    for row in range(distance):
        for col in range(distance):
            data_qubits.append(row * distance + col)

    # Use Stim's built-in surface code generator
    circuit = stim.Circuit.generated(
        "surface_code:rotated_memory_z",
        distance=distance,
        rounds=rounds,
        after_clifford_depolarization=noise,
        after_reset_flip_probability=noise,
        before_measure_flip_probability=noise,
        before_round_data_depolarization=noise
    )
    return circuit

# Distance-5 surface code, 10 rounds, 0.1% noise
circuit = make_surface_code_circuit(distance=5, rounds=10, noise=0.001)

# Get the detector error model
dem = circuit.detector_error_model(decompose_errors=True)
print(f"Detectors: {circuit.num_detectors}")
print(f"Observables: {circuit.num_observables}")

Sampling and Decoding

import pymatching

# Compile the matching graph from Stim's detector error model
matching = pymatching.Matching.from_detector_error_model(dem)

# Sample errors
sampler = circuit.compile_detector_sampler()
detection_events, observable_flips = sampler.sample(
    shots=100000,
    separate_observables=True
)

# Decode: predict which logical errors occurred
predicted_flips = matching.decode_batch(detection_events)

# Calculate logical error rate
num_errors = np.sum(predicted_flips != observable_flips)
logical_error_rate = num_errors / len(observable_flips)
print(f"Logical error rate: {logical_error_rate:.6f}")

Threshold Analysis

The threshold is the physical error rate below which increasing code distance reduces the logical error rate:

import matplotlib.pyplot as plt

physical_rates = [0.001, 0.003, 0.005, 0.007, 0.01, 0.015]
distances = [3, 5, 7, 9]

results = {}
for d in distances:
    results[d] = []
    for p in physical_rates:
        circuit = make_surface_code_circuit(distance=d, rounds=d, noise=p)
        dem = circuit.detector_error_model(decompose_errors=True)
        matching = pymatching.Matching.from_detector_error_model(dem)

        det_samples, obs_flips = circuit.compile_detector_sampler().sample(
            shots=50000, separate_observables=True
        )
        predictions = matching.decode_batch(det_samples)
        error_rate = np.mean(predictions != obs_flips)
        results[d].append(error_rate)

# Plot: lines cross at the threshold
for d in distances:
    plt.semilogy(physical_rates, results[d], 'o-', label=f'd={d}')

plt.xlabel('Physical error rate')
plt.ylabel('Logical error rate')
plt.legend()
plt.title('Surface Code Threshold')

The crossing point of the curves reveals the threshold — typically around 0.5-1% for the surface code with minimum-weight perfect matching decoding.

Implementing Decoders

Minimum-Weight Perfect Matching (MWPM)

The standard decoder for surface codes. It models the decoding problem as a graph matching problem: pair up detected errors in a way that minimizes total weight.

import pymatching

# PyMatching implements MWPM efficiently
# Create from a check matrix (parity check perspective)
import scipy.sparse

# Example: repetition code check matrix
# Each row is a stabilizer, each column is a qubit
H = scipy.sparse.csc_matrix(np.array([
    [1, 1, 0, 0, 0],
    [0, 1, 1, 0, 0],
    [0, 0, 1, 1, 0],
    [0, 0, 0, 1, 1],
]))

matching = pymatching.Matching(H, weights=np.ones(5))

# Decode a syndrome
syndrome = np.array([1, 0, 0, 1])  # Errors detected at positions 0 and 3
correction = matching.decode(syndrome)
print(f"Correction: {correction}")  # Which qubits to flip

Union-Find Decoder

Faster than MWPM (almost linear time) with slightly worse performance:

# Union-find decoder concept (simplified)
class UnionFind:
    def __init__(self, n):
        self.parent = list(range(n))
        self.rank = [0] * n

    def find(self, x):
        while self.parent[x] != x:
            self.parent[x] = self.parent[self.parent[x]]
            x = self.parent[x]
        return x

    def union(self, x, y):
        rx, ry = self.find(x), self.find(y)
        if rx == ry:
            return
        if self.rank[rx] < self.rank[ry]:
            rx, ry = ry, rx
        self.parent[ry] = rx
        if self.rank[rx] == self.rank[ry]:
            self.rank[rx] += 1

def uf_decode(syndrome, graph):
    """
    Grow clusters from syndrome nodes, merge when they touch.
    Peel edges to find correction.
    """
    uf = UnionFind(len(graph.nodes))
    # Grow clusters from detection events
    active = set(np.where(syndrome)[0])
    # ... cluster growth and peeling logic
    return correction

Neural Network Decoders

An active research area — using ML to decode quantum error correction syndromes:

import torch
import torch.nn as nn

class NeuralDecoder(nn.Module):
    def __init__(self, n_detectors, n_observables, hidden_dim=128):
        super().__init__()
        self.net = nn.Sequential(
            nn.Linear(n_detectors, hidden_dim),
            nn.ReLU(),
            nn.Linear(hidden_dim, hidden_dim),
            nn.ReLU(),
            nn.Linear(hidden_dim, n_observables),
            nn.Sigmoid()
        )

    def forward(self, syndromes):
        return self.net(syndromes.float())

# Train on Stim-generated data
def train_neural_decoder(circuit, epochs=100):
    dem = circuit.detector_error_model()
    n_det = circuit.num_detectors
    n_obs = circuit.num_observables

    model = NeuralDecoder(n_det, n_obs)
    optimizer = torch.optim.Adam(model.parameters(), lr=1e-3)
    loss_fn = nn.BCELoss()

    sampler = circuit.compile_detector_sampler()

    for epoch in range(epochs):
        det, obs = sampler.sample(shots=10000, separate_observables=True)
        det_t = torch.tensor(det, dtype=torch.float32)
        obs_t = torch.tensor(obs, dtype=torch.float32)

        pred = model(det_t)
        loss = loss_fn(pred, obs_t)

        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        if epoch % 10 == 0:
            accuracy = ((pred > 0.5) == obs_t).float().mean()
            print(f"Epoch {epoch}: loss={loss:.4f}, acc={accuracy:.4f}")

Noise Models for QEC Research

Circuit-Level Noise

Realistic noise occurs during specific operations, not uniformly:

# Stim's circuit-level noise model
circuit = stim.Circuit("""
    # After each two-qubit gate: depolarizing noise
    R 0 1 2 3 4
    DEPOLARIZE1(0.001) 0 1 2 3 4

    # Syndrome extraction round
    H 3 4
    DEPOLARIZE1(0.001) 3 4

    CX 3 0
    DEPOLARIZE2(0.005) 3 0  # Two-qubit noise is worse

    CX 3 1
    DEPOLARIZE2(0.005) 3 1

    CX 4 1
    DEPOLARIZE2(0.005) 4 1

    CX 4 2
    DEPOLARIZE2(0.005) 4 2

    H 3 4
    DEPOLARIZE1(0.001) 3 4

    # Measurement errors
    M(0.005) 3 4
""")

Biased Noise

Some hardware has asymmetric noise — phase errors more likely than bit-flip errors:

# Z-biased noise (phase errors 10x more likely)
circuit = stim.Circuit("""
    R 0 1 2
    # Z errors at 1%, X errors at 0.1%
    Z_ERROR(0.01) 0 1 2
    X_ERROR(0.001) 0 1 2
    M 0 1 2
""")

Tailored codes like the XZZX surface code exploit this bias for better performance.

Beyond the Surface Code

Color Codes

Support transversal implementation of more gates, potentially reducing overhead for fault-tolerant computation:

# Color code on a triangular lattice
# Each face has a 3-body stabilizer
# Supports transversal Hadamard and phase gates
color_code = stim.Circuit.generated(
    "color_code:memory_xyz",
    distance=5,
    rounds=5,
    after_clifford_depolarization=0.001
)

Quantum LDPC Codes

The next frontier — encode more logical qubits per physical qubit:

# Hypergraph product codes
# These achieve better encoding rates than surface codes
# but require non-local connectivity

import ldpc  # LDPC decoder library

# Classical LDPC code as building block
from ldpc import bp_osd

# Construct a quantum LDPC code from classical codes
# BP+OSD decoder for hypergraph product codes
decoder = bp_osd.BpOsdDecoder(
    parity_check_matrix,
    error_rate=0.01,
    max_iter=50,
    osd_order=10
)
correction = decoder.decode(syndrome)

Performance Benchmarks

Stim can simulate massive QEC experiments:

ScenarioStim Performance
d=5 surface code, 10 rounds, 1M shots~2 seconds
d=17 surface code, 17 rounds, 100K shots~30 seconds
d=31 surface code, 31 rounds, 10K shots~60 seconds

This performance enables statistical analysis that would be impossible with full statevector simulation.

The Road to Fault Tolerance

Google demonstrated in 2024-2025 that increasing surface code distance from 3 to 5 to 7 actually reduces logical error rates on their hardware — crossing the “break-even” point where error correction helps more than it hurts. This was a landmark result.

The path forward:

  1. Current (2025-2026): Demonstrating error correction at small scales
  2. Near-term (2027-2029): ~1,000 physical qubits, a handful of logical qubits
  3. Medium-term (2030+): ~100,000 physical qubits, ~100 logical qubits
  4. Long-term: Millions of physical qubits, practical fault-tolerant quantum computing

One thing to remember: Quantum error correction transforms the engineering challenge from “build perfect qubits” (impossible) to “build enough okay qubits and protect information through redundancy” — and Python tools like Stim and PyMatching make simulating these protection schemes fast enough to design the codes that will power future quantum computers.

pythonquantum-computingerror-correctionqec

See Also