Quantum Error Correction in Python — Deep Dive

Implement stabilizer codes, surface code decoders, and noise simulations for quantum error correction using Stim, PyMatching, and Qiskit

Stabilizer Formalism

Quantum error correction codes are most elegantly described using the stabilizer formalism. A stabilizer code is defined by a set of Pauli operators (tensor products of I, X, Y, Z) that “stabilize” the code space — meaning every valid codeword is a +1 eigenvector of every stabilizer.

For an [[n, k, d]] code:

n physical qubits encode k logical qubits
d is the code distance — the minimum number of physical qubit errors needed to cause a logical error
The code can correct up to ⌊(d-1)/2⌋ errors

The 3-Qubit Bit-Flip Code in Stabilizer Language

# Stabilizers: Z₁Z₂ and Z₂Z₃
# These detect bit-flip errors without revealing the encoded state

import stim

# Build a repetition code circuit
circuit = stim.Circuit("""
    # Initialize logical |0⟩ = |000⟩
    R 0 1 2

    # Encoding (already in |000⟩ for logical |0⟩)

    # Error channel: each qubit has 5% chance of X error
    X_ERROR(0.05) 0 1 2

    # Syndrome extraction
    # Ancilla qubits 3,4 measure stabilizers
    R 3 4
    CX 0 3
    CX 1 3
    CX 1 4
    CX 2 4
    M 3 4

    # Measure data qubits
    M 0 1 2
""")

# Sample many rounds
sampler = circuit.compile_sampler()
samples = sampler.sample(shots=10000)

Stim: High-Performance QEC Simulation

Stim, developed by Google, is the go-to tool for large-scale quantum error correction simulation. It exploits the Gottesman-Knill theorem: stabilizer circuits (Clifford gates + Pauli measurements) can be simulated efficiently in polynomial time.

Building a Surface Code

import stim
import numpy as np

def make_surface_code_circuit(distance, rounds, noise):
    """Generate a distance-d surface code memory experiment."""
    circuit = stim.Circuit()

    # Surface code on a d×d grid
    # Data qubits on vertices, measure qubits on faces/edges
    data_qubits = []
    x_stabilizers = []  # Detect phase errors
    z_stabilizers = []  # Detect bit-flip errors

    for row in range(distance):
        for col in range(distance):
            data_qubits.append(row * distance + col)

    # Use Stim's built-in surface code generator
    circuit = stim.Circuit.generated(
        "surface_code:rotated_memory_z",
        distance=distance,
        rounds=rounds,
        after_clifford_depolarization=noise,
        after_reset_flip_probability=noise,
        before_measure_flip_probability=noise,
        before_round_data_depolarization=noise
    )
    return circuit

# Distance-5 surface code, 10 rounds, 0.1% noise
circuit = make_surface_code_circuit(distance=5, rounds=10, noise=0.001)

# Get the detector error model
dem = circuit.detector_error_model(decompose_errors=True)
print(f"Detectors: {circuit.num_detectors}")
print(f"Observables: {circuit.num_observables}")

Sampling and Decoding

import pymatching

# Compile the matching graph from Stim's detector error model
matching = pymatching.Matching.from_detector_error_model(dem)

# Sample errors
sampler = circuit.compile_detector_sampler()
detection_events, observable_flips = sampler.sample(
    shots=100000,
    separate_observables=True
)

# Decode: predict which logical errors occurred
predicted_flips = matching.decode_batch(detection_events)

# Calculate logical error rate
num_errors = np.sum(predicted_flips != observable_flips)
logical_error_rate = num_errors / len(observable_flips)
print(f"Logical error rate: {logical_error_rate:.6f}")

Threshold Analysis

The threshold is the physical error rate below which increasing code distance reduces the logical error rate:

import matplotlib.pyplot as plt

physical_rates = [0.001, 0.003, 0.005, 0.007, 0.01, 0.015]
distances = [3, 5, 7, 9]

results = {}
for d in distances:
    results[d] = []
    for p in physical_rates:
        circuit = make_surface_code_circuit(distance=d, rounds=d, noise=p)
        dem = circuit.detector_error_model(decompose_errors=True)
        matching = pymatching.Matching.from_detector_error_model(dem)

        det_samples, obs_flips = circuit.compile_detector_sampler().sample(
            shots=50000, separate_observables=True
        )
        predictions = matching.decode_batch(det_samples)
        error_rate = np.mean(predictions != obs_flips)
        results[d].append(error_rate)

# Plot: lines cross at the threshold
for d in distances:
    plt.semilogy(physical_rates, results[d], 'o-', label=f'd={d}')

plt.xlabel('Physical error rate')
plt.ylabel('Logical error rate')
plt.legend()
plt.title('Surface Code Threshold')

The crossing point of the curves reveals the threshold — typically around 0.5-1% for the surface code with minimum-weight perfect matching decoding.

Implementing Decoders

Minimum-Weight Perfect Matching (MWPM)

The standard decoder for surface codes. It models the decoding problem as a graph matching problem: pair up detected errors in a way that minimizes total weight.

import pymatching

# PyMatching implements MWPM efficiently
# Create from a check matrix (parity check perspective)
import scipy.sparse

# Example: repetition code check matrix
# Each row is a stabilizer, each column is a qubit
H = scipy.sparse.csc_matrix(np.array([
    [1, 1, 0, 0, 0],
    [0, 1, 1, 0, 0],
    [0, 0, 1, 1, 0],
    [0, 0, 0, 1, 1],
]))

matching = pymatching.Matching(H, weights=np.ones(5))

# Decode a syndrome
syndrome = np.array([1, 0, 0, 1])  # Errors detected at positions 0 and 3
correction = matching.decode(syndrome)
print(f"Correction: {correction}")  # Which qubits to flip

Union-Find Decoder

Faster than MWPM (almost linear time) with slightly worse performance:

# Union-find decoder concept (simplified)
class UnionFind:
    def __init__(self, n):
        self.parent = list(range(n))
        self.rank = [0] * n

    def find(self, x):
        while self.parent[x] != x:
            self.parent[x] = self.parent[self.parent[x]]
            x = self.parent[x]
        return x

    def union(self, x, y):
        rx, ry = self.find(x), self.find(y)
        if rx == ry:
            return
        if self.rank[rx] < self.rank[ry]:
            rx, ry = ry, rx
        self.parent[ry] = rx
        if self.rank[rx] == self.rank[ry]:
            self.rank[rx] += 1

def uf_decode(syndrome, graph):
    """
    Grow clusters from syndrome nodes, merge when they touch.
    Peel edges to find correction.
    """
    uf = UnionFind(len(graph.nodes))
    # Grow clusters from detection events
    active = set(np.where(syndrome)[0])
    # ... cluster growth and peeling logic
    return correction

Neural Network Decoders

An active research area — using ML to decode quantum error correction syndromes:

import torch
import torch.nn as nn

class NeuralDecoder(nn.Module):
    def __init__(self, n_detectors, n_observables, hidden_dim=128):
        super().__init__()
        self.net = nn.Sequential(
            nn.Linear(n_detectors, hidden_dim),
            nn.ReLU(),
            nn.Linear(hidden_dim, hidden_dim),
            nn.ReLU(),
            nn.Linear(hidden_dim, n_observables),
            nn.Sigmoid()
        )

    def forward(self, syndromes):
        return self.net(syndromes.float())

# Train on Stim-generated data
def train_neural_decoder(circuit, epochs=100):
    dem = circuit.detector_error_model()
    n_det = circuit.num_detectors
    n_obs = circuit.num_observables

    model = NeuralDecoder(n_det, n_obs)
    optimizer = torch.optim.Adam(model.parameters(), lr=1e-3)
    loss_fn = nn.BCELoss()

    sampler = circuit.compile_detector_sampler()

    for epoch in range(epochs):
        det, obs = sampler.sample(shots=10000, separate_observables=True)
        det_t = torch.tensor(det, dtype=torch.float32)
        obs_t = torch.tensor(obs, dtype=torch.float32)

        pred = model(det_t)
        loss = loss_fn(pred, obs_t)

        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        if epoch % 10 == 0:
            accuracy = ((pred > 0.5) == obs_t).float().mean()
            print(f"Epoch {epoch}: loss={loss:.4f}, acc={accuracy:.4f}")

Noise Models for QEC Research

Circuit-Level Noise

Realistic noise occurs during specific operations, not uniformly:

# Stim's circuit-level noise model
circuit = stim.Circuit("""
    # After each two-qubit gate: depolarizing noise
    R 0 1 2 3 4
    DEPOLARIZE1(0.001) 0 1 2 3 4

    # Syndrome extraction round
    H 3 4
    DEPOLARIZE1(0.001) 3 4

    CX 3 0
    DEPOLARIZE2(0.005) 3 0  # Two-qubit noise is worse

    CX 3 1
    DEPOLARIZE2(0.005) 3 1

    CX 4 1
    DEPOLARIZE2(0.005) 4 1

    CX 4 2
    DEPOLARIZE2(0.005) 4 2

    H 3 4
    DEPOLARIZE1(0.001) 3 4

    # Measurement errors
    M(0.005) 3 4
""")

Biased Noise

Some hardware has asymmetric noise — phase errors more likely than bit-flip errors:

# Z-biased noise (phase errors 10x more likely)
circuit = stim.Circuit("""
    R 0 1 2
    # Z errors at 1%, X errors at 0.1%
    Z_ERROR(0.01) 0 1 2
    X_ERROR(0.001) 0 1 2
    M 0 1 2
""")

Tailored codes like the XZZX surface code exploit this bias for better performance.

Beyond the Surface Code

Color Codes

Support transversal implementation of more gates, potentially reducing overhead for fault-tolerant computation:

# Color code on a triangular lattice
# Each face has a 3-body stabilizer
# Supports transversal Hadamard and phase gates
color_code = stim.Circuit.generated(
    "color_code:memory_xyz",
    distance=5,
    rounds=5,
    after_clifford_depolarization=0.001
)

Quantum LDPC Codes

The next frontier — encode more logical qubits per physical qubit:

# Hypergraph product codes
# These achieve better encoding rates than surface codes
# but require non-local connectivity

import ldpc  # LDPC decoder library

# Classical LDPC code as building block
from ldpc import bp_osd

# Construct a quantum LDPC code from classical codes
# BP+OSD decoder for hypergraph product codes
decoder = bp_osd.BpOsdDecoder(
    parity_check_matrix,
    error_rate=0.01,
    max_iter=50,
    osd_order=10
)
correction = decoder.decode(syndrome)

Performance Benchmarks

Stim can simulate massive QEC experiments:

Scenario	Stim Performance
d=5 surface code, 10 rounds, 1M shots	~2 seconds
d=17 surface code, 17 rounds, 100K shots	~30 seconds
d=31 surface code, 31 rounds, 10K shots	~60 seconds

This performance enables statistical analysis that would be impossible with full statevector simulation.

The Road to Fault Tolerance

Google demonstrated in 2024-2025 that increasing surface code distance from 3 to 5 to 7 actually reduces logical error rates on their hardware — crossing the “break-even” point where error correction helps more than it hurts. This was a landmark result.

The path forward:

Current (2025-2026): Demonstrating error correction at small scales
Near-term (2027-2029): ~1,000 physical qubits, a handful of logical qubits
Medium-term (2030+): ~100,000 physical qubits, ~100 logical qubits
Long-term: Millions of physical qubits, practical fault-tolerant quantum computing

One thing to remember: Quantum error correction transforms the engineering challenge from “build perfect qubits” (impossible) to “build enough okay qubits and protect information through redundancy” — and Python tools like Stim and PyMatching make simulating these protection schemes fast enough to design the codes that will power future quantum computers.

pythonquantum-computingerror-correctionqec