Python Approval Testing — Deep Dive

Setting up approvaltests in Python

The approvaltests library provides the canonical implementation of Llewellyn Falco’s approval testing pattern for Python:

# Install: pip install approvaltests
from approvaltests import verify, Options
from approvaltests.reporters import DiffReporter

def test_invoice_rendering():
    invoice = generate_invoice(
        customer="Acme Corp",
        items=[
            {"name": "Widget", "qty": 10, "price": 9.99},
            {"name": "Gadget", "qty": 3, "price": 24.50},
        ],
        tax_rate=0.08,
    )
    rendered = render_invoice_text(invoice)
    verify(rendered, options=Options().with_reporter(DiffReporter()))

On first run, this creates two files:

  • test_invoice_rendering.received.txt — the actual output
  • test_invoice_rendering.approved.txt — empty (test fails)

You inspect the received file, and if correct, rename it to .approved.txt. Subsequent runs compare against this approved version.

Using syrupy for pytest-native snapshots

Syrupy integrates directly with pytest’s assertion mechanism and supports multiple serialization formats:

# Install: pip install syrupy
# conftest.py — no special setup needed, syrupy auto-discovers

def test_api_response_shape(snapshot):
    response = client.get("/api/v1/users/1")
    assert response.json() == snapshot

def test_error_messages(snapshot):
    with pytest.raises(ValidationError) as exc_info:
        validate_input({"email": "not-an-email"})
    assert str(exc_info.value) == snapshot

def test_dataclass_output(snapshot):
    result = process_order(sample_order)
    assert result == snapshot

Syrupy stores snapshots in __snapshots__/ directories as .ambr files. Update snapshots with pytest --snapshot-update.

The advantage over approvaltests is tighter pytest integration — snapshots update through a CLI flag rather than file manipulation, and the diff output integrates with pytest’s assertion introspection.

Building custom normalizers

Raw output comparison breaks on dynamic content. Build normalizers to strip non-deterministic elements:

import re
from datetime import datetime

def normalize_output(text: str) -> str:
    """Strip dynamic content before approval comparison."""
    # Replace timestamps with placeholder
    text = re.sub(
        r"\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}(\.\d+)?Z?",
        "<TIMESTAMP>",
        text
    )
    # Replace UUIDs
    text = re.sub(
        r"[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}",
        "<UUID>",
        text,
    )
    # Normalize floating point precision
    text = re.sub(r"(\d+\.\d{2})\d+", r"\1", text)
    return text


def test_report_output():
    report = generate_monthly_report(month=3, year=2026)
    normalized = normalize_output(report.to_text())
    verify(normalized)

For approvaltests, you can also use the built-in scrubbing mechanism:

from approvaltests import verify
from approvaltests.scrubbers import create_regex_scrubber

date_scrubber = create_regex_scrubber(
    r"\d{4}-\d{2}-\d{2}", lambda m: "<DATE>"
)

def test_with_scrubber():
    output = generate_output()
    verify(output, options=Options().with_scrubber(date_scrubber))

Approval testing for complex data structures

When testing APIs or data pipelines, serialize complex objects to a readable format before approval:

import json

def to_approval_string(obj: dict) -> str:
    """Convert to deterministic, readable string for approval."""
    return json.dumps(obj, indent=2, sort_keys=True, default=str)

def test_etl_pipeline_output():
    raw_data = load_fixture("raw_sales_data.csv")
    transformed = run_etl_pipeline(raw_data)

    # Serialize for stable comparison
    approval_text = to_approval_string({
        "record_count": len(transformed),
        "columns": sorted(transformed[0].keys()) if transformed else [],
        "sample_records": transformed[:5],
        "aggregates": {
            "total_revenue": sum(r["revenue"] for r in transformed),
            "unique_customers": len(set(r["customer_id"] for r in transformed)),
        }
    })
    verify(approval_text)

This approach captures both the structure and key metrics of the output without recording every single row, making the approved file readable and maintainable.

CI integration strategies

In CI, approval tests must fail clearly when output changes, without interactive diff tools:

# GitHub Actions
- name: Run approval tests
  run: |
    pytest tests/approval/ -v --tb=short 2>&1 | tee test-output.txt
    if grep -q "FAILED" test-output.txt; then
      echo "::error::Approval tests failed. Run locally and update snapshots if changes are intentional."
      exit 1
    fi

For syrupy, configure CI to detect uncommitted snapshot changes:

- name: Check snapshots are up to date
  run: |
    pytest --snapshot-update
    git diff --exit-code __snapshots__/
    if [ $? -ne 0 ]; then
      echo "Snapshots are outdated. Run 'pytest --snapshot-update' and commit."
      exit 1
    fi

Migrating from assertion-heavy tests

Approval testing works well as a migration strategy for legacy code with no tests. Instead of understanding every business rule to write assertions, capture current behavior:

# Step 1: Characterization test — capture current behavior
def test_legacy_pricing_engine():
    """Characterization test: captures current behavior, not necessarily correct behavior."""
    scenarios = load_test_scenarios("pricing_scenarios.json")
    results = []
    for scenario in scenarios:
        result = legacy_price_calculator(**scenario["inputs"])
        results.append({
            "scenario": scenario["name"],
            "output": result,
        })
    verify(to_approval_string(results))

# Step 2: Over time, replace with specific assertions
def test_bulk_discount_applied():
    """Specific business rule test — extracted from characterization test."""
    result = legacy_price_calculator(
        items=[{"sku": "A1", "qty": 100, "unit_price": 10.0}],
        customer_tier="gold",
    )
    assert result["discount_pct"] == 15.0
    assert result["total"] == 850.0

The characterization test acts as a safety net while you incrementally add targeted assertions. Once you have good assertion coverage for a module, the approval test can be retired.

Tradeoffs and architecture decisions

AspectAssertion TestsApproval Tests
Setup effortHigh (write each check)Low (capture output once)
MaintenanceLow (focused checks)Medium (update approved files)
False positivesLowMedium (formatting changes)
ReadabilityRules are explicitReference is the full output
Best forBusiness logicComplex output, legacy code

The pragmatic approach combines both: approval tests for output verification, assertion tests for business rules. A test file might use assertions for “the discount is 15%” and approval testing for “the full invoice looks like this.”

One thing to remember: Approval testing is most powerful as a stepping stone — it captures behavior quickly and provides a safety net while you build understanding. The best teams use it to bootstrap coverage, then gradually replace broad approval tests with precise assertions as they learn the domain.

pythontestingquality

See Also