Python Approval Testing — Core Concepts
The core idea
Traditional tests assert specific conditions: “the result should equal 42.” Approval tests take a different approach: “the result should match what we previously approved.”
This is sometimes called “golden master” testing. You produce output, inspect it manually, approve it, and save it as a reference file. Future test runs compare new output against this reference. Any difference triggers a failure with a clear diff.
When approval testing shines
Approval testing is most valuable when output is complex and hard to assert piecemeal:
- HTML/email templates: Checking the entire rendered output is simpler than asserting 50 individual DOM elements
- Report generation: A financial report with dozens of rows and calculations is easier to verify as a complete document
- Serialization formats: JSON, XML, or CSV output where the structure matters as much as individual values
- CLI output: Command-line tools where formatting, alignment, and text content all need to be correct
In these cases, writing individual assertions is both tedious and brittle. Approval testing captures the “big picture” in one shot.
The workflow
The approval testing cycle has four steps:
- Run: Execute the code and capture its output
- Compare: Diff the new output against the approved version
- If different: The test fails. You see exactly what changed.
- Decide: Either fix the code (output shouldn’t have changed) or approve the new output (the change was intentional)
The first time a test runs, there’s no approved version. The test fails by default, and you must explicitly approve the initial output after verifying it’s correct. This prevents accidentally approving wrong output.
Key tools in the Python ecosystem
approvaltests is the main Python library for this pattern. It provides verify() functions that handle the capture-compare-diff cycle and stores approved outputs as .approved.txt files alongside your tests.
pytest-regtest takes a similar approach but integrates directly with pytest, storing expected output in a registry directory.
syrupy focuses specifically on snapshot assertions and works well with pytest, supporting multiple serialization formats.
Approved files management
Approved output files live in version control alongside your tests. When a developer changes code that produces different output, they update the approved files as part of their commit. Code reviewers can then see both the code change and its effect on output.
This creates an audit trail. Git history shows exactly when output changed and why. For regulated industries (finance, healthcare), this documentation is valuable for compliance.
Limitations to understand
Brittle to formatting changes: If you refactor code that changes whitespace, line ordering, or timestamp formatting in output, approval tests break even though the content is functionally identical. Use normalizers (strip timestamps, sort keys) to reduce false positives.
Initial approval requires diligence: The entire approach relies on a human correctly verifying the first output. Approve wrong output and your “golden master” is wrong — all future comparisons validate against a flawed reference.
Not a replacement for logic tests: Approval testing verifies output shape, not business rules. You still need unit tests that assert “discount is applied when coupon is valid.” Approval tests complement logic tests, they don’t replace them.
One thing to remember: Approval testing trades writing assertions for maintaining reference files — it’s a workflow shift, not a complexity reduction, and the payoff comes when output is too complex for hand-written assertions to cover practically.
See Also
- Python Acceptance Testing Patterns How Python teams verify software does what real users actually asked for.
- Python Behavior Driven Development Get an intuitive feel for Behavior Driven Development so Python behavior stops feeling unpredictable.
- Python Browser Automation Testing How Python can control a web browser like a robot to test websites automatically.
- Python Chaos Testing Applications Why breaking your own Python systems on purpose makes them stronger.
- Python Contract Testing Why contract testing is like having a written agreement between two teams so neither one accidentally breaks the other's work.