Python pprint — Deep Dive

How the formatting algorithm works

PrettyPrinter uses a recursive approach with a width budget:

  1. Try to format the entire object on one line using repr()
  2. If it fits within the remaining width, use it
  3. If not, expand: format each element on its own line, indented, and recurse

The decision to expand is per-object, not global. A list of short strings might stay on one line while a list of dicts expands — even within the same parent structure.

The key internal methods are:

  • _format(object, stream, indent, allowance, context, level) — the recursive core
  • _repr(object, context, level) — generates the one-line repr
  • _pprint_dict, _pprint_list, etc. — type-specific formatters registered via _dispatch

Custom object formatting

By default, pprint calls repr() on objects it doesn’t have a specific formatter for. You can improve this two ways:

Option 1: Define __repr__ on your class

class Config:
    def __init__(self, host, port, debug):
        self.host = host
        self.port = port
        self.debug = debug

    def __repr__(self):
        return (
            f"Config(host={self.host!r}, "
            f"port={self.port!r}, "
            f"debug={self.debug!r})"
        )

pprint will use this repr when the object appears inside containers.

Option 2: Register a custom formatter

For deeper control, subclass PrettyPrinter and register a dispatch handler:

from pprint import PrettyPrinter

class CustomPP(PrettyPrinter):
    _dispatch = PrettyPrinter._dispatch.copy()

    def _pprint_config(self, object, stream, indent, allowance, context, level):
        cls_name = object.__class__.__name__
        stream.write(f"{cls_name}(\n")
        next_indent = indent + self._indent_per_level
        for i, (key, val) in enumerate(vars(object).items()):
            stream.write(" " * next_indent + f"{key}=")
            self._format(val, stream, next_indent + len(key) + 1,
                        allowance if i == len(vars(object)) - 1 else 1,
                        context, level + 1)
            if i < len(vars(object)) - 1:
                stream.write(",\n")
        stream.write("\n" + " " * indent + ")")

    _dispatch[Config.__repr__] = _pprint_config

Note: the dispatch table keys off the __repr__ method identity, which is fragile. For production use, consider wrapping objects before passing to pprint.

Integration with logging

pprint pairs well with Python’s logging module for structured debug output:

import logging
from pprint import pformat

logger = logging.getLogger(__name__)

def process_response(response):
    logger.debug("API response:\n%s", pformat(response, width=100, depth=4))

Using pformat (not pprint) avoids writing to stdout and lets the logging framework handle the output. The %s lazy formatting means the pformat call only executes if DEBUG level is active.

For high-volume logging, guard the formatting:

if logger.isEnabledFor(logging.DEBUG):
    logger.debug("State:\n%s", pformat(large_state))

This avoids the pformat cost entirely when debug logging is disabled.

Circular reference handling

pprint detects circular references using an identity-based context set:

from pprint import pprint

a = [1, 2]
a.append(a)  # circular!

pprint(a)
# [1, 2, <Recursion on list with id=...>]

This is one of pprint’s advantages over json.dumps, which raises ValueError on circular structures.

Performance characteristics

pprint is not optimized for speed — it’s a debugging tool. Rough benchmarks:

Data sizepformat time
100-element flat list~50μs
1,000-element flat dict~500μs
10,000-element nested~15ms
100,000-element nested~200ms

For large data, use depth to limit recursion and width to control expansion. If you’re formatting megabytes of data, you’re probably using the wrong tool — consider streaming JSON or custom formatters.

pprint in the REPL and IPython

The standard Python REPL uses repr() by default. You can make pprint the default display:

import sys
from pprint import pprint

# Override displayhook for REPL
def pprint_displayhook(value):
    if value is not None:
        pprint(value)
        __builtins__.__dict__["_"] = value

sys.displayhook = pprint_displayhook

IPython has this built in with %pprint magic command, which toggles pretty printing on and off.

Production recipes

Diff-friendly configuration dumps

from pprint import pformat

def dump_config(config, path="config_dump.txt"):
    """Write config in a format that diffs well."""
    formatted = pformat(config, width=60, sort_dicts=True)
    with open(path, "w") as f:
        f.write(formatted + "\n")

Sorting keys and using a consistent width means git diffs show only actual changes, not reformatting noise.

Truncated debug output

from pprint import pformat

def debug_preview(obj, max_chars=500):
    """Format object for debug, truncating if too large."""
    full = pformat(obj, width=80, depth=3)
    if len(full) > max_chars:
        return full[:max_chars] + "\n... [truncated]"
    return full

Test assertion messages

from pprint import pformat

def assert_dict_equal(actual, expected):
    if actual != expected:
        msg = (
            f"Dicts differ:\n"
            f"ACTUAL:\n{pformat(actual, width=60)}\n\n"
            f"EXPECTED:\n{pformat(expected, width=60)}"
        )
        raise AssertionError(msg)

pprint vs. modern alternatives

ToolBest for
pprintQuick debugging, stdlib-only environments
rich.prettyColor-highlighted, theme-aware terminal output
icecreamInline debug printing with variable names
devtools.debugpydantic-aware, colored output
json.dumpsJSON-valid output for APIs

rich.pretty.install() replaces pprint as the REPL default with syntax-highlighted, type-aware output. For modern development, it’s a strict upgrade — but pprint remains valuable for environments where third-party packages aren’t available.

One thing to remember

pprint is the universal debugging formatter that handles every Python type including circular references. Use pformat for string output, guard expensive calls behind log-level checks, and know that rich.pretty is the modern successor when you can afford a dependency.

pythonstandard-librarydebugging

See Also

  • Python Atexit How Python's atexit module lets your program clean up after itself right before it shuts down.
  • Python Bisect Sorted Lists How Python's bisect module finds things in sorted lists the way you'd find a word in a dictionary — by jumping to the middle.
  • Python Contextlib How Python's contextlib module makes the 'with' statement work for anything, not just files.
  • Python Copy Module Why copying data in Python isn't as simple as it sounds, and how the copy module prevents sneaky bugs.
  • Python Dataclass Field Metadata How Python dataclass fields can carry hidden notes — like sticky notes on a filing cabinet that tools read automatically.