Compiler Pipeline — Deep Dive

Walk through CPython's compilation architecture — PEG parser, symbol tables, CFG construction, and the eval loop that drives execution.

Architecture Overview

CPython’s compilation pipeline lives primarily in these source files:

Parser/tokenize.c — Tokenizer
Parser/pegen.c, Parser/parser.c — PEG parser (auto-generated from Grammar/python.gram)
Python/ast.c — AST construction and validation
Python/ast_opt.c — AST-level optimizations
Python/symtable.c — Symbol table construction
Python/compile.c — Bytecode compilation
Python/flowgraph.c — CFG construction and optimization (Python 3.12+)
Python/ceval.c — The evaluation loop

Understanding these files gives you a map of the entire journey from source to execution.

Stage 1: Tokenization in Detail

The tokenizer is the first consumer of source bytes. It handles encoding detection (reading # -*- coding: utf-8 -*- or the BOM), converts bytes to characters, and produces a token stream.

Key behaviors:

Indentation tracking: The tokenizer maintains an indentation stack. When a line’s indentation increases, it emits INDENT. When it decreases, it emits one or more DEDENT tokens to match. This is how Python’s significant whitespace is encoded at the grammar level.
Implicit line joining: Expressions inside brackets ((), [], {}) can span multiple lines without backslash continuation. The tokenizer tracks bracket depth to handle this.
F-string tokenization: Since Python 3.12, f-strings are tokenized properly at the tokenizer level (previously they were partially handled by the parser).

import token, tokenize, io

code = """
def greet(name):
    return f"Hello, {name}!"
"""

for tok in tokenize.generate_tokens(io.StringIO(code).readline):
    print(f"{token.tok_name[tok.type]:10} {tok.string!r:20} "
          f"line {tok.start[0]}")

Stage 2: The PEG Parser

Python 3.9 replaced the LL(1) parser with a PEG parser. The grammar is defined in Grammar/python.gram using a notation that supports:

Ordered alternatives: Try each option in order, use the first that matches
Lookahead: &expr (positive) and !expr (negative) without consuming input
Left recursion: Handled automatically (LL(1) cannot do this)
Soft keywords: match and case are keywords only in pattern-matching contexts

The parser generator (Tools/peg_generator/) reads this grammar and produces C code that builds AST nodes directly. There is no intermediate parse tree — tokens go straight to AST.

import ast

source = """
match command:
    case "quit":
        exit()
    case ("go", direction):
        move(direction)
"""

tree = ast.parse(source)
match_node = tree.body[0]
print(type(match_node).__name__)  # Match
print(len(match_node.cases))       # 2

Stage 3: AST Validation and Optimization

After parsing, ast.c validates the tree (checking for invalid constructs that the grammar allows but semantics forbid, like del f()). Then ast_opt.c runs optimizations:

Constant folding evaluates constant expressions recursively. The folder handles:

# All folded at compile time:
x = 1 + 2 * 3         # → 7
y = "ab" + "cd"        # → "abcd"
z = (1, 2) + (3,)      # → (1, 2, 3)
w = not True            # → False

Membership test optimization converts constant containers:

# list → tuple (O(n) but less overhead):
x in [1, 2, 3]  →  x in (1, 2, 3)

# set literal → frozenset (O(1) lookup):
x in {1, 2, 3}  →  x in frozenset({1, 2, 3})

Format string simplification handles simple cases where f-string segments can be merged.

Stage 4: Symbol Table Construction

Python/symtable.c performs a critical pre-pass before compilation. It walks the entire AST and builds a symbol table that maps every name to its scope:

LOCAL — Assigned in this scope (uses LOAD_FAST)
GLOBAL_EXPLICIT — Declared with global statement
GLOBAL_IMPLICIT — Referenced but not assigned locally
FREE — Referenced from an enclosing scope (nonlocal or implicit closure)
CELL — Local but also captured by an inner scope

This pass detects errors:

def broken():
    print(x)  # SyntaxError or UnboundLocalError?
    x = 10    # This makes x local, so the print above fails

# The symbol table marks x as LOCAL because of the assignment,
# so LOAD_FAST is used — but x hasn't been assigned yet at the print

The symbol table also determines whether a function needs a closure (any CELL or FREE variables) and sets appropriate flags in the code object.

Stage 5: Bytecode Compilation

Python/compile.c is the core compiler. It walks the AST depth-first and emits bytecode instructions. The compilation is scope-driven: entering a function definition starts a new compilation unit with its own code object.

Key responsibilities:

Expression compilation: Recursive descent through expression nodes, emitting stack-machine instructions
Control flow: Translating if/elif/else, for, while, try/except/finally into jump instructions
Comprehension compilation: Each comprehension creates its own code object (a hidden nested function)
Class compilation: Class bodies are compiled as separate code objects and executed at class creation time

The compiler produces an intermediate representation — a sequence of instruction objects with symbolic labels for jump targets.

Stage 6: CFG Construction and Optimization

In Python 3.12+, the instruction sequence is converted to a Control Flow Graph (CFG) in Python/flowgraph.c. Each basic block is a sequence of non-branching instructions. Edges represent jumps.

The CFG optimizer performs:

Dead block elimination: Blocks unreachable from the entry point are removed
Jump threading: Chains of jumps are collapsed
Instruction peephole: NOP removal, redundant load/store elimination
Stack depth calculation: The maximum evaluation stack depth is computed by walking all CFG paths

After optimization, the CFG is linearized back into a flat instruction sequence, labels are resolved to concrete offsets, and the final bytecode bytes are assembled.

Stage 7: Code Object Assembly

The final step packages everything into a PyCodeObject:

Bytecode bytes (co_code)
Constants pool (co_consts)
Variable name tables (co_varnames, co_names, co_cellvars, co_freevars)
Source mapping (co_linetable, exception table)
Metadata (argument counts, flags, stack size)

Nested scopes produce nested code objects — inner function code objects appear in the outer’s co_consts.

The Eval Loop

Python/ceval.c contains _PyEval_EvalFrameDefault, a massive function that implements the bytecode interpreter. It is essentially a giant switch statement over opcodes:

// Simplified pseudocode
for (;;) {
    opcode = next_instruction();
    switch (opcode) {
        case LOAD_FAST:
            value = fastlocals[oparg];
            PUSH(value);
            break;
        case BINARY_OP:
            right = POP();
            left = POP();
            result = binary_op(oparg, left, right);
            PUSH(result);
            break;
        case RETURN_VALUE:
            retval = POP();
            goto exit_frame;
        // ... hundreds more cases
    }
}

Python 3.11+ uses computed gotos (on supported compilers) instead of a switch, and Python 3.13 experiments with a register-based interpreter for further speedup.

Specializing Adaptive Interpreter

Python 3.11 added a runtime optimization layer on top of the static compilation. After a bytecode instruction executes a few times, the interpreter replaces it with a specialized version:

BINARY_OP (+)  →  BINARY_OP_ADD_INT    (when both operands are ints)
LOAD_ATTR      →  LOAD_ATTR_INSTANCE_VALUE (when object layout is known)
CALL           →  CALL_PY_EXACT_ARGS   (when target is a Python function)

Specialization is per-instruction and uses inline caches stored after the instruction. If the type assumption fails, the instruction de-specializes back to the generic form. This is not JIT compilation — it is still interpreted — but it eliminates type-checking overhead for common patterns.

Hooking into the Pipeline

You can intercept compilation at several points:

Import hooks (importlib.abc.SourceLoader.source_to_code) — Replace the entire compilation process for imported modules.

AST transformers — Modify the tree between parsing and compilation:

import ast

source = open("module.py").read()
tree = ast.parse(source)
tree = MyTransformer().visit(tree)
code = compile(tree, "module.py", "exec")
exec(code)

compile() built-in — Accepts source strings, AST objects, or (with flags) returns an AST instead of a code object.

sys.settrace() / sys.monitoring — Hook into the eval loop at runtime.

One thing to remember: CPython’s compiler pipeline is a sophisticated multi-stage system that transforms human-readable source into optimized bytecode through tokenization, PEG parsing, AST optimization, symbol resolution, CFG-based compilation, and bytecode optimization. Each stage is well-defined and hookable, giving tool authors and curious developers clear entry points to understand, analyze, and extend how Python processes code.

pythoncompiler-internalslanguage-implementation