Compiler Pipeline — Core Concepts

Trace the path from .py file to running bytecode — tokenizer, parser, AST, optimizer, and code generation in CPython.

The Big Picture

When you run python script.py, CPython does not execute your source text directly. It compiles it into bytecode first, then interprets that bytecode. This compilation happens every time you run a script (unless a cached .pyc file exists). The pipeline has distinct stages, each feeding into the next.

Stage 1: Tokenization (Lexical Analysis)

The tokenizer reads raw source text character by character and groups characters into tokens — the smallest meaningful units of the language. A token might be a keyword (def, if, return), a name (my_variable), a number (42), an operator (+, ==), or punctuation (:, ().

Python’s tokenizer also handles indentation. It tracks the indentation level and generates INDENT and DEDENT tokens when the level changes, which is how Python enforces its whitespace-based block structure.

You can see tokenization in action:

import tokenize, io

source = "x = 2 + 3\n"
tokens = tokenize.generate_tokens(io.StringIO(source).readline)
for tok in tokens:
    print(tok)

Stage 2: Parsing (Syntax Analysis)

The parser takes the stream of tokens and builds an Abstract Syntax Tree (AST). The AST is a tree structure where each node represents a language construct — an assignment, a function definition, a binary operation — and child nodes represent sub-components.

Since Python 3.9, CPython uses a PEG (Parsing Expression Grammar) parser, replacing the older LL(1) parser. The PEG parser is generated from a grammar file (Grammar/python.gram) and can handle more complex grammar rules without workarounds.

Parsing is where syntax errors are caught. If your tokens do not form a valid tree according to Python’s grammar, you get a SyntaxError with a line number and description.

Stage 3: AST Optimization

Before generating bytecode, CPython runs an optimization pass over the AST. This is where constant folding happens — expressions like 2 + 3 are evaluated at compile time and replaced with 5. The optimizer also converts constant lists in membership tests (x in [1, 2, 3]) to tuples or frozensets for faster runtime checks.

This stage is relatively lightweight. It only handles transformations that are provably safe based on the tree structure alone, without any type inference or cross-function analysis.

Stage 4: Symbol Table Construction

Before the compiler can generate bytecode, it needs to know the scope of every variable. The symbol table pass walks the AST and determines whether each name is local, global, a closure variable, or a free variable. This information dictates which bytecode instructions are used — LOAD_FAST for locals, LOAD_GLOBAL for globals, LOAD_DEREF for closure variables.

This is also where Python detects scope-related errors, like using a variable before assignment in a scope where it is later assigned (the classic “UnboundLocalError” setup).

Stage 5: Bytecode Compilation

The compiler walks the AST and emits bytecode — a sequence of low-level instructions for CPython’s virtual machine. Each instruction is an opcode (like LOAD_FAST, BINARY_OP, CALL, RETURN_VALUE) paired with an argument.

The output is a code object — an immutable container holding the bytecode, a constants pool, variable name tables, and metadata like line number mappings. Every function, class body, module, and comprehension gets its own code object.

Stage 6: Bytecode Optimization

After initial bytecode generation, CPython runs another optimization pass on the bytecode level. In modern CPython (3.12+), this operates on a Control Flow Graph representation, performing jump threading, dead block elimination, and instruction simplification.

The Interpreter Loop

Once compilation is complete, the interpreter (_PyEval_EvalFrameDefault) executes the bytecode. It creates a frame object for each scope, reads instructions one at a time, and manipulates a stack-based virtual machine. This is the “eval loop” — the heart of CPython.

Caching: `.pyc` Files

After compiling a module, CPython writes the resulting code object to a .pyc file in the __pycache__ directory. On subsequent imports, if the .pyc is up to date (checked via timestamp or hash), Python loads the cached bytecode and skips the entire compilation pipeline. This is why the first import of a module is slower than subsequent ones.

The Full Pipeline Summary

Source text (.py)
    → Tokenizer → Token stream
    → Parser → Abstract Syntax Tree
    → AST Optimizer → Optimized AST
    → Symbol Table Builder → Scope information
    → Compiler → Bytecode (code objects)
    → Bytecode Optimizer → Optimized bytecode
    → Interpreter (eval loop) → Execution

A Common Misconception

Many people describe Python as “interpreted, not compiled.” This is misleading. Python is compiled — your source code goes through a full compilation pipeline to produce bytecode. The difference from languages like C is that C compiles to native machine code, while Python compiles to bytecode that runs on a virtual machine. The compilation just happens transparently every time you run your script.

One thing to remember: Python’s compiler pipeline transforms your source code through a series of well-defined stages — tokenizing, parsing, optimizing, scope analysis, and bytecode generation — before the interpreter ever executes a single instruction. Understanding this pipeline demystifies error messages, performance characteristics, and how tools like linters and debuggers hook into the process.

pythoncompiler-internalslanguage-implementation