Python dis Module and Bytecode — Deep Dive
Bytecode instruction format
In CPython 3.6+, every instruction is exactly 2 bytes (a “word”): one byte for the opcode and one byte for the argument. Instructions that need larger arguments use EXTENDED_ARG prefixes to build up the argument value across multiple words. Python 3.12 changed to a consistent 2-byte instruction word format but keeps the same logical model.
import dis
import sys
def example():
x = 100_000 # Constant index might need EXTENDED_ARG
# Access raw bytecode
code = example.__code__
print(f"Bytecode bytes: {list(code.co_code)}")
print(f"Python version: {sys.version_info[:2]}")
print()
dis.dis(example)
The dis.Bytecode class
For programmatic analysis, dis.Bytecode provides an iterator of dis.Instruction named tuples:
import dis
def calculate(a, b, op):
if op == "+":
return a + b
elif op == "*":
return a * b
return None
bc = dis.Bytecode(calculate)
for instr in bc:
print(f"offset={instr.offset:3d} "
f"opname={instr.opname:25s} "
f"arg={instr.arg!s:5s} "
f"argval={instr.argval!r}")
Each Instruction has: opcode, opname, arg, argval, argrepr, offset, starts_line, is_jump_target. The argval field resolves the raw argument into its semantic value — a variable name, a constant, or a jump target offset.
Jump analysis and control flow graphs
You can build a basic control flow graph from bytecode:
import dis
from collections import defaultdict
def build_cfg(func):
"""Build a simple control flow graph from bytecode."""
instructions = list(dis.Bytecode(func))
blocks = defaultdict(list)
current_block = 0
# Find block leaders (targets of jumps)
leaders = {0}
for instr in instructions:
if instr.opcode in dis.hasjabs or instr.opcode in dis.hasjrel:
leaders.add(instr.argval)
# Instruction after a jump is also a leader
next_offset = instr.offset + 2
leaders.add(next_offset)
# Group instructions into basic blocks
sorted_leaders = sorted(leaders)
leader_to_block = {offset: i for i, offset in enumerate(sorted_leaders)}
for instr in instructions:
# Find which block this instruction belongs to
block_id = max(
l for l in sorted_leaders if l <= instr.offset
)
blocks[leader_to_block[block_id]].append(instr)
return dict(blocks)
def show_cfg(func):
cfg = build_cfg(func)
for block_id, instrs in sorted(cfg.items()):
print(f"\n--- Block {block_id} ---")
for instr in instrs:
print(f" {instr.offset:3d}: {instr.opname} {instr.argrepr}")
The specializing adaptive interpreter (Python 3.11+)
Python 3.11 introduced a specializing adaptive interpreter that rewrites bytecode at runtime. Generic instructions are replaced with type-specific versions after a few executions:
LOAD_ATTR→LOAD_ATTR_INSTANCE_VALUE(for regular attribute access)BINARY_OP→BINARY_OP_ADD_INT(for integer addition)CALL→CALL_PY_EXACT_ARGS(for Python functions with matching signatures)
You can see these specialized instructions with dis.dis() using the adaptive=True parameter (Python 3.12+):
import dis
def tight_loop():
total = 0
for i in range(1000):
total += i
return total
# Run the function to trigger specialization
tight_loop()
# Show specialized bytecode
dis.dis(tight_loop, adaptive=True)
The specialization happens in-place: the bytecode array is modified so the next execution uses the fast path directly. If a specialization fails (a type guard is violated), the instruction reverts to the generic version.
Bytecode optimization patterns
Pattern 1: Constant folding
CPython’s peephole optimizer folds constant expressions at compile time:
import dis
def constants():
x = 2 * 3 * 7 # Folded to 42 at compile time
y = "hello" + " " + "world" # Folded to "hello world"
z = (1, 2, 3) + (4, 5) # Folded to (1, 2, 3, 4, 5)
dis.dis(constants)
# You'll see LOAD_CONST 42, not LOAD_CONST 2; LOAD_CONST 3; BINARY_MULTIPLY
Pattern 2: Comparing variable access speeds
import dis
x_global = 42
def access_global():
return x_global # LOAD_GLOBAL
def access_local():
x_local = 42
return x_local # LOAD_FAST
def access_closure():
x = 42
def inner():
return x # LOAD_DEREF (closure variable)
return inner
The bytecode shows three different instructions for three access patterns. LOAD_FAST (locals) uses a C array index. LOAD_GLOBAL does a dictionary lookup (optimized with per-keys caching in 3.11+). LOAD_DEREF accesses a cell object through a pointer.
Pattern 3: Understanding comprehension overhead
import dis
# The comprehension creates a hidden function
def with_comp():
return [x for x in range(10)]
dis.dis(with_comp)
# Shows MAKE_FUNCTION + CALL for the comprehension's inner function
# The inner function itself uses LIST_APPEND which is faster than
# the LOAD_ATTR + CALL overhead of list.append()
Analyzing exception handling bytecode
Python 3.11 changed exception handling from a block-based model to a table-based model. Exception tables replace the old SETUP_EXCEPT / POP_BLOCK instructions:
import dis
def with_exception():
try:
risky_operation()
except ValueError as e:
handle(e)
finally:
cleanup()
dis.dis(with_exception)
# In 3.11+, shows PUSH_EXC_INFO and exception table entries
# Use dis.show_code() for the exception table:
dis.show_code(with_exception)
Instruction frequency analysis
For performance-sensitive code, count which instructions dominate:
import dis
from collections import Counter
def instruction_profile(func):
counter = Counter()
for instr in dis.Bytecode(func):
if instr.opname != "CACHE": # Skip cache entries (3.11+)
counter[instr.opname] += 1
print(f"\nInstruction profile for {func.__name__}:")
for opname, count in counter.most_common(10):
bar = "█" * count
print(f" {opname:30s} {count:3d} {bar}")
return counter
Cross-version bytecode comparison
A practical pattern for understanding how Python evolves:
import dis
import sys
def swap(a, b):
a, b = b, a
return a, b
print(f"Python {sys.version_info[:2]}")
dis.dis(swap)
# Python 3.7: ROT_TWO instruction
# Python 3.12: SWAP 2 instruction
# Both do the same thing, different implementation
The co_linetable (Python 3.10+)
Python 3.10 replaced co_lnotab with co_linetable, using a more compact encoding for the line number table. Python 3.11 further extended this to co_lines() and added column information via co_positions():
import dis
import sys
def multiline():
result = (
some_func(a, b)
+ other_func(c)
)
if sys.version_info >= (3, 11):
code = multiline.__code__
for offset, start_line, end_line, col, end_col in code.co_positions():
print(f" offset {offset}: "
f"lines {start_line}-{end_line}, "
f"cols {col}-{end_col}")
This precise position information powers the improved error messages in Python 3.11+ where the interpreter underlines the exact expression that caused an error.
Practical debugging with dis
When behavior is confusing, bytecode tells the truth:
import dis
# Why does this work?
def mysterious():
x = [1, 2, 3]
x += [4] # Calls __iadd__ (mutates in place for lists)
x = x + [5] # Calls __add__ (creates new list)
dis.dis(mysterious)
# Shows BINARY_OP(+=) for +=, which uses INPLACE_ADD
# vs BINARY_OP(+) for +, which uses BINARY_ADD
# For lists they produce different bytecode paths
The one thing to remember: The dis module combined with dis.Bytecode for programmatic access reveals exactly what CPython does with your code — from stack operations and jump targets to the specializing optimizations that make Python 3.11+ significantly faster than earlier versions.
See Also
- Python Ast Module Code Analysis How Python's ast module reads your code like a grammar teacher diagrams sentences — turning source text into a tree you can inspect and change.
- Python Gc Module Internals How Python's garbage collector automatically cleans up memory you are no longer using — like a tidy roommate for your program.
- Python Importlib Custom Loaders How Python's importlib lets you teach Python to load code from anywhere — databases, zip files, the internet, or even generated on the fly.
- Python Site Customization How Python's site module sets up your environment before your code even starts running — the invisible first step of every Python program.
- Python Startup Optimization Why Python takes a moment to start and what you can do to make your scripts and tools launch faster.