Bytecode Manipulation — Core Concepts
What Is Python Bytecode?
Python bytecode is the intermediate representation that CPython’s compiler produces from your source code. It is a sequence of instructions for CPython’s stack-based virtual machine. Each instruction consists of an opcode (operation code) and optionally an argument.
The compilation pipeline: source code → AST → bytecode → execution by the interpreter loop.
Inspecting Bytecode with dis
The dis module disassembles bytecode into human-readable form:
import dis
def greet(name):
return f"Hello, {name}!"
dis.dis(greet)
Output:
2 0 LOAD_CONST 1 ('Hello, ')
2 LOAD_FAST 0 (name)
4 FORMAT_VALUE 0
6 LOAD_CONST 2 ('!')
8 BUILD_STRING 3
10 RETURN_VALUE
Each line shows: line number, byte offset, opcode name, argument index, and (in parentheses) the resolved argument value.
Code Objects
Every function in Python has a code object accessible via func.__code__. Code objects are immutable and contain everything needed to execute the function:
code = greet.__code__
code.co_code # raw bytecode bytes
code.co_consts # constant values used by the function
code.co_varnames # local variable names
code.co_names # names used (globals, attributes)
code.co_stacksize # maximum stack depth needed
code.co_filename # source file
code.co_firstlineno # first line number
The co_code attribute contains the raw bytes. In Python 3.6+, each instruction is exactly 2 bytes (opcode + argument), a format called wordcode. Earlier versions used variable-length instructions.
The Stack Machine Model
CPython’s bytecode runs on a stack machine. There are no registers — all values are pushed onto and popped from a stack:
# Source: x + y * 2
# Bytecode equivalent:
LOAD_FAST x # stack: [x]
LOAD_FAST y # stack: [x, y]
LOAD_CONST 2 # stack: [x, y, 2]
BINARY_MULTIPLY # stack: [x, y*2]
BINARY_ADD # stack: [x + y*2]
Each operation pops its inputs from the stack and pushes its result back. LOAD_* pushes values, arithmetic ops consume the top items and push the result.
Common Opcodes
| Opcode | What It Does |
|---|---|
LOAD_FAST | Push a local variable onto the stack |
STORE_FAST | Pop the stack into a local variable |
LOAD_CONST | Push a constant value |
LOAD_GLOBAL | Push a global variable |
CALL_FUNCTION | Call a function (Python <3.12) |
CALL | Call a callable (Python 3.12+) |
RETURN_VALUE | Return the top of stack |
POP_JUMP_IF_FALSE | Conditional jump |
BINARY_OP | Arithmetic/bitwise operation (3.12+) |
Note: opcodes change between Python versions. Python 3.12 significantly reorganized many opcodes.
Modifying Bytecode
Since code objects are immutable, you modify bytecode by creating a new code object with altered attributes using code.replace() (Python 3.8+):
import types
def add(a, b):
return a + b
# Get the original code
original = add.__code__
# Create modified code (change constant values)
new_consts = tuple(
c * 2 if isinstance(c, int) else c
for c in original.co_consts
)
# Replace the code object
add.__code__ = original.replace(co_consts=new_consts)
For bytecode-level changes, you would modify co_code bytes directly — but this requires understanding the exact byte layout and keeping related attributes (constants, names, stack size) consistent.
Practical Uses
Profiling and coverage: Tools like coverage.py instrument bytecode to track which lines execute.
Debugging: The sys.settrace hook works at the bytecode level, called before each line.
Optimization: Libraries like numba analyze bytecode to understand what a function does before JIT-compiling it.
Testing: Some mocking libraries modify bytecode to redirect function calls.
Common Misconception
Developers often think bytecode is like machine code (compiled C or assembly). Python bytecode is much higher-level — it still references variable names, function objects, and Python-level operations. It is an intermediate format designed for CPython’s interpreter, not for a CPU. Different Python implementations (PyPy, Jython) may not use this bytecode format at all.
One thing to remember: Python bytecode is a sequence of stack-machine instructions stored in code objects — you can inspect it with dis and modify it by creating new code objects with code.replace(), but the bytecode format changes between Python versions.
See Also
- Python Abstract Syntax Trees How Python reads your code like a recipe before cooking it — the hidden tree structure behind every program.
- Python Code Objects Internals What Python actually creates when it reads your function — the hidden blueprint that tells the computer what to do.
- Python Compiler Pipeline The journey your Python code takes from text file to running program — explained like an assembly line.
- Python Frame Objects Why Python keeps a notepad for every running function — and how it remembers where it left off.
- Python Peephole Optimizer How Python quietly tidies up your code behind the scenes — making it faster without you lifting a finger.