Bytecode Manipulation — Core Concepts

What Is Python Bytecode?

Python bytecode is the intermediate representation that CPython’s compiler produces from your source code. It is a sequence of instructions for CPython’s stack-based virtual machine. Each instruction consists of an opcode (operation code) and optionally an argument.

The compilation pipeline: source code → AST → bytecode → execution by the interpreter loop.

Inspecting Bytecode with dis

The dis module disassembles bytecode into human-readable form:

import dis

def greet(name):
    return f"Hello, {name}!"

dis.dis(greet)

Output:

  2           0 LOAD_CONST               1 ('Hello, ')
              2 LOAD_FAST                0 (name)
              4 FORMAT_VALUE             0
              6 LOAD_CONST               2 ('!')
              8 BUILD_STRING             3
             10 RETURN_VALUE

Each line shows: line number, byte offset, opcode name, argument index, and (in parentheses) the resolved argument value.

Code Objects

Every function in Python has a code object accessible via func.__code__. Code objects are immutable and contain everything needed to execute the function:

code = greet.__code__

code.co_code        # raw bytecode bytes
code.co_consts      # constant values used by the function
code.co_varnames    # local variable names
code.co_names       # names used (globals, attributes)
code.co_stacksize   # maximum stack depth needed
code.co_filename    # source file
code.co_firstlineno # first line number

The co_code attribute contains the raw bytes. In Python 3.6+, each instruction is exactly 2 bytes (opcode + argument), a format called wordcode. Earlier versions used variable-length instructions.

The Stack Machine Model

CPython’s bytecode runs on a stack machine. There are no registers — all values are pushed onto and popped from a stack:

# Source: x + y * 2
# Bytecode equivalent:
LOAD_FAST    x       # stack: [x]
LOAD_FAST    y       # stack: [x, y]
LOAD_CONST   2       # stack: [x, y, 2]
BINARY_MULTIPLY      # stack: [x, y*2]
BINARY_ADD           # stack: [x + y*2]

Each operation pops its inputs from the stack and pushes its result back. LOAD_* pushes values, arithmetic ops consume the top items and push the result.

Common Opcodes

OpcodeWhat It Does
LOAD_FASTPush a local variable onto the stack
STORE_FASTPop the stack into a local variable
LOAD_CONSTPush a constant value
LOAD_GLOBALPush a global variable
CALL_FUNCTIONCall a function (Python <3.12)
CALLCall a callable (Python 3.12+)
RETURN_VALUEReturn the top of stack
POP_JUMP_IF_FALSEConditional jump
BINARY_OPArithmetic/bitwise operation (3.12+)

Note: opcodes change between Python versions. Python 3.12 significantly reorganized many opcodes.

Modifying Bytecode

Since code objects are immutable, you modify bytecode by creating a new code object with altered attributes using code.replace() (Python 3.8+):

import types

def add(a, b):
    return a + b

# Get the original code
original = add.__code__

# Create modified code (change constant values)
new_consts = tuple(
    c * 2 if isinstance(c, int) else c
    for c in original.co_consts
)

# Replace the code object
add.__code__ = original.replace(co_consts=new_consts)

For bytecode-level changes, you would modify co_code bytes directly — but this requires understanding the exact byte layout and keeping related attributes (constants, names, stack size) consistent.

Practical Uses

Profiling and coverage: Tools like coverage.py instrument bytecode to track which lines execute.

Debugging: The sys.settrace hook works at the bytecode level, called before each line.

Optimization: Libraries like numba analyze bytecode to understand what a function does before JIT-compiling it.

Testing: Some mocking libraries modify bytecode to redirect function calls.

Common Misconception

Developers often think bytecode is like machine code (compiled C or assembly). Python bytecode is much higher-level — it still references variable names, function objects, and Python-level operations. It is an intermediate format designed for CPython’s interpreter, not for a CPU. Different Python implementations (PyPy, Jython) may not use this bytecode format at all.

One thing to remember: Python bytecode is a sequence of stack-machine instructions stored in code objects — you can inspect it with dis and modify it by creating new code objects with code.replace(), but the bytecode format changes between Python versions.

pythoncompiler-internalsbytecode

See Also