Bytecode Manipulation — Deep Dive

Advanced Python bytecode engineering — from instruction-level patching and the bytecode library to building custom optimizers and instrumentation tools.

Bytecode Format Internals

Wordcode (Python 3.6+)

Since Python 3.6, all instructions are exactly 2 bytes: one opcode byte and one argument byte. Instructions needing arguments larger than 255 use EXTENDED_ARG prefixes:

import dis
import opcode

def show_raw_bytecode(func):
    code = func.__code__.co_code
    i = 0
    while i < len(code):
        op = code[i]
        arg = code[i + 1]
        name = opcode.opname[op]
        print(f"  {i:4d}: {name:<25s} {arg}")
        i += 2

# EXTENDED_ARG example: argument = (ext_arg << 8) | arg
# For constant index 300:
#   EXTENDED_ARG 1        (accumulated arg = 256)
#   LOAD_CONST  44        (final arg = 256 + 44 = 300)

Python 3.12+ Changes

Python 3.12 introduced significant bytecode changes:

Adaptive specialization: The interpreter replaces generic opcodes with specialized versions at runtime (e.g., BINARY_OP → BINARY_OP_ADD_INT for integer addition)
Inline caches: Extra bytes after some instructions store runtime optimization data
CALL replaces CALL_FUNCTION/CALL_METHOD: Simplified calling convention
RESUME: New instruction at function entry points
Exception table: Replaces block-based exception handling with a separate table

# Python 3.12+ instruction layout for specialized ops:
# [opcode][arg][cache_0][cache_1]...
# Cache entries are invisible to dis but present in co_code

Working with the `bytecode` Library

The third-party bytecode library provides a high-level API for bytecode manipulation, abstracting away raw bytes:

from bytecode import Bytecode, Instr, ConcreteBytecode

def original(x):
    return x + 1

# Decompile to abstract bytecode
bc = Bytecode.from_code(original.__code__)
print(bc)  # list of Instr objects

# Modify: change "x + 1" to "x + 10"
for i, instr in enumerate(bc):
    if isinstance(instr, Instr) and instr.name == 'LOAD_CONST' and instr.arg == 1:
        bc[i] = Instr('LOAD_CONST', 10)

# Recompile to a code object
new_code = bc.to_code()
original.__code__ = new_code
print(original(5))  # 15 instead of 6

The library handles EXTENDED_ARG, jump target resolution, and stack depth calculation automatically.

Instruction-Level Patching

Injecting Function Calls

A common instrumentation pattern injects a call at the beginning of every function:

from bytecode import Bytecode, Instr

def inject_entry_hook(func, hook):
    """Add a call to hook() at the start of func."""
    bc = Bytecode.from_code(func.__code__)

    # Build injection sequence
    injection = [
        Instr('LOAD_GLOBAL', hook.__name__),
        Instr('CALL', 0),      # Python 3.12+
        Instr('POP_TOP'),       # discard return value
    ]

    # Find the insertion point (after RESUME in 3.12+)
    insert_idx = 0
    for i, instr in enumerate(bc):
        if isinstance(instr, Instr) and instr.name == 'RESUME':
            insert_idx = i + 1
            break

    for j, instr in enumerate(injection):
        bc.insert(insert_idx + j, instr)

    # Update the code object
    new_code = bc.to_code()
    func.__code__ = new_code

# Usage
def my_hook():
    print("Function entered!")

def my_function(x):
    return x * 2

inject_entry_hook(my_function, my_hook)
my_function(5)  # prints "Function entered!" then returns 10

Replacing Operations

def optimize_constant_math(func):
    """Pre-compute constant arithmetic at compile time."""
    bc = Bytecode.from_code(func.__code__)

    i = 0
    while i < len(bc) - 2:
        # Pattern: LOAD_CONST a, LOAD_CONST b, BINARY_OP
        if (isinstance(bc[i], Instr) and bc[i].name == 'LOAD_CONST' and
            isinstance(bc[i+1], Instr) and bc[i+1].name == 'LOAD_CONST' and
            isinstance(bc[i+2], Instr) and bc[i+2].name == 'BINARY_OP'):

            a, b = bc[i].arg, bc[i+1].arg
            if isinstance(a, (int, float)) and isinstance(b, (int, float)):
                op = bc[i+2].arg
                try:
                    # Compute at compile time
                    import operator
                    ops = {0: operator.add, 1: operator.and_, 5: operator.mul,
                           10: operator.sub, 11: operator.truediv}
                    if op in ops:
                        result = ops[op](a, b)
                        bc[i] = Instr('LOAD_CONST', result)
                        del bc[i+1:i+3]  # remove the other two instructions
                        continue
                except (ArithmeticError, KeyError):
                    pass
        i += 1

    func.__code__ = bc.to_code()

Building a Coverage Tool

Coverage tools work by instrumenting bytecode to record which lines execute:

import sys
from collections import defaultdict

class LineCoverage:
    def __init__(self):
        self.executed_lines = defaultdict(set)

    def trace(self, frame, event, arg):
        if event == 'line':
            filename = frame.f_code.co_filename
            lineno = frame.f_lineno
            self.executed_lines[filename].add(lineno)
        return self.trace

    def start(self):
        sys.settrace(self.trace)

    def stop(self):
        sys.settrace(None)

    def report(self):
        for filename, lines in self.executed_lines.items():
            print(f"{filename}: lines {sorted(lines)}")

Production tools like coverage.py go further — they rewrite bytecode to insert NOP instructions at branch points and use the co_linetable (Python 3.10+ line table format) for precise mapping.

Code Object Surgery

The `co_linetable` Format

Python 3.10+ uses a compressed line number table (co_linetable) instead of co_lnotab. The format encodes (bytecode offset, line number) pairs using variable-length entries:

def decode_linetable(code):
    """Decode Python 3.12+ location table entries."""
    # Use dis to get the mapping
    import dis
    for instruction in dis.get_instructions(code):
        if instruction.starts_line is not None:
            print(f"  offset {instruction.offset}: line {instruction.starts_line}")

Creating Code Objects from Scratch

import types
from bytecode import Bytecode, Instr, ConcreteBytecode

def create_function(name, params, body_instrs):
    """Create a function from raw bytecode instructions."""
    bc = Bytecode()
    bc.name = name
    bc.argnames = params
    bc.argcount = len(params)

    # Add RESUME for Python 3.12+
    import sys
    if sys.version_info >= (3, 12):
        bc.append(Instr('RESUME', 0))

    bc.extend(body_instrs)

    code = bc.to_code()
    return types.FunctionType(code, globals(), name)

# Create: def double(x): return x * 2
double = create_function('double', ['x'], [
    Instr('LOAD_FAST', 'x'),
    Instr('LOAD_CONST', 2),
    Instr('BINARY_OP', 5),   # 5 = multiply
    Instr('RETURN_VALUE'),
])

print(double(21))  # 42

Frame Hacking and Execution Context

Code objects execute within frame objects. You can manipulate frames for advanced debugging:

import sys

def get_caller_locals():
    """Access the calling function's local variables."""
    frame = sys._getframe(1)
    return dict(frame.f_locals)

def inject_variable(name, value):
    """Inject a variable into the caller's scope."""
    import ctypes
    frame = sys._getframe(1)
    frame.f_locals[name] = value
    ctypes.pythonapi.PyFrame_LocalsToFast(
        ctypes.py_object(frame), ctypes.c_int(0)
    )

Warning: Frame manipulation using ctypes is fragile and version-dependent. Python 3.13 changes frame internals significantly.

Performance Impact

Bytecode manipulation has performance implications:

Technique	Overhead	Use Case
`dis.dis()` inspection	Zero runtime cost	Development/debugging
`sys.settrace`	2-10x slowdown	Debuggers, coverage
Bytecode injection (static)	Per-instruction cost	Instrumentation
Adaptive specialization bypass	Prevents optimization	Avoid in hot paths

Key insight: static bytecode modification (done once before execution) adds only the cost of the injected instructions. Dynamic tracing (settrace) adds per-instruction overhead because the interpreter must call the trace function at every step.

Real-World Tools Using Bytecode Manipulation

coverage.py — line and branch coverage via bytecode analysis
codetransformer — AST-to-bytecode transformation library
numba — reads bytecode to understand function operations before JIT compilation
cloudpickle — serializes functions by capturing their code objects
forbiddenfruit — patches built-in types by manipulating C-level structures
crosshair — symbolic execution engine that interprets bytecode symbolically

Version Compatibility Strategy

Bytecode changes across Python versions are frequent. Production tools handle this by:

Version-gated opcode tables: Map opcode numbers per Python version
Using dis and opcode modules: These track version changes automatically
Abstracting through bytecode library: Handles format differences internally
Testing across versions: CI matrices covering 3.8–3.13

import sys

if sys.version_info >= (3, 12):
    CALL_OP = 'CALL'
    BINARY_OP = 'BINARY_OP'
else:
    CALL_OP = 'CALL_FUNCTION'
    BINARY_OP = None  # use specific ops like BINARY_ADD

One thing to remember: Python bytecode manipulation is a powerful technique for instrumentation, optimization, and metaprogramming — but the bytecode format changes significantly between Python versions, so always use abstraction layers like the bytecode library or dis module rather than working with raw bytes, and test thoroughly across target Python versions.

pythoncompiler-internalsbytecode