Python sys.settrace Hooks — Deep Dive

Building a minimal debugger

The best way to understand sys.settrace is to build something with it. Here is a minimal breakpoint debugger:

import sys

class MiniDebugger:
    def __init__(self):
        self.breakpoints = set()  # (filename, lineno) pairs
        self.enabled = True

    def add_breakpoint(self, filename, lineno):
        self.breakpoints.add((filename, lineno))

    def global_trace(self, frame, event, arg):
        if event == "call":
            # Only trace files where we have breakpoints
            filename = frame.f_code.co_filename
            if any(bp[0] == filename for bp in self.breakpoints):
                return self.local_trace
            return None  # Skip tracing this function entirely
        return None

    def local_trace(self, frame, event, arg):
        if event == "line":
            key = (frame.f_code.co_filename, frame.f_lineno)
            if key in self.breakpoints:
                self._interact(frame)
        return self.local_trace

    def _interact(self, frame):
        print(f"\n*** Breakpoint at {frame.f_code.co_filename}:{frame.f_lineno}")
        print(f"    Function: {frame.f_code.co_name}")
        print(f"    Locals: {dict(frame.f_locals)}")
        input("    Press Enter to continue...")

    def start(self):
        sys.settrace(self.global_trace)

    def stop(self):
        sys.settrace(None)

The critical optimization is in global_trace: returning None for functions outside our interest prevents Python from calling the local trace function on every line of library code.

Frame object internals

The frame object passed to trace functions exposes CPython’s internal execution state:

def inspect_frame(frame, event, arg):
    if event == "call":
        code = frame.f_code
        print(f"Function: {code.co_name}")
        print(f"  File: {code.co_filename}:{frame.f_lineno}")
        print(f"  Args: {code.co_varnames[:code.co_argcount]}")
        print(f"  Free vars: {code.co_freevars}")
        print(f"  Stack size: {code.co_stacksize}")
        print(f"  Flags: {code.co_flags:#x}")

        # Access the caller's frame
        if frame.f_back:
            print(f"  Called from: {frame.f_back.f_code.co_name}:"
                  f"{frame.f_back.f_lineno}")

        return inspect_frame
    return inspect_frame

Key frame attributes for tracing:

  • f_locals — local variable dict (read-only in practice; modifications may not take effect in CPython due to fast locals optimization)
  • f_globals — global variable dict
  • f_code — the code object containing bytecode and metadata
  • f_lineno — current line number (writable — you can change execution flow)
  • f_back — the calling frame (allows stack walking)
  • f_trace_lines — boolean controlling whether "line" events fire (Python 3.7+)
  • f_trace_opcodes — boolean enabling "opcode" events (Python 3.7+)

The opcode event (Python 3.7+)

Setting frame.f_trace_opcodes = True fires an event for every bytecode instruction, not just every source line:

import sys
import dis

def opcode_tracer(frame, event, arg):
    if event == "call":
        frame.f_trace_opcodes = True
        return opcode_tracer
    if event == "opcode":
        code = frame.f_code
        offset = frame.f_lasti
        instruction = dis.Instruction(
            opname="",
            opcode=code.co_code[offset],
            arg=None, argval=None, argrepr="",
            offset=offset,
            starts_line=None,
            is_jump_target=False,
        )
        print(f"  opcode at offset {offset}: "
              f"{dis.opname[code.co_code[offset]]}")
        return opcode_tracer
    return opcode_tracer

This is extremely slow but useful for security analysis, taint tracking, and educational tools that visualize bytecode execution.

Building a code coverage tool

Coverage tools use settrace to track which lines execute:

import sys
from collections import defaultdict

class Coverage:
    def __init__(self, source_dirs):
        self.source_dirs = source_dirs
        self.executed_lines = defaultdict(set)
        self._original_trace = None

    def _should_trace(self, filename):
        return any(filename.startswith(d) for d in self.source_dirs)

    def _global_trace(self, frame, event, arg):
        if event == "call" and self._should_trace(frame.f_code.co_filename):
            return self._local_trace
        return None

    def _local_trace(self, frame, event, arg):
        if event == "line":
            self.executed_lines[frame.f_code.co_filename].add(frame.f_lineno)
        return self._local_trace

    def start(self):
        self._original_trace = sys.gettrace()
        sys.settrace(self._global_trace)

    def stop(self):
        sys.settrace(self._original_trace)

    def report(self):
        for filename, lines in sorted(self.executed_lines.items()):
            total_lines = self._count_executable_lines(filename)
            covered = len(lines)
            pct = (covered / total_lines * 100) if total_lines else 0
            print(f"{filename}: {covered}/{total_lines} ({pct:.1f}%)")

    def _count_executable_lines(self, filename):
        import ast
        with open(filename) as f:
            tree = ast.parse(f.read())
        return len({
            node.lineno for node in ast.walk(tree)
            if hasattr(node, 'lineno')
        })

Real coverage tools like coverage.py use C extensions to reduce overhead and handle edge cases like multi-line expressions and decorator lines.

sys.monitoring: the Python 3.12+ replacement

Python 3.12 introduced sys.monitoring (PEP 669) as a lower-overhead alternative to sys.settrace. The key differences:

import sys

# Register a tool (each tool gets an ID 0-5)
DEBUGGER_ID = sys.monitoring.DEBUGGER

# Enable specific events instead of getting all events
sys.monitoring.use_tool_id(DEBUGGER_ID, "my_debugger")
sys.monitoring.set_events(
    DEBUGGER_ID,
    sys.monitoring.events.LINE | sys.monitoring.events.CALL
)

# Register per-event callbacks
def on_line(code, line_number):
    print(f"Line {line_number} in {code.co_name}")

def on_call(code, instruction_offset, callable_obj, arg0):
    print(f"Calling {code.co_name}")

sys.monitoring.register_callback(
    DEBUGGER_ID, sys.monitoring.events.LINE, on_line
)
sys.monitoring.register_callback(
    DEBUGGER_ID, sys.monitoring.events.CALL, on_call
)

The performance advantage is dramatic: sys.monitoring uses a flag-based system that can disable events per-code-object. When no events are active for a function, the overhead is near zero — compared to settrace’s constant per-line check. Benchmarks show 10–100x less overhead for selective tracing.

Migration strategy from settrace to sys.monitoring

settrace conceptsys.monitoring equivalent
Global trace functionregister_callback for CALL events
Local trace functionregister_callback for LINE events + set_local_events
Return None to skipsys.monitoring.DISABLE return value
frame.f_trace_opcodesINSTRUCTION event type
sys.setprofileCALL + RETURN + C_RAISE events

For code that needs to support both Python 3.11 and 3.12+, feature-detect:

import sys

if hasattr(sys, "monitoring"):
    # Use sys.monitoring (3.12+)
    pass
else:
    # Fall back to sys.settrace
    pass

Tracing and generators

Generators interact with tracing in subtle ways. Each yield triggers a "return" event (with the yielded value as arg), and each next() call triggers a "call" event. This means a generator that yields 1000 times fires 2000 trace events. For async generators, "return" fires on both yield and await.

Security implications

sys.settrace can observe every function call, return value, and exception in a process. Malicious code could install a trace function to intercept passwords, API keys, or session tokens. In security-sensitive environments:

  • Use sys.flags.no_debug_ranges (Python 3.11+) to strip debug info
  • Audit for unexpected sys.gettrace() results
  • Consider using -O flag which disables some debug features (though not settrace itself)

The one thing to remember: sys.settrace is the hook that makes Python’s entire debugging ecosystem possible, but its per-event overhead makes sys.monitoring (Python 3.12+) the preferred choice for new tools that need selective, low-overhead code instrumentation.

pythondebuggingintrospection

See Also

  • Python Ast Module Code Analysis How Python's ast module reads your code like a grammar teacher diagrams sentences — turning source text into a tree you can inspect and change.
  • Python Dis Module Bytecode How Python's dis module lets you peek at the secret instructions your computer actually runs when it executes your Python code.
  • Python Gc Module Internals How Python's garbage collector automatically cleans up memory you are no longer using — like a tidy roommate for your program.
  • Python Importlib Custom Loaders How Python's importlib lets you teach Python to load code from anywhere — databases, zip files, the internet, or even generated on the fly.
  • Python Site Customization How Python's site module sets up your environment before your code even starts running — the invisible first step of every Python program.