Code Objects Internals — Deep Dive
The Code Object in CPython’s Architecture
In CPython, the code object is represented by PyCodeObject (defined in Include/cpython/code.h). It is one of the most important internal types — every execution scope (module, function, class, comprehension) gets one. The object is immutable after creation, which enables caching in .pyc files and safe sharing across threads.
Complete Attribute Reference
Here is the full set of co_ attributes available in Python 3.12+:
def example(a, b=10, *args, key=None, **kwargs):
x = a + b
name = "test"
return x
co = example.__code__
Argument metadata:
co_argcount— Positional argument count (2:aandb)co_posonlyargcount— Positional-only arguments (0 here)co_kwonlyargcount— Keyword-only arguments (1:key)co_flags— Bitmask indicating properties likeCO_VARARGS(0x04),CO_VARKEYWORDS(0x08),CO_GENERATOR(0x20),CO_COROUTINE(0x100)
Variable tables:
co_varnames— Local variables including arguments:('a', 'b', 'args', 'key', 'kwargs', 'x', 'name')co_cellvars— Variables captured by inner functions (closures)co_freevars— Variables received from an enclosing scopeco_names— Global/attribute names referenced
Bytecode and constants:
co_code— The raw bytecode bytesco_consts— Immutable constant pool:(10, None, 'test')co_stacksize— Maximum evaluation stack depth
Source mapping:
co_filename— Source file pathco_name— Scope name (e.g.,'example')co_qualname— Qualified name including nesting (Python 3.11+)co_firstlineno— First line numberco_lnotab— Legacy line number table (deprecated)co_linetable— New compact line table (Python 3.10+)
Bytecode Layout
The co_code attribute contains the actual instructions. In Python 3.6+, every instruction is exactly 2 bytes (word code): one byte for the opcode, one for the argument. Instructions that need larger arguments use EXTENDED_ARG prefixes.
import dis
def square(n):
return n * n
dis.dis(square)
# Output (Python 3.12):
# RESUME 0
# LOAD_FAST 0 (n)
# LOAD_FAST 0 (n)
# BINARY_OP 5 (*)
# RETURN_VALUE
The dis module decodes co_code and maps argument indices back to the variable tables. LOAD_FAST 0 means “load local variable at index 0 in co_varnames”, which is n.
The Constants Pool
co_consts is a tuple containing every literal value in the function: integers, floats, strings, bytes, None, True, False, tuples of constants, and nested code objects. The compiler deduplicates constants where possible — two identical string literals may share the same entry.
Nested code objects appear here because inner function definitions are themselves constants from the compiler’s perspective:
def outer():
def inner():
return 42
return inner
# outer.__code__.co_consts contains inner's code object
code_objects = [c for c in outer.__code__.co_consts
if isinstance(c, type(outer.__code__))]
print(code_objects[0].co_name) # 'inner'
print(code_objects[0].co_consts) # (None, 42)
Variable Scope Resolution
The compiler determines variable scope at compile time, not runtime. This decision is encoded in which table a name appears in:
co_varnames→LOAD_FAST/STORE_FAST(array index, fastest)co_cellvars→LOAD_DEREF/STORE_DEREF(closure cell)co_freevars→LOAD_DEREF(received from enclosing scope)co_names→LOAD_GLOBAL/LOAD_ATTR(dictionary lookup, slowest)
This is why local variables are faster than globals — LOAD_FAST is an array index operation, while LOAD_GLOBAL involves a dictionary lookup in the module namespace.
def closure_example():
count = 0
def increment():
nonlocal count
count += 1
return count
return increment
outer_co = closure_example.__code__
print(outer_co.co_cellvars) # ('count',)
inner_co = [c for c in outer_co.co_consts
if hasattr(c, 'co_code')][0]
print(inner_co.co_freevars) # ('count',)
Creating Code Objects Programmatically
You can construct code objects using types.CodeType, though the constructor signature is extensive:
import types
# A minimal code object that returns 42
bytecode = bytes([
100, 0, # LOAD_CONST 0 (42)
83, 0, # RETURN_VALUE
])
code = types.CodeType(
0, # argcount
0, # posonlyargcount
0, # kwonlyargcount
0, # nlocals
1, # stacksize
0, # flags
bytecode, # codestring
(42,), # constants
(), # names
(), # varnames
'<dynamic>', # filename
'answer', # name
'answer', # qualname
1, # firstlineno
b'', # linetable
b'', # exceptiontable
(), # freevars
(), # cellvars
)
func = types.FunctionType(code, {})
print(func()) # 42
The code.replace() method (Python 3.8+) is safer for modifying existing code objects — it copies all attributes and lets you override specific ones:
new_code = square.__code__.replace(co_name='square_v2')
The Line Number Table
Tracebacks, debuggers, and coverage tools need to map bytecode offsets back to source lines. Python 3.10 replaced co_lnotab with co_linetable, a more compact encoding that also supports column-level precision (Python 3.11+ adds co_positions() for exact column ranges).
# Python 3.11+
for pos in square.__code__.co_positions():
print(pos) # (lineno, end_lineno, col_offset, end_col_offset)
This granularity powers the precise error messages in Python 3.11+ that underline the exact expression that caused an error.
Exception Tables (Python 3.11+)
Python 3.11 introduced co_exceptiontable, replacing the old block stack mechanism. This table maps bytecode ranges to exception handlers, enabling zero-cost exception handling — no runtime overhead when exceptions are not raised.
.pyc Files: Serialized Code Objects
When Python imports a module, it serializes the module’s code object (and all nested code objects) using the marshal module, writing the result to __pycache__/<name>.cpython-3XX.pyc. The file contains a magic number (identifying the Python version), a timestamp/hash for invalidation, and the marshalled code object. On subsequent imports, Python loads the .pyc directly, skipping parsing and compilation entirely.
Security Implications
Code objects can be pickled and marshalled. Deserializing an untrusted code object is as dangerous as eval() — the bytecode can contain arbitrary instructions. Never load .pyc files or marshalled code from untrusted sources. The compile() built-in is safe (it only accepts source strings), but marshal.loads() can produce executable code objects from arbitrary bytes.
Practical Debugging Patterns
def inspect_code(func):
co = func.__code__
print(f"Name: {co.co_name}")
print(f"Args: {co.co_argcount} positional, {co.co_kwonlyargcount} kw-only")
print(f"Locals: {co.co_varnames}")
print(f"Constants: {co.co_consts}")
print(f"Stack size: {co.co_stacksize}")
print(f"Flags: {co.co_flags:#06x}")
print(f"Bytecode: {co.co_code.hex()}")
This kind of introspection is invaluable when debugging import hooks, understanding optimizer behavior, or building code analysis tools.
One thing to remember: The code object is the bridge between your source code and the interpreter’s execution engine. Understanding its structure — the bytecode, the constants pool, the variable tables, and the scope resolution rules — gives you insight into Python’s performance characteristics and opens the door to advanced metaprogramming, debugging, and tooling.
See Also
- Python Abstract Syntax Trees How Python reads your code like a recipe before cooking it — the hidden tree structure behind every program.
- Python Bytecode Manipulation How Python secretly translates your code into tiny instructions — and how you can peek at and change those instructions yourself.
- Python Compiler Pipeline The journey your Python code takes from text file to running program — explained like an assembly line.
- Python Frame Objects Why Python keeps a notepad for every running function — and how it remembers where it left off.
- Python Peephole Optimizer How Python quietly tidies up your code behind the scenes — making it faster without you lifting a finger.