Python 3.11 New Features — Deep Dive
Technical overview
Python 3.11 (October 2022) was the most performance-focused CPython release ever. The Faster CPython project delivered adaptive interpreter specialisation, and the language gained structured concurrency primitives and exception groups. This deep dive covers the internals that make it all work.
Adaptive interpreter specialisation
How it works
CPython 3.11’s interpreter monitors bytecode execution. After a bytecode instruction runs 8 times with the same types, the interpreter replaces it with a specialised variant:
BINARY_OP→BINARY_OP_ADD_INTwhen both operands areintLOAD_ATTR→LOAD_ATTR_INSTANCE_VALUEwhen the object layout is knownCALL→CALL_PY_EXACT_ARGSwhen calling a Python function with matching arity
If the type assumption breaks (a different type appears), it de-specialises back to the generic instruction with minimal overhead.
Quickening
Specialisation happens through “quickening” — the bytecode is mutated in-place in a copy of the code object. Key details:
import dis
import sys
def add(a, b):
return a + b
# Call enough times to trigger specialisation
for _ in range(10):
add(1, 2)
# Show adaptive bytecodes
dis.dis(add, adaptive=True)
# Output shows BINARY_OP_ADD_INT instead of BINARY_OP
The adaptive=True flag in dis.dis is new in 3.11 and reveals the specialised opcodes.
Measured improvements
The pyperformance benchmark suite showed:
| Benchmark | Speedup |
|---|---|
| richards | 1.26× |
| raytrace | 1.21× |
| regex_compile | 1.19× |
| json_loads | 1.09× |
| django_template | 1.26× |
| async_tree_io | 1.55× |
| spectral_norm | 1.08× |
I/O-bound workloads see smaller gains. Pure computation and template rendering benefit most.
Memory impact
Specialisation uses copy-on-write code objects, adding ~2-5% memory overhead. For most applications this is negligible. Memory-constrained environments can disable quickening with PYTHONDONTWRITEBYTECODE=1, though this also disables .pyc caching.
Zero-cost exception handling
In Python ≤3.10, entering a try block had a small overhead — the interpreter pushed a block to the exception handling stack. In 3.11, try blocks with no raised exception have zero runtime cost. The exception handler information is stored in a static table attached to the code object, not on the runtime stack.
import timeit
# This is essentially free in 3.11
def with_try():
try:
x = 1 + 2
except Exception:
pass
return x
# Virtually identical performance
def without_try():
x = 1 + 2
return x
This makes defensive try/except blocks in hot paths cost-free.
Exception groups — semantics and internals
The ExceptionGroup class hierarchy
BaseException
├── BaseExceptionGroup (can contain BaseException subclasses)
│ └── ExceptionGroup (can only contain Exception subclasses)
├── KeyboardInterrupt
└── Exception
└── ExceptionGroup
BaseExceptionGroup handles KeyboardInterrupt and SystemExit. ExceptionGroup (a subclass) only wraps Exception subclasses.
except* semantics in detail
try:
raise ExceptionGroup("errors", [
ValueError("a"),
TypeError("b"),
ValueError("c"),
])
except* ValueError as eg:
print(f"Caught {len(eg.exceptions)} ValueErrors")
# eg is ExceptionGroup("errors", [ValueError("a"), ValueError("c")])
except* TypeError as eg:
print(f"Caught {len(eg.exceptions)} TypeErrors")
Critical rules:
- All
except*clauses are checked — they don’t short-circuit likeexcept - The same exception cannot match two
except*clauses - Unmatched exceptions propagate as a new
ExceptionGroup - You cannot mix
exceptandexcept*in the sametryblock - Re-raising in
except*re-raises only the matched sub-group
ExceptionGroup.subgroup() and .split()
eg = ExceptionGroup("all", [ValueError("x"), TypeError("y"), ValueError("z")])
# subgroup: returns matching exceptions only
val_errors = eg.subgroup(ValueError)
# ExceptionGroup("all", [ValueError("x"), ValueError("z")])
# split: returns (match, rest)
match, rest = eg.split(TypeError)
# match = ExceptionGroup("all", [TypeError("y")])
# rest = ExceptionGroup("all", [ValueError("x"), ValueError("z")])
These methods preserve the nesting structure — nested ExceptionGroup instances are recursively filtered.
asyncio.TaskGroup internals
TaskGroup is built on ExceptionGroup:
async with asyncio.TaskGroup() as tg:
task1 = tg.create_task(coro1())
task2 = tg.create_task(coro2())
Implementation details:
- On
__aexit__, if any task raised, remaining tasks are cancelled viatask.cancel() - The group waits for all tasks to finish (including cancelled ones)
- All exceptions are collected into an
ExceptionGroup - If a single task raised, you still get an
ExceptionGroupwith one exception (consistent interface)
This is structured concurrency — the lifetime of spawned tasks is bound to the async with block. No orphaned tasks.
Migration from gather()
# Old pattern (silent failure risk)
results = await asyncio.gather(coro1(), coro2(), return_exceptions=True)
for r in results:
if isinstance(r, Exception):
handle(r)
# New pattern (exceptions cannot be silently ignored)
try:
async with asyncio.TaskGroup() as tg:
tg.create_task(coro1())
tg.create_task(coro2())
except* ConnectionError:
handle_connection_failures()
except* ValueError:
handle_validation_failures()
tomllib implementation notes
tomllib is a vendored copy of tomli (by Taneli Hukkinen). It’s a spec-compliant TOML v1.0.0 parser. Key characteristics:
- Read-only by design — writing TOML is a separate concern with different trade-offs
- Requires binary mode (
"rb") because TOML is always UTF-8 - Returns native Python types:
dict,list,int,float,str,bool,datetime.datetime,datetime.date,datetime.time
TypeVarTuple (PEP 646) — variadic generics
Enables typing for variadic structures like tensors:
from typing import TypeVarTuple, Generic, Unpack
Ts = TypeVarTuple("Ts")
class Tensor(Generic[*Ts]):
def __init__(self, *shape: Unpack[Ts]) -> None:
self.shape = shape
# Type checkers can verify shape compatibility
def matrix_multiply(
a: Tensor[int, int],
b: Tensor[int, int]
) -> Tensor[int, int]:
...
This was initially designed for NumPy/PyTorch tensor shape checking but has broader applications for any variadic generic container.
Migration strategy
- Benchmark first — run your test suite under 3.11 and measure wall-clock time improvement
- Search for
collections.Callable— was removed; usecollections.abc.Callable - Adopt
TaskGroupovergather()in new async code - Replace
tomliwithtomllib— use conditional import for 3.10 compat:try: import tomllib except ModuleNotFoundError: import tomli as tomllib - Add
except*only where you genuinely handle concurrent failures — don’t use it as a fancierexcept
The one thing to remember: Python 3.11’s adaptive specialisation proved that a 25% speedup was achievable without a JIT compiler — and exception groups finally gave Python the concurrent error handling model it needed.
See Also
- Python 310 New Features Python 3.10 gave programmers a shape-sorting machine, friendlier error messages, and cleaner ways to say 'this or that' in type hints.
- Python 312 New Features Python 3.12 made type hints shorter, f-strings more powerful, and started preparing Python's engine for a world without the GIL.
- Python 313 New Features Python 3.13 finally lets multiple tasks run at the same time for real, added a speed booster engine, and gave the interactive prompt a colourful makeover.
- Python Exception Groups Python's ExceptionGroup is like getting one report card that lists every mistake at once instead of stopping at the first one.
- Python Free Threading Nogil Python has always had a rule that only one thing can happen at a time — free threading finally changes that, like opening extra checkout lanes at the grocery store.