Python Deferred Computation — Deep Dive
Generator Protocol Internals
Generators implement the iterator protocol through a suspended frame object. When Python encounters yield, it:
- Saves the current execution frame (local variables, instruction pointer, exception state).
- Returns the yielded value to the caller.
- Suspends execution — the generator’s frame stays on the heap, not the call stack.
- On
next(), restores the frame and resumes from the instruction afteryield.
This frame suspension is key to understanding generator memory behavior. A generator holding references to large objects in its local scope keeps those objects alive as long as the generator exists:
def process_chunks(filepath):
with open(filepath, 'rb') as f:
while True:
chunk = f.read(1_048_576) # 1 MB
if not chunk:
break
yield transform(chunk)
# File handle closed here, but only when generator is exhausted or closed
gen = process_chunks("huge_file.bin")
next(gen) # File is now open and stays open
# ... if we forget about gen, the file handle leaks until GC collects it
Always call .close() on generators that manage resources, or use them within context managers.
Generator Send and Throw
Generators support bidirectional communication through .send() and .throw():
def accumulator():
total = 0
while True:
value = yield total
if value is None:
break
total += value
acc = accumulator()
next(acc) # Prime the generator → yields 0
acc.send(10) # → yields 10
acc.send(20) # → yields 30
acc.send(5) # → yields 35
This turns generators into coroutines that can receive input at each suspension point — the foundation of asyncio before async/await syntax existed.
Building a Lazy Evaluation Framework
For complex deferred computation, you can build a Lazy wrapper that delays any computation:
import threading
class Lazy:
"""Thread-safe lazy evaluation wrapper."""
_SENTINEL = object()
def __init__(self, func, *args, **kwargs):
self._func = func
self._args = args
self._kwargs = kwargs
self._result = self._SENTINEL
self._lock = threading.Lock()
self._exception = None
@property
def value(self):
if self._result is self._SENTINEL:
with self._lock:
if self._result is self._SENTINEL:
try:
self._result = self._func(
*self._args, **self._kwargs
)
except Exception as e:
self._exception = e
raise
if self._exception:
raise self._exception
return self._result
def is_evaluated(self):
return self._result is not self._SENTINEL
# Usage
config = Lazy(load_config_from_remote, "https://config.example.com")
# No HTTP call yet
if needs_config:
print(config.value) # HTTP call happens now, result cached
print(config.value) # Returns cached result instantly
This double-checked locking pattern ensures thread safety while avoiding lock acquisition on subsequent accesses.
Lazy Module Loading
Python 3.12+ supports lazy imports through importlib:
import importlib
import importlib.util
def lazy_import(name):
"""Import a module lazily — actual loading deferred until attribute access."""
spec = importlib.util.find_spec(name)
loader = importlib.util.LazyLoader(spec.loader)
spec.loader = loader
module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(module)
return module
# numpy isn't actually loaded until you access an attribute
np = lazy_import("numpy")
type(np) # <class 'module'> — but numpy code hasn't executed yet
np.array([1, 2, 3]) # NOW numpy fully loads
Instagram’s Python server uses lazy imports extensively. They reported a 60% reduction in startup time by deferring imports of modules that aren’t needed on every request path.
PEP 690: Lazy Imports
PEP 690 proposes a -L flag to make all imports lazy by default. While not yet accepted into CPython, the concept is used in Meta’s internal Python runtime (Cinder) and has demonstrated significant improvements:
- Server startup time reduced from 12s to 4s at Instagram scale
- Memory usage reduced by 40% for CLI tools that import large frameworks but only use a subset
Deferred Descriptor Chains
Descriptors enable per-attribute deferred computation with fine-grained control:
class LazyDescriptor:
def __init__(self, func):
self.func = func
self.attr_name = f"_lazy_{func.__name__}"
def __set_name__(self, owner, name):
self.attr_name = f"_lazy_{name}"
def __get__(self, obj, objtype=None):
if obj is None:
return self
try:
return getattr(obj, self.attr_name)
except AttributeError:
value = self.func(obj)
setattr(obj, self.attr_name, value)
return value
def __delete__(self, obj):
try:
delattr(obj, self.attr_name)
except AttributeError:
pass
class DataPipeline:
def __init__(self, raw_data):
self.raw_data = raw_data
@LazyDescriptor
def cleaned(self):
print("Cleaning data...")
return [x.strip().lower() for x in self.raw_data]
@LazyDescriptor
def tokenized(self):
print("Tokenizing...")
return [s.split() for s in self.cleaned]
@LazyDescriptor
def vocabulary(self):
print("Building vocabulary...")
words = set()
for tokens in self.tokenized:
words.update(tokens)
return sorted(words)
pipeline = DataPipeline([" Hello World ", "Python IS Great "])
# Nothing computed yet
print(pipeline.vocabulary)
# Prints: Cleaning data... Tokenizing... Building vocabulary...
# Then the sorted vocabulary
print(pipeline.vocabulary)
# Returns cached result — no recomputation
The __delete__ method allows cache invalidation — del pipeline.cleaned forces recomputation on next access, cascading through the dependency chain.
Deferred Iteration with itertools Recipes
Advanced deferred pipelines combine multiple itertools functions:
import itertools
from typing import Iterator, TypeVar
T = TypeVar('T')
def lazy_batch(iterable: Iterator[T], size: int) -> Iterator[list[T]]:
"""Lazily batch items without loading all into memory."""
it = iter(iterable)
while True:
batch = list(itertools.islice(it, size))
if not batch:
break
yield batch
def lazy_flatmap(func, iterable):
"""Lazily apply func to each item and flatten results."""
return itertools.chain.from_iterable(map(func, iterable))
def lazy_deduplicate(iterable, key=None):
"""Lazily deduplicate, preserving order, streaming."""
seen = set()
for item in iterable:
k = key(item) if key else item
if k not in seen:
seen.add(k)
yield item
# Compose a fully deferred pipeline
raw_lines = open("access.log") # Lazy file iteration
parsed = map(parse_log_line, raw_lines)
errors = filter(lambda r: r.status >= 500, parsed)
unique_ips = lazy_deduplicate(errors, key=lambda r: r.ip)
batches = lazy_batch(unique_ips, 100)
for batch in batches:
alert_service.send(batch)
This pipeline processes a multi-gigabyte log file with constant memory usage. Each line flows through the entire pipeline before the next line is read.
Performance: Eager vs Deferred Tradeoffs
Benchmarking on a list of 10 million integers, computing squares and filtering evens:
import time
data = range(10_000_000)
# Eager: list comprehensions
start = time.perf_counter()
squares = [x**2 for x in data]
evens = [x for x in squares if x % 2 == 0]
result_eager = evens[:10]
eager_time = time.perf_counter() - start
# ~3.2s, ~400 MB peak memory
# Deferred: generator pipeline
start = time.perf_counter()
squares = (x**2 for x in data)
evens = (x for x in squares if x % 2 == 0)
result_lazy = list(itertools.islice(evens, 10))
lazy_time = time.perf_counter() - start
# ~0.00003s, ~0 MB peak memory
When only 10 results are needed, the deferred approach is 100,000x faster because it processes exactly 10 items instead of 10 million.
But if all results are consumed:
# Eager: ~3.2s for full list
# Deferred: ~4.1s to iterate all generators (frame save/restore overhead)
Generators add ~20-30% overhead per item compared to list comprehensions when all items are consumed. The choice depends on whether you’re processing subsets or everything.
async Generators for Deferred I/O
Async generators combine deferred computation with non-blocking I/O:
async def fetch_pages(base_url, max_pages=100):
"""Lazily fetch paginated API results."""
async with httpx.AsyncClient() as client:
for page in range(1, max_pages + 1):
response = await client.get(
f"{base_url}?page={page}"
)
data = response.json()
if not data["results"]:
break
for item in data["results"]:
yield item
# Only fetches pages as items are consumed
async for user in fetch_pages("https://api.example.com/users"):
if user["role"] == "admin":
await notify(user)
break # Stops fetching further pages
This pattern is used in production API clients, database cursors (asyncpg), and streaming data pipelines.
The one thing to remember: Deferred computation in Python spans from simple generators to sophisticated lazy descriptor chains and async generators — the key architectural decision is identifying which computations might not be needed and deferring exactly those, while keeping eagerly-evaluated the hot paths that always run to completion.
See Also
- Python Algorithmic Complexity Understand Algorithmic Complexity through a practical analogy so your Python decisions become faster and clearer.
- Python Async Performance Tuning Making your async Python faster is like organizing a busy restaurant kitchen — it's all about flow.
- Python Benchmark Methodology Why timing Python code once means nothing, and how fair testing works like a science experiment.
- Python C Extension Performance How Python borrows C's speed for the hard parts — like hiring a specialist for the toughest job on the worksite.
- Python Caching Strategies Understand Python caching strategies with a shortcut-road analogy so your app gets faster without taking wrong turns.