Python Caching with lru_cache — Deep Dive
Internal Data Structure
Under the hood, CPython’s lru_cache maintains two data structures working in tandem:
- A dictionary mapping argument tuples to cached results (O(1) lookup).
- A circular doubly-linked list that tracks access order for eviction.
When a cached function is called with known arguments, the dictionary provides the result immediately. The corresponding node in the linked list is then moved to the head (most recently used position). When the cache is full and a new entry arrives, the tail node (least recently used) is removed from both the list and the dictionary.
This design gives O(1) time complexity for both cache hits and evictions — the same strategy used by operating systems for page replacement and by databases like Redis for key eviction.
The Make Key Function
Before looking up arguments in the dictionary, lru_cache converts them into a hashable key using an internal _make_key function. This function:
- Flattens positional and keyword arguments into a single tuple
- Adds type markers between positional and keyword argument sections
- Includes argument names for keyword arguments so that
f(a=1, b=2)andf(b=2, a=1)produce the same key
from functools import lru_cache
@lru_cache(maxsize=32)
def add(a, b):
return a + b
# These all produce the same cache key:
add(1, 2)
add(a=1, b=2)
add(1, b=2)
This normalization has a cost: _make_key runs on every call, cached or not. For functions with many arguments, this overhead can become significant relative to the function’s actual computation time.
Typed Mode
By default, lru_cache treats 3 (int) and 3.0 (float) as the same key because they compare equal in Python:
@lru_cache(maxsize=128)
def compute(x):
return x * 2
compute(3) # Computes and caches
compute(3.0) # Returns cached result from compute(3)
Setting typed=True forces separate cache entries for different types:
@lru_cache(maxsize=128, typed=True)
def compute(x):
return x * 2
compute(3) # Caches under int key
compute(3.0) # Separate entry, caches under float key
Use typed=True when your function behaves differently for different types (e.g., serialization functions) or when type correctness matters more than cache density.
Thread Safety
In CPython, lru_cache is thread-safe thanks to a C-level lock (not the GIL — there’s an actual dedicated lock). The lock is held during cache lookups and updates, meaning:
- Multiple threads can safely call the same cached function concurrently
- Under high contention, the lock becomes a bottleneck
- The function itself may execute multiple times for the same arguments if several threads call it simultaneously before any result is cached (no thundering-herd protection)
For write-heavy workloads with many unique arguments, consider per-thread caching or a concurrent dictionary instead.
Advanced Patterns
Caching with TTL
lru_cache has no built-in TTL. A common workaround uses a time-bucketing trick:
import time
from functools import lru_cache
def ttl_hash(seconds=300):
"""Return the same value within each time window."""
return round(time.time() / seconds)
@lru_cache(maxsize=128)
def get_config(key, _ttl_hash=None):
return load_config_from_disk(key)
# Usage — cache refreshes every 5 minutes
result = get_config("database_url", _ttl_hash=ttl_hash())
The _ttl_hash parameter changes value every 300 seconds, causing a cache miss and re-computation. Old entries get evicted naturally by the LRU policy.
Caching Instance Methods
Decorating a method with lru_cache directly causes memory leaks because self becomes part of the cache key, holding a strong reference to every instance:
class DataProcessor:
@lru_cache(maxsize=64) # DON'T — leaks instances
def process(self, data_id):
return heavy_computation(data_id)
The standard fix uses __hash__ and __eq__ on the class, or moves the cache outside:
class DataProcessor:
def __init__(self):
self.process = lru_cache(maxsize=64)(self._process)
def _process(self, data_id):
return heavy_computation(data_id)
Or use a module-level function that accepts the relevant data instead of self.
Combining with @property
Caching a property computation avoids repeated work on attribute access:
from functools import cached_property # Python 3.8+
class Report:
@cached_property
def summary(self):
return expensive_aggregation(self.data)
cached_property computes the value once, stores it as an instance attribute, and never recomputes. Unlike lru_cache, it has no size limit or eviction — it’s a one-shot cache per instance.
Benchmarking the Overhead
The overhead of lru_cache comes from key generation and dictionary lookup. For a function that executes in under 100 nanoseconds (like adding two numbers), the caching overhead actually makes it slower. The break-even point is roughly:
| Function cost | Cache benefit |
|---|---|
| < 1 μs | Overhead likely exceeds savings |
| 1–100 μs | Moderate benefit if hit rate is high |
| > 100 μs | Almost always beneficial |
| > 1 ms | Dramatic speedup possible |
Profile with cache_info() before assuming caching helps.
functools.cache vs lru_cache
Python 3.9 added @functools.cache, which is equivalent to @lru_cache(maxsize=None):
- No eviction: every unique call is cached forever
- Slightly faster per-call (no linked list maintenance)
- Best for pure functions with bounded input domains (e.g., mathematical computations with known ranges)
- Dangerous for functions called with unbounded unique inputs
Use @cache when you know the input space is finite. Use @lru_cache(maxsize=N) when you need bounded memory.
Real-World Performance
Instagram’s backend team reported that adding lru_cache to their Django template tag resolution reduced CPU usage by 12% on some endpoints. The pattern works especially well in request-response cycles where the same computations repeat across requests within a single process lifecycle.
Dropbox uses memoization extensively in their Python desktop client for file metadata lookups — computing hashes and checking modification times are expensive operations that benefit from per-session caching.
When Not to Use It
- Side-effectful functions — Functions that write to databases, send emails, or modify global state should never be cached. The cache would suppress repeated side effects.
- Functions with large return values — Caching 100MB DataFrames at
maxsize=128means potentially 12.8 GB of memory held in cache. - High cardinality inputs — If every call has unique arguments, the cache just wastes memory on entries that never get hit.
The one thing to remember: lru_cache is an O(1) memoization layer backed by a hash map and doubly-linked list — powerful for pure, deterministic functions with repeating inputs, but it requires careful attention to memory, mutability, and argument hashability in production.
See Also
- Python Algorithmic Complexity Understand Algorithmic Complexity through a practical analogy so your Python decisions become faster and clearer.
- Python Async Performance Tuning Making your async Python faster is like organizing a busy restaurant kitchen — it's all about flow.
- Python Benchmark Methodology Why timing Python code once means nothing, and how fair testing works like a science experiment.
- Python C Extension Performance How Python borrows C's speed for the hard parts — like hiring a specialist for the toughest job on the worksite.
- Python Caching Strategies Understand Python caching strategies with a shortcut-road analogy so your app gets faster without taking wrong turns.