Memoization Techniques — Deep Dive
Inside functools.lru_cache
CPython’s lru_cache uses a circular doubly-linked list for O(1) LRU eviction and a dictionary for O(1) lookups. When maxsize is None, it skips the linked list entirely and uses a plain dict — this is why @cache (unlimited) is faster than @lru_cache(maxsize=very_large_number).
from functools import lru_cache
@lru_cache(maxsize=256)
def compute(x):
return x ** 2
# Inspect internals
compute(10)
info = compute.cache_info()
print(info) # CacheInfo(hits=0, misses=1, maxsize=256, currsize=1)
# Clear cache
compute.cache_clear()
# Access the wrapped function directly
compute.__wrapped__(10) # bypasses cache
Thread safety
lru_cache is thread-safe in CPython — the GIL protects the cache dict during updates. However, this means two threads calling the same function with the same args simultaneously will both compute the result (cache-miss race). The second result overwrites the first, which is harmless for pure functions. For expensive computations, consider a lock-based approach.
Building a custom memoization decorator
Basic decorator
from functools import wraps
def memoize(func):
cache = {}
@wraps(func)
def wrapper(*args, **kwargs):
key = (args, tuple(sorted(kwargs.items())))
if key not in cache:
cache[key] = func(*args, **kwargs)
return cache[key]
wrapper.cache = cache
wrapper.cache_clear = cache.clear
return wrapper
TTL (time-to-live) memoization
When cached values should expire:
import time
from functools import wraps
def memoize_ttl(ttl_seconds=300):
def decorator(func):
cache = {}
@wraps(func)
def wrapper(*args, **kwargs):
key = (args, tuple(sorted(kwargs.items())))
now = time.monotonic()
if key in cache:
result, timestamp = cache[key]
if now - timestamp < ttl_seconds:
return result
result = func(*args, **kwargs)
cache[key] = (result, now)
return result
wrapper.cache = cache
wrapper.cache_clear = cache.clear
return wrapper
return decorator
@memoize_ttl(ttl_seconds=60)
def fetch_exchange_rate(currency):
# Expensive API call — cache for 60 seconds
import requests
resp = requests.get(f"https://api.example.com/rates/{currency}")
return resp.json()["rate"]
Size-limited dictionary cache
For when you want a simple max-size cache without LRU complexity:
from collections import OrderedDict
from functools import wraps
def memoize_maxsize(maxsize=1024):
def decorator(func):
cache = OrderedDict()
@wraps(func)
def wrapper(*args):
if args in cache:
cache.move_to_end(args) # mark as recently used
return cache[args]
result = func(*args)
cache[args] = result
if len(cache) > maxsize:
cache.popitem(last=False) # evict oldest
return result
wrapper.cache = cache
wrapper.cache_clear = cache.clear
return wrapper
return decorator
Memoization patterns for dynamic programming
Top-down with @lru_cache
The standard approach — write the recursive solution, then add caching:
from functools import lru_cache
@lru_cache(maxsize=None)
def edit_distance(s1, s2, i, j):
if i == 0:
return j
if j == 0:
return i
if s1[i-1] == s2[j-1]:
return edit_distance(s1, s2, i-1, j-1)
return 1 + min(
edit_distance(s1, s2, i-1, j), # delete
edit_distance(s1, s2, i, j-1), # insert
edit_distance(s1, s2, i-1, j-1) # replace
)
# Must pass strings (hashable), not lists
dist = edit_distance("kitten", "sitting", 6, 7)
Converting to bottom-up tabulation
Memoized recursion can always be converted to iterative tabulation. The process:
- Identify the state variables (the function arguments that change)
- Create a table with those dimensions
- Fill base cases
- Fill the table in an order where dependencies are already computed
def edit_distance_tabular(s1, s2):
m, n = len(s1), len(s2)
dp = [[0] * (n + 1) for _ in range(m + 1)]
for i in range(m + 1):
dp[i][0] = i
for j in range(n + 1):
dp[0][j] = j
for i in range(1, m + 1):
for j in range(1, n + 1):
if s1[i-1] == s2[j-1]:
dp[i][j] = dp[i-1][j-1]
else:
dp[i][j] = 1 + min(
dp[i-1][j],
dp[i][j-1],
dp[i-1][j-1]
)
return dp[m][n]
Space optimization: rolling array
When the DP recurrence only depends on the previous row, you can use O(n) space instead of O(mn):
def edit_distance_optimized(s1, s2):
m, n = len(s1), len(s2)
prev = list(range(n + 1))
curr = [0] * (n + 1)
for i in range(1, m + 1):
curr[0] = i
for j in range(1, n + 1):
if s1[i-1] == s2[j-1]:
curr[j] = prev[j-1]
else:
curr[j] = 1 + min(prev[j], curr[j-1], prev[j-1])
prev, curr = curr, prev
return prev[n]
Profiling cache efficiency
Measuring hit rates
from functools import lru_cache
@lru_cache(maxsize=None)
def complex_calc(n):
if n <= 1:
return n
return complex_calc(n - 1) + complex_calc(n - 2) + complex_calc(n - 3)
complex_calc(100)
info = complex_calc.cache_info()
hit_rate = info.hits / (info.hits + info.misses) * 100
print(f"Hit rate: {hit_rate:.1f}%")
print(f"Cache size: {info.currsize}")
A high hit rate (>50%) confirms memoization is beneficial. A very low hit rate (<10%) suggests the function is rarely called with repeated arguments and memoization may be adding overhead without benefit.
Memory profiling
import sys
from functools import lru_cache
@lru_cache(maxsize=None)
def cached_func(n):
return list(range(n))
for i in range(1000):
cached_func(i)
# Rough memory estimate
# Each cache entry stores: key tuple + result + linked list node
cache_size = cached_func.cache_info().currsize
print(f"Entries: {cache_size}")
For precise memory measurement, use tracemalloc:
import tracemalloc
tracemalloc.start()
# Run memoized computations
for i in range(10000):
cached_func(i)
current, peak = tracemalloc.get_traced_memory()
print(f"Current: {current / 1024:.1f} KB")
print(f"Peak: {peak / 1024:.1f} KB")
tracemalloc.stop()
Advanced patterns
Memoization with mutable arguments
@lru_cache requires hashable arguments. For functions that take lists or dicts:
from functools import lru_cache
def solve_knapsack(weights, values, capacity):
# Convert to tuple for hashing
w_tuple = tuple(weights)
v_tuple = tuple(values)
@lru_cache(maxsize=None)
def dp(i, remaining):
if i == 0 or remaining == 0:
return 0
if w_tuple[i-1] > remaining:
return dp(i-1, remaining)
return max(
dp(i-1, remaining),
v_tuple[i-1] + dp(i-1, remaining - w_tuple[i-1])
)
return dp(len(weights), capacity)
Disk-based memoization
For results that should persist across program runs:
import json
import os
import hashlib
from functools import wraps
def disk_memoize(cache_dir=".cache"):
os.makedirs(cache_dir, exist_ok=True)
def decorator(func):
@wraps(func)
def wrapper(*args, **kwargs):
key = hashlib.sha256(
json.dumps((args, kwargs), sort_keys=True).encode()
).hexdigest()
cache_path = os.path.join(cache_dir, f"{func.__name__}_{key}.json")
if os.path.exists(cache_path):
with open(cache_path) as f:
return json.load(f)
result = func(*args, **kwargs)
with open(cache_path, "w") as f:
json.dump(result, f)
return result
return wrapper
return decorator
@disk_memoize(cache_dir="/tmp/my_cache")
def expensive_analysis(dataset_id):
# Takes 10 minutes — results cached to disk
pass
Selective cache invalidation
from functools import lru_cache
@lru_cache(maxsize=None)
def get_user_data(user_id):
# Fetch from database
pass
def update_user(user_id, data):
# Update database
save_to_db(user_id, data)
# Invalidate the entire cache (no selective invalidation in lru_cache)
get_user_data.cache_clear()
For selective invalidation, use a manual cache:
_user_cache = {}
def get_user_cached(user_id):
if user_id not in _user_cache:
_user_cache[user_id] = fetch_from_db(user_id)
return _user_cache[user_id]
def invalidate_user(user_id):
_user_cache.pop(user_id, None)
Benchmarks: memoization impact
import timeit
from functools import lru_cache
def fib_no_cache(n):
if n <= 1:
return n
return fib_no_cache(n - 1) + fib_no_cache(n - 2)
@lru_cache(maxsize=None)
def fib_cached(n):
if n <= 1:
return n
return fib_cached(n - 1) + fib_cached(n - 2)
# n=30: without cache ~0.5s, with cache ~0.00001s
# n=35: without cache ~5s, with cache ~0.00001s
# n=40: without cache ~60s, with cache ~0.00001s
The improvement is not a constant factor — it is a change in complexity class from O(2ⁿ) to O(n). This is why memoization is transformative, not just an optimization.
One thing to remember: Memoization converts exponential-time recursive algorithms into polynomial-time algorithms by eliminating redundant computation. In Python, @cache or @lru_cache gives you this power with a single line. Use it whenever you spot overlapping subproblems, but always verify the hit rate to confirm it is actually helping.
See Also
- Python Backtracking Algorithms How computers solve puzzles by trying every path, backing up when stuck, and never giving up until they find the answer.
- Python Big O Complexity Analysis Why some programs finish in a blink while others take forever — explained with pizza delivery and toy cleanup.
- Python Binary Search Implementation Find anything in a sorted list insanely fast using the same trick you already use with dictionaries and phone books.
- Python Dynamic Programming The clever trick of remembering answers you already figured out so you never solve the same puzzle twice.
- Python Graph Algorithms How computers navigate maps, friendships, and connections using dots and lines — explained without any math.