Python Caching with lru_cache — Core Concepts

How Python's functools.lru_cache works under the hood, when to use it, and the traps that catch most developers off guard.

What lru_cache Actually Does

functools.lru_cache is a decorator that wraps a function and stores its return values in a dictionary keyed by the function’s arguments. When the function is called with arguments it has already seen, the cached result is returned directly — no re-execution.

The cache uses an LRU (Least Recently Used) eviction strategy: once the cache reaches its size limit, the entry that hasn’t been accessed for the longest time gets discarded to make room.

How to Use It

The basic usage is a single decorator line:

from functools import lru_cache

@lru_cache(maxsize=128)
def expensive_lookup(user_id):
    # Imagine a slow database call or complex calculation
    return database.fetch(user_id)

The maxsize parameter controls how many unique input-output pairs to store. The default is 128. Setting maxsize=None creates an unbounded cache that never evicts — useful for pure mathematical functions but dangerous for functions with millions of possible inputs.

Python 3.8 introduced @functools.cache as a shorthand for @lru_cache(maxsize=None).

When It Shines

Recursive algorithms — Fibonacci, dynamic programming, and tree traversals benefit enormously. A naive recursive Fibonacci runs in O(2ⁿ) time; with lru_cache, it becomes O(n).
Repeated API or database calls — If the same queries are made frequently and the data doesn’t change often.
Configuration lookups — Values that are computed once from environment variables or config files.

Inspecting the Cache

Every lru_cache-wrapped function gets a .cache_info() method that returns hit/miss statistics and current size. This is incredibly useful for tuning maxsize:

print(expensive_lookup.cache_info())
# CacheInfo(hits=47, misses=12, maxsize=128, currsize=12)

A high hit rate means caching is effective. If your hit rate is near zero, either the inputs are too varied or the function isn’t being called with repeating arguments.

You can also call .cache_clear() to reset the cache entirely — useful during testing or when the underlying data changes.

The Traps

Unhashable arguments kill it immediately. Since the cache is a dictionary, all arguments must be hashable. Passing a list or dictionary as an argument raises a TypeError. Convert lists to tuples first if you need to cache functions that conceptually take sequences.

Mutable return values are dangerous. The cache stores the actual object, not a copy. If the caller modifies the returned value, the cache itself gets corrupted. Return immutable types (tuples, frozensets, namedtuples) or make defensive copies.

Memory grows silently. With maxsize=None, every unique call permanently consumes memory. In a web server processing millions of unique user IDs, this leaks memory until the process crashes.

It’s per-process only. lru_cache doesn’t share across threads in any synchronized way (though it is thread-safe for reads/writes in CPython due to the GIL), and it definitely doesn’t share across processes. For distributed caching, you need Redis or Memcached.

Common Misconception

Many developers think lru_cache replaces application-level caching systems. It doesn’t. It’s a local, in-memory, single-process optimization. It has no TTL (time-to-live), no invalidation signals, and no persistence. It’s perfect for pure functions and short-lived computations — not for caching user sessions or database query results that change.

The one thing to remember: lru_cache is the fastest way to add memoization to a Python function, but it only works well when your inputs are hashable, your outputs are immutable, and your cache won’t grow unbounded.

pythonperformancecachingoptimization