functools Module — Deep Dive

Advanced functools internals: lru_cache implementation, singledispatch method support, cached_property threading, and production optimization patterns

lru_cache Internals

The C implementation of lru_cache uses a circular doubly-linked list for O(1) access ordering, combined with a dictionary for O(1) lookup:

Dictionary: {args_key: linked_list_node}
Linked List: HEAD ↔ node1 ↔ node2 ↔ ... ↔ TAIL
              (newest)                    (oldest)

When a cached function is called:

Arguments are converted to a hashable key (tuple of args + kwargs sentinel + types)
If the key exists in the dict: move the node to the head (most recently used), return the cached value
If not: call the function, create a new node at the head, evict the tail if at capacity

The key creation is the performance bottleneck. Every call must hash all arguments:

# What happens internally for cache key creation:
# f(1, 2, x=3) → key = (1, 2, _KWD_MARK, 'x', 3)

When lru_cache Breaks

Arguments must be hashable. These fail:

@lru_cache
def process(data: list):  # TypeError: unhashable type: 'list'
    return sum(data)

Workaround — convert to a hashable type:

@lru_cache
def process(data: tuple):  # tuples are hashable
    return sum(data)

process(tuple([1, 2, 3]))

Thread Safety

lru_cache is thread-safe for reads and writes — it uses a reentrant lock internally. However, the underlying function can still be called simultaneously by multiple threads (on cache miss). If the function has side effects, this matters:

@lru_cache(maxsize=100)
def get_user(user_id):
    # Multiple threads might call this simultaneously for the same user_id
    # Only one result gets cached; others are discarded
    return database.fetch_user(user_id)

Cache Invalidation

@lru_cache(maxsize=256)
def get_config(key):
    return load_from_file(key)

# Clear entire cache
get_config.cache_clear()

# Inspect cache
info = get_config.cache_info()
# CacheInfo(hits=45, misses=12, maxsize=256, currsize=12)

There’s no way to invalidate a single key. If you need that, use a dictionary-based cache instead.

cached_property: One-Time Computation

cached_property computes a value once and stores it as an instance attribute:

from functools import cached_property

class Dataset:
    def __init__(self, path):
        self.path = path

    @cached_property
    def data(self):
        """Loaded once, then stored as self.data."""
        print("Loading...")
        with open(self.path) as f:
            return f.read()

Unlike @property + manual caching, cached_property replaces the descriptor with the computed value on first access. Subsequent accesses hit the instance’s __dict__ directly — no descriptor protocol overhead.

Threading Caveat

In Python 3.12+, cached_property is NOT thread-safe (the lock was removed for performance). In multi-threaded code, the property might be computed multiple times. If this is problematic, add your own locking:

import threading

class ThreadSafeDataset:
    _lock = threading.Lock()

    @cached_property
    def data(self):
        with self._lock:
            # Double-check: another thread might have set it
            if 'data' in self.__dict__:
                return self.__dict__['data']
            return expensive_load()

singledispatch: Advanced Patterns

Method dispatch with singledispatchmethod

singledispatch works on functions. For methods, use singledispatchmethod (Python 3.8+):

from functools import singledispatchmethod

class Serializer:
    @singledispatchmethod
    def serialize(self, value):
        raise TypeError(f"Cannot serialize {type(value)}")

    @serialize.register
    def _(self, value: str):
        return f'"{value}"'

    @serialize.register
    def _(self, value: int):
        return str(value)

    @serialize.register
    def _(self, value: list):
        items = ", ".join(self.serialize(v) for v in value)
        return f"[{items}]"

Registration with type annotations

Since Python 3.7, you can register using type annotations instead of explicit type arguments:

@singledispatch
def process(value):
    raise TypeError(f"Unsupported: {type(value)}")

@process.register
def _(value: int):
    return value * 2

@process.register
def _(value: str):
    return value.upper()

Union types and abstract classes

from collections.abc import Mapping, Sequence

@process.register(Mapping)
def _(value):
    return {k: process(v) for k, v in value.items()}

@process.register(Sequence)
def _(value):
    return [process(v) for v in value]

ABCs and virtual subclasses work with singledispatch — it checks the MRO.

partial: Under the Hood

partial creates a functools.partial object (implemented in C) that stores the wrapped function, frozen args, and frozen kwargs:

from functools import partial

def connect(host, port, timeout=30):
    print(f"Connecting to {host}:{port} (timeout={timeout})")

local_connect = partial(connect, "localhost", timeout=5)

# Inspect the partial
print(local_connect.func)      # <function connect>
print(local_connect.args)      # ('localhost',)
print(local_connect.keywords)  # {'timeout': 5}

local_connect(8080)  # Connecting to localhost:8080 (timeout=5)

partialmethod for descriptors

partialmethod works inside class definitions where partial doesn’t (because partial doesn’t implement the descriptor protocol):

from functools import partialmethod

class Connection:
    def set_state(self, state, reason=""):
        self.state = state
        self.reason = reason

    connect = partialmethod(set_state, "connected")
    disconnect = partialmethod(set_state, "disconnected")

conn = Connection()
conn.connect(reason="user request")
print(conn.state)   # "connected"
print(conn.reason)  # "user request"

reduce: When It’s Actually Useful

Beyond simple aggregation, reduce excels at building nested structures:

from functools import reduce

# Deep dictionary access
def deep_get(d, keys):
    return reduce(lambda obj, key: obj[key], keys, d)

config = {"database": {"primary": {"host": "db.example.com"}}}
deep_get(config, ["database", "primary", "host"])  # "db.example.com"

# Function composition
def compose(*funcs):
    return reduce(lambda f, g: lambda x: f(g(x)), funcs)

transform = compose(str.upper, str.strip, str.replace)
# Doesn't quite work — but shows the pattern

reduce with operator module

from functools import reduce
from operator import mul, or_

# Product of a list
reduce(mul, [1, 2, 3, 4, 5])  # 120

# Bitwise OR of flags
reduce(or_, [0x01, 0x02, 0x08])  # 0x0B (11)

cmp_to_key: Legacy Compatibility

Converts old-style comparison functions (returning -1, 0, 1) to key functions for sorted():

from functools import cmp_to_key

def compare_versions(a, b):
    a_parts = list(map(int, a.split(".")))
    b_parts = list(map(int, b.split(".")))
    if a_parts < b_parts:
        return -1
    elif a_parts > b_parts:
        return 1
    return 0

versions = ["1.2.3", "1.0.0", "2.1.0", "1.2.1"]
sorted(versions, key=cmp_to_key(compare_versions))
# ['1.0.0', '1.2.1', '1.2.3', '2.1.0']

Performance Patterns

Warming the cache

For web applications, pre-fill caches at startup:

@lru_cache(maxsize=1000)
def get_template(name):
    return load_and_compile_template(name)

def warm_caches():
    """Call at application startup."""
    for name in get_all_template_names():
        get_template(name)

Typed caching

lru_cache(typed=True) treats 3 and 3.0 as different arguments:

@lru_cache(typed=True)
def process(value):
    return type(value).__name__

process(3)    # "int" — cached separately
process(3.0)  # "float" — different cache entry

Without typed=True, 3 and 3.0 share a cache entry because hash(3) == hash(3.0).

Monitoring cache effectiveness

import atexit

@lru_cache(maxsize=512)
def expensive_query(sql):
    return db.execute(sql)

def report_cache_stats():
    info = expensive_query.cache_info()
    hit_rate = info.hits / (info.hits + info.misses) if info.misses else 1.0
    print(f"Cache hit rate: {hit_rate:.1%} ({info.currsize}/{info.maxsize} entries)")

atexit.register(report_cache_stats)

If your hit rate is below 50%, your maxsize is likely too small or the input distribution is too spread out for caching to help.

One thing to remember: functools is implemented largely in C for performance. lru_cache uses a linked-list + dict combo for O(1) operations, partial creates lean C-level callable wrappers, and singledispatch builds a type→function registry with MRO-aware lookup. Master the internals and you’ll know when each tool helps — and when a simpler approach is better.

pythonstandard-libraryfunctional