Weak References — Deep Dive

CPython’s Weak Reference Internals

Every weakref-capable object in CPython contains a __weakref__ field — a pointer to the head of a singly-linked list of weak reference objects. When the object is deallocated, CPython walks this list and sets each weak reference’s referent pointer to NULL (which surfaces as None in Python).

The C struct for a basic weak reference:

typedef struct _PyWeakReference {
    PyObject_HEAD
    PyObject *wr_object;     /* The referenced object */
    PyObject *wr_callback;   /* Callback or NULL */
    Py_hash_t hash;          /* Cached hash of referent */
    struct _PyWeakReference *wr_prev;
    struct _PyWeakReference *wr_next;
} PyWeakReference;

The doubly-linked list (prev/next) allows O(1) removal. When you delete a weak reference, it unlinks itself. When the referent dies, CPython iterates the list, clears all references, and calls callbacks.

Callback Execution Order

When an object is destroyed, its weak reference callbacks execute in reverse order of registration. This is not guaranteed by the documentation but is the current CPython behavior:

import weakref

class Target:
    pass

t = Target()
refs = []

for i in range(3):
    def cb(ref, i=i):
        print(f"Callback {i}")
    refs.append(weakref.ref(t, cb))

del t
# Output: Callback 2, Callback 1, Callback 0

Critical detail: Callbacks receive the weak reference object (already dead — calling it returns None), not the original object. You cannot resurrect the object in a callback.

WeakValueDictionary Implementation

WeakValueDictionary stores KeyedRef objects — weak references that know their own dictionary key. When the referent dies, the callback removes the entry from the dictionary:

# Simplified internal logic
class KeyedRef(weakref.ref):
    __slots__ = ('key',)
    
    def __new__(cls, ob, callback, key):
        self = super().__new__(cls, ob, callback)
        self.key = key
        return self

class WeakValueDictionary:
    def __setitem__(self, key, value):
        def remove(wr, selfref=weakref.ref(self)):
            self = selfref()
            if self is not None:
                del self.data[wr.key]
        self.data[key] = KeyedRef(value, remove, key)

Notice the nested weak reference: the callback itself holds a weak reference to the dictionary. This prevents the common bug where a callback keeps the dictionary alive after it should be collected.

Iteration Pitfalls

Iterating a WeakValueDictionary while the GC runs can cause RuntimeError: dictionary changed size during iteration. The dictionary wraps iteration in a _IterationGuard that defers removals:

cache = weakref.WeakValueDictionary()

# Safe: iteration guard defers cleanup
for key, value in cache.items():
    process(value)

# Dangerous: manual iteration without guard
for key in list(cache.data.keys()):
    ref = cache.data[key]
    obj = ref()  # Might be None if collected between lines

Always use the public API (items(), values(), keys()) — never access .data directly.

WeakKeyDictionary: The Reverse Pattern

While WeakValueDictionary lets values be collected, WeakKeyDictionary lets keys be collected. When a key object dies, the entire entry is removed.

Use case: attaching metadata to objects without preventing their cleanup:

metadata = weakref.WeakKeyDictionary()

class Widget:
    pass

w = Widget()
metadata[w] = {"created_at": "2026-01-15", "clicks": 0}

del w  # Entry removed from metadata automatically

This is how some debugging tools and profilers attach information to objects without leaking.

weakref.finalize: Robust Cleanup

weakref.finalize is superior to __del__ for several reasons:

  1. Ordering — finalizers registered first run last (LIFO), and you can control order
  2. No resurrection__del__ receives self, which can accidentally keep the object alive
  3. Atexit integration — finalizers run at interpreter shutdown (unless atexit=False)
  4. Idempotent — calling finalize.detach() cancels the callback cleanly
class TempFile:
    def __init__(self, path):
        self.path = path
        self._finalizer = weakref.finalize(
            self, self._cleanup, path
        )

    @staticmethod
    def _cleanup(path):
        import os
        if os.path.exists(path):
            os.unlink(path)
            print(f"Cleaned up {path}")

    def close(self):
        self._finalizer()  # Run cleanup now

    @property
    def alive(self):
        return self._finalizer.alive

The cleanup function is a static method taking the path as an argument — it doesn’t reference self, avoiding resurrection.

GC Interaction: Cycles and Weak References

Python’s cyclic garbage collector handles reference cycles that reference counting can’t. Weak references interact with the GC in specific ways:

class Node:
    def __init__(self, name):
        self.name = name
        self.parent = None
        self.children = []

parent = Node("root")
child = Node("leaf")
parent.children.append(child)
child.parent = weakref.ref(parent)  # Weak back-reference

By making the child→parent reference weak, you break the cycle. Without weak references, parent and child reference each other, and neither can reach refcount zero. The cyclic GC would eventually find them, but weak references avoid the cycle entirely — which means faster, deterministic cleanup.

Weak References and del

Objects with __del__ methods in reference cycles used to be uncollectable (moved to gc.garbage) in Python 2. Python 3.4+ (PEP 442) fixed this with safe object finalization, but using weak references to break cycles is still cleaner and more predictable.

Production Pattern: LRU Cache with Weak References

Combine WeakValueDictionary with an LRU eviction policy:

from collections import OrderedDict
import weakref

class WeakLRUCache:
    def __init__(self, maxsize=128):
        self._cache = OrderedDict()
        self._weak = weakref.WeakValueDictionary()
        self._maxsize = maxsize

    def get(self, key):
        # Check strong cache first
        if key in self._cache:
            self._cache.move_to_end(key)
            return self._cache[key]
        # Fall back to weak cache (might still be alive)
        obj = self._weak.get(key)
        if obj is not None:
            # Promote back to strong cache
            self._cache[key] = obj
            self._cache.move_to_end(key)
            self._trim()
        return obj

    def put(self, key, value):
        self._cache[key] = value
        self._weak[key] = value
        self._cache.move_to_end(key)
        self._trim()

    def _trim(self):
        while len(self._cache) > self._maxsize:
            self._cache.popitem(last=False)
            # Evicted item stays in _weak — accessible if still alive

Evicted items remain accessible through the weak dictionary as long as some other code holds a strong reference. This two-tier approach gives you bounded memory with opportunistic caching.

Production Pattern: Observable with Auto-Unsubscribe

class EventBus:
    def __init__(self):
        self._listeners = weakref.WeakSet()

    def subscribe(self, listener):
        self._listeners.add(listener)

    def emit(self, event):
        # WeakSet automatically removes dead listeners
        for listener in list(self._listeners):
            listener.handle(event)

When a listener object is garbage collected, it vanishes from _listeners. No explicit unsubscribe() needed — though providing one for immediate removal is still good practice.

Thread Safety

The weakref module’s containers are not thread-safe. Concurrent access to WeakValueDictionary or WeakSet can corrupt internal state. In multi-threaded code, wrap access with a threading.Lock:

import threading

class ThreadSafeWeakCache:
    def __init__(self):
        self._cache = weakref.WeakValueDictionary()
        self._lock = threading.Lock()

    def get(self, key):
        with self._lock:
            return self._cache.get(key)

    def set(self, key, value):
        with self._lock:
            self._cache[key] = value

Debugging Weak References

import weakref, gc

class Tracked:
    pass

t = Tracked()
refs = weakref.getweakrefs(t)  # List all weak refs to this object
print(len(refs))               # 0

r = weakref.ref(t)
refs = weakref.getweakrefs(t)
print(len(refs))               # 1

# Force GC and check
gc.collect()
print(r())  # Still alive if t is still referenced

weakref.getweakrefs(obj) is invaluable for debugging — it tells you how many weak references an object has, helping trace unexpected reference patterns.

One thing to remember: Weak references work through a per-object linked list that CPython walks during deallocation. They’re the foundation of self-cleaning caches (WeakValueDictionary), auto-unsubscribing observers (WeakSet), and robust cleanup (finalize). The key design principle: separate observation from ownership to prevent memory leaks.

pythonmemoryadvanced

See Also

  • Ci Cd Why big apps can ship updates every day without turning your phone into a glitchy mess — CI/CD is the behind-the-scenes quality gate and delivery truck.
  • Containerization Why does software that works on your computer break on everyone else's? Containers fix that — and they're why Netflix can deploy 100 updates a day without the site going down.
  • Python 310 New Features Python 3.10 gave programmers a shape-sorting machine, friendlier error messages, and cleaner ways to say 'this or that' in type hints.
  • Python 311 New Features Python 3.11 made everything faster, error messages smarter, and let you catch several mistakes at once instead of stopping at the first one.
  • Python 312 New Features Python 3.12 made type hints shorter, f-strings more powerful, and started preparing Python's engine for a world without the GIL.