Weak References — Deep Dive
CPython’s Weak Reference Internals
Every weakref-capable object in CPython contains a __weakref__ field — a pointer to the head of a singly-linked list of weak reference objects. When the object is deallocated, CPython walks this list and sets each weak reference’s referent pointer to NULL (which surfaces as None in Python).
The C struct for a basic weak reference:
typedef struct _PyWeakReference {
PyObject_HEAD
PyObject *wr_object; /* The referenced object */
PyObject *wr_callback; /* Callback or NULL */
Py_hash_t hash; /* Cached hash of referent */
struct _PyWeakReference *wr_prev;
struct _PyWeakReference *wr_next;
} PyWeakReference;
The doubly-linked list (prev/next) allows O(1) removal. When you delete a weak reference, it unlinks itself. When the referent dies, CPython iterates the list, clears all references, and calls callbacks.
Callback Execution Order
When an object is destroyed, its weak reference callbacks execute in reverse order of registration. This is not guaranteed by the documentation but is the current CPython behavior:
import weakref
class Target:
pass
t = Target()
refs = []
for i in range(3):
def cb(ref, i=i):
print(f"Callback {i}")
refs.append(weakref.ref(t, cb))
del t
# Output: Callback 2, Callback 1, Callback 0
Critical detail: Callbacks receive the weak reference object (already dead — calling it returns None), not the original object. You cannot resurrect the object in a callback.
WeakValueDictionary Implementation
WeakValueDictionary stores KeyedRef objects — weak references that know their own dictionary key. When the referent dies, the callback removes the entry from the dictionary:
# Simplified internal logic
class KeyedRef(weakref.ref):
__slots__ = ('key',)
def __new__(cls, ob, callback, key):
self = super().__new__(cls, ob, callback)
self.key = key
return self
class WeakValueDictionary:
def __setitem__(self, key, value):
def remove(wr, selfref=weakref.ref(self)):
self = selfref()
if self is not None:
del self.data[wr.key]
self.data[key] = KeyedRef(value, remove, key)
Notice the nested weak reference: the callback itself holds a weak reference to the dictionary. This prevents the common bug where a callback keeps the dictionary alive after it should be collected.
Iteration Pitfalls
Iterating a WeakValueDictionary while the GC runs can cause RuntimeError: dictionary changed size during iteration. The dictionary wraps iteration in a _IterationGuard that defers removals:
cache = weakref.WeakValueDictionary()
# Safe: iteration guard defers cleanup
for key, value in cache.items():
process(value)
# Dangerous: manual iteration without guard
for key in list(cache.data.keys()):
ref = cache.data[key]
obj = ref() # Might be None if collected between lines
Always use the public API (items(), values(), keys()) — never access .data directly.
WeakKeyDictionary: The Reverse Pattern
While WeakValueDictionary lets values be collected, WeakKeyDictionary lets keys be collected. When a key object dies, the entire entry is removed.
Use case: attaching metadata to objects without preventing their cleanup:
metadata = weakref.WeakKeyDictionary()
class Widget:
pass
w = Widget()
metadata[w] = {"created_at": "2026-01-15", "clicks": 0}
del w # Entry removed from metadata automatically
This is how some debugging tools and profilers attach information to objects without leaking.
weakref.finalize: Robust Cleanup
weakref.finalize is superior to __del__ for several reasons:
- Ordering — finalizers registered first run last (LIFO), and you can control order
- No resurrection —
__del__receivesself, which can accidentally keep the object alive - Atexit integration — finalizers run at interpreter shutdown (unless
atexit=False) - Idempotent — calling
finalize.detach()cancels the callback cleanly
class TempFile:
def __init__(self, path):
self.path = path
self._finalizer = weakref.finalize(
self, self._cleanup, path
)
@staticmethod
def _cleanup(path):
import os
if os.path.exists(path):
os.unlink(path)
print(f"Cleaned up {path}")
def close(self):
self._finalizer() # Run cleanup now
@property
def alive(self):
return self._finalizer.alive
The cleanup function is a static method taking the path as an argument — it doesn’t reference self, avoiding resurrection.
GC Interaction: Cycles and Weak References
Python’s cyclic garbage collector handles reference cycles that reference counting can’t. Weak references interact with the GC in specific ways:
class Node:
def __init__(self, name):
self.name = name
self.parent = None
self.children = []
parent = Node("root")
child = Node("leaf")
parent.children.append(child)
child.parent = weakref.ref(parent) # Weak back-reference
By making the child→parent reference weak, you break the cycle. Without weak references, parent and child reference each other, and neither can reach refcount zero. The cyclic GC would eventually find them, but weak references avoid the cycle entirely — which means faster, deterministic cleanup.
Weak References and del
Objects with __del__ methods in reference cycles used to be uncollectable (moved to gc.garbage) in Python 2. Python 3.4+ (PEP 442) fixed this with safe object finalization, but using weak references to break cycles is still cleaner and more predictable.
Production Pattern: LRU Cache with Weak References
Combine WeakValueDictionary with an LRU eviction policy:
from collections import OrderedDict
import weakref
class WeakLRUCache:
def __init__(self, maxsize=128):
self._cache = OrderedDict()
self._weak = weakref.WeakValueDictionary()
self._maxsize = maxsize
def get(self, key):
# Check strong cache first
if key in self._cache:
self._cache.move_to_end(key)
return self._cache[key]
# Fall back to weak cache (might still be alive)
obj = self._weak.get(key)
if obj is not None:
# Promote back to strong cache
self._cache[key] = obj
self._cache.move_to_end(key)
self._trim()
return obj
def put(self, key, value):
self._cache[key] = value
self._weak[key] = value
self._cache.move_to_end(key)
self._trim()
def _trim(self):
while len(self._cache) > self._maxsize:
self._cache.popitem(last=False)
# Evicted item stays in _weak — accessible if still alive
Evicted items remain accessible through the weak dictionary as long as some other code holds a strong reference. This two-tier approach gives you bounded memory with opportunistic caching.
Production Pattern: Observable with Auto-Unsubscribe
class EventBus:
def __init__(self):
self._listeners = weakref.WeakSet()
def subscribe(self, listener):
self._listeners.add(listener)
def emit(self, event):
# WeakSet automatically removes dead listeners
for listener in list(self._listeners):
listener.handle(event)
When a listener object is garbage collected, it vanishes from _listeners. No explicit unsubscribe() needed — though providing one for immediate removal is still good practice.
Thread Safety
The weakref module’s containers are not thread-safe. Concurrent access to WeakValueDictionary or WeakSet can corrupt internal state. In multi-threaded code, wrap access with a threading.Lock:
import threading
class ThreadSafeWeakCache:
def __init__(self):
self._cache = weakref.WeakValueDictionary()
self._lock = threading.Lock()
def get(self, key):
with self._lock:
return self._cache.get(key)
def set(self, key, value):
with self._lock:
self._cache[key] = value
Debugging Weak References
import weakref, gc
class Tracked:
pass
t = Tracked()
refs = weakref.getweakrefs(t) # List all weak refs to this object
print(len(refs)) # 0
r = weakref.ref(t)
refs = weakref.getweakrefs(t)
print(len(refs)) # 1
# Force GC and check
gc.collect()
print(r()) # Still alive if t is still referenced
weakref.getweakrefs(obj) is invaluable for debugging — it tells you how many weak references an object has, helping trace unexpected reference patterns.
One thing to remember: Weak references work through a per-object linked list that CPython walks during deallocation. They’re the foundation of self-cleaning caches (WeakValueDictionary), auto-unsubscribing observers (WeakSet), and robust cleanup (finalize). The key design principle: separate observation from ownership to prevent memory leaks.
See Also
- Ci Cd Why big apps can ship updates every day without turning your phone into a glitchy mess — CI/CD is the behind-the-scenes quality gate and delivery truck.
- Containerization Why does software that works on your computer break on everyone else's? Containers fix that — and they're why Netflix can deploy 100 updates a day without the site going down.
- Python 310 New Features Python 3.10 gave programmers a shape-sorting machine, friendlier error messages, and cleaner ways to say 'this or that' in type hints.
- Python 311 New Features Python 3.11 made everything faster, error messages smarter, and let you catch several mistakes at once instead of stopping at the first one.
- Python 312 New Features Python 3.12 made type hints shorter, f-strings more powerful, and started preparing Python's engine for a world without the GIL.