Python Garbage Collector Tuning — Core Concepts
Python uses reference counting plus a cyclic garbage collector. Most objects are reclaimed immediately when references drop to zero, while the GC handles cycles that reference counting alone cannot free.
Generational Model
CPython’s GC tracks objects in generations:
- younger generation: many short-lived objects
- older generations: longer-lived objects
The intuition: most objects die young, so collect young generations more frequently.
Thresholds and Collection Frequency
You can inspect and adjust thresholds:
import gc
print(gc.get_threshold())
# gc.set_threshold(700, 10, 10)
Lower thresholds trigger collections more often; higher thresholds delay collections.
When Tuning Helps
Tuning can help when profiling shows:
- frequent GC runs adding latency spikes
- excessive memory growth between collections
- workload-specific churn patterns (many temporary container objects)
If you have not measured GC impact, tuning is guesswork.
What to Measure Before and After
- request latency percentiles (p95/p99)
- throughput
- RSS/heap trend
- GC stats (
gc.get_stats()in modern Python)
A threshold change that improves average latency but worsens p99 may not be acceptable.
Temporary Disable Pattern (Careful)
For tight compute sections with no cycle creation, some teams temporarily disable GC:
import gc
gc.disable()
try:
run_hot_section()
finally:
gc.enable()
Only do this with strong evidence and guardrails. Disabling GC globally in long-lived services is risky.
Common Misconception
Misconception: lowering thresholds always improves memory.
Reality: lower thresholds can reduce some memory growth but may introduce more pause overhead and CPU cost.
Practical Approach
- baseline metrics under realistic load
- change one threshold set at a time
- run long enough to observe memory trend
- keep settings only if net outcome improves service goals
Related Topics
Use Python Memory Profiling first; GC tuning should respond to measured allocation/churn patterns, not replace profiling.
A Small Experiment Template
Try this practical experiment:
- baseline run with default GC thresholds for 30 minutes
- run again with one threshold change
- compare p95 latency, CPU, and memory slope
If memory improves but latency worsens significantly, revert. If latency improves but memory drifts toward OOM, revert.
A tuning decision is successful only when overall service health improves, not one isolated metric.
Another useful trick is to graph GC collection counts next to request traffic. If bursts line up with user-facing spikes, you have evidence to optimize allocation patterns before more threshold changes.
One Thing to Remember
GC tuning is an empirical exercise: adjust thresholds with metrics, and choose settings that improve your real latency-memory tradeoff.
See Also
- Python Cpython Vs Pypy CPython and PyPy both run Python code, but one is a careful planner and the other is a speed learner that gets faster as it works.
- Ci Cd Why big apps can ship updates every day without turning your phone into a glitchy mess — CI/CD is the behind-the-scenes quality gate and delivery truck.
- Containerization Why does software that works on your computer break on everyone else's? Containers fix that — and they're why Netflix can deploy 100 updates a day without the site going down.
- Python 310 New Features Python 3.10 gave programmers a shape-sorting machine, friendlier error messages, and cleaner ways to say 'this or that' in type hints.
- Python 311 New Features Python 3.11 made everything faster, error messages smarter, and let you catch several mistakes at once instead of stopping at the first one.