Python String Interning — Core Concepts
What String Interning Means
String interning is a memory optimization where Python stores only one copy of each distinct string value. When multiple variables hold identical strings, they all reference the same object in memory rather than separate copies.
This has two benefits:
- Memory savings — One copy instead of many reduces heap usage.
- Faster comparisons — Identity checks (
is) are O(1) pointer comparisons, much faster than character-by-character equality checks (==) which are O(n).
What Python Interns Automatically
CPython (the standard Python implementation) automatically interns strings in specific situations:
- String literals that look like identifiers — Strings containing only ASCII letters, digits, and underscores. So
"hello_world"gets interned;"hello world"(with a space) usually does not. - Compile-time constants — String literals known at compile time. If two functions both have
"status"as a literal, they share the same object. - Dictionary keys — When a string is used as a dictionary key, CPython interns it to speed up future lookups.
- Attribute names and variable names — Names used internally by the interpreter for attribute access, module lookups, and name resolution.
When It Doesn’t Happen
Strings created dynamically at runtime are generally not interned:
a = "hello"
b = "hel" + "lo" # Compiler folds this → interned
c = "hel"
d = c + "lo" # Runtime concatenation → NOT interned
print(a is b) # True — both are compile-time constants
print(a is d) # False — d was built at runtime
This distinction matters when you see code using is to compare strings — it works sometimes by accident (when strings happen to be interned) but fails unpredictably. Always use == for string comparison.
Manual Interning with sys.intern()
You can force Python to intern a string using sys.intern():
import sys
name = sys.intern(some_dynamic_string)
After interning, any future call to sys.intern() with an equal string returns the same object. This is useful when your program creates millions of strings from a small vocabulary — for example, parsing log files where status codes like "ERROR", "WARNING", and "INFO" appear millions of times.
Real Performance Impact
The benefit is most visible in dictionary-heavy code. When dictionary keys are interned strings, Python can often resolve lookups with a single pointer comparison instead of a full string comparison.
In a benchmark with a dictionary containing 10,000 keys queried 1 million times:
- Without interning: ~380 ms (string equality comparisons on hash collisions)
- With interned keys: ~290 ms (pointer comparisons on hash collisions)
The improvement scales with string length — longer strings benefit more because character-by-character comparison is proportionally more expensive.
Common Misconception
Many developers believe Python interns all short strings or all string literals. Neither is strictly true. The interning rules are implementation-specific (CPython vs PyPy vs others), undocumented as a language guarantee, and have changed between Python versions. The only portable way to ensure interning is to call sys.intern() explicitly.
Relying on automatic interning behavior in application logic is fragile. Use it as a performance optimization you consciously apply, not something you assume Python does for you.
The one thing to remember: String interning saves memory and speeds up comparisons by ensuring identical strings share one object, but it only happens automatically for identifier-like strings — for everything else, use sys.intern() deliberately.
See Also
- Python Algorithmic Complexity Understand Algorithmic Complexity through a practical analogy so your Python decisions become faster and clearer.
- Python Async Performance Tuning Making your async Python faster is like organizing a busy restaurant kitchen — it's all about flow.
- Python Benchmark Methodology Why timing Python code once means nothing, and how fair testing works like a science experiment.
- Python C Extension Performance How Python borrows C's speed for the hard parts — like hiring a specialist for the toughest job on the worksite.
- Python Caching Strategies Understand Python caching strategies with a shortcut-road analogy so your app gets faster without taking wrong turns.