Python PyPy Migration Guide — Core Concepts

What PyPy is

PyPy is an alternative Python interpreter with a built-in JIT (Just-In-Time) compiler. It runs standard Python code but compiles frequently-executed paths to optimized machine code at runtime. For CPU-bound pure Python, PyPy is typically 4-10× faster than CPython.

PyPy supports Python 3.10 (with 3.11 in development as of 2026) and is available for Linux, macOS, and Windows.

Step 1: Check your dependencies

The biggest migration barrier is C extension compatibility. Pure Python packages work fine. C extensions need special handling.

Compatible out of the box:

  • Pure Python packages (most of PyPI)
  • Packages using cffi (PyPy’s preferred C interop)
  • Standard library (nearly 100% compatible)

Need attention:

  • numpy — supported via PyPy’s compatibility layer, but sometimes slower than CPython+NumPy
  • pandas — works but some operations are slower due to C extension overhead
  • scipy — partial support
  • Packages using ctypes — generally work

Often incompatible:

  • Packages using CPython C API directly
  • Cython-compiled extensions (need recompilation for PyPy)
  • Low-level CPython hacks (sys._getframe, gc module details)

Check with:

# Install PyPy
pypy -m ensurepip
pypy -m pip install -r requirements.txt
# Watch for build failures

Step 2: Identify where PyPy helps

PyPy accelerates pure Python execution. It doesn’t speed up:

  • C extension calls (NumPy operations, database drivers)
  • I/O-bound waiting (network, disk)
  • Code that barely runs (initialization, configuration)

The best candidates for PyPy migration:

Good for PyPyNot helped by PyPy
Loop-heavy pure PythonNumPy array operations
Text processingDatabase queries
Custom algorithmsFile I/O
JSON parsingCalling C libraries
SerializationStartup-heavy CLI tools

Step 3: Measure before migrating

Run your test suite and benchmarks under both interpreters:

# CPython baseline
python -m pytest tests/ --benchmark-save=cpython
python benchmark.py  # your custom benchmarks

# PyPy comparison
pypy -m pytest tests/ --benchmark-save=pypy
pypy benchmark.py

Key metrics to compare:

  • Throughput for long-running services
  • Latency including warmup time
  • Memory usage (PyPy often uses more memory due to JIT)
  • Startup time (PyPy is slower to start due to JIT initialization)

Step 4: Handle the differences

Memory management

CPython uses reference counting with garbage collection. PyPy uses only garbage collection (no reference counting). This means:

# CPython: file closed immediately when last reference dropped
f = open('data.txt')
data = f.read()
f = None  # file closed here in CPython, not guaranteed in PyPy

# Safe for both: use context managers
with open('data.txt') as f:
    data = f.read()
# file closed here — works everywhere

Startup overhead

PyPy’s JIT needs warmup time. For long-running servers, this is negligible. For short scripts, it can make PyPy slower:

# Short script: CPython wins
time python -c "print('hello')"   # ~0.03s
time pypy -c "print('hello')"     # ~0.10s

# Long computation: PyPy wins
time python heavy_compute.py       # 45s
time pypy heavy_compute.py         # 6s

del timing

Destructors (__del__) run at unpredictable times in PyPy. Code relying on deterministic destruction breaks:

# FRAGILE: depends on __del__ running promptly
class TempFile:
    def __del__(self):
        os.unlink(self.path)

# ROBUST: explicit cleanup
class TempFile:
    def close(self):
        os.unlink(self.path)
    def __enter__(self):
        return self
    def __exit__(self, *args):
        self.close()

Common misconception: PyPy replaces all optimization

PyPy removes the interpreter overhead but doesn’t fix algorithmic problems. An O(n²) algorithm in PyPy is still O(n²) — just with a smaller constant. Profile first, improve algorithms, and then consider PyPy for the remaining interpreter-bound hotspots.

Decision framework

Use PyPy when:

  • ✅ Your code is CPU-bound pure Python
  • ✅ Your dependencies are compatible
  • ✅ The application runs long enough for JIT warmup
  • ✅ You can accept higher memory usage

Stay on CPython when:

  • ❌ You depend heavily on C extensions (NumPy-heavy workflows)
  • ❌ Your tool is a short-lived CLI script
  • ❌ You need the latest Python version immediately
  • ❌ Memory is tightly constrained

The one thing to remember: PyPy is a drop-in speed boost for pure Python workloads — check C extension compatibility, measure both interpreters on your actual code, and use context managers instead of relying on deterministic object destruction.

pythonperformancemigration

See Also