Python Memory-Mapped Files — Core Concepts
What Memory Mapping Does
When you open a file normally and call read(), Python copies the file’s contents from disk into a buffer in your process’s memory. For a 2 GB file, that’s 2 GB of RAM consumed immediately.
Memory mapping takes a different approach. It asks the operating system to map the file’s contents directly into the process’s virtual address space. The OS uses its virtual memory system — the same mechanism it uses for swap — to load pages from disk on demand. Only the pages your code actually accesses get loaded into physical RAM.
Basic Usage with mmap
import mmap
with open("large_data.bin", "r+b") as f:
mm = mmap.mmap(f.fileno(), 0) # Map entire file
# Read like a file
first_100 = mm[:100]
# Search efficiently
pos = mm.find(b"ERROR")
if pos != -1:
print(f"Found at byte {pos}")
# Random access without seeking
chunk = mm[1000000:1000100]
mm.close()
The 0 in mmap.mmap(f.fileno(), 0) means “map the entire file.” The OS handles the rest — paging in sections as needed and evicting old sections when memory pressure increases.
Key Operations
Slicing works like bytes — mm[start:end] returns the data at those byte positions. This triggers disk reads only for the pages containing those bytes.
Searching with mm.find(pattern) scans through the file using the OS’s page cache. For repeated searches, frequently accessed pages stay cached in RAM.
Writing is supported when the file is opened in "r+b" mode:
mm[0:5] = b"HELLO" # Overwrites first 5 bytes
mm.flush() # Ensures changes reach disk
Changes are written back to the file. The flush() call ensures they’re persisted rather than sitting in the OS page cache.
When Memory Mapping Shines
- Large file random access — Log files, binary data, scientific datasets where you jump to different positions.
- Shared memory between processes — Two processes can mmap the same file and see each other’s writes, enabling IPC without sockets.
- File-backed data structures — Databases like SQLite use mmap internally for efficient page access.
- Binary file parsing — Reading headers, jumping to offsets, extracting records from structured binary formats.
Performance Comparison
For a 4 GB file, reading specific byte ranges:
| Method | Time (100 random reads) | Peak Memory |
|---|---|---|
f.read() entire file | 12s (initial load) | 4 GB |
f.seek() + f.read(n) | 0.8s | Minimal |
mmap slicing | 0.3s | ~50 MB (OS-managed) |
Memory mapping wins for random access because the OS page cache optimizes which pages are resident. Sequential reads of the entire file show less difference — standard buffered I/O is already efficient for that pattern.
Limitations and Gotchas
File size limits on 32-bit systems. A 32-bit process can only address 4 GB of virtual memory. Memory mapping a file larger than ~2 GB on a 32-bit system fails. On 64-bit systems, the virtual address space is effectively unlimited.
Not a substitute for streaming. If you’re processing a file sequentially from start to end, regular buffered file reading (for line in f) is simpler and equally fast. mmap adds value for random access patterns.
Windows vs Unix differences. On Windows, the file is locked while mapped — other processes can’t delete or resize it. On Unix, the file can be modified or deleted while mapped, which can cause crashes (SIGBUS) if the underlying file shrinks.
No automatic resizing. You can’t grow a memory-mapped file by writing past its end. To expand the file, close the mapping, resize the file with f.truncate(), and create a new mapping.
Common Misconception
People often think memory mapping uses no RAM. It does — the OS loads accessed pages into physical memory (the page cache). The advantage is that the OS manages this automatically, evicting pages under memory pressure, and the same page cache is shared across processes. You don’t need to manage buffers yourself, and unused portions never consume RAM.
The one thing to remember: Memory-mapped files give you random access to files of any size with OS-managed paging, making them ideal for large binary files and inter-process data sharing — but they add complexity over simple file I/O that’s only worthwhile for random access or shared memory patterns.
See Also
- Python Algorithmic Complexity Understand Algorithmic Complexity through a practical analogy so your Python decisions become faster and clearer.
- Python Async Performance Tuning Making your async Python faster is like organizing a busy restaurant kitchen — it's all about flow.
- Python Benchmark Methodology Why timing Python code once means nothing, and how fair testing works like a science experiment.
- Python C Extension Performance How Python borrows C's speed for the hard parts — like hiring a specialist for the toughest job on the worksite.
- Python Caching Strategies Understand Python caching strategies with a shortcut-road analogy so your app gets faster without taking wrong turns.