NumPy Memory Views — Deep Dive
Technical foundation
A NumPy ndarray is a thin descriptor that wraps a contiguous block of memory. The descriptor contains: a data pointer, a dtype, a shape tuple, a strides tuple, and flags. This separation between the descriptor and the data buffer is what makes views possible — multiple descriptors can reference the same buffer with different shapes and strides.
The ndarray memory model
ndarray object (Python heap)
├── data pointer ─────────→ [buffer in memory]
├── shape: (3, 4) │ 0 1 2 3 │
├── strides: (32, 8) │ 4 5 6 7 │
├── dtype: float64 │ 8 9 10 11 │
├── flags: C_CONTIGUOUS=True
└── base: None (or parent array)
When you create a view:
b = a[1:3] # view of rows 1-2
b (ndarray object)
├── data pointer ─────────→ [same buffer, offset by 32 bytes]
├── shape: (2, 4)
├── strides: (32, 8) (same strides)
├── dtype: float64
└── base: a (reference to parent)
The key insight: b.ctypes.data equals a.ctypes.data + 32 — it points into the middle of a’s buffer.
Strides in depth
Strides define the byte distance between consecutive elements along each axis:
import numpy as np
a = np.arange(24, dtype=np.float64).reshape(2, 3, 4)
print(a.strides) # (96, 32, 8)
# Axis 0: 96 bytes = 12 elements × 8 bytes (jump one "plane")
# Axis 1: 32 bytes = 4 elements × 8 bytes (jump one row)
# Axis 2: 8 bytes = 1 element × 8 bytes (jump one column)
Negative strides (reversed views)
b = a[::-1]
print(b.strides) # (-96, 32, 8) — negative stride along axis 0
# b.data points to the last plane; each step goes backward
Zero strides (broadcasting)
c = np.broadcast_to(np.array([1, 2, 3]), (1000, 3))
print(c.strides) # (0, 8) — zero stride along axis 0
# The same 3 elements are "reused" for all 1000 rows
print(c.flags.writeable) # False — writing would corrupt shared data
Contiguity flags
NumPy tracks two contiguity states:
- C-contiguous (row-major): elements in the last axis are adjacent in memory.
- F-contiguous (column-major, Fortran): elements in the first axis are adjacent.
a = np.arange(12).reshape(3, 4)
print(a.flags['C_CONTIGUOUS']) # True
print(a.flags['F_CONTIGUOUS']) # False
b = np.asfortranarray(a)
print(b.flags['F_CONTIGUOUS']) # True
# A 1D array is both C and F contiguous
c = np.arange(10)
print(c.flags['C_CONTIGUOUS'], c.flags['F_CONTIGUOUS']) # True True
Why it matters: many NumPy operations and external libraries (BLAS, LAPACK, cuDNN) require specific contiguity. Passing a non-contiguous array triggers an implicit copy.
The Python buffer protocol
NumPy arrays implement Python’s buffer protocol, enabling zero-copy access from other libraries:
a = np.arange(12, dtype=np.int32)
# memoryview — Python built-in zero-copy view
mv = memoryview(a)
print(mv.format) # 'i' (int32)
print(mv.shape) # (12,)
print(mv.strides) # (4,)
# Modify through memoryview — changes a
mv[0] = 99
print(a[0]) # 99
Libraries like Pillow, PyTorch, TensorFlow, and Arrow use the buffer protocol to exchange data with NumPy without copying.
Memory-mapped arrays
np.memmap creates an array backed by a file on disk:
# Create a 1 GB memory-mapped array
mmap = np.memmap('big_data.dat', dtype='float64', mode='w+', shape=(125_000_000,))
mmap[:1000] = np.random.randn(1000)
del mmap # flush to disk
# Reopen — OS pages data in on demand
mmap = np.memmap('big_data.dat', dtype='float64', mode='r+', shape=(125_000_000,))
chunk = mmap[500:600] # only 800 bytes paged in, not 1 GB
Memory maps are views into file-backed memory. The OS virtual memory system handles paging, making it possible to work with datasets larger than physical RAM.
Slicing memory-mapped arrays
Slices of memmaps return views (still memory-mapped):
subset = mmap[1000:2000]
print(type(subset)) # numpy.memmap
print(subset.base is mmap) # True (or a chain of bases)
Fancy indexing returns a regular ndarray (copied into RAM):
selected = mmap[[0, 500, 999]]
print(type(selected)) # numpy.ndarray — no longer memory-mapped
Zero-copy interop patterns
NumPy → PyTorch
import torch
a = np.random.randn(1000, 1000)
t = torch.from_numpy(a) # zero-copy — shares memory
t[0, 0] = 999
print(a[0, 0]) # 999.0 — same memory
# PyTorch → NumPy
t2 = torch.randn(100)
a2 = t2.numpy() # zero-copy (CPU tensors only)
NumPy → Pandas
import pandas as pd
a = np.random.randn(1000)
s = pd.Series(a, copy=False) # shares memory (pandas ≤1.x)
# Note: pandas 2.x with Arrow backend may not share memory
NumPy → bytes (for network transfer)
a = np.random.randn(1000).astype(np.float32)
raw = a.tobytes() # copy to bytes
b = np.frombuffer(raw, dtype=np.float32) # view into raw bytes
print(np.shares_memory(a, b)) # False — tobytes creates a copy
np.lib.stride_tricks.as_strided — advanced views
For expert use, as_strided creates views with arbitrary shapes and strides:
from numpy.lib.stride_tricks import as_strided
# Sliding window view — zero-copy
a = np.arange(10, dtype=np.float64)
window_size = 4
n_windows = len(a) - window_size + 1
windows = as_strided(
a,
shape=(n_windows, window_size),
strides=(a.strides[0], a.strides[0])
)
print(windows)
# [[0 1 2 3]
# [1 2 3 4]
# [2 3 4 5]
# ...
# [6 7 8 9]]
NumPy 1.20+ provides a safer alternative:
from numpy.lib.stride_tricks import sliding_window_view
windows = sliding_window_view(a, window_shape=4)
Warning: as_strided does not bounds-check. Invalid strides read garbage memory or crash Python. Prefer sliding_window_view when possible.
Diagnosing memory issues
def array_info(a):
"""Print diagnostic info about an array's memory."""
print(f"shape: {a.shape}")
print(f"strides: {a.strides}")
print(f"dtype: {a.dtype} ({a.dtype.itemsize} bytes)")
print(f"data ptr: {a.ctypes.data}")
print(f"C-contiguous: {a.flags['C_CONTIGUOUS']}")
print(f"F-contiguous: {a.flags['F_CONTIGUOUS']}")
print(f"writeable: {a.flags['WRITEABLE']}")
print(f"owns data: {a.base is None}")
print(f"nbytes: {a.nbytes:,}")
if a.base is not None:
print(f"base shape: {a.base.shape}")
Reference counting and memory lifetime
An array’s data buffer is freed when all views (and the base array) are garbage-collected:
a = np.arange(1_000_000) # allocates ~8 MB
b = a[::2] # view — a's buffer is kept alive
del a # buffer NOT freed — b still references it
del b # NOW the buffer is freed
This means a small view can keep a huge parent array alive. If you only need a small subset, copy it and delete the parent:
a = np.arange(10_000_000)
subset = a[:100].copy() # independent 800-byte array
del a # 80 MB freed immediately
The one thing to remember: NumPy’s view system is a zero-copy memory sharing protocol built on pointer arithmetic and strides — understanding it lets you control exactly when data is shared, when it is copied, and when it is freed.
See Also
- Python Bokeh Get an intuitive feel for Bokeh so Python behavior stops feeling unpredictable.
- Python Numpy Advanced Indexing How to cherry-pick exactly the data you want from a NumPy array using lists, masks, and fancy tricks.
- Python Numpy Broadcasting Rules How NumPy magically makes different-sized arrays work together without you writing any loops.
- Python Numpy Einsum One tiny function that replaces dozens of NumPy operations — once you learn its shorthand, array math becomes a breeze.
- Python Numpy Fft Spectral How NumPy breaks apart a signal into its hidden frequencies — like separating a chord into individual notes.