NumPy Memory Views — Deep Dive

Explore NumPy's memory model in depth — strides, contiguity flags, buffer protocol, memory-mapped arrays, and zero-copy interop.

Technical foundation

A NumPy ndarray is a thin descriptor that wraps a contiguous block of memory. The descriptor contains: a data pointer, a dtype, a shape tuple, a strides tuple, and flags. This separation between the descriptor and the data buffer is what makes views possible — multiple descriptors can reference the same buffer with different shapes and strides.

The ndarray memory model

ndarray object (Python heap)
├── data pointer ─────────→ [buffer in memory]
├── shape: (3, 4)           │ 0  1  2  3 │
├── strides: (32, 8)        │ 4  5  6  7 │
├── dtype: float64           │ 8  9 10 11 │
├── flags: C_CONTIGUOUS=True
└── base: None (or parent array)

When you create a view:

b = a[1:3]  # view of rows 1-2

b (ndarray object)
├── data pointer ─────────→ [same buffer, offset by 32 bytes]
├── shape: (2, 4)
├── strides: (32, 8)        (same strides)
├── dtype: float64
└── base: a                  (reference to parent)

The key insight: b.ctypes.data equals a.ctypes.data + 32 — it points into the middle of a’s buffer.

Strides in depth

Strides define the byte distance between consecutive elements along each axis:

import numpy as np

a = np.arange(24, dtype=np.float64).reshape(2, 3, 4)
print(a.strides)  # (96, 32, 8)
# Axis 0: 96 bytes = 12 elements × 8 bytes (jump one "plane")
# Axis 1: 32 bytes = 4 elements × 8 bytes (jump one row)
# Axis 2: 8 bytes = 1 element × 8 bytes (jump one column)

Negative strides (reversed views)

b = a[::-1]
print(b.strides)  # (-96, 32, 8) — negative stride along axis 0
# b.data points to the last plane; each step goes backward

Zero strides (broadcasting)

c = np.broadcast_to(np.array([1, 2, 3]), (1000, 3))
print(c.strides)  # (0, 8) — zero stride along axis 0
# The same 3 elements are "reused" for all 1000 rows
print(c.flags.writeable)  # False — writing would corrupt shared data

Contiguity flags

NumPy tracks two contiguity states:

C-contiguous (row-major): elements in the last axis are adjacent in memory.
F-contiguous (column-major, Fortran): elements in the first axis are adjacent.

a = np.arange(12).reshape(3, 4)
print(a.flags['C_CONTIGUOUS'])  # True
print(a.flags['F_CONTIGUOUS'])  # False

b = np.asfortranarray(a)
print(b.flags['F_CONTIGUOUS'])  # True

# A 1D array is both C and F contiguous
c = np.arange(10)
print(c.flags['C_CONTIGUOUS'], c.flags['F_CONTIGUOUS'])  # True True

Why it matters: many NumPy operations and external libraries (BLAS, LAPACK, cuDNN) require specific contiguity. Passing a non-contiguous array triggers an implicit copy.

The Python buffer protocol

NumPy arrays implement Python’s buffer protocol, enabling zero-copy access from other libraries:

a = np.arange(12, dtype=np.int32)

# memoryview — Python built-in zero-copy view
mv = memoryview(a)
print(mv.format)  # 'i' (int32)
print(mv.shape)   # (12,)
print(mv.strides) # (4,)

# Modify through memoryview — changes a
mv[0] = 99
print(a[0])  # 99

Libraries like Pillow, PyTorch, TensorFlow, and Arrow use the buffer protocol to exchange data with NumPy without copying.

Memory-mapped arrays

np.memmap creates an array backed by a file on disk:

# Create a 1 GB memory-mapped array
mmap = np.memmap('big_data.dat', dtype='float64', mode='w+', shape=(125_000_000,))
mmap[:1000] = np.random.randn(1000)
del mmap  # flush to disk

# Reopen — OS pages data in on demand
mmap = np.memmap('big_data.dat', dtype='float64', mode='r+', shape=(125_000_000,))
chunk = mmap[500:600]  # only 800 bytes paged in, not 1 GB

Memory maps are views into file-backed memory. The OS virtual memory system handles paging, making it possible to work with datasets larger than physical RAM.

Slicing memory-mapped arrays

Slices of memmaps return views (still memory-mapped):

subset = mmap[1000:2000]
print(type(subset))  # numpy.memmap
print(subset.base is mmap)  # True (or a chain of bases)

Fancy indexing returns a regular ndarray (copied into RAM):

selected = mmap[[0, 500, 999]]
print(type(selected))  # numpy.ndarray — no longer memory-mapped

Zero-copy interop patterns

NumPy → PyTorch

import torch

a = np.random.randn(1000, 1000)
t = torch.from_numpy(a)        # zero-copy — shares memory
t[0, 0] = 999
print(a[0, 0])                 # 999.0 — same memory

# PyTorch → NumPy
t2 = torch.randn(100)
a2 = t2.numpy()                # zero-copy (CPU tensors only)

NumPy → Pandas

import pandas as pd

a = np.random.randn(1000)
s = pd.Series(a, copy=False)   # shares memory (pandas ≤1.x)
# Note: pandas 2.x with Arrow backend may not share memory

NumPy → bytes (for network transfer)

a = np.random.randn(1000).astype(np.float32)
raw = a.tobytes()              # copy to bytes
b = np.frombuffer(raw, dtype=np.float32)  # view into raw bytes
print(np.shares_memory(a, b))  # False — tobytes creates a copy

`np.lib.stride_tricks.as_strided` — advanced views

For expert use, as_strided creates views with arbitrary shapes and strides:

from numpy.lib.stride_tricks import as_strided

# Sliding window view — zero-copy
a = np.arange(10, dtype=np.float64)
window_size = 4
n_windows = len(a) - window_size + 1
windows = as_strided(
    a,
    shape=(n_windows, window_size),
    strides=(a.strides[0], a.strides[0])
)
print(windows)
# [[0 1 2 3]
#  [1 2 3 4]
#  [2 3 4 5]
#  ...
#  [6 7 8 9]]

NumPy 1.20+ provides a safer alternative:

from numpy.lib.stride_tricks import sliding_window_view
windows = sliding_window_view(a, window_shape=4)

Warning: as_strided does not bounds-check. Invalid strides read garbage memory or crash Python. Prefer sliding_window_view when possible.

Diagnosing memory issues

def array_info(a):
    """Print diagnostic info about an array's memory."""
    print(f"shape: {a.shape}")
    print(f"strides: {a.strides}")
    print(f"dtype: {a.dtype} ({a.dtype.itemsize} bytes)")
    print(f"data ptr: {a.ctypes.data}")
    print(f"C-contiguous: {a.flags['C_CONTIGUOUS']}")
    print(f"F-contiguous: {a.flags['F_CONTIGUOUS']}")
    print(f"writeable: {a.flags['WRITEABLE']}")
    print(f"owns data: {a.base is None}")
    print(f"nbytes: {a.nbytes:,}")
    if a.base is not None:
        print(f"base shape: {a.base.shape}")

Reference counting and memory lifetime

An array’s data buffer is freed when all views (and the base array) are garbage-collected:

a = np.arange(1_000_000)  # allocates ~8 MB
b = a[::2]                  # view — a's buffer is kept alive
del a                       # buffer NOT freed — b still references it
del b                       # NOW the buffer is freed

This means a small view can keep a huge parent array alive. If you only need a small subset, copy it and delete the parent:

a = np.arange(10_000_000)
subset = a[:100].copy()  # independent 800-byte array
del a                     # 80 MB freed immediately

The one thing to remember: NumPy’s view system is a zero-copy memory sharing protocol built on pointer arithmetic and strides — understanding it lets you control exactly when data is shared, when it is copied, and when it is freed.

pythonnumpydata-science