NumPy Structured Arrays — Deep Dive

Master NumPy structured dtypes — memory layout, alignment, nested records, memory-mapped I/O, and C interop patterns.

Technical foundation

A NumPy structured array is a contiguous block of memory where each element has a fixed-size, multi-field layout defined by a structured dtype. Unlike Python objects, there is no per-element overhead — each record occupies exactly dtype.itemsize bytes, packed sequentially.

Memory layout and alignment

By default, NumPy packs fields tightly with no padding:

import numpy as np

dt = np.dtype([('x', 'i1'), ('y', 'f8')])
print(dt.itemsize)  # 9 bytes (1 + 8, no padding)

This differs from C compilers, which typically align double to 8-byte boundaries. To match C struct layout (necessary for interop), use align=True:

dt_aligned = np.dtype([('x', 'i1'), ('y', 'f8')], align=True)
print(dt_aligned.itemsize)  # 16 bytes (1 + 7 padding + 8)

Inspect exact offsets:

for name in dt_aligned.names:
    field_dtype, offset = dt_aligned.fields[name]
    print(f"{name}: dtype={field_dtype}, offset={offset}")
# x: dtype=int8, offset=0
# y: dtype=float64, offset=8

Nested structured dtypes

Fields can themselves be structured:

point_dt = np.dtype([('x', 'f4'), ('y', 'f4'), ('z', 'f4')])
particle_dt = np.dtype([
    ('id', 'i4'),
    ('position', point_dt),
    ('velocity', point_dt),
    ('mass', 'f8'),
])

particles = np.zeros(1000, dtype=particle_dt)
particles['position']['x'] = np.random.randn(1000)

Nested access returns views all the way down — no copies until you explicitly request one.

Sub-arrays in dtypes

A field can be an array itself:

dt = np.dtype([('label', 'U10'), ('readings', 'f4', (5,))])
data = np.zeros(3, dtype=dt)
data['readings'][0] = [1.1, 2.2, 3.3, 4.4, 5.5]
print(data['readings'].shape)  # (3, 5) — a regular 2D float array

This is powerful for fixed-size vector fields (RGB pixels, 3D coordinates, sensor channels) without needing separate arrays.

Memory-mapped structured arrays

For datasets too large for RAM, combine structured dtypes with memory mapping:

# Write
dt = np.dtype([('timestamp', 'f8'), ('sensor_id', 'i4'), ('value', 'f8')])
data = np.memmap('sensors.dat', dtype=dt, mode='w+', shape=(10_000_000,))
data['timestamp'][:100] = np.arange(100, dtype='f8')
del data  # flush to disk

# Read — only pages accessed data into RAM
mapped = np.memmap('sensors.dat', dtype=dt, mode='r')
recent = mapped[mapped['timestamp'] > 50]  # OS pages in only needed blocks

This pattern handles multi-gigabyte binary logs with constant memory footprint.

C struct interop

Structured arrays map directly to C structs, enabling zero-copy data exchange:

// C side
typedef struct {
    int32_t id;
    double x;
    double y;
} Point;

# Python side — must match C layout exactly
point_dt = np.dtype([('id', '<i4'), ('x', '<f8'), ('y', '<f8')], align=True)

# Read binary file written by C program
points = np.fromfile('points.bin', dtype=point_dt)

# Pass to C function via ctypes
import ctypes
lib = ctypes.CDLL('./libpoints.so')
lib.process_points(
    points.ctypes.data_as(ctypes.c_void_p),
    ctypes.c_int(len(points))
)

Key gotchas for C interop:

Byte order must match (< for little-endian on x86).
Alignment must match (align=True if the C compiler uses default alignment).
String fields in NumPy are Unicode (U); C uses char[] — use S (byte strings) for C interop.

Multi-field indexing

Selecting multiple fields returns a view (NumPy 1.16+):

dt = np.dtype([('a', 'i4'), ('b', 'f8'), ('c', 'f8')])
data = np.zeros(10, dtype=dt)
subset = data[['a', 'c']]  # view with fields a and c only

In older NumPy versions, this returned a copy with reordered memory. The view behavior is more efficient but means modifications propagate. Check your NumPy version if this matters.

Converting between structured and unstructured

# Structured → regular 2D array (all fields must be same type)
dt = np.dtype([('x', 'f8'), ('y', 'f8'), ('z', 'f8')])
structured = np.zeros(100, dtype=dt)
plain = np.lib.recfunctions.structured_to_unstructured(structured)
print(plain.shape)  # (100, 3)

# Regular → structured
from numpy.lib.recfunctions import unstructured_to_structured
back = unstructured_to_structured(plain, dt)

structured_to_unstructured returns a view when possible (fields are contiguous and same type), avoiding copies.

Performance: structured vs separate arrays

Structured arrays offer better cache locality when you access multiple fields per record (row-oriented access). Separate arrays win when you process one field across all records (column-oriented access).

# Row-oriented: structured wins
for record in structured_data:
    process(record['x'], record['y'], record['z'])

# Column-oriented: separate arrays win
result = x_array * 2 + y_array  # pure vectorized, one field at a time

In practice, structured arrays shine for I/O and data transport. For heavy computation, extract fields into separate arrays first.

Practical recipe: parsing a binary protocol

header_dt = np.dtype([
    ('magic', 'S4'),
    ('version', 'u2'),
    ('num_records', 'u4'),
    ('reserved', 'u2'),
])

record_dt = np.dtype([
    ('timestamp', 'u8'),
    ('channel', 'u1'),
    ('flags', 'u1'),
    ('value', 'f4'),
])

with open('protocol_dump.bin', 'rb') as f:
    header = np.frombuffer(f.read(header_dt.itemsize), dtype=header_dt)[0]
    n = int(header['num_records'])
    records = np.frombuffer(f.read(n * record_dt.itemsize), dtype=record_dt)

active = records[records['flags'] & 0x01 > 0]
print(f"Parsed {n} records, {len(active)} active")

No manual struct unpacking, no loops, no intermediate lists. The entire parse is two frombuffer calls.

Performance considerations

Structured arrays store records contiguously (row-oriented layout). This is efficient when you access many fields per record but slower when you process one field across millions of records. For column-oriented workloads, extract individual fields into separate arrays first:

timestamps = records['timestamp'].copy()  # contiguous column
values = records['value'].copy()

# Column operations are now cache-friendly
filtered_values = values[timestamps > some_threshold]

The copy cost is paid once; subsequent vectorized operations on contiguous columns run at full SIMD speed.

The one thing to remember: Structured arrays are NumPy’s zero-copy bridge between raw binary data and typed Python access — master the dtype definition and everything else follows.

pythonnumpydata-science