NumPy Structured Arrays — ELI5

Regular NumPy arrays are like a drawer full of identical socks — everything is the same type. Numbers only, or text only, but never mixed. A structured array is more like a filing cabinet where each folder has different sections: a name (text), an age (number), and a salary (bigger number), all bundled together.

Think of a class roster. Each student has a name, a grade, and a student ID. In a normal array, you would need three separate arrays and keep them in sync yourself — if you sort one, you have to sort all three. With a structured array, each student is one row, and all their data travels together. Sort by grade and the names follow automatically.

You define the structure once — “the first field is a ten-character name, the second is an integer, the third is a decimal” — and NumPy packs the data tightly in memory. It is not as flexible as a pandas DataFrame, but it is much faster for large datasets because there is no Python overhead per row.

Structured arrays are popular for reading binary files, working with sensor data, and any situation where you have millions of rows of fixed-format records.

The one thing to remember: Structured arrays let NumPy hold different data types in one array — like a spreadsheet row, but stored at C speed.

pythonnumpydata-science

See Also