NumPy Structured Arrays — ELI5
Regular NumPy arrays are like a drawer full of identical socks — everything is the same type. Numbers only, or text only, but never mixed. A structured array is more like a filing cabinet where each folder has different sections: a name (text), an age (number), and a salary (bigger number), all bundled together.
Think of a class roster. Each student has a name, a grade, and a student ID. In a normal array, you would need three separate arrays and keep them in sync yourself — if you sort one, you have to sort all three. With a structured array, each student is one row, and all their data travels together. Sort by grade and the names follow automatically.
You define the structure once — “the first field is a ten-character name, the second is an integer, the third is a decimal” — and NumPy packs the data tightly in memory. It is not as flexible as a pandas DataFrame, but it is much faster for large datasets because there is no Python overhead per row.
Structured arrays are popular for reading binary files, working with sensor data, and any situation where you have millions of rows of fixed-format records.
The one thing to remember: Structured arrays let NumPy hold different data types in one array — like a spreadsheet row, but stored at C speed.
See Also
- Python Bokeh Get an intuitive feel for Bokeh so Python behavior stops feeling unpredictable.
- Python Numpy Advanced Indexing How to cherry-pick exactly the data you want from a NumPy array using lists, masks, and fancy tricks.
- Python Numpy Broadcasting Rules How NumPy magically makes different-sized arrays work together without you writing any loops.
- Python Numpy Einsum One tiny function that replaces dozens of NumPy operations — once you learn its shorthand, array math becomes a breeze.
- Python Numpy Fft Spectral How NumPy breaks apart a signal into its hidden frequencies — like separating a chord into individual notes.