NumPy FFT & Spectral Analysis — Core Concepts

Why this topic matters

Spectral analysis reveals structure in data that is invisible in the time domain. Periodic trends, noise characteristics, and dominant cycles all become obvious once you look at the frequency spectrum. NumPy’s np.fft module provides the tools to compute and interpret these transforms efficiently.

What the FFT actually computes

Given N data points, np.fft.fft() returns N complex numbers. Each complex number represents a frequency component:

  • Magnitude (absolute value) = how strong that frequency is.
  • Phase (angle) = where in its cycle that frequency starts.
import numpy as np

# Create a signal: 5 Hz sine + 12 Hz sine
t = np.linspace(0, 1, 1000, endpoint=False)  # 1 second, 1000 samples
signal = np.sin(2 * np.pi * 5 * t) + 0.5 * np.sin(2 * np.pi * 12 * t)

# Compute FFT
spectrum = np.fft.fft(signal)
freqs = np.fft.fftfreq(len(signal), d=1/1000)  # frequency axis

# Power spectrum — magnitude squared
power = np.abs(spectrum) ** 2

Interpreting the output

The FFT output has a specific layout:

Index rangeFrequency meaning
0DC component (mean of signal)
1 to N/2-1Positive frequencies (low to high)
N/2Nyquist frequency
N/2+1 to N-1Negative frequencies (mirror of positive)

For real-valued signals, the negative frequencies are a mirror of the positive ones. Use np.fft.rfft() to get only the positive half — it is faster and uses half the memory.

The Nyquist limit

You can only detect frequencies up to half your sampling rate. If you sample at 1000 Hz, the highest detectable frequency is 500 Hz. This is the Nyquist frequency.

Signals above the Nyquist limit “fold back” and appear as lower frequencies — a phenomenon called aliasing. This is why audio is sampled at 44.1 kHz (to capture sounds up to ~22 kHz, the limit of human hearing).

rfft vs fft

For real-valued input data (which covers most practical cases):

# Full FFT — N complex outputs, symmetric
full = np.fft.fft(signal)       # 1000 complex numbers

# Real FFT — N/2+1 complex outputs, no redundancy
half = np.fft.rfft(signal)      # 501 complex numbers
rfreqs = np.fft.rfftfreq(len(signal), d=1/1000)

rfft is preferred for real data: faster, less memory, no confusing negative frequency handling.

Common misconception

Many people think the FFT output directly tells you “this frequency exists in the signal.” It does — but only if the signal is periodic within your window. If you analyze 1 second of a 5 Hz signal, you get a clean spike at 5 Hz. But if your window cuts a cycle in half, the energy “leaks” into neighboring frequency bins. This is called spectral leakage, and it is solved by applying a window function before the FFT.

from scipy.signal import windows

windowed = signal * windows.hann(len(signal))
spectrum = np.fft.rfft(windowed)

Inverse FFT

The FFT is fully reversible:

reconstructed = np.fft.ifft(spectrum)
np.allclose(signal, reconstructed.real)  # True

This enables frequency-domain filtering: transform to frequency, zero out unwanted frequencies, transform back.

The one thing to remember: Use rfft for real data, respect the Nyquist limit, and apply a window function to avoid spectral leakage — those three rules handle 90% of spectral analysis tasks.

pythonnumpydata-science

See Also

  • Python Bokeh Get an intuitive feel for Bokeh so Python behavior stops feeling unpredictable.
  • Python Numpy Advanced Indexing How to cherry-pick exactly the data you want from a NumPy array using lists, masks, and fancy tricks.
  • Python Numpy Broadcasting Rules How NumPy magically makes different-sized arrays work together without you writing any loops.
  • Python Numpy Einsum One tiny function that replaces dozens of NumPy operations — once you learn its shorthand, array math becomes a breeze.
  • Python Numpy Memory Views Why NumPy arrays can share the same data without copying it — and how that makes your code fast but occasionally surprising.