NumPy FFT & Spectral Analysis — Deep Dive

Master NumPy FFT internals — algorithm complexity, zero-padding, 2D transforms, spectral filtering, and real-world signal processing patterns.

Technical foundation

The Discrete Fourier Transform (DFT) converts N time-domain samples into N frequency-domain coefficients. Direct computation is O(N²). The Fast Fourier Transform (FFT) exploits symmetry in the DFT matrix to reduce this to O(N log N) — a transformation that made spectral analysis practical for large datasets.

NumPy’s np.fft uses the PocketFFT library (since NumPy 1.17), which supports arbitrary input sizes, not just powers of two.

The DFT equation

For input x[n], the DFT output X[k] is:

X[k] = Σ(n=0 to N-1) x[n] · exp(-2πi·nk/N)

Each X[k] is a complex number whose magnitude is the amplitude at frequency k/N cycles per sample and whose phase is the starting angle of that frequency component.

Frequency resolution and zero-padding

The frequency resolution of the FFT is Δf = fs / N where fs is the sampling rate and N is the number of points. To get finer frequency resolution, you need more data — not zero-padding.

Zero-padding interpolates the spectrum (making it smoother) but does not add new information:

import numpy as np

signal = np.sin(2 * np.pi * 5.5 * np.linspace(0, 1, 100, endpoint=False))

# Without padding: frequency bins at 0, 1, 2, ..., 50 Hz — 5.5 Hz falls between bins
spec_raw = np.abs(np.fft.rfft(signal))

# With padding: frequency bins at 0, 0.25, 0.5, ..., 50 Hz — smoother, but same true resolution
spec_padded = np.abs(np.fft.rfft(signal, n=400))

Zero-padding is useful for visual interpolation and for making convolution lengths match, but it does not improve the ability to distinguish two close frequencies.

Window functions in depth

Spectral leakage occurs because the DFT assumes the signal is periodic within the analysis window. A finite-length signal that does not complete an integer number of cycles creates a discontinuity at the window boundary.

Window functions taper the signal to zero at the edges, reducing leakage at the cost of widening the main lobe:

from scipy.signal import windows

N = 1024
t = np.linspace(0, 1, N, endpoint=False)
signal = np.sin(2 * np.pi * 50 * t) + 0.1 * np.sin(2 * np.pi * 52 * t)

# Different windows trade main-lobe width for side-lobe suppression
wins = {
    'rectangular': np.ones(N),
    'hann': windows.hann(N),
    'blackman': windows.blackman(N),
    'kaiser_14': windows.kaiser(N, 14),
}

for name, w in wins.items():
    spec = np.abs(np.fft.rfft(signal * w))
    peak_db = 20 * np.log10(spec.max())
    sidelobe_db = 20 * np.log10(np.sort(spec)[-3])  # third-highest peak
    print(f"{name:15s}: main={peak_db:.1f} dB, sidelobe~{sidelobe_db:.1f} dB")

Window	Main lobe width	Side lobe level	Best for
Rectangular	Narrowest	-13 dB	Known-periodic signals
Hann	Medium	-31 dB	General purpose
Blackman	Wide	-58 dB	Weak signals near strong ones
Kaiser (β=14)	Tunable	-70 dB	Precision spectral analysis

2D FFT for image processing

Images are 2D signals. The 2D FFT reveals spatial frequency content:

from PIL import Image

img = np.array(Image.open('photo.jpg').convert('L'), dtype=float)

# 2D FFT
F = np.fft.fft2(img)
F_shifted = np.fft.fftshift(F)  # move DC to center

# Power spectrum (log scale for visualization)
power = np.log1p(np.abs(F_shifted))

Applications:

Low-pass filter: zero out high-frequency components to blur.
High-pass filter: zero out low-frequency components to detect edges.
Pattern detection: periodic textures create bright spots in the spectrum.

# Low-pass filter: keep only center frequencies
rows, cols = img.shape
crow, ccol = rows // 2, cols // 2
mask = np.zeros_like(F_shifted)
radius = 30
Y, X = np.ogrid[:rows, :cols]
mask[(Y - crow)**2 + (X - ccol)**2 <= radius**2] = 1

filtered = np.fft.ifft2(np.fft.ifftshift(F_shifted * mask))
blurred = np.abs(filtered)

Convolution via FFT

Convolution in the time domain equals multiplication in the frequency domain. For large arrays, FFT-based convolution is much faster than direct convolution:

def fft_convolve(a, b):
    """Convolve two 1D arrays using FFT."""
    n = len(a) + len(b) - 1
    # Pad to next power of 2 for speed (optional but faster)
    fft_size = 1 << (n - 1).bit_length()
    A = np.fft.rfft(a, n=fft_size)
    B = np.fft.rfft(b, n=fft_size)
    return np.fft.irfft(A * B, n=fft_size)[:n]

# Compare with direct convolution
a = np.random.randn(100_000)
b = np.random.randn(1_000)

# Direct: O(n*m) = 100 million operations
result1 = np.convolve(a, b)

# FFT: O(n log n) ≈ 1.7 million operations
result2 = fft_convolve(a, b)
np.allclose(result1, result2)  # True

For signal lengths above ~500 elements, FFT convolution wins.

Power spectral density (PSD)

The raw FFT magnitude squared gives the periodogram — a noisy estimate of the power spectral density. For smoother PSD estimates, use Welch’s method (available in SciPy):

from scipy.signal import welch

fs = 1000  # sampling rate
f, psd = welch(signal, fs=fs, nperseg=256, noverlap=128)

# psd[i] is the power density at frequency f[i] in units²/Hz

Welch’s method splits the signal into overlapping segments, windows each segment, computes periodograms, and averages them. The tradeoff: lower variance but reduced frequency resolution.

Short-Time Fourier Transform (STFT)

For non-stationary signals (where frequency content changes over time), compute the FFT on sliding windows:

def stft(signal, window_size=256, hop_size=128):
    """Compute Short-Time Fourier Transform."""
    window = np.hanning(window_size)
    n_frames = (len(signal) - window_size) // hop_size + 1
    spectrogram = np.zeros((window_size // 2 + 1, n_frames), dtype=complex)

    for i in range(n_frames):
        start = i * hop_size
        frame = signal[start:start + window_size] * window
        spectrogram[:, i] = np.fft.rfft(frame)

    return spectrogram

# Result is a time-frequency matrix (spectrogram)
S = stft(signal)
magnitude_db = 20 * np.log10(np.abs(S) + 1e-10)

The spectrogram reveals how the frequency content evolves — essential for audio analysis, vibration monitoring, and EEG processing.

Performance considerations

Input size	`fft` time	`rfft` time	Speedup
2^20 (1M)	8.1 ms	4.2 ms	1.9x
10^6	12.3 ms	6.8 ms	1.8x
2^20 + 1	45 ms	24 ms	—

Powers of two are 3-5x faster than arbitrary sizes because the Cooley-Tukey radix-2 algorithm applies. PocketFFT handles arbitrary sizes via mixed-radix decomposition, but composite numbers with small factors (2, 3, 5) are still faster than primes.

# Find the next fast FFT size
from scipy.fft import next_fast_len
good_size = next_fast_len(len(signal))
spectrum = np.fft.rfft(signal, n=good_size)

Common pitfalls

Forgetting to normalize: ifft(fft(x)) returns x — NumPy handles the 1/N normalization in ifft. But if you compute the power spectrum as |X|², the values scale with N². Divide by N² (or N for amplitude) to get physical units.
Interpreting DC offset: X[0] is the sum of all samples (the DC component). For mean-removed signals, it should be near zero.
Aliased sampling: If your signal contains frequencies above Nyquist, no amount of post-processing can fix the aliased spectrum. Anti-alias filtering must happen before sampling.

The one thing to remember: The FFT converts O(N²) brute-force frequency analysis into O(N log N) elegance — but the quality of your results depends on sampling rate, window function, and understanding what each frequency bin actually represents.

pythonnumpydata-science