Convolution Operations — Core Concepts
What convolution actually computes
Convolution combines two arrays — a signal (or image) and a kernel (a small pattern) — by sliding the kernel across the signal and computing a weighted sum at each position.
For 1D discrete convolution, the output at position i is:
output[i] = sum(signal[i - k] * kernel[k] for k in range(kernel_size))
The kernel is flipped before sliding (the mathematical definition). In practice, libraries handle this automatically.
1D convolution in Python
import numpy as np
from scipy.signal import convolve
signal = np.array([1, 3, 5, 7, 9, 7, 5, 3, 1])
kernel = np.array([1, 2, 1]) / 4 # Smoothing kernel
smoothed = convolve(signal, kernel, mode='same')
The mode parameter controls output size:
| Mode | Output length | Behavior |
|---|---|---|
full | N + K - 1 | All overlapping positions |
same | N | Same size as input (centered) |
valid | N - K + 1 | Only fully overlapping positions |
2D convolution for images
Images are 2D arrays. A 2D kernel (typically 3×3 or 5×5) slides across both rows and columns:
from scipy.signal import convolve2d
# Sobel edge detection kernel (horizontal edges)
sobel_y = np.array([[-1, -2, -1],
[ 0, 0, 0],
[ 1, 2, 1]])
edges = convolve2d(grayscale_image, sobel_y, mode='same', boundary='wrap')
Common kernels and their effects:
| Kernel | Effect | Use case |
|---|---|---|
[[1,1,1],[1,1,1],[1,1,1]] / 9 | Box blur | Noise reduction |
[[1,2,1],[2,4,2],[1,2,1]] / 16 | Gaussian blur | Smooth noise |
[[0,-1,0],[-1,5,-1],[0,-1,0]] | Sharpen | Enhance detail |
[[-1,-1,-1],[-1,8,-1],[-1,-1,-1]] | Edge detect | Find boundaries |
How convolution powers neural networks
Convolutional Neural Networks (CNNs) use convolution as their core operation. Instead of hand-designing kernels, the network learns the kernel values during training.
A CNN typically stacks multiple convolutional layers:
- First layers learn simple features — edges, gradients, color blobs
- Middle layers combine simple features into parts — eyes, wheels, letters
- Deep layers combine parts into objects — faces, cars, words
Each layer applies many different kernels (called filters), producing a stack of feature maps. A layer with 64 filters takes one image and produces 64 filtered versions, each highlighting different patterns.
Padding and stride
Two parameters control how convolution is applied:
Padding adds zeros (or other values) around the input border. Without padding, the output shrinks by (kernel_size - 1) pixels. “Same” padding preserves the input size.
Stride controls how many positions the kernel moves at each step. Stride 1 (default) moves one pixel at a time. Stride 2 skips every other position, halving the output size — a common way to downsample.
Correlation vs convolution
Mathematically, convolution flips the kernel before sliding. Cross-correlation skips the flip. For symmetric kernels (like Gaussian blur), they give the same result. Most deep learning frameworks actually implement correlation and call it “convolution” — the distinction rarely matters because learned kernels can compensate for the flip.
Performance considerations
Direct convolution is O(N × K) for 1D (N signal length, K kernel length). For large kernels, FFT-based convolution is faster — O(N log N) regardless of kernel size. SciPy automatically picks the best method with scipy.signal.fftconvolve.
Common misconception
Many people think convolution and matrix multiplication are unrelated. In fact, convolution can be expressed as matrix multiplication using a Toeplitz matrix constructed from the kernel. Deep learning frameworks often use this approach (via the im2col trick) to leverage optimized matrix multiplication on GPUs.
One thing to remember: Convolution is a weighted sliding-window operation — change the kernel weights and you change what it detects, which is exactly what neural networks learn to do automatically.
See Also
- Python Bayesian Inference How updating your beliefs with new evidence works — and why it helps computers make smarter guesses.
- Python Fourier Transforms How breaking any sound, image, or signal into simple waves reveals hidden patterns invisible to the naked eye.
- Python Genetic Algorithms How computers borrow evolution's playbook — survival of the fittest, mutation, and reproduction — to solve problems too complicated for brute force.
- Python Linear Algebra Numpy Why solving puzzles with rows and columns of numbers is the secret engine behind search engines, video games, and AI.
- Python Markov Chains Why the next thing that happens often depends only on what is happening right now — and how that one rule generates text, predicts weather, and powers board games.