NumPy Ufunc Creation — Core Concepts
Why this topic matters
NumPy ships around 60 built-in ufuncs, but real projects often need custom element-wise operations — unit conversions, domain-specific formulas, or piecewise functions. Knowing how to create ufuncs properly means your custom functions get broadcasting, type casting, reduction, and output arguments for free, instead of reimplementing those features by hand.
The three approaches
1. np.frompyfunc — quickest to write
import numpy as np
def clamp(x, lo, hi):
return min(max(x, lo), hi)
clamp_ufunc = np.frompyfunc(clamp, 3, 1)
result = clamp_ufunc(np.array([1, 5, 10, 15]), 3, 12)
print(result) # [3 5 10 12]
Pros: Works with any Python function. Supports multiple inputs and outputs. Cons: Returns object arrays (not float/int). Calls Python per element — slow for large arrays.
2. np.vectorize — cleaner interface
@np.vectorize
def fahrenheit_to_celsius(f):
return (f - 32) * 5 / 9
temps = np.array([32, 72, 100, 212])
print(fahrenheit_to_celsius(temps)) # [0. 22.22 37.78 100.]
vectorize infers the output dtype from the first element and returns a proper typed array. It also supports the signature parameter for operations on sub-arrays.
Pros: Returns properly typed arrays. Supports excluded parameters and signatures.
Cons: Still calls Python per element — the docs explicitly warn “not for performance.”
3. Numba @vectorize — compiled speed
from numba import vectorize, float64
@vectorize([float64(float64, float64)])
def fast_hypotenuse(a, b):
return (a**2 + b**2)**0.5
result = fast_hypotenuse(np.random.randn(1_000_000), np.random.randn(1_000_000))
Numba compiles the function to machine code and registers it as a genuine NumPy ufunc. It supports broadcasting, reduction, and even GPU execution with target='cuda'.
Pros: True C-level speed. Full ufunc protocol support. Cons: Requires Numba. Function body must use Numba-compatible operations only.
Ufunc features your custom function inherits
Once registered as a ufunc (especially via Numba), your function gains:
- Broadcasting: Inputs of different shapes are automatically aligned.
- Type casting: NumPy converts input dtypes to match your function’s signature.
outparameter: Callers can provide a pre-allocated output array to avoid allocation..reduce(): Fold the function across an axis (likenp.add.reduceissum)..accumulate(): Running fold (likenp.add.accumulateiscumsum)..at(): Unbuffered operation at specific indices.
Common misconception
Developers often think np.vectorize makes their function fast. It does not — it is a convenience wrapper, not a compiler. The name is misleading. For actual vectorized performance, you need either pure NumPy expressions (which use pre-compiled C loops) or a JIT compiler like Numba.
When to use each
| Scenario | Best choice |
|---|---|
| Quick prototype, small data | np.frompyfunc or np.vectorize |
| Production, large arrays | Numba @vectorize |
| Complex logic with branching | Numba @vectorize or @guvectorize |
| Need GPU acceleration | Numba @vectorize(target='cuda') |
| Cannot install Numba | Rewrite as pure NumPy expression |
The one thing to remember: np.vectorize is convenience, not speed — for fast custom ufuncs, you need Numba or pure NumPy expressions.
See Also
- Python Bokeh Get an intuitive feel for Bokeh so Python behavior stops feeling unpredictable.
- Python Numpy Advanced Indexing How to cherry-pick exactly the data you want from a NumPy array using lists, masks, and fancy tricks.
- Python Numpy Broadcasting Rules How NumPy magically makes different-sized arrays work together without you writing any loops.
- Python Numpy Einsum One tiny function that replaces dozens of NumPy operations — once you learn its shorthand, array math becomes a breeze.
- Python Numpy Fft Spectral How NumPy breaks apart a signal into its hidden frequencies — like separating a chord into individual notes.