Python timeit Best Practices — Core Concepts
What timeit actually does
The timeit module measures small code snippets by running them many times in a controlled loop. It disables garbage collection during measurement and uses time.perf_counter() for high-resolution timing.
There are two key parameters:
- number — how many times to run the code in one batch (default: adaptively chosen or 1,000,000)
- repeat — how many separate batches to run (default: 5 in
timeit.repeat())
The distinction matters. Each “repeat” is an independent measurement. The minimum of the repeats is often the best estimate of true performance because higher values reflect interference from other processes.
The setup parameter
Code that prepares data should go in setup, not in the timed code:
import timeit
# WRONG: import happens inside the timed loop
timeit.timeit('json.dumps({"key": "value"})', number=100000)
# NameError: json is not defined
# RIGHT: setup runs once before timing starts
timeit.timeit(
'json.dumps({"key": "value"})',
setup='import json',
number=100000
)
Setup runs once per repeat, not once per iteration. This is the right behavior — you want to measure the operation, not the imports.
Command-line usage
The CLI interface is surprisingly powerful:
# Basic timing
python -m timeit "sum(range(1000))"
# With setup
python -m timeit -s "import json; data={'a': 1}" "json.dumps(data)"
# Multiple statements (semicolons)
python -m timeit -s "xs = list(range(1000))" "xs.sort()" "xs.reverse()"
# Control repetitions
python -m timeit -n 10000 -r 7 "list(range(100))"
The CLI auto-calibrates number if you don’t specify -n, which is usually what you want.
Five common traps
1. Timing mutable operations
# WRONG: sort mutates the list — after first iteration it's already sorted
timeit.timeit('xs.sort()', setup='xs = list(range(1000))', number=10000)
# This benchmarks sorting a sorted list, not a random one
# RIGHT: recreate the list each time
timeit.timeit('sorted(xs)', setup='import random; xs = random.sample(range(1000), 1000)', number=10000)
2. Ignoring setup cost for meaningful context
If your function needs a database connection or large data structure, put the creation in setup. But be aware that the setup object is shared across all iterations — mutations accumulate.
3. Comparing across separate timeit calls
# Unreliable: system load may differ between calls
time_a = timeit.timeit('method_a(data)', setup=setup, number=N)
time_b = timeit.timeit('method_b(data)', setup=setup, number=N)
Better: use timeit.repeat() for each and compare the minimums, or use a proper benchmarking framework.
4. Number too high for slow code
If your function takes 1 second, number=1000000 means waiting 11.5 days. Use a smaller number:
# For slow functions, reduce iterations
timeit.timeit('slow_function()', setup='...', number=100)
5. Forgetting globals parameter in scripts
def my_function():
return sum(range(100))
# WRONG: timeit can't see my_function
timeit.timeit('my_function()')
# RIGHT: pass the current namespace
timeit.timeit('my_function()', globals=globals())
Common misconception: timeit is only for micro-benchmarks
While timeit excels at microsecond-level measurements, it works fine for operations taking milliseconds or even seconds. Just adjust number downward. The real limitation is that timeit doesn’t provide statistical analysis — it gives you raw times. For histograms, percentiles, and regression tracking, tools like pyperf or pytest-benchmark build on similar principles with richer output.
When to use timeit vs alternatives
| Scenario | Tool |
|---|---|
| Quick comparison of two expressions | timeit CLI |
| Benchmarks in test suite | pytest-benchmark |
| Full statistical analysis | pyperf |
| Production latency tracking | time.perf_counter() in instrumentation |
| Profiling a whole program | cProfile or py-spy |
The one thing to remember: always separate setup from measured code, use the minimum of multiple repeats as your estimate, and never time mutable operations without resetting state.
See Also
- Python Algorithmic Complexity Understand Algorithmic Complexity through a practical analogy so your Python decisions become faster and clearer.
- Python Async Performance Tuning Making your async Python faster is like organizing a busy restaurant kitchen — it's all about flow.
- Python Benchmark Methodology Why timing Python code once means nothing, and how fair testing works like a science experiment.
- Python C Extension Performance How Python borrows C's speed for the hard parts — like hiring a specialist for the toughest job on the worksite.
- Python Caching Strategies Understand Python caching strategies with a shortcut-road analogy so your app gets faster without taking wrong turns.