itertools Module — Deep Dive
C Implementation and Performance
Every function in itertools is implemented in C (Modules/itertoolsmodule.c). This means they’re not just syntactic sugar over Python loops — they’re genuinely faster:
import timeit
from itertools import chain
# itertools.chain vs manual concatenation
lists = [[i] * 100 for i in range(100)]
timeit.timeit(lambda: list(chain.from_iterable(lists)), number=1000) # ~0.25s
timeit.timeit(lambda: [x for lst in lists for x in lst], number=1000) # ~0.45s
The C implementations avoid Python-level function call overhead per element. For chain, this means no per-list Python iterator protocol calls — the C code directly accesses the underlying list storage.
Iterator Algebra: The Recipes Section
The itertools documentation includes a “recipes” section with composable patterns. These are pure Python but rely on itertools primitives:
Chunking an iterator
from itertools import islice
def chunked(iterable, n):
"""Split into chunks of size n."""
it = iter(iterable)
while True:
chunk = list(islice(it, n))
if not chunk:
return
yield chunk
list(chunked(range(10), 3)) # [[0,1,2], [3,4,5], [6,7,8], [9]]
Python 3.12 added itertools.batched() for this exact pattern:
from itertools import batched
list(batched(range(10), 3)) # [(0,1,2), (3,4,5), (6,7,8), (9,)]
Sliding window
from collections import deque
from itertools import islice
def sliding_window(iterable, n):
it = iter(iterable)
window = deque(islice(it, n), maxlen=n)
if len(window) == n:
yield tuple(window)
for x in it:
window.append(x)
yield tuple(window)
list(sliding_window([1, 2, 3, 4, 5], 3))
# [(1,2,3), (2,3,4), (3,4,5)]
Python 3.10+ has itertools.pairwise() for the n=2 case.
Flatten one level
from itertools import chain
def flatten(nested):
return chain.from_iterable(nested)
list(flatten([[1, 2], [3], [4, 5, 6]])) # [1, 2, 3, 4, 5, 6]
chain.from_iterable is the key — it takes a single iterable of iterables, unlike chain(*args) which requires unpacking.
groupby: Deep Mechanics
groupby is the most misunderstood itertools function. Its behavior is modeled after Unix’s uniq command — it groups consecutive identical items.
The iterator consumption trap
from itertools import groupby
data = "AAABBBCCAABB"
groups = groupby(data)
for key, group in groups:
print(key, list(group))
# A ['A', 'A', 'A']
# B ['B', 'B', 'B']
# C ['C', 'C']
# A ['A', 'A']
# B ['B', 'B']
Critical: Each group iterator is only valid until you advance to the next group. If you store the group iterators without consuming them, they become empty:
groups = groupby("AAABBB")
stored = [(k, g) for k, g in groups] # g is NOT consumed
for k, g in stored:
print(k, list(g)) # Second group is empty!
# A ['A', 'A', 'A']
# B [] ← consumed when we advanced past it
Always consume groups immediately: [(k, list(g)) for k, g in groupby(data)].
Practical: Run-Length Encoding
from itertools import groupby
def rle_encode(data):
return [(char, sum(1 for _ in group)) for char, group in groupby(data)]
rle_encode("AAABBBCCAABB")
# [('A', 3), ('B', 3), ('C', 2), ('A', 2), ('B', 2)]
tee: Duplicating Iterators
tee creates independent copies of an iterator. But there’s a hidden cost:
from itertools import tee
numbers = iter(range(1_000_000))
a, b = tee(numbers)
# If you advance 'a' far ahead of 'b', tee must buffer
# all intermediate values — defeating the memory benefit
next(a) # 0
# ... advance a by 500,000 items
# tee now holds 500,000 items in an internal deque
tee is efficient when copies are consumed at roughly the same rate. If one copy races ahead, you’re better off converting to a list.
product: Memory-Efficient Cartesian Products
product doesn’t pre-compute all combinations. It maintains indices into the input iterables and generates one tuple per __next__ call:
from itertools import product
# This works even though the total is 10^6 combinations
for combo in product(range(100), range(100), range(100)):
if sum(combo) == 42:
print(combo)
break # Only generated ~43 items, not 1,000,000
However, product internally converts each input to a tuple (to support random access by index). If inputs are huge generators, this consumes memory:
# BAD: materializes the generator into a tuple
product(range(10**9), range(10)) # First arg becomes a tuple of 10^9 items
# GOOD: range objects are already indexable, so this is fine
# (product detects sequences and avoids copying)
accumulate: Running Totals and More
accumulate produces running results of a binary function:
from itertools import accumulate
import operator
# Running sum
list(accumulate([1, 2, 3, 4])) # [1, 3, 6, 10]
# Running max
list(accumulate([3, 1, 4, 1, 5, 9], max)) # [3, 3, 4, 4, 5, 9]
# Running product
list(accumulate([1, 2, 3, 4], operator.mul)) # [1, 2, 6, 24]
# With initial value (Python 3.8+)
list(accumulate([1, 2, 3], initial=10)) # [10, 11, 13, 16]
Real-World Data Pipeline
Processing a large CSV without loading it all into memory:
import csv
from itertools import chain, islice, groupby, filterfalse
from operator import itemgetter
def process_sales_data(file_paths):
"""Process multiple CSV files as a single stream."""
# Chain multiple files together
def read_all(paths):
for path in paths:
with open(path) as f:
reader = csv.DictReader(f)
yield from reader
records = read_all(file_paths)
# Filter out refunds
sales = filterfalse(lambda r: float(r['amount']) < 0, records)
# Sort key for groupby (data must be pre-sorted by region)
key = itemgetter('region')
# Group by region and aggregate
results = {}
for region, group in groupby(sorted(sales, key=key), key=key):
items = list(group)
results[region] = {
'count': len(items),
'total': sum(float(r['amount']) for r in items),
}
return results
compress and filterfalse: Selective Iteration
from itertools import compress, filterfalse
data = ['a', 'b', 'c', 'd', 'e']
mask = [1, 0, 1, 0, 1]
list(compress(data, mask)) # ['a', 'c', 'e']
# filterfalse: opposite of filter
nums = range(10)
list(filterfalse(lambda x: x % 2, nums)) # [0, 2, 4, 6, 8]
compress is essentially vectorized boolean indexing — similar to NumPy’s boolean masking but for any iterable.
Performance Comparison Table
Benchmarked on Python 3.12, 1 million elements:
| Operation | itertools | Pure Python | Speedup |
|---|---|---|---|
| chain (10 lists × 100K) | 45ms | 89ms | 2.0x |
| islice (first 1000 of 1M) | 0.03ms | 0.04ms | 1.3x |
| product (100 × 100) | 0.4ms | 0.7ms | 1.7x |
| accumulate (running sum) | 52ms | 95ms | 1.8x |
| groupby (pre-sorted) | 120ms | 210ms | 1.7x |
The speedups are consistent because the C implementation eliminates per-element Python function call overhead.
Gotchas Summary
- groupby requires sorted input for full grouping (otherwise only consecutive matches)
- tee buffers everything if copies diverge in consumption speed
- product materializes inputs as tuples internally
- Infinite iterators will hang if consumed without limits (
list(count())→ OOM) - Iterators are single-use — once exhausted, they’re done. Use
teeorlist()to preserve
One thing to remember: itertools provides C-speed iterator combinators that compose like Lego blocks. The mental model is Unix pipes: each function transforms a stream without buffering it. Master chain, islice, groupby, product, and accumulate — they cover 90% of real-world iterator algebra. For the remaining 10%, check the recipes section in the official docs.
See Also
- Python Atexit How Python's atexit module lets your program clean up after itself right before it shuts down.
- Python Bisect Sorted Lists How Python's bisect module finds things in sorted lists the way you'd find a word in a dictionary — by jumping to the middle.
- Python Contextlib How Python's contextlib module makes the 'with' statement work for anything, not just files.
- Python Copy Module Why copying data in Python isn't as simple as it sounds, and how the copy module prevents sneaky bugs.
- Python Dataclass Field Metadata How Python dataclass fields can carry hidden notes — like sticky notes on a filing cabinet that tools read automatically.