Python heapq & Priority Queues — Core Concepts
What problem does heapq solve?
You have a collection of items and you frequently need the smallest one. Maybe you’re processing tasks by deadline, routing network packets by latency, or merging sorted log files. Every time you add an item or remove the current smallest, you want both operations to be fast.
Sorting the whole list every time costs O(n log n). Using heapq, each push or pop costs only O(log n). For a list of a million items, that’s roughly 20 operations instead of 20 million.
The heap data structure
A heap is a binary tree stored inside a plain Python list. The rule is simple: every parent is smaller than or equal to its children. The smallest element always sits at index 0.
Python’s heapq module implements a min-heap — the minimum value is always on top. There’s no built-in max-heap, but you can negate values to flip the order.
Key functions
| Function | What it does | Time |
|---|---|---|
heapq.heappush(heap, item) | Add an item, maintain heap order | O(log n) |
heapq.heappop(heap) | Remove and return the smallest item | O(log n) |
heapq.heapify(list) | Convert an existing list into a heap in-place | O(n) |
heapq.heappushpop(heap, item) | Push then pop — faster than doing both separately | O(log n) |
heapq.nsmallest(k, iterable) | Return k smallest items | O(n log k) |
heapq.nlargest(k, iterable) | Return k largest items | O(n log k) |
heapq.merge(*iterables) | Merge sorted iterables lazily | O(n log k) |
When to use what
- Need min/max once: use
min()ormax(). - Need top k from static data: use
sorted(data)[:k]if k is close to n, orheapq.nsmallestif k is much smaller than n. - Need repeated push/pop of smallest: use heapq — this is its sweet spot.
Practical pattern: task scheduler
import heapq
tasks = []
heapq.heappush(tasks, (1, "fix critical bug"))
heapq.heappush(tasks, (3, "update docs"))
heapq.heappush(tasks, (2, "code review"))
while tasks:
priority, name = heapq.heappop(tasks)
print(f"[P{priority}] {name}")
Output is always in priority order without manually sorting.
Common misconception
Many developers think a heap keeps the entire list sorted. It doesn’t. Only the top element is guaranteed to be the smallest. If you iterate over the list directly, the order looks scrambled. That’s by design — maintaining full sort order would be slower.
Gotcha: comparing items
When two items have the same priority value, Python compares the second element. If those elements don’t support comparison (like dicts), you’ll get a TypeError. The standard fix is to include a tie-breaking counter:
counter = 0
heapq.heappush(heap, (priority, counter, task))
counter += 1
This guarantees a unique ordering without requiring the task itself to be comparable.
Merge sorted streams
heapq.merge() takes multiple sorted iterables and yields items in sorted order without loading everything into memory. Log aggregation tools use this to combine sorted log files from multiple servers in a single pass.
The one thing to remember: Reach for heapq when you need fast, repeated access to the smallest item in a changing collection — it’s the right tool between “just use min()” and “bring in a database.”
See Also
- Python Atexit How Python's atexit module lets your program clean up after itself right before it shuts down.
- Python Bisect Sorted Lists How Python's bisect module finds things in sorted lists the way you'd find a word in a dictionary — by jumping to the middle.
- Python Contextlib How Python's contextlib module makes the 'with' statement work for anything, not just files.
- Python Copy Module Why copying data in Python isn't as simple as it sounds, and how the copy module prevents sneaky bugs.
- Python Dataclass Field Metadata How Python dataclass fields can carry hidden notes — like sticky notes on a filing cabinet that tools read automatically.