Python bisect & Sorted Lists — Deep Dive
How bisect works internally
The CPython implementation of bisect_left is a textbook binary search:
def bisect_left(a, x, lo=0, hi=None, *, key=None):
if hi is None:
hi = len(a)
if key is None:
while lo < hi:
mid = (lo + hi) // 2
if a[mid] < x:
lo = mid + 1
else:
hi = mid
else:
while lo < hi:
mid = (lo + hi) // 2
if key(a[mid]) < x:
lo = mid + 1
else:
hi = mid
return lo
bisect_right differs only in the comparison: a[mid] <= x instead of a[mid] < x. This single change determines whether equal elements land to the left or right.
Since CPython 3.10, the fast path (no key) is implemented in C (_bisectmodule.c), making it significantly faster than a pure Python loop. The key path falls back to Python-level calls per comparison.
The lo and hi parameters
Both bisect_left and bisect_right accept lo and hi to restrict the search range. This is useful for searching within a window of a larger sorted list:
import bisect
data = [10, 20, 30, 40, 50, 60, 70, 80]
# Search only within indices 2..5
pos = bisect.bisect_left(data, 45, lo=2, hi=6)
# pos → 4
This avoids copying a sublist and keeps the operation allocation-free.
Advanced pattern: range queries
To find all elements in a sorted list within [low, high]:
import bisect
def range_query(sorted_list, low, high):
left = bisect.bisect_left(sorted_list, low)
right = bisect.bisect_right(sorted_list, high)
return sorted_list[left:right]
timestamps = [100, 200, 300, 400, 500, 600]
result = range_query(timestamps, 250, 500)
# → [300, 400, 500]
Both lookups are O(log n), and the slice is O(k) where k is the number of results. This is the foundation of index-based range scans in databases.
Advanced pattern: discrete event simulation
Games and simulations schedule events at specific times. A sorted event list with insort provides a simple priority mechanism:
import bisect
from dataclasses import dataclass, field
@dataclass(order=True)
class Event:
time: float
sequence: int = field(compare=True)
callback: object = field(compare=False)
class EventLoop:
def __init__(self):
self.events = []
self._seq = 0
def schedule(self, time, callback):
self._seq += 1
bisect.insort(self.events, Event(time, self._seq, callback))
def run(self, until):
while self.events and self.events[0].time <= until:
event = self.events.pop(0)
event.callback()
The sequence number breaks ties for events at the same time, preserving insertion order. For high-throughput simulations, replacing the list with heapq avoids the O(n) insertion cost of insort — but insort wins when you also need random access to future events (e.g., for cancellation by index).
Performance analysis
bisect_left vs linear scan
For a sorted list of n elements:
| Method | Time complexity | 1M elements |
|---|---|---|
bisect_left | O(log n) | ~20 comparisons |
list.index() | O(n) | up to 1,000,000 comparisons |
x in list | O(n) | up to 1,000,000 comparisons |
x in set | O(1) average | 1 hash + compare |
If you only need membership testing, a set is faster. bisect shines when you need the position, not just existence.
insort vs sort-after-append
Inserting 10,000 items one at a time into a list of 100,000 items (CPython 3.12, benchmarked with timeit):
| Strategy | Time |
|---|---|
insort per item | ~120 ms |
append + sort() after each | ~8,500 ms |
append all + single sort() | ~25 ms |
If you’re batch-inserting, sort once at the end. If you’re inserting one at a time and need the list sorted between insertions, insort is the right choice.
The list-shift bottleneck
insort finds the position in O(log n) but the actual insertion into a Python list is O(n) because elements after the insertion point must be shifted. For workloads dominated by insertions, this makes insort slower than a balanced tree structure.
The sortedcontainers.SortedList provides O(log n) insertion using a tiered list-of-lists structure. Benchmarks typically show SortedList.add() is 3-10x faster than insort for lists above 100,000 elements, while maintaining similar iteration speed.
Implementing custom binary searches with bisect
The bisect module can solve problems beyond simple sorted-list operations. Any monotonic function can be binary-searched by creating a virtual “sorted list”:
import bisect
class MonotonicFunction:
"""Wraps a monotonic function so bisect can search its domain."""
def __init__(self, func, size):
self.func = func
self.size = size
def __len__(self):
return self.size
def __getitem__(self, i):
return self.func(i)
# Find the first integer where x**2 >= 1000
squares = MonotonicFunction(lambda x: x * x, 1000)
result = bisect.bisect_left(squares, 1000)
# result → 32 (32**2 = 1024 >= 1000)
This trick works because bisect only requires __len__ and __getitem__ — it never checks if the object is an actual list.
Combining bisect with the key parameter
Before Python 3.10, the standard workaround for keyed bisect was maintaining a parallel keys list or using a wrapper. Post-3.10, the key parameter simplifies sorted containers of complex objects:
import bisect
from datetime import datetime
log_entries = [
{"ts": datetime(2026, 1, 1, 10, 0), "msg": "start"},
{"ts": datetime(2026, 1, 1, 12, 0), "msg": "checkpoint"},
{"ts": datetime(2026, 1, 1, 14, 0), "msg": "end"},
]
target = datetime(2026, 1, 1, 11, 30)
idx = bisect.bisect_left(log_entries, target, key=lambda e: e["ts"])
# idx → 1, the entry just after 11:30 is "checkpoint" at 12:00
Important caveat: the key function is called on each element during the search, adding function-call overhead. For hot loops, pre-extracting keys into a separate list and bisecting that list can be 2-5x faster.
Thread safety and immutability
bisect_left and bisect_right are read-only and safe to call concurrently on a list that isn’t being modified. insort modifies the list and is not thread-safe — use a lock or process insertions in a single thread.
For concurrent workloads, a common architecture is to batch insertions and periodically merge them into the main sorted list, similar to an LSM-tree approach.
Pitfalls and edge cases
-
Unsorted input produces silent wrong results. bisect has no way to detect this. If your list might be unsorted, validate it or use a data structure that enforces order.
-
Float precision issues.
bisect_left([0.1 + 0.2], 0.3)returns 1, not 0, because0.1 + 0.2 != 0.3in floating point. Usedecimal.Decimalfor exact comparisons. -
Empty list edge case. Both
bisect_left([], x)andbisect_right([], x)return 0, which is correct but means you always need a bounds check before accessing the returned index. -
Mixing types.
bisect_left([1, 2, 3], "a")raisesTypeErrorin Python 3. Python 2 allowed cross-type comparisons, which led to subtle bugs.
The one thing to remember: bisect is a precision tool — O(log n) search that works on any sorted sequence and even on virtual sequences via __getitem__ — but it trusts you to provide sorted input and check bounds on the result.
See Also
- Python Atexit How Python's atexit module lets your program clean up after itself right before it shuts down.
- Python Contextlib How Python's contextlib module makes the 'with' statement work for anything, not just files.
- Python Copy Module Why copying data in Python isn't as simple as it sounds, and how the copy module prevents sneaky bugs.
- Python Dataclass Field Metadata How Python dataclass fields can carry hidden notes — like sticky notes on a filing cabinet that tools read automatically.
- Python Datetime Handling Why dealing with dates and times in Python is trickier than it sounds — and how the datetime module tames the chaos