Big O Complexity Analysis — Core Concepts

What Big O actually measures

Big O notation describes how an algorithm’s runtime (or memory usage) scales relative to the input size n. It drops constants and lower-order terms because it cares about the growth pattern, not exact milliseconds. O(2n + 500) simplifies to O(n) — the 500 and the 2 stop mattering when n reaches millions.

This is not a theoretical exercise. Choosing an O(n) approach over an O(n²) approach on a dataset of 10 million rows can mean the difference between 10 seconds and 3 years.

The complexity hierarchy

Big ONameExample10 items1M items
O(1)ConstantDictionary lookup1 op1 op
O(log n)LogarithmicBinary search3 ops20 ops
O(n)LinearScanning a list10 ops1M ops
O(n log n)LinearithmicSorting33 ops20M ops
O(n²)QuadraticNested loops100 ops1 trillion ops
O(2ⁿ)ExponentialAll subsets1,024 opsHeat death

Python operations and their real costs

Python’s built-in data structures have specific complexities that trip people up.

Lists:

  • list[i] → O(1) — direct index access
  • item in list → O(n) — scans every element
  • list.append(x) → O(1) amortized — occasionally resizes
  • list.insert(0, x) → O(n) — shifts every element right
  • list.sort() → O(n log n) — Timsort

Dictionaries:

  • dict[key] → O(1) average — hash table lookup
  • key in dict → O(1) average — unlike lists
  • dict.items() → O(n) — iterates all pairs

Sets:

  • item in set → O(1) average — hash-based
  • set_a & set_b → O(min(len(a), len(b))) — intersection

The critical insight: in on a list is O(n), but in on a set or dict is O(1). This single fact eliminates most accidental O(n²) code in Python.

How to analyze your own code

Rule 1: Count the loops. One loop over n items is O(n). A loop inside a loop is O(n²). A loop inside a loop inside a loop is O(n³). Each nesting level multiplies.

Rule 2: Watch for hidden loops. if x in my_list inside a for loop is O(n²) — the in operator scans the list each time. Replace the list with a set and it drops to O(n).

Rule 3: Sequential blocks add; nested blocks multiply. Two consecutive loops of O(n) give O(n) + O(n) = O(n). A loop containing another loop gives O(n) × O(n) = O(n²).

Rule 4: Drop constants and smaller terms. O(3n² + 7n + 42) becomes O(n²). The n² term dominates everything else at scale.

Common misconception

“Big O tells you which algorithm is faster.” Not quite — it tells you which algorithm scales better. An O(n²) algorithm with a tiny constant factor can beat an O(n log n) algorithm on small inputs. Python’s Timsort is O(n log n) but its constant factors are so well-tuned that it beats theoretically-better algorithms on real-world data. Big O matters most when your data grows.

Space complexity matters too

Big O applies to memory, not just time. An algorithm that creates a copy of the input uses O(n) space. One that builds a matrix from the input uses O(n²) space. When processing a 2 GB file, an O(n) space algorithm needs 2 GB of RAM; an O(1) space streaming approach needs almost none.

Python’s range(1_000_000) is O(1) space — it generates numbers on demand. list(range(1_000_000)) is O(n) space — it stores all million numbers.

Best, average, and worst case

Algorithms can behave differently depending on the input. Quicksort is O(n log n) on average but O(n²) on already-sorted data (with naive pivot selection). Dictionary lookup is O(1) on average but O(n) in the worst case if every key hashes to the same bucket.

When someone says “dictionary lookup is O(1),” they mean average case. Python’s hash table implementation makes worst-case collisions extremely rare, but they are theoretically possible.

One thing to remember: When you see a loop inside a loop, ask yourself: “Can I replace the inner search with a dictionary or set lookup?” That single refactoring pattern eliminates more performance bugs than any other technique in Python.

pythonalgorithmsbig-ocomplexity

See Also