Trees and Traversals — Deep Dive

Technical framing

Trees are recursive data structures, and most tree algorithms express naturally as recursive functions. But production Python code often needs iterative implementations (to avoid stack overflow), efficient serialization (for storage and transmission), and self-balancing guarantees (for predictable performance). This deep dive covers the advanced techniques that make tree implementations production-ready.

Iterative traversals with explicit stacks

Recursive traversals are elegant but limited by Python’s recursion depth (default 1000). Iterative versions handle trees of any depth.

Iterative pre-order

def preorder_iterative(root):
    if not root:
        return []
    result = []
    stack = [root]
    while stack:
        node = stack.pop()
        result.append(node.val)
        if node.right:  # right first so left is processed first
            stack.append(node.right)
        if node.left:
            stack.append(node.left)
    return result

Iterative in-order

def inorder_iterative(root):
    result = []
    stack = []
    current = root
    while current or stack:
        while current:
            stack.append(current)
            current = current.left
        current = stack.pop()
        result.append(current.val)
        current = current.right
    return result

Iterative post-order (two-stack method)

def postorder_iterative(root):
    if not root:
        return []
    result = []
    stack1 = [root]
    stack2 = []
    while stack1:
        node = stack1.pop()
        stack2.append(node)
        if node.left:
            stack1.append(node.left)
        if node.right:
            stack1.append(node.right)
    while stack2:
        result.append(stack2.pop().val)
    return result

Morris traversal: O(1) space in-order

Morris traversal modifies the tree temporarily using threaded pointers, achieving O(1) space without a stack:

def morris_inorder(root):
    result = []
    current = root
    while current:
        if not current.left:
            result.append(current.val)
            current = current.right
        else:
            # Find in-order predecessor
            predecessor = current.left
            while predecessor.right and predecessor.right is not current:
                predecessor = predecessor.right
            
            if not predecessor.right:
                # Create thread
                predecessor.right = current
                current = current.left
            else:
                # Remove thread, visit node
                predecessor.right = None
                result.append(current.val)
                current = current.right
    return result

Morris traversal is O(n) time and O(1) space but modifies the tree during traversal (restoring it afterwards). Useful for memory-constrained environments.

Self-balancing BST: AVL tree

An AVL tree maintains the invariant that for every node, the heights of the left and right subtrees differ by at most 1. This guarantees O(log n) height.

class AVLNode:
    __slots__ = ('val', 'left', 'right', 'height')
    
    def __init__(self, val):
        self.val = val
        self.left = None
        self.right = None
        self.height = 0

class AVLTree:
    def _height(self, node):
        return node.height if node else -1
    
    def _balance_factor(self, node):
        return self._height(node.left) - self._height(node.right)
    
    def _update_height(self, node):
        node.height = 1 + max(self._height(node.left), self._height(node.right))
    
    def _rotate_right(self, y):
        x = y.left
        t2 = x.right
        x.right = y
        y.left = t2
        self._update_height(y)
        self._update_height(x)
        return x
    
    def _rotate_left(self, x):
        y = x.right
        t2 = y.left
        y.left = x
        x.right = t2
        self._update_height(x)
        self._update_height(y)
        return y
    
    def _rebalance(self, node):
        self._update_height(node)
        bf = self._balance_factor(node)
        
        if bf > 1:  # Left-heavy
            if self._balance_factor(node.left) < 0:
                node.left = self._rotate_left(node.left)  # Left-Right case
            return self._rotate_right(node)
        
        if bf < -1:  # Right-heavy
            if self._balance_factor(node.right) > 0:
                node.right = self._rotate_right(node.right)  # Right-Left case
            return self._rotate_left(node)
        
        return node
    
    def insert(self, root, val):
        if not root:
            return AVLNode(val)
        if val < root.val:
            root.left = self.insert(root.left, val)
        elif val > root.val:
            root.right = self.insert(root.right, val)
        else:
            return root  # no duplicates
        return self._rebalance(root)
    
    def delete(self, root, val):
        if not root:
            return None
        if val < root.val:
            root.left = self.delete(root.left, val)
        elif val > root.val:
            root.right = self.delete(root.right, val)
        else:
            if not root.left:
                return root.right
            if not root.right:
                return root.left
            # Find in-order successor
            successor = root.right
            while successor.left:
                successor = successor.left
            root.val = successor.val
            root.right = self.delete(root.right, successor.val)
        return self._rebalance(root)

The four rotation cases:

  • Left-Left: single right rotation
  • Right-Right: single left rotation
  • Left-Right: left rotation on left child, then right rotation
  • Right-Left: right rotation on right child, then left rotation

Trie (prefix tree)

Tries are trees optimized for string operations — autocomplete, spell checking, and IP routing:

class TrieNode:
    __slots__ = ('children', 'is_end', 'count')
    
    def __init__(self):
        self.children = {}
        self.is_end = False
        self.count = 0  # words passing through this node

class Trie:
    def __init__(self):
        self.root = TrieNode()
    
    def insert(self, word: str) -> None:
        node = self.root
        for char in word:
            if char not in node.children:
                node.children[char] = TrieNode()
            node = node.children[char]
            node.count += 1
        node.is_end = True
    
    def search(self, word: str) -> bool:
        node = self._find_node(word)
        return node is not None and node.is_end
    
    def starts_with(self, prefix: str) -> bool:
        return self._find_node(prefix) is not None
    
    def count_prefix(self, prefix: str) -> int:
        node = self._find_node(prefix)
        return node.count if node else 0
    
    def _find_node(self, prefix: str):
        node = self.root
        for char in prefix:
            if char not in node.children:
                return None
            node = node.children[char]
        return node
    
    def autocomplete(self, prefix: str, limit: int = 10) -> list[str]:
        node = self._find_node(prefix)
        if not node:
            return []
        results = []
        self._collect_words(node, prefix, results, limit)
        return results
    
    def _collect_words(self, node, prefix, results, limit):
        if len(results) >= limit:
            return
        if node.is_end:
            results.append(prefix)
        for char in sorted(node.children):
            self._collect_words(node.children[char], prefix + char, results, limit)

Trie operations are O(m) where m is the key length, independent of the number of stored keys.

Tree serialization and deserialization

For storing trees in files or transmitting over networks:

def serialize(root) -> str:
    """Pre-order serialization with None markers."""
    if not root:
        return "#"
    return f"{root.val},{serialize(root.left)},{serialize(root.right)}"

def deserialize(data: str):
    """Reconstruct tree from serialized string."""
    tokens = iter(data.split(","))
    
    def build():
        val = next(tokens)
        if val == "#":
            return None
        node = TreeNode(int(val))
        node.left = build()
        node.right = build()
        return node
    
    return build()

This pre-order serialization with null markers uniquely identifies any binary tree and can be deserialized in O(n) time.

Lowest common ancestor (LCA)

Finding the LCA of two nodes — essential for distance calculations and path queries:

def lca(root, p, q):
    """Find LCA in a binary tree (not necessarily BST)."""
    if not root or root.val == p or root.val == q:
        return root
    left = lca(root.left, p, q)
    right = lca(root.right, p, q)
    if left and right:
        return root  # p and q are in different subtrees
    return left or right

For a BST, LCA can be found more efficiently by using the ordering property:

def lca_bst(root, p, q):
    while root:
        if p < root.val and q < root.val:
            root = root.left
        elif p > root.val and q > root.val:
            root = root.right
        else:
            return root
    return None

Performance comparison of tree types

StructureSearchInsertDeleteSpaceOrdered
Unbalanced BSTO(n) worstO(n) worstO(n) worstO(n)Yes
AVL treeO(log n)O(log n)O(log n)O(n)Yes
Red-black treeO(log n)O(log n)O(log n)O(n)Yes
B-treeO(log n)O(log n)O(log n)O(n)Yes
TrieO(m)O(m)O(m)O(Σ × m × n)Prefix
HeapO(n)O(log n)O(log n)O(n)Partial

Where m = key length, Σ = alphabet size, n = number of keys.

Python-specific tree libraries

  • heapq — binary heap (array-based, not a node-based tree)
  • sortedcontainers.SortedList — B-tree variant, the best general-purpose sorted structure in Python
  • anytree — general tree library with visualization
  • treelib — tree with printing and export
  • lxml.etree — XML/HTML tree operations at C speed

For production Python code, prefer battle-tested libraries over hand-rolled tree implementations. Custom trees are for learning, interviews, and specialized cases where no library fits.

One thing to remember: Trees are recursive structures, and tree algorithms are recursive by nature. Master the four traversal patterns iteratively for production robustness, and understand self-balancing (AVL/red-black) for performance-critical applications.

pythondata-structurestrees

See Also