H3 Hexagonal Indexing — Deep Dive

Master H3 internals: icosahedral geometry, compact cell sets, directed edges, DuckDB integration, and high-performance spatial aggregation at scale.

H3’s hexagonal grid looks simple, but its icosahedral projection, hierarchical structure, and edge/vertex model support sophisticated spatial operations. This deep dive covers the geometry, performance patterns, and production integration techniques that unlock H3’s full potential.

Icosahedral projection model

H3 projects Earth’s surface onto an icosahedron (20-triangle solid), then recursively subdivides each triangular face into hexagons. This produces a nearly uniform grid — cell areas at a given resolution vary by less than a factor of ~1.7 globally, compared to 10× or more for latitude-longitude grids at high latitudes.

The pentagon problem

An icosahedron has 12 vertices. At each vertex, the hexagonal tiling cannot close, so H3 places a pentagon instead. At every resolution, exactly 12 pentagons exist. Their locations (in approximate lat/lng):

10 in remote ocean areas
2 near the poles

Pentagons have 5 neighbors instead of 6. Code that assumes 6 neighbors will break:

import h3

cell = h3.latlng_to_cell(40.748, -73.985, res=9)
print(h3.is_pentagon(cell))  # False (almost always)

# Safe neighbor count
neighbors = h3.grid_disk(cell, 1)
neighbor_count = len(neighbors) - 1  # 5 or 6

Cell index bit layout

An H3 index is a 64-bit integer:

Mode (4 bits) | Reserved (3 bits) | Resolution (4 bits) | Base cell (7 bits) | Digits (3 bits × 15)

Each of the 15 digit slots stores a value 0-6 representing which child cell was chosen at that resolution level. This compact encoding enables:

O(1) parent/child lookups (bit masking)
O(1) resolution detection
Lexicographic ordering that preserves spatial locality

# Integer form for database storage
cell_int = h3.str_to_int("892a100d2c3ffff")
cell_str = h3.int_to_str(cell_int)

Store H3 cells as BIGINT in databases for efficient indexing and joins.

Compact cell sets

When you have a set of cells covering a region, many adjacent cells share a parent. compact replaces groups of children with their parent:

cells = h3.polygon_to_cells(polygon, res=9)
print(len(cells))        # e.g., 5,000 cells

compacted = h3.compact_cells(cells)
print(len(compacted))    # e.g., 800 cells (mixed resolutions)

# Restore to uniform resolution
restored = h3.uncompact_cells(compacted, res=9)
assert restored == cells

Compact sets are 5-10× smaller, reducing storage and transmission costs. Use them for storing coverage areas in databases.

Directed edges

H3 models connections between adjacent cells as directed edges:

origin = h3.latlng_to_cell(40.748, -73.985, res=9)
neighbors = h3.grid_ring(origin, 1)

for neighbor in neighbors:
    edge = h3.cells_to_directed_edge(origin, neighbor)
    print(edge)
    print(h3.directed_edge_to_boundary(edge))  # edge geometry

Directed edges are useful for:

Modeling flow between cells (traffic, migration)
Computing gradients across cell boundaries
Network analysis on the hexagonal grid

Vectorized operations with h3-pandas

For DataFrame-scale operations, avoid row-level .apply():

# Slow: Python loop per row
df["h3"] = df.apply(lambda r: h3.latlng_to_cell(r.lat, r.lng, 9), axis=1)

# Fast: vectorized with h3ronpy (Rust-based)
import h3ronpy.pandas  # extends pandas
import h3ronpy
df["h3"] = h3ronpy.vector.coordinates_to_cells(
    df["lat"].values, df["lng"].values, resolution=9
)

h3ronpy wraps the H3 C library via Rust and operates on NumPy arrays, achieving 10-50× speedup over Python loops.

DuckDB integration

DuckDB’s H3 extension enables SQL-native hexagonal analysis:

import duckdb

duckdb.install_extension("h3")
duckdb.load_extension("h3")

result = duckdb.sql("""
    SELECT
        h3_latlng_to_cell(lat, lng, 9) as cell,
        count(*) as event_count,
        avg(value) as avg_value
    FROM read_parquet('events/*.parquet')
    GROUP BY cell
    HAVING event_count > 10
    ORDER BY event_count DESC
""").df()

This processes billions of rows without loading them into Python memory.

H3 in PostGIS

-- Install the h3-pg extension
CREATE EXTENSION h3;

-- Index a table by H3 cell
ALTER TABLE events ADD COLUMN h3_cell h3index;
UPDATE events SET h3_cell = h3_lat_lng_to_cell(POINT(lng, lat), 9);
CREATE INDEX idx_events_h3 ON events (h3_cell);

-- Aggregate by cell
SELECT h3_cell, count(*), avg(value)
FROM events
GROUP BY h3_cell;

The B-tree index on h3_cell enables fast lookups. For spatial queries, combine with h3_grid_disk:

SELECT * FROM events
WHERE h3_cell = ANY(
    SELECT h3_grid_disk(h3_lat_lng_to_cell(POINT(-73.985, 40.748), 9), 2)
);

Resolution selection strategy

Choosing the right resolution involves balancing precision against data volume:

def recommend_resolution(avg_feature_spacing_meters):
    """Suggest H3 resolution based on typical spacing between features."""
    resolutions = {
        0: 1_108_000, 1: 418_000, 2: 158_000, 3: 59_000,
        4: 22_600, 5: 8_500, 6: 3_200, 7: 1_220,
        8: 461, 9: 174, 10: 66, 11: 25,
        12: 9, 13: 3.6, 14: 1.4, 15: 0.5,
    }
    for res, edge_m in resolutions.items():
        if edge_m <= avg_feature_spacing_meters:
            return res
    return 15

Rules of thumb:

Ride-share/delivery: Resolution 8-9 (city blocks)
Retail analytics: Resolution 7-8 (neighborhoods)
Environmental monitoring: Resolution 4-6 (regions)
Telecommunications: Resolution 9-10 (cell towers)

Visualization patterns

With Kepler.gl

from keplergl import KeplerGl

map1 = KeplerGl()
map1.add_data(data=cell_counts_df, name="h3_data")
# Kepler.gl natively recognizes H3 cell columns and renders hexagons

Kepler.gl auto-detects H3 index columns and renders hexagonal fills — no polygon conversion needed.

With Folium + h3

import folium

m = folium.Map(location=[40.748, -73.985], zoom_start=13)
for _, row in gdf.iterrows():
    boundary = h3.cell_to_boundary(row["h3_cell"])
    folium.Polygon(
        locations=boundary,
        color="blue",
        fill=True,
        fill_opacity=row["normalized_count"],
    ).add_to(m)

Performance benchmarks

Operation	Time (1M cells)	Notes
`latlng_to_cell` (loop)	4.5s	Python overhead
`latlng_to_cell` (h3ronpy)	0.08s	Vectorized Rust
`grid_disk(k=2)`	0.3s per call	19 cells returned
`polygon_to_cells` (city)	0.5-2s	Depends on resolution
`compact_cells`	0.1s	1M → ~150K cells
DuckDB H3 GROUP BY	2s (100M rows)	Columnar + native ext

Common pitfalls

Pitfall	Symptom	Fix
Using string cells in databases	Slow joins, 2× storage	Store as BIGINT
Assuming 6 neighbors always	IndexError at pentagons	Check `is_pentagon()` or handle variable neighbor count
Wrong resolution for use case	Either too many cells or lost detail	Use the spacing heuristic above
Mixing resolutions in aggregation	Inconsistent cell sizes	`uncompact_cells` to uniform resolution before analysis
lat/lng argument order confusion	Points in wrong hemisphere	H3 v4 uses `latlng_to_cell(lat, lng)` — lat first

The one thing to remember: H3’s real power is not just hexagonal binning — its hierarchical structure, compact representations, directed edges, and native database extensions make it a complete spatial indexing system that scales from notebook analysis to billion-row production pipelines.

pythonh3hexagonal-indexinggeospatial