H3 Hexagonal Indexing — Deep Dive
H3’s hexagonal grid looks simple, but its icosahedral projection, hierarchical structure, and edge/vertex model support sophisticated spatial operations. This deep dive covers the geometry, performance patterns, and production integration techniques that unlock H3’s full potential.
Icosahedral projection model
H3 projects Earth’s surface onto an icosahedron (20-triangle solid), then recursively subdivides each triangular face into hexagons. This produces a nearly uniform grid — cell areas at a given resolution vary by less than a factor of ~1.7 globally, compared to 10× or more for latitude-longitude grids at high latitudes.
The pentagon problem
An icosahedron has 12 vertices. At each vertex, the hexagonal tiling cannot close, so H3 places a pentagon instead. At every resolution, exactly 12 pentagons exist. Their locations (in approximate lat/lng):
- 10 in remote ocean areas
- 2 near the poles
Pentagons have 5 neighbors instead of 6. Code that assumes 6 neighbors will break:
import h3
cell = h3.latlng_to_cell(40.748, -73.985, res=9)
print(h3.is_pentagon(cell)) # False (almost always)
# Safe neighbor count
neighbors = h3.grid_disk(cell, 1)
neighbor_count = len(neighbors) - 1 # 5 or 6
Cell index bit layout
An H3 index is a 64-bit integer:
Mode (4 bits) | Reserved (3 bits) | Resolution (4 bits) | Base cell (7 bits) | Digits (3 bits × 15)
Each of the 15 digit slots stores a value 0-6 representing which child cell was chosen at that resolution level. This compact encoding enables:
- O(1) parent/child lookups (bit masking)
- O(1) resolution detection
- Lexicographic ordering that preserves spatial locality
# Integer form for database storage
cell_int = h3.str_to_int("892a100d2c3ffff")
cell_str = h3.int_to_str(cell_int)
Store H3 cells as BIGINT in databases for efficient indexing and joins.
Compact cell sets
When you have a set of cells covering a region, many adjacent cells share a parent. compact replaces groups of children with their parent:
cells = h3.polygon_to_cells(polygon, res=9)
print(len(cells)) # e.g., 5,000 cells
compacted = h3.compact_cells(cells)
print(len(compacted)) # e.g., 800 cells (mixed resolutions)
# Restore to uniform resolution
restored = h3.uncompact_cells(compacted, res=9)
assert restored == cells
Compact sets are 5-10× smaller, reducing storage and transmission costs. Use them for storing coverage areas in databases.
Directed edges
H3 models connections between adjacent cells as directed edges:
origin = h3.latlng_to_cell(40.748, -73.985, res=9)
neighbors = h3.grid_ring(origin, 1)
for neighbor in neighbors:
edge = h3.cells_to_directed_edge(origin, neighbor)
print(edge)
print(h3.directed_edge_to_boundary(edge)) # edge geometry
Directed edges are useful for:
- Modeling flow between cells (traffic, migration)
- Computing gradients across cell boundaries
- Network analysis on the hexagonal grid
Vectorized operations with h3-pandas
For DataFrame-scale operations, avoid row-level .apply():
# Slow: Python loop per row
df["h3"] = df.apply(lambda r: h3.latlng_to_cell(r.lat, r.lng, 9), axis=1)
# Fast: vectorized with h3ronpy (Rust-based)
import h3ronpy.pandas # extends pandas
import h3ronpy
df["h3"] = h3ronpy.vector.coordinates_to_cells(
df["lat"].values, df["lng"].values, resolution=9
)
h3ronpy wraps the H3 C library via Rust and operates on NumPy arrays, achieving 10-50× speedup over Python loops.
DuckDB integration
DuckDB’s H3 extension enables SQL-native hexagonal analysis:
import duckdb
duckdb.install_extension("h3")
duckdb.load_extension("h3")
result = duckdb.sql("""
SELECT
h3_latlng_to_cell(lat, lng, 9) as cell,
count(*) as event_count,
avg(value) as avg_value
FROM read_parquet('events/*.parquet')
GROUP BY cell
HAVING event_count > 10
ORDER BY event_count DESC
""").df()
This processes billions of rows without loading them into Python memory.
H3 in PostGIS
-- Install the h3-pg extension
CREATE EXTENSION h3;
-- Index a table by H3 cell
ALTER TABLE events ADD COLUMN h3_cell h3index;
UPDATE events SET h3_cell = h3_lat_lng_to_cell(POINT(lng, lat), 9);
CREATE INDEX idx_events_h3 ON events (h3_cell);
-- Aggregate by cell
SELECT h3_cell, count(*), avg(value)
FROM events
GROUP BY h3_cell;
The B-tree index on h3_cell enables fast lookups. For spatial queries, combine with h3_grid_disk:
SELECT * FROM events
WHERE h3_cell = ANY(
SELECT h3_grid_disk(h3_lat_lng_to_cell(POINT(-73.985, 40.748), 9), 2)
);
Resolution selection strategy
Choosing the right resolution involves balancing precision against data volume:
def recommend_resolution(avg_feature_spacing_meters):
"""Suggest H3 resolution based on typical spacing between features."""
resolutions = {
0: 1_108_000, 1: 418_000, 2: 158_000, 3: 59_000,
4: 22_600, 5: 8_500, 6: 3_200, 7: 1_220,
8: 461, 9: 174, 10: 66, 11: 25,
12: 9, 13: 3.6, 14: 1.4, 15: 0.5,
}
for res, edge_m in resolutions.items():
if edge_m <= avg_feature_spacing_meters:
return res
return 15
Rules of thumb:
- Ride-share/delivery: Resolution 8-9 (city blocks)
- Retail analytics: Resolution 7-8 (neighborhoods)
- Environmental monitoring: Resolution 4-6 (regions)
- Telecommunications: Resolution 9-10 (cell towers)
Visualization patterns
With Kepler.gl
from keplergl import KeplerGl
map1 = KeplerGl()
map1.add_data(data=cell_counts_df, name="h3_data")
# Kepler.gl natively recognizes H3 cell columns and renders hexagons
Kepler.gl auto-detects H3 index columns and renders hexagonal fills — no polygon conversion needed.
With Folium + h3
import folium
m = folium.Map(location=[40.748, -73.985], zoom_start=13)
for _, row in gdf.iterrows():
boundary = h3.cell_to_boundary(row["h3_cell"])
folium.Polygon(
locations=boundary,
color="blue",
fill=True,
fill_opacity=row["normalized_count"],
).add_to(m)
Performance benchmarks
| Operation | Time (1M cells) | Notes |
|---|---|---|
latlng_to_cell (loop) | 4.5s | Python overhead |
latlng_to_cell (h3ronpy) | 0.08s | Vectorized Rust |
grid_disk(k=2) | 0.3s per call | 19 cells returned |
polygon_to_cells (city) | 0.5-2s | Depends on resolution |
compact_cells | 0.1s | 1M → ~150K cells |
| DuckDB H3 GROUP BY | 2s (100M rows) | Columnar + native ext |
Common pitfalls
| Pitfall | Symptom | Fix |
|---|---|---|
| Using string cells in databases | Slow joins, 2× storage | Store as BIGINT |
| Assuming 6 neighbors always | IndexError at pentagons | Check is_pentagon() or handle variable neighbor count |
| Wrong resolution for use case | Either too many cells or lost detail | Use the spacing heuristic above |
| Mixing resolutions in aggregation | Inconsistent cell sizes | uncompact_cells to uniform resolution before analysis |
| lat/lng argument order confusion | Points in wrong hemisphere | H3 v4 uses latlng_to_cell(lat, lng) — lat first |
The one thing to remember: H3’s real power is not just hexagonal binning — its hierarchical structure, compact representations, directed edges, and native database extensions make it a complete spatial indexing system that scales from notebook analysis to billion-row production pipelines.
See Also
- Python Adaptive Learning Systems How Python builds learning apps that adjust to each student like a personal tutor who knows exactly what you need next.
- Python Airflow Learn Airflow as a timetable manager that makes sure data tasks run in the right order every day.
- Python Altair Learn Altair through the idea of drawing charts by describing rules, not by hand-placing every visual element.
- Python Automated Grading How Python grades homework and exams automatically, from simple answer keys to understanding written essays.
- Python Batch Vs Stream Processing Batch processing is like doing laundry once a week; stream processing is like a self-cleaning shirt that cleans itself constantly.