Kepler.gl Visualization — Deep Dive
Kepler.gl’s Python integration looks simple on the surface, but building production-quality geospatial visualizations at scale requires understanding its rendering pipeline, configuration schema, data format optimizations, and embedding strategies.
Architecture: browser-side rendering
When you instantiate KeplerGl in a Jupyter notebook, it:
- Serializes your DataFrame to JSON (or Arrow if available)
- Embeds the Kepler.gl React/deck.gl application in an IFrame
- Passes data + config to the JavaScript layer
- deck.gl renders layers using WebGL on the client GPU
This means all rendering happens in the browser. Python never touches pixels. The implication: performance depends on the client’s GPU, and data transfer between Python and the browser is the bottleneck for large datasets.
Data format optimization
Reducing transfer size
For DataFrames with millions of rows, JSON serialization is slow and memory-hungry. Strategies:
# 1. Downcast numeric types before adding to map
df["lat"] = df["lat"].astype("float32")
df["lng"] = df["lng"].astype("float32")
df["value"] = df["value"].astype("int16")
# 2. Drop columns Kepler doesn't need
df_slim = df[["lat", "lng", "value", "timestamp"]]
map1.add_data(data=df_slim, name="points")
GeoArrow integration
Recent versions of keplergl support Apache Arrow for faster serialization:
import pyarrow as pa
table = pa.Table.from_pandas(df)
map1.add_data(data=table, name="arrow_data")
Arrow avoids the JSON roundtrip, cutting data transfer time by 3-10× for large datasets.
Practical limits
| Row count | JSON transfer | Rendering FPS | Recommendation |
|---|---|---|---|
| <100K | Fast (<2s) | 60 FPS | Use as-is |
| 100K-1M | Slow (5-30s) | 30-60 FPS | Downcast types, drop columns |
| 1M-5M | Very slow | 15-30 FPS | Aggregate or sample |
| >5M | May crash browser | Unusable | Pre-aggregate with H3 or grid |
Multi-layer dashboard construction
Combining layer types
config = {
"version": "v1",
"config": {
"visState": {
"layers": [
{
"id": "heatmap_layer",
"type": "heatmap",
"config": {
"dataId": "pickups",
"columns": {"lat": "lat", "lng": "lng"},
"visConfig": {
"radius": 30,
"colorRange": {
"name": "Global Warming",
"type": "sequential",
"colors": ["#5A1846", "#900C3F", "#C70039",
"#E3611C", "#F1920E", "#FFC300"],
},
},
},
},
{
"id": "arc_layer",
"type": "arc",
"config": {
"dataId": "flows",
"columns": {
"lat0": "origin_lat", "lng0": "origin_lng",
"lat1": "dest_lat", "lng1": "dest_lng",
},
"color": [130, 154, 227],
"visConfig": {"thickness": 2, "opacity": 0.5},
},
},
],
"filters": [
{
"dataId": ["pickups"],
"name": ["timestamp"],
"type": "timeRange",
"enlarged": True,
}
],
},
},
}
from keplergl import KeplerGl
dashboard = KeplerGl(
height=700,
data={"pickups": pickups_df, "flows": flows_df},
config=config,
)
Time animation
The timeRange filter creates an animated playback slider. Kepler.gl interpolates between frames, producing smooth animations of temporal data.
Key settings for animation:
"animationConfig": {
"currentTime": None, # auto-detect from data
"speed": 1, # playback speed multiplier
}
Trip layer: animated trajectories
The trip layer animates moving objects along LineString geometries with timestamps:
import json
# Each feature is a LineString with a "timestamps" property
trip_geojson = {
"type": "FeatureCollection",
"features": [
{
"type": "Feature",
"geometry": {
"type": "LineString",
"coordinates": [
[-73.99, 40.73], [-73.98, 40.74], [-73.97, 40.75]
],
},
"properties": {
"vendor": "A",
"timestamps": [1609459200, 1609459260, 1609459320],
},
}
],
}
import geopandas as gpd
trips = gpd.GeoDataFrame.from_features(trip_geojson["features"])
map1.add_data(data=trips, name="trips")
Configure the trip layer to use timestamps as the animation key, and Kepler renders moving dots along each trajectory.
Custom color scales
Quantile breaks for choropleth maps
import numpy as np
breaks = np.percentile(df["population"], [20, 40, 60, 80, 100])
color_config = {
"name": "custom-quantile",
"type": "quantile",
"colors": ["#ffffcc", "#a1dab4", "#41b6c4", "#2c7fb8", "#253494"],
}
Diverging scales for change data
color_config = {
"name": "diverging",
"type": "diverging",
"colors": ["#d73027", "#fc8d59", "#fee08b",
"#d9ef8b", "#91cf60", "#1a9850"],
}
Embedding in web applications
Static HTML export
map1.save_to_html(file_name="dashboard.html", read_only=True)
The read_only=True flag hides the configuration sidebar, presenting a clean visualization.
Flask/Django integration
# Generate HTML string
html = map1._repr_html_()
# Serve in Flask
from flask import Flask, render_template_string
app = Flask(__name__)
@app.route("/map")
def show_map():
return render_template_string(html)
Streamlit integration
import streamlit as st
from keplergl import KeplerGl
from streamlit_keplergl import keplergl_static
map1 = KeplerGl(data={"data": df}, config=config)
keplergl_static(map1, height=600)
Combining with preprocessing pipelines
A typical analytical workflow:
# 1. Query data with DuckDB
import duckdb
df = duckdb.query("""
SELECT lat, lng, count(*) as trips
FROM taxi_data
WHERE pickup_time BETWEEN '2025-06-01' AND '2025-06-30'
GROUP BY h3_cell(lat, lng, 8)
""").df()
# 2. Enrich with spatial joins
import geopandas as gpd
neighborhoods = gpd.read_file("neighborhoods.geojson")
gdf = gpd.GeoDataFrame(df, geometry=gpd.points_from_xy(df.lng, df.lat), crs=4326)
enriched = gpd.sjoin(gdf, neighborhoods, predicate="within")
# 3. Visualize
map1 = KeplerGl(height=700)
map1.add_data(data=enriched, name="enriched_trips")
Performance tuning checklist
| Technique | Impact |
|---|---|
| Reduce columns to only needed fields | 2-5× faster data transfer |
| Downcast float64 → float32 | 50% memory reduction |
| Pre-aggregate with H3 hexagons | 100-1000× fewer rows |
| Use hexbin/grid layers instead of points | GPU renders aggregates faster |
| Limit time filter window | Fewer features rendered per frame |
| Use Arrow serialization | 3-10× faster than JSON |
Comparison with alternatives
| Tool | Strength | Weakness |
|---|---|---|
| Kepler.gl | Best interactive exploration, large data | Browser-only, no print-quality export |
| Folium | Simple, Leaflet-based, lightweight | Slow with >50K points, no 3D |
| PyDeck | Programmatic deck.gl, good for dashboards | Less interactive exploration UI |
| Plotly Mapbox | Integrated with Plotly ecosystem | Needs Mapbox token for base maps |
Kepler.gl wins when you need to explore large datasets interactively. For static publication maps, use matplotlib + contextily. For programmatic dashboards, consider PyDeck.
The one thing to remember: Kepler.gl’s power comes from client-side GPU rendering — optimize data transfer (fewer columns, smaller types, pre-aggregation) and layer configuration to keep millions of features interactive and visually informative.
See Also
- Python Adaptive Learning Systems How Python builds learning apps that adjust to each student like a personal tutor who knows exactly what you need next.
- Python Airflow Learn Airflow as a timetable manager that makes sure data tasks run in the right order every day.
- Python Altair Learn Altair through the idea of drawing charts by describing rules, not by hand-placing every visual element.
- Python Automated Grading How Python grades homework and exams automatically, from simple answer keys to understanding written essays.
- Python Batch Vs Stream Processing Batch processing is like doing laundry once a week; stream processing is like a self-cleaning shirt that cleans itself constantly.