Bokeh Interactive Plots — Deep Dive

Bokeh’s architecture separates the data model (Python), the rendering engine (BokehJS/JavaScript), and the synchronization layer (WebSocket for server mode). Understanding these layers lets you build visualizations that scale from quick exploratory charts to production dashboards serving hundreds of concurrent users.

ColumnDataSource Deep Dive

Every Bokeh glyph reads from a ColumnDataSource. When data changes, only the diff is sent to the browser — not the entire dataset. This makes updates efficient.

from bokeh.models import ColumnDataSource
from bokeh.plotting import figure, curdoc
import numpy as np

source = ColumnDataSource(data={
    "x": np.random.random(1000),
    "y": np.random.random(1000),
    "size": np.random.uniform(2, 15, 1000),
    "color": np.random.choice(["#e74c3c", "#3498db", "#2ecc71"], 1000)
})

fig = figure(width=800, height=500, tools="pan,box_zoom,hover,reset")
fig.circle("x", "y", size="size", color="color", alpha=0.6, source=source)

Updating data in server mode uses three methods:

  • source.data = new_dict — full replacement
  • source.stream(new_data, rollover=500) — append rows, keep at most 500
  • source.patch({col: [(index, value)]}) — update specific cells

stream() is essential for real-time dashboards. The rollover parameter caps the dataset size, preventing memory growth. Internally, BokehJS receives only the new rows via WebSocket and appends them to its local data store.

Custom JavaScript Callbacks

For static HTML output, Python callbacks aren’t available. Instead, use CustomJS to define JavaScript that runs in the browser:

from bokeh.models import CustomJS, Slider
from bokeh.layouts import column

slider = Slider(start=0.1, end=2.0, value=1.0, step=0.1, title="Alpha")

callback = CustomJS(args=dict(source=source, slider=slider), code="""
    const data = source.data;
    const alpha = slider.value;
    const new_sizes = data['size'].map(s => s * alpha);
    source.data['size'] = new_sizes;
    source.change.emit();
""")

slider.js_on_change("value", callback)

CustomJS callbacks have access to all Bokeh model objects passed via args. They execute in the browser’s JavaScript runtime, so they work in static files. The source.change.emit() call triggers BokehJS to re-render affected glyphs.

Server Application Architecture

A Bokeh server application is a Python script that curdoc() calls to build the document. The server manages sessions — each browser tab gets its own document instance with isolated state.

from bokeh.plotting import figure, curdoc
from bokeh.models import Select, ColumnDataSource
from bokeh.layouts import column
import pandas as pd

# Load data once at module level
full_data = pd.read_csv("sales_data.csv")

# Session-specific state
source = ColumnDataSource(data=full_data[full_data["region"] == "North"].to_dict("list"))

fig = figure(width=900, height=400, title="Sales by Region")
fig.vbar(x="month", top="revenue", width=0.8, source=source)

select = Select(title="Region", value="North",
                options=["North", "South", "East", "West"])

def update_region(attr, old, new):
    filtered = full_data[full_data["region"] == new]
    source.data = filtered.to_dict("list")
    fig.title.text = f"Sales — {new} Region"

select.on_change("value", update_region)
curdoc().add_root(column(select, fig))
curdoc().title = "Sales Dashboard"

Run with bokeh serve app.py. Each connected browser gets its own curdoc() instance. The on_change callback fires on the Python side when the user changes the dropdown, queries the DataFrame, and pushes the filtered data back through the WebSocket.

Streaming Real-Time Data

For live dashboards — monitoring servers, financial tickers, IoT sensors — Bokeh’s periodic callback mechanism drives updates:

from bokeh.plotting import figure, curdoc
from bokeh.models import ColumnDataSource
import random
from datetime import datetime

source = ColumnDataSource(data={"time": [], "value": []})

fig = figure(width=900, height=300, x_axis_type="datetime",
             title="Live Sensor Feed")
fig.line("time", "value", source=source, line_width=2)

def stream_update():
    new_data = {
        "time": [datetime.now()],
        "value": [random.gauss(50, 10)]
    }
    source.stream(new_data, rollover=200)

curdoc().add_periodic_callback(stream_update, 500)  # every 500ms
curdoc().add_root(fig)

The add_periodic_callback runs server-side on a fixed interval. Combined with stream(), this creates a sliding-window visualization. At 500ms intervals with a rollover of 200, the chart always shows the last 100 seconds of data.

Custom Tools

Bokeh allows extending the toolbar with custom tools written in TypeScript/JavaScript. For simpler cases, TapTool with a callback handles click interactions:

from bokeh.models import TapTool, CustomJS

tap_callback = CustomJS(args=dict(source=source), code="""
    const indices = source.selected.indices;
    if (indices.length > 0) {
        const idx = indices[0];
        const name = source.data['name'][idx];
        const value = source.data['value'][idx];
        alert(`Selected: ${name} = ${value}`);
    }
""")

tap_tool = TapTool(callback=tap_callback)
fig.add_tools(tap_tool)

For server mode, replace CustomJS with a Python callback on source.selected.on_change("indices", handler) to trigger database lookups, API calls, or model inference when users click data points.

Embedding Strategies

Bokeh plots integrate into larger applications several ways:

Standalone HTML: save(fig) produces a self-contained .html file with embedded BokehJS. File sizes range from 200KB to several MB depending on data volume.

Components: components(fig) returns a (script, div) tuple that you embed in templates. The script tag loads BokehJS and renders into the div. Works with Flask, Django, or any template engine.

Bokeh Server embedding: Use server_document("http://bokeh-server:5006/app") in your web app to embed a live Bokeh application via iframe. The Bokeh server handles WebSocket connections and session management.

from bokeh.embed import components, file_html
from bokeh.resources import CDN

# For templates
script, div = components(fig)

# Full standalone HTML string
html = file_html(fig, resources=CDN, title="Dashboard")

Performance at Scale

Bokeh renders on HTML5 Canvas, which handles up to ~100,000 glyphs comfortably. Beyond that, consider:

WebGL rendering: Enable with fig.output_backend = "webgl". GPU acceleration handles millions of points for scatter and line glyphs. Not all glyph types support WebGL — check the docs for your specific glyph.

Downsampling server-side: For datasets with millions of rows, aggregate before sending to the browser. Compute bin counts, moving averages, or min/max envelopes in Pandas and send the summary.

Lazy loading: In server mode, load only the visible data range. Use x_range.on_change("start", handler) to detect when the user pans, then query and push only the relevant subset.

fig.output_backend = "webgl"

def on_range_change(attr, old, new):
    x_start = fig.x_range.start
    x_end = fig.x_range.end
    visible = full_data[(full_data["x"] >= x_start) & (full_data["x"] <= x_end)]
    source.data = visible.to_dict("list")

fig.x_range.on_change("start", on_range_change)
fig.x_range.on_change("end", on_range_change)

Deployment Patterns

Single-user: bokeh serve app.py works for development and internal tools. Defaults to port 5006.

Multi-user production: Run behind Nginx with --allow-websocket-origin for CORS. Use --num-procs for multiple worker processes. Each process handles its own set of sessions.

bokeh serve app.py \
    --port 5006 \
    --num-procs 4 \
    --allow-websocket-origin dashboard.example.com

Containerized: Bokeh server runs well in Docker. Set --address 0.0.0.0 to bind all interfaces. Health checks should hit the / endpoint. Memory scales linearly with concurrent sessions — budget ~50MB per session for data-heavy dashboards.

Serverless alternative: For static output, pre-render HTML during CI/CD and deploy to S3/CDN. No server needed; interactivity (zoom, hover, selection) still works fully.

Testing Bokeh Applications

Testing server applications uses bokeh.io.export for screenshot regression or Selenium for interaction testing. For unit testing, validate the data model directly:

def test_region_filter():
    source = ColumnDataSource(data=full_data[full_data["region"] == "North"].to_dict("list"))
    assert len(source.data["revenue"]) > 0
    assert all(r == "North" for r in source.data["region"])

For integration testing, bokeh.client connects to a running server programmatically, allowing automated interaction sequences.

One thing to remember: Bokeh’s power comes from its dual architecture — rich client-side interactivity for static files, plus a WebSocket-synced server mode when user actions need to drive Python computation. Choose the right mode for your deployment, and you get interactive dashboards without unnecessary infrastructure.

pythonbokehdata-visualizationinteractive

See Also