Bokeh for Web Applications
Build Interactive Web Visualizations with Python! 🌐
Bokeh is a powerful Python library for creating interactive visualizations for modern web browsers. It helps you build complex dashboards, real-time streaming plots, and sophisticated web applications - all from Python! With Bokeh, you can create JavaScript-powered visualizations without writing any JavaScript.
Why Bokeh?
Bokeh excels at creating interactive web-based visualizations with these key features:
- 🌐 Web-Native: Generates HTML and JavaScript that runs in any modern browser
- ⚡ High Performance: Handles large and streaming datasets efficiently
- 🔧 Server Applications: Build complete web apps with Bokeh Server
- 🎨 Customizable: Full control over every visual aspect
- 🔄 Real-time Updates: Stream data to live-updating plots
- 🤝 Integration: Works with Jupyter, Flask, Django, and more
Installation and Setup
# Install Bokeh
# pip install bokeh
# Basic imports
from bokeh.plotting import figure, output_file, show, output_notebook
from bokeh.models import HoverTool, ColumnDataSource, CategoricalColorMapper
from bokeh.layouts import row, column, gridplot
from bokeh.palettes import Spectral, Category20, Viridis256
from bokeh.io import curdoc, push_notebook
from bokeh.transform import dodge, jitter, cumsum
import pandas as pd
import numpy as np
# For Jupyter notebooks
output_notebook()
# For standalone HTML files
output_file("visualization.html")
print("Bokeh version:", bokeh.__version__)
Basic Plotting with Bokeh
Creating Your First Plot
# Create a new plot with tools
p = figure(
title="My First Bokeh Plot",
x_axis_label='X-axis',
y_axis_label='Y-axis',
width=800,
height=400,
tools="pan,wheel_zoom,box_zoom,reset,hover,save",
toolbar_location="above"
)
# Add a line renderer
x = [1, 2, 3, 4, 5]
y = [2, 5, 8, 2, 7]
p.line(x, y, legend_label="Line", line_width=2, color="navy", alpha=0.8)
# Add circle markers
p.circle(x, y, size=10, color="orange", alpha=0.5, legend_label="Points")
# Customize the legend
p.legend.location = "top_left"
p.legend.click_policy = "hide" # Click legend to hide/show
# Show the result
show(p)
Different Glyph Types
# Create figure
p = figure(width=800, height=600, title="Bokeh Glyph Gallery")
# Different glyph types
x = np.linspace(0, 4*np.pi, 100)
y = np.sin(x)
# Line glyphs
p.line(x, y, legend_label="line", color="blue", line_width=2)
p.multi_line([x[:50], x[50:]], [y[:50], y[50:]],
color=["red", "green"], alpha=0.8, line_width=3)
# Marker glyphs
p.circle(x[::5], y[::5], size=10, color="navy", alpha=0.5)
p.square(x[::7], y[::7], size=8, color="olive", alpha=0.7)
p.triangle(x[::9], y[::9], size=10, color="gold")
p.diamond(x[::11], y[::11], size=12, color="red", alpha=0.5)
# Area glyphs
p.varea(x=x, y1=y-0.2, y2=y+0.2, alpha=0.2, color="grey")
p.hbar(y=[0.5, 1, 1.5], left=0, right=[1, 2, 3], height=0.1, color="magenta")
# Customize grid
p.grid.grid_line_alpha = 0.3
show(p)
Working with Data Sources
ColumnDataSource is fundamental to Bokeh - it's the object that holds your data:
# Using ColumnDataSource for better data management
import pandas as pd
# Create sample DataFrame
df = pd.DataFrame({
'x': np.random.randn(100),
'y': np.random.randn(100),
'colors': np.random.choice(['red', 'green', 'blue'], 100),
'sizes': np.random.randint(10, 30, 100),
'labels': [f'Point {i}' for i in range(100)]
})
# Convert to ColumnDataSource
source = ColumnDataSource(df)
# Create plot using the source
p = figure(width=800, height=600, title="Data from ColumnDataSource",
tools="pan,wheel_zoom,box_select,lasso_select,reset")
# Plot using column names from source
p.circle('x', 'y', size='sizes', color='colors', alpha=0.6, source=source)
# Add hover tool with data from source
hover = HoverTool(tooltips=[
("Index", "$index"),
("(X,Y)", "($x, $y)"),
("Label", "@labels"),
("Color", "@colors"),
("Size", "@sizes")
])
p.add_tools(hover)
show(p)
Interactive Widgets
Adding Interactive Controls
from bokeh.layouts import column, row
from bokeh.models import Slider, Select, TextInput, Button, CheckboxGroup
from bokeh.plotting import curdoc
import numpy as np
# Create plot
x = np.linspace(0, 4*np.pi, 100)
y = np.sin(x)
source = ColumnDataSource(data=dict(x=x, y=y))
plot = figure(title="Interactive Sine Wave", width=800, height=400)
line = plot.line('x', 'y', source=source, line_width=3, alpha=0.6)
# Create widgets
amplitude = Slider(title="Amplitude", value=1.0, start=0.1, end=5.0, step=0.1)
frequency = Slider(title="Frequency", value=1.0, start=0.1, end=5.0, step=0.1)
phase = Slider(title="Phase", value=0.0, start=0.0, end=2*np.pi, step=0.1)
offset = Slider(title="Offset", value=0.0, start=-5.0, end=5.0, step=0.1)
# Define callback
def update_data(attrname, old, new):
# Get current widget values
a = amplitude.value
f = frequency.value
p = phase.value
o = offset.value
# Generate new data
x = np.linspace(0, 4*np.pi, 100)
y = a * np.sin(f * x + p) + o
# Update data source
source.data = dict(x=x, y=y)
# Attach callbacks
for widget in [amplitude, frequency, phase, offset]:
widget.on_change('value', update_data)
# Create layout
layout = column(plot, amplitude, frequency, phase, offset)
# For Bokeh server app
curdoc().add_root(layout)
curdoc().title = "Interactive Sine Wave"
# To run: bokeh serve --show app.py
Creating Dashboards
Multi-Plot Dashboard
from bokeh.plotting import figure
from bokeh.layouts import gridplot, column, row
from bokeh.models import Panel, Tabs
import numpy as np
import pandas as pd
# Generate sample data
n = 500
x = np.random.random(n) * 100
y = np.random.random(n) * 100
colors = np.random.choice(['red', 'green', 'blue', 'yellow', 'purple'], n)
radii = np.random.random(n) * 1.5
# Create multiple plots
# 1. Scatter plot
p1 = figure(width=400, height=400, title="Scatter Plot",
tools="pan,wheel_zoom,reset,hover")
p1.scatter(x, y, radius=radii, fill_color=colors, fill_alpha=0.6, line_color=None)
# 2. Line plot
p2 = figure(width=400, height=400, title="Time Series",
x_axis_type="datetime")
dates = pd.date_range('2024-01-01', periods=100)
ts_data = np.cumsum(np.random.randn(100))
p2.line(dates, ts_data, color='navy', alpha=0.8, line_width=2)
p2.circle(dates, ts_data, size=4, color='navy', alpha=0.5)
# 3. Bar chart
categories = ['A', 'B', 'C', 'D', 'E']
values = [25, 40, 35, 20, 45]
p3 = figure(x_range=categories, width=400, height=400,
title="Bar Chart", toolbar_location=None)
p3.vbar(x=categories, top=values, width=0.9, color="teal", alpha=0.8)
p3.y_range.start = 0
# 4. Heatmap
data = np.random.randn(10, 10)
p4 = figure(width=400, height=400, title="Heatmap",
x_range=(0, 10), y_range=(0, 10),
toolbar_location=None)
p4.image(image=[data], x=0, y=0, dw=10, dh=10, palette="Viridis256")
# Create dashboard layout
dashboard = gridplot([[p1, p2], [p3, p4]], sizing_mode="scale_width")
# Alternative: Using tabs
tab1 = Panel(child=row(p1, p2), title="Main Metrics")
tab2 = Panel(child=row(p3, p4), title="Additional Analysis")
tabs = Tabs(tabs=[tab1, tab2])
# Show dashboard
show(tabs)
Real-time Streaming Data
# Streaming data example
from bokeh.plotting import figure, curdoc
from bokeh.models import ColumnDataSource
from datetime import datetime
import numpy as np
# Initialize data source
source = ColumnDataSource(data=dict(
time=[],
value=[],
color=[]
))
# Create plot
plot = figure(
title="Real-time Data Stream",
x_axis_type='datetime',
width=900,
height=400,
tools="pan,wheel_zoom,box_zoom,reset"
)
# Add glyphs
plot.line('time', 'value', source=source, line_width=2, alpha=0.8)
plot.circle('time', 'value', source=source, size=4,
color='color', alpha=0.8)
# Streaming update function
def update():
new_data = dict(
time=[datetime.now()],
value=[np.random.randn()],
color=['red' if np.random.random() > 0.5 else 'blue']
)
# Stream new data (keeps last 100 points)
source.stream(new_data, rollover=100)
# Update x_range to show recent data
plot.x_range.follow = "end"
plot.x_range.follow_interval = 20000 # 20 seconds
plot.x_range.range_padding = 0.1
# Add periodic callback
curdoc().add_periodic_callback(update, 1000) # Update every second
curdoc().add_root(plot)
curdoc().title = "Streaming Data"
# Run with: bokeh serve --show streaming.py
Advanced Visualizations
Network Graphs
from bokeh.plotting import figure, show
from bokeh.models import GraphRenderer, Oval, StaticLayoutProvider
from bokeh.palettes import Spectral4
import networkx as nx
# Create a network graph
G = nx.karate_club_graph()
# Create Bokeh graph from NetworkX
plot = figure(title="Network Graph Visualization",
x_range=(-1.1, 1.1), y_range=(-1.1, 1.1),
tools="pan,wheel_zoom,reset", width=800, height=800)
# Create graph renderer
graph = GraphRenderer()
# Set node positions using NetworkX layout
pos = nx.spring_layout(G)
graph.layout_provider = StaticLayoutProvider(graph_layout=pos)
# Node properties
graph.node_renderer.data_source.data = dict(
index=list(G.nodes()),
fill_color=[Spectral4[i%4] for i in range(len(G.nodes()))]
)
graph.node_renderer.glyph = Oval(width=0.05, height=0.05, fill_color="fill_color")
# Edge properties
graph.edge_renderer.data_source.data = dict(
start=[e[0] for e in G.edges()],
end=[e[1] for e in G.edges()]
)
# Add graph to plot
plot.renderers.append(graph)
# Add hover tool
from bokeh.models import HoverTool
hover = HoverTool(tooltips=[("Node", "@index")])
plot.add_tools(hover)
show(plot)
Geographic Plots
from bokeh.plotting import figure, show
from bokeh.tile_providers import get_provider, CARTODBPOSITRON
from bokeh.models import GeoJSONDataSource
import json
# Create figure with map tiles
tile_provider = get_provider(CARTODBPOSITRON)
p = figure(
x_axis_type="mercator", y_axis_type="mercator",
width=900, height=600,
title="Geographic Visualization",
tools="pan,wheel_zoom,reset"
)
p.add_tile(tile_provider)
# Add points on map (convert lat/lon to Web Mercator)
def lat_lon_to_mercator(lat, lon):
"""Convert latitude/longitude to Web Mercator coordinates"""
from math import pi, log, tan
x = lon * 20037508.34 / 180
y = log(tan((90 + lat) * pi / 360)) / (pi / 180) * 20037508.34 / 180
return x, y
# Example cities
cities = {
'New York': (40.7128, -74.0060),
'London': (51.5074, -0.1278),
'Tokyo': (35.6762, 139.6503),
'Sydney': (-33.8688, 151.2093)
}
x_coords = []
y_coords = []
names = []
for city, (lat, lon) in cities.items():
x, y = lat_lon_to_mercator(lat, lon)
x_coords.append(x)
y_coords.append(y)
names.append(city)
# Add city markers
p.circle(x_coords, y_coords, size=15, fill_color='red',
fill_alpha=0.8, line_color='white', line_width=2)
# Add city labels
p.text(x_coords, y_coords, text=names,
text_align="center", text_baseline="bottom",
text_font_size="10pt", text_color="black")
show(p)
Custom JavaScript Callbacks
Add client-side interactivity without a Bokeh server:
from bokeh.models import CustomJS, Slider
from bokeh.layouts import column
from bokeh.plotting import figure, show, ColumnDataSource
import numpy as np
# Create data
x = np.linspace(0, 10, 500)
y = np.sin(x)
source = ColumnDataSource(data=dict(x=x, y=y))
# Create plot
plot = figure(width=800, height=400, title="Client-side Interaction")
line = plot.line('x', 'y', source=source, line_width=3, alpha=0.6)
# Create slider with JavaScript callback
slider = Slider(start=0.1, end=10, value=1, step=0.1, title="Frequency")
# JavaScript code to execute
callback = CustomJS(args=dict(source=source, slider=slider), code="""
const data = source.data;
const f = slider.value;
const x = data['x'];
const y = data['y'];
for (let i = 0; i < x.length; i++) {
y[i] = Math.sin(f * x[i]);
}
source.change.emit();
""")
slider.js_on_change('value', callback)
# Create layout
layout = column(slider, plot)
show(layout)
Integration with Web Frameworks
Flask Integration
# app.py - Flask + Bokeh
from flask import Flask, render_template
from bokeh.embed import components
from bokeh.plotting import figure
from bokeh.resources import INLINE
import numpy as np
app = Flask(__name__)
@app.route('/')
def index():
# Create Bokeh plot
x = np.linspace(0, 4*np.pi, 100)
y = np.sin(x)
plot = figure(title="Bokeh + Flask", width=800, height=400)
plot.line(x, y, line_width=2, color='navy', alpha=0.8)
# Get plot components
script, div = components(plot)
# Get Bokeh resources
js_resources = INLINE.render_js()
css_resources = INLINE.render_css()
# Render template with plot
return render_template(
'index.html',
plot_script=script,
plot_div=div,
js_resources=js_resources,
css_resources=css_resources,
)
if __name__ == '__main__':
app.run(debug=True)
# templates/index.html
'''
Bokeh Flask App
{{ css_resources|safe }}
{{ js_resources|safe }}
Data Visualization Dashboard
{{ plot_div|safe }}
{{ plot_script|safe }}
'''
Embedding in Jupyter
# For Jupyter notebooks
from bokeh.io import output_notebook, push_notebook
from bokeh.plotting import figure, show
import numpy as np
# Enable notebook output
output_notebook()
# Create interactive plot
x = np.linspace(0, 10, 100)
y = np.sin(x)
p = figure(width=800, height=400, title="Interactive in Jupyter")
r = p.line(x, y, line_width=2)
# Show with notebook handle for updates
handle = show(p, notebook_handle=True)
# Update plot dynamically
for phase in np.linspace(0, 2*np.pi, 50):
y = np.sin(x + phase)
r.data_source.data['y'] = y
push_notebook(handle=handle)
Building a Complete Application
# complete_app.py - Full Bokeh Server Application
from bokeh.plotting import figure, curdoc
from bokeh.layouts import column, row
from bokeh.models import (ColumnDataSource, Select, Slider,
TextInput, Button, Div, DataTable,
DateFormatter, TableColumn, HoverTool)
from bokeh.palettes import Category20_20
import pandas as pd
import numpy as np
from datetime import datetime, timedelta
# Generate sample data
np.random.seed(42)
start_date = datetime.now() - timedelta(days=365)
dates = pd.date_range(start_date, periods=365, freq='D')
# Create dataset
data = pd.DataFrame({
'date': dates,
'sales': np.random.normal(1000, 200, 365) + np.sin(np.arange(365) * 2 * np.pi / 365) * 300,
'visitors': np.random.poisson(500, 365),
'conversion': np.random.uniform(0.02, 0.08, 365),
'category': np.random.choice(['Electronics', 'Clothing', 'Food', 'Books'], 365)
})
# Calculate additional metrics
data['revenue'] = data['sales'] * data['conversion'] * 100
# Create ColumnDataSource
source = ColumnDataSource(data)
filtered_source = ColumnDataSource(data)
# Create figures
# 1. Time series plot
time_plot = figure(
x_axis_type='datetime',
title='Sales Over Time',
width=900, height=300,
tools='pan,wheel_zoom,box_zoom,reset,save'
)
time_plot.line('date', 'sales', source=filtered_source,
line_width=2, color='navy', alpha=0.8)
time_plot.circle('date', 'sales', source=filtered_source,
size=4, color='navy', alpha=0.5)
# Add hover tool
hover = HoverTool(tooltips=[
('Date', '@date{%F}'),
('Sales', '@sales{$0,0.00}'),
('Visitors', '@visitors{0,0}'),
('Revenue', '@revenue{$0,0.00}')
], formatters={'@date': 'datetime'})
time_plot.add_tools(hover)
# 2. Category breakdown
category_plot = figure(
x_range=['Electronics', 'Clothing', 'Food', 'Books'],
title='Sales by Category',
width=450, height=300
)
# Calculate category totals
category_data = data.groupby('category')['sales'].sum().reset_index()
category_source = ColumnDataSource(category_data)
category_plot.vbar(x='category', top='sales', source=category_source,
width=0.8, color='teal', alpha=0.8)
# 3. Scatter plot
scatter_plot = figure(
title='Sales vs Visitors',
width=450, height=300,
tools='pan,wheel_zoom,box_select,reset'
)
scatter_plot.scatter('visitors', 'sales', source=filtered_source,
size=8, color='orange', alpha=0.6)
# Create widgets
date_slider = Slider(
title="Filter by Day of Year",
start=1, end=365, value=365, step=1
)
category_select = Select(
title="Filter by Category",
value="All",
options=["All"] + list(data['category'].unique())
)
refresh_button = Button(label="Refresh Data", button_type="primary")
# Statistics display
stats_div = Div(text="", width=900, height=100)
def update_stats():
"""Update statistics display"""
filtered_data = filtered_source.data
df = pd.DataFrame(filtered_data)
if len(df) > 0:
total_sales = df['sales'].sum()
avg_sales = df['sales'].mean()
total_visitors = df['visitors'].sum()
avg_conversion = df['conversion'].mean()
stats_html = f"""
Summary Statistics
Total Sales: ${total_sales:,.2f} |
Average Daily Sales: ${avg_sales:,.2f} |
Total Visitors: {total_visitors:,} |
Avg Conversion Rate: {avg_conversion:.2%}
"""
stats_div.text = stats_html
# Callbacks
def filter_data():
"""Filter data based on widget values"""
filtered = data.copy()
# Filter by days
days_to_show = date_slider.value
filtered = filtered.head(days_to_show)
# Filter by category
if category_select.value != "All":
filtered = filtered[filtered['category'] == category_select.value]
# Update filtered source
filtered_source.data = ColumnDataSource.from_df(filtered)
# Update category breakdown
if category_select.value == "All":
category_data = data.groupby('category')['sales'].sum().reset_index()
else:
category_data = filtered.groupby('category')['sales'].sum().reset_index()
category_source.data = ColumnDataSource.from_df(category_data)
# Update statistics
update_stats()
def refresh_data():
"""Simulate refreshing data from a database"""
# Generate new random data
new_sales = np.random.normal(1000, 200, 365) + np.sin(np.arange(365) * 2 * np.pi / 365) * 300
new_visitors = np.random.poisson(500, 365)
new_conversion = np.random.uniform(0.02, 0.08, 365)
data['sales'] = new_sales
data['visitors'] = new_visitors
data['conversion'] = new_conversion
data['revenue'] = data['sales'] * data['conversion'] * 100
source.data = ColumnDataSource.from_df(data)
filter_data()
# Attach callbacks
date_slider.on_change('value', lambda attr, old, new: filter_data())
category_select.on_change('value', lambda attr, old, new: filter_data())
refresh_button.on_click(refresh_data)
# Create data table
columns = [
TableColumn(field="date", title="Date", formatter=DateFormatter()),
TableColumn(field="sales", title="Sales"),
TableColumn(field="visitors", title="Visitors"),
TableColumn(field="category", title="Category"),
]
data_table = DataTable(source=filtered_source, columns=columns,
width=900, height=200)
# Layout
charts_row = row(category_plot, scatter_plot)
widgets_row = row(date_slider, category_select, refresh_button)
layout = column(
Div(text="Sales Dashboard
"),
stats_div,
widgets_row,
time_plot,
charts_row,
Div(text="Data Table
"),
data_table
)
# Initialize
filter_data()
# Add to document
curdoc().add_root(layout)
curdoc().title = "Sales Dashboard"
# Run with: bokeh serve --show complete_app.py
Performance Optimization
- Use WebGL: Add
output_backend="webgl"for large datasets - Data Decimation: Reduce data points for initial view, load full data on zoom
- Server-side Filtering: Process data on server before sending to client
- Streaming: Use
source.stream()for real-time data instead of full updates - Patch Updates: Use
source.patch()to update specific data points - CDN Resources: Use CDN mode for Bokeh resources in production
Deployment Options
Bokeh vs Other Libraries
| Feature | Bokeh | Plotly | Matplotlib |
|---|---|---|---|
| Interactivity | ✅ Excellent | ✅ Excellent | ❌ Limited |
| Web Integration | ✅ Native | ✅ Good | ⚠️ Basic |
| Server Apps | ✅ Built-in | ✅ Via Dash | ❌ No |
| Large Data | ✅ Excellent | ⚠️ Good | ⚠️ Moderate |
| Customization | ✅ High | ⚠️ Moderate | ✅ High |
Practice Exercises
Exercise 1: Interactive Dashboard
Create a dashboard with:
- Time series plot with range selector
- Category filter dropdown
- Statistics panel that updates
- Data table showing filtered results
Exercise 2: Real-time Monitor
Build a real-time monitoring app:
- Stream data every second
- Show last 100 data points
- Color code based on thresholds
- Add alert notifications
Exercise 3: Geographic Visualization
Create an interactive map with:
- Map tiles background
- Clickable markers
- Hover information
- Filter by region/category
Key Takeaways
- 🌐 Bokeh creates web-ready visualizations without JavaScript knowledge
- ⚡ Excellent performance with large datasets through WebGL
- 🔧 Bokeh Server enables full web applications with Python
- 🔄 Real-time streaming and updates are built-in features
- 🤝 Integrates well with Flask, Django, and Jupyter
- 🎨 Highly customizable with full control over appearance