Bokeh Python Data Visualization

Filed Under: Python

Bokeh is an interactive Python data visualization library which targets modern web browsers for presentation. Bokeh aims at providing high-performing interactivity with the concise construction of novel graphics over very large or even streaming datasets in a quick, easy way and elegant manner.

Bokeh Python

Bokeh offers simple, flexible and powerful features and provides two interface levels:

  • Bokeh.models: A low-level interface which provides the application developers with most flexibility.
  • Bokeh.plotting: A higher-level interface to compose visual glyphs.

Bokeh Dependencies

Before beginning with Bokeh, we need to have NumPy installed on our machine.

Install Bokeh

The easiest way to install bokeh and its dependencies is by using conda or pip.

To install using conda open the terminal and run the following command:

sudo conda install bokeh

To install using pip open the terminal and run the following command:

sudo pip install bokeh

Verifying Installation of Bokeh Module

We can verify that Bokeh is correctly installed or not by using some commands. But we will instead make a very small program to provide Bokeh output to verify that it is working properly.


from bokeh.plotting import figure, output_file, show

output_file("test.html")
plot = figure()
plot.line([1, 2, 3, 4, 5], [6, 7, 2, 4, 5], line_width=2)
show(plot)

This should create a file named test.html locally, open that file in browser and see the results like this:
Bokeh python example
Notice how we were able to create a graph by just using very few lines of code.

Bokeh Examples

Now that we have verified Bokeh installation, we can get started with its examples of graphs and plots.

Plotting a simple line graph

Plotting a simple line graph is quite similar to what we did for verification, but we are going to add a few details to make the plot easy to read. Let’s look at a code snippet:


from bokeh.plotting import figure, output_file, show

# prepare some data
x = [1, 2, 3, 4, 5]
y = [6, 7, 2, 4, 5]

# output to static HTML file
output_file("lines.html")

# create a new plot with a title and axis labels
p = figure(title="simple line example", x_axis_label='x', y_axis_label='y')

# add a line renderer with legend and line thickness
p.line(x, y, legend="Temp.", line_width=2)

# show the results
show(p)

Let’s see the output for this program:
bokeh simple line graph
With the figure() function and its parameters, we were able to provide titles for the axes as well which is much more descriptive about what data we are presenting on the graph along with the graph legends.

Multiple Plots

We know how to create a simple plot, let’s try creating multiple plots this time. Here is a sample program:


from bokeh.plotting import figure, output_file, show

# prepare some data
x = [0.1, 0.5, 1.0, 1.5, 2.0, 2.5, 3.0]
y0 = [i**2 for i in x]
y1 = [10**i for i in x]
y2 = [10**(i**2) for i in x]

# output to static HTML file
output_file("log_lines.html")

# create a new plot
p = figure(
   tools="pan,box_zoom,reset,save",
   y_axis_type="log", y_range=[0.001, 10**11], title="log axis example",
   x_axis_label='sections', y_axis_label='particles'
)

# add some renderers
p.line(x, x, legend="y=x")
p.circle(x, x, legend="y=x", fill_color="white", size=8)
p.line(x, y0, legend="y=x^2", line_width=3)
p.line(x, y1, legend="y=10^x", line_color="red")
p.circle(x, y1, legend="y=10^x", fill_color="red", line_color="red", size=6)
p.line(x, y2, legend="y=10^x^2", line_color="orange", line_dash="4 4")

# show the results
show(p)

Let’s see the output for this program:
bokeh multiple line graphs
These graph plots were much more customised with the legends and line colors. This way, it is easier to differentiate between multiple line plots on the same graph.

Vectorized Colors And Sizes

Different colors and sizes are very important when we need to plot large data as we have a lot to visualize and very few to show. Here is a sample program:


import numpy as np
from bokeh.plotting import figure, output_file, show

# prepare some data
N = 4000
x = np.random.random(size=N) * 100
y = np.random.random(size=N) * 100
radii = np.random.random(size=N) * 1.5
colors = [
    "#%02x%02x%02x" % (int(r), int(g), 150) for r, g in zip(50+2*x, 30+2*y)
]

# output to static HTML file (with CDN resources)
output_file("color_scatter.html", title="color_scatter.py example", mode="cdn")
TOOLS="crosshair,pan,wheel_zoom,box_zoom,reset,box_select,lasso_select"

# create a new plot with the tools above, and explicit ranges
p = figure(tools=TOOLS, x_range=(0,100), y_range=(0,100))

# add a circle renderer with vectorized colors and sizes
p.circle(x,y, radius=radii, fill_color=colors, fill_alpha=0.6, line_color=None)

# show the results
show(p)

Let’s see the output for this program:
bokeh vector color
Vectorized graphs are very important in some scenarios, like:

  • Showing heatmap related data
  • Showing data which exhibits the density property of some parameters

Linked panning and brushing

Linking various aspects is a very useful technique for data visualization. Here is a sample program how this can be achieved with Bokeh:


import numpy as np
from bokeh.layouts import gridplot
from bokeh.plotting import figure, output_file, show

# prepare some data
N = 100
x = np.linspace(0, 4*np.pi, N)
y0 = np.sin(x)
y1 = np.cos(x)
y2 = np.sin(x) + np.cos(x)

# output to static HTML file
output_file("linked_panning.html")

# create a new plot
s1 = figure(width=250, plot_height=250, title=None)
s1.circle(x, y0, size=10, color="navy", alpha=0.5)

# NEW: create a new plot and share both ranges
s2 = figure(width=250, height=250, x_range=s1.x_range, y_range=s1.y_range, title=None)
s2.triangle(x, y1, size=10, color="firebrick", alpha=0.5)

# NEW: create a new plot and share only one range
s3 = figure(width=250, height=250, x_range=s1.x_range, title=None)
s3.square(x, y2, size=10, color="olive", alpha=0.5)

# NEW: put the subplots in a gridplot
p = gridplot([[s1, s2, s3]], toolbar_location=None)

# show the results
show(p)

Let’s see the output for this program:
bokeh planning brushing
These kind of plots are especially helpful when we need to show variation of a parameter based on another parameter.

Datetime axes

Plotting with datetime is a very common task. Let’s make an attempt with a sample program:


import numpy as np

from bokeh.plotting import figure, output_file, show
from bokeh.sampledata.stocks import AAPL

# prepare some data
aapl = np.array(AAPL['adj_close'])
aapl_dates = np.array(AAPL['date'], dtype=np.datetime64)
window_size = 30
window = np.ones(window_size)/float(window_size)
aapl_avg = np.convolve(aapl, window, 'same')

# output to static HTML file
output_file("stocks.html", title="stocks.py example")

# create a new plot with a a datetime axis type
p = figure(width=800, height=350, x_axis_type="datetime")

# add renderers
p.circle(aapl_dates, aapl, size=4, color='darkgrey', alpha=0.2, legend='close')
p.line(aapl_dates, aapl_avg, color='navy', legend='avg')

# NEW: customize by setting attributes
p.title.text = "AAPL One-Month Average"
p.legend.location = "top_left"
p.grid.grid_line_alpha=0
p.xaxis.axis_label = 'Date'
p.yaxis.axis_label = 'Price'
p.ygrid.band_fill_color="olive"
p.ygrid.band_fill_alpha = 0.1

# show the results
show(p)

Let’s see the output for this program:
bokeh datetime axes

Conclusion

In this tutorial, we have seen that Bokeh makes it easy to visualize large data and create different graph plots. We have seen examples of different types of graphs. Bokeh makes it easy to visualize data in an attractive manner and make it easier to read and understand.

Read more Python posts here.

close
Generic selectors
Exact matches only
Search in title
Search in content
Search in posts
Search in pages