Python Altair tutorial: Creating Interactive Visualizations

Filed Under: Python Advanced
Altair Tutorial For Creating Interactive Visualizations

Python Altair is a unique data visualization library that allows you to create interactive models for visualizing data.

To become a good data scientist, being able to build easily understandable but complex plots is important.

A perfect way to tell the underlying story of your data is to make visualisations.

It illustrates the relationships within the data and exposes information that can not be communicated with only numbers and digits apparent to the human eye.

But you know what’s even better for data processing than visualizations? Visualizations that are interactive!

As a beginner, sadly, it can seem like a daunting mission.

To support you with the mission, Python and R both have a wide range of tools and tricks.

We will introduce you to Altair in this tutorial.

With Altair, with only a few lines of code and in a very short time, you’ll be able to construct meaningful, beautiful, and efficient visualizations. So let’s start now!

What is Python Altair?

Altair is a library of Python intended for statistical visualization. In nature, it is declarative (we shall come to this definition later on).

It is based on Vega and Vega-Lite, both of which are visualization grammar that enables you to explain a visualization’s visual appearance and interactive actions in a JSON format.

As a data scientist, Altair will allow you to concentrate your time on your data and make more effort to understand, analyze, and visualize it rather than on the required code.

Working with the Python Altair Library

Let’s move to work with the Altair library now. We’ll work on the vega dataset here. I’ve shared the link in the datasets section.

1. Installing the Altair module

To install the Python Altair library, we can use pip package manager:

pip install altair
pip install vega_datasets

I’m using Google Colab, where it’s already present, so we can directly import:

import pandas as pd
import altair as alt
from vega_datasets import data as vega_data

2. Preparing the dataset

Today we’ll be using the flights_2k dataset from the vega-datasets library. I chose this because it is small, and doesn’t take much time to load, unlike the flights_3m library.

3. Fetching data with Pandas

We can fetch data from the library using the Python Pandas library and add the “url” tag as mentioned on the first line below:

flights_data = pd.read_json(vega_data.flights_2k.url)
flights_data.head(10)

This gives us our data:

Flights Dataset
Flights Dataset

4. Plotting a dataset using Python Altair

Data is designed around the Pandas Dataframe in Altair, which means you can manipulate information in Altair the same way you can interact with Pandas DataFrame.

And while Altair internally stores data in a Pandas DataFrame format, there are several ways to enter information.

We use the alt.Chart function to plot :

alt.Chart(flights_data).mark_point().encode(
    alt.X('delay'),
    alt.Y('distance')
)
Altair Plot
Altair Plot

5. Making plots interactive with Altair

Now we’ll take it to the next level. Let’s add the ability to interact with the plot, including:

  • zooming into the plot
  • clicking on data points
  • viewing information while hovering

Add the tooltip option and then call the interactive function:

alt.Chart(flights_data).mark_point().encode(
    alt.X('delay'),
    alt.Y('distance'),
    tooltip = [ alt.Tooltip('delay'),
               alt.Tooltip('distance'),
              ]
).interactive()

This will give us:

Visualization 1
Visualization 1
Visualization 2
Visualization 2
Visualization 3
Visualization 3

As you can see, we can zoom in as we want into the dataset to get inferences.

Complete implementation of an interactive plot in Python

And that’s all. I’ve made a bunch of more interactive plots on my colab notebook using these codes, so try them out:

import pandas as pd
import altair as alt
from vega_datasets import data as vega_data
flights_data = pd.read_json(vega_data.flights_2k.url)
flights_data.head(10)

alt.Chart(flights_data).mark_point().encode(
    alt.X('delay'),
    alt.Y('distance'),
    tooltip = [ alt.Tooltip('delay'),
               alt.Tooltip('distance'),
              ]
).interactive()

alt.Chart(flights_data).mark_point(filled=True).encode(
    alt.X('origin'),
    alt.Y('delay'),
    alt.Size('distance')
)

median_delay = flights_data.groupby('origin').median()

alt.Chart(flights_data).mark_point(filled=True).encode(
    alt.X('origin'),
    alt.Y('destination'),
    alt.Size('distance')
    ).configure_mark(
    opacity=0.2,
    color='red'
)

Ending Note

If you liked reading this article and want to read more, continue to follow the site! We have a lot of interesting articles upcoming in the near future. To stay updated on all the articles, don’t forget to join us along on Twitter and sign up for the newsletter for some interesting reads!

Leave a Reply

Your email address will not be published. Required fields are marked *

close
Generic selectors
Exact matches only
Search in title
Search in content