Median line plotting in Histograms using Altair in Python

Filed Under: Python Modules
Altair Histogram Median

In this tutorial, we will learn how to make a histogram with a median line using the Altair library in Python. Altair is one of the latest interactive data visualizations libraries in python. It is based on vega and vegalite.

Also Read: Python Altair tutorial: Creating Interactive Visualizations

Implementing Altair Median Line Plotting

First, we will load the libraries that will help to make a histogram using Altair. 

import altair as alt
import numpy as np
import pandas as pd

Now we will generate the data to make a histogram with the line. Here, we will use the Numpy library to generate random numbers. We will make use of a normal distribution and create data frames from the dataset.

DATA = pd.DataFrame({'Bar Heights': np.random.normal(1500, 100, 5000)})
print(DATA)

The dataset looks something like the image below.

Histogram Dataset NormalDistribution
Histogram Dataset NormalDistribution

Let us draw a simple histogram for the dataset using the code below. We make use of the mark_bar function and choose the variables that you need to plot.

alt.Chart(DATA).mark_bar().encode(
    x=alt.X('Bar Heights:Q', bin=alt.BinParams(), axis=None), y='count()')
Basic Histogram Plot Median Altair
Basic Histogram Plot Median Altair

Next, we need to plot the median line using the code below. We will be using the mark_rule function to create a median line to the original plot. We will create two different variables for the histogram and the line and then plot them together!

histogram = alt.Chart(DATA).mark_bar().encode(
    x=alt.X('Bar Heights:Q', bin=alt.BinParams(), axis=None),
    y='count()'
)

meadian_line = alt.Chart(DATA).mark_rule().encode(
    x=alt.X('mean(Bar Heights):Q', title='Height'),
    size=alt.value(5)
)

histogram + meadian_line
Final Histogram Plot Median Altair
Final Histogram Plot Median Altair

Therefore, here we get the histogram with the line using Altair in python. Let us now understand to get the Customized histogram.

Customizing Histogram with Median Line

By default, Altair has chosen a blue color for the histogram and also the number of bins. Along with this, it chose the black color for the line.

But we can easily customize the histogram and the line using the code below.

import altair as alt
import numpy as np
import pandas as pd

DATA = pd.DataFrame({'Bar Heights': np.random.normal(1500, 100, 5000)})
print(DATA)

histogram = alt.Chart(DATA).mark_bar().encode(
	x=alt.X('Bar Heights:Q', bin=alt.BinParams(maxbins=100), axis=None), y='count()')

meadian_line = alt.Chart(DATA).mark_rule(color='red').encode(
    x=alt.X('mean(Bar Heights):Q', title='Height'),
    size=alt.value(5)
)

histogram+meadian_line
Final Customized Histogram Plot Median Altair
Final Customized Histogram Plot Median Altair

The above figure shows the histogram with 100 bins and a red line using Altair in python.

Conclusion

I hope you are now clear with plotting histograms along with a median line using the Altair library in the Python programming language. Keep reading to learn more!

Hope you liked the tutorial!

close
Generic selectors
Exact matches only
Search in title
Search in content