Regression Line on Scatter Plot in Python Altair

Filed Under: Python
Altair Regression Line

In this tutorial, we will take a real-world dataset and plot the scatter chart for the dataset. Along with this, we will be plotting regression lines for the dataset.

scatter plot is a type of plot which displays the relation between two variables in a dataset.  Adding a regression line to a scatter plot is a great way to understand the relationship between the two numeric variables.

Altair is a Python library that makes uses Vega and Vega-Lite grammars that gives more time to focus on the analysis and study of data rather than visualization of data.

We will start off by loading the `Pandas` and `NumPy` libraries. We will also import `Altair` and `vega_datasets` to get the dataset in the later sections.

Also Read: Python Altair tutorial: Creating Interactive Visualizations

Implementing Regression Line on Scattery Plot using Python Altair

We will start by importing the Altair and vega_datasets libraries to get the plots and dataset we will be working on in the later sections.

import altair as alt
from vega_datasets import data

In this tutorial, we will be making use of Seattle’s weather dataset which is built-in and can be loaded using the code below.

seattle_weather_data = data.seattle_weather()
print(seattle_weather_data.head())
Vega Datasets Weather Data
Vega Datasets Weather Data

We will start by plotting a simple scatter chart using the mark_point function using the code below. We will be plotting the resulting line for three different types of relationships that are:

Minimum Temp and Maximum Temp

alt.Chart(seattle_weather_data).mark_point().encode(
    x='temp_max',
    y='temp_min'
)
Scatter Plot Min Vs Max Temp
Scatter Plot Min Vs Max Temp

Wind and Minimum Temperature

alt.Chart(seattle_weather_data).mark_point().encode(
    x='temp_min',
    y='wind'
)
Scatter Plot Min Temp Vs Wind
Scatter Plot Min Temp Vs Wind

Wind and Maximum Temperature

alt.Chart(seattle_weather_data).mark_point().encode(
    x='temp_max',
    y='wind'
)
Scatter Plot Max Temp Vs Wind
Scatter Plot Max Temp Vs Wind

Plotting Regression Line using Altair

The next step and the final step is to plot the regression line on the plots we have just seen right now. We can make a regression line using transform_regression function and we can add it as another layer to the scatter plot.

Minimum Temp. and Maximum Temp.

alt.Chart(seattle_weather_data).mark_point().encode(
    x='temp_max',
    y='temp_min'
) + alt.Chart(seattle_weather_data).mark_point().encode(
    x='temp_max',
    y='temp_min'
).transform_regression('temp_max', 'temp_min').mark_line(color='red')
Regression Line Min Vs Max Temp
Regression Line Min Vs Max Temp

Wind and Minimum Temperature

alt.Chart(seattle_weather_data).mark_point().encode(
    x='temp_min',
    y='wind'
) + alt.Chart(seattle_weather_data).mark_point().encode(
    x='temp_min',
    y='wind'
).transform_regression('temp_min', 'wind').mark_line(color='red')
Regression Line Min Temp Vs Wind
Regression Line Min Temp Vs Wind

Wind and Maximum Temperature

alt.Chart(seattle_weather_data).mark_point().encode(
    x='temp_max',
    y='wind'
) + alt.Chart(seattle_weather_data).mark_point().encode(
    x='temp_max',
    y='wind'
).transform_regression('temp_max', 'wind').mark_line(color='red')
Regression Line Max Temp Vs Wind
Regression Line Max Temp Vs Wind

Conclusion

I hope you are now clear with how to plot regression lines on basic scatter plots in the Python programming language. Thank you for reading!

close
Generic selectors
Exact matches only
Search in title
Search in content