Python Seaborn Tutorial
Seaborn is a library for making statistical infographics in Python. It is built on top of matplotlib and also supports numpy and pandas data structures. It also supports statistical units from SciPy.
Visualization plays an important role when we try to explore and understand data, Seaborn is aimed to make it easier and the centre of the process. To put in perspective, if we say matplotlib makes things easier and hard things possible, seaborn tries to make that hard easy too, that too in a well-defined way. But seaborn is not an alternative to matplotlib, think of it as a complement to the previous.
As it is built on top of matplotlib, we will often invoke matplotlib functions directly for simple plots at matplotlib has already created highly efficient programs for it.
The high-level interface of seaborn and customizability and variety of backends for matplotlib combined together makes it easy to generate publication-quality figures.
Why Seaborn?
Seaborn offers a variety of functionality which makes it useful and easier than other frameworks. Some of these functionalities are:
- A function to plot statistical time series data with flexible estimation and representation of uncertainty around the estimate
- Functions for visualizing univariate and bivariate distributions or for comparing them between subsets of data
- Functions that visualize matrices of data and use clustering algorithms to discover structure in those matrices
- High-level abstractions for structuring grids of plots that let you easily build complex visualizations
- Several built-in themes for styling matplotlib graphics
- Tools for choosing color palettes to make beautiful plots that reveal patterns in your data
- Tools that fit and visualize linear regression models for different kinds of independent and dependent variables
Getting Started with Seaborn
To get started with Seaborn, we will install it on our machines.
Install Seaborn
Seaborn assumes you have a running Python 2.7 or above platform with NumPY (1.8.2 and above), SciPy(0.13.3 and above) and pandas packages on the devices.
Once we have these python packages installed we can proceed with the installation. For pip
installation, run the following command in the terminal:
pip install seaborn
If you like conda, you can also use conda for package installation, run the following command:
conda install seaborn
Alternatively, you can use pip to install the development version directly from GitHub:
pip install git+https://github.com/mwaskom/seaborn.git
Using Seaborn
Once you are done with the installation, you can use seaborn easily in your Python code by importing it:
import seaborn
Controlling figure aesthetics
When it comes to visualization drawing attractive figures is important.
Matplotlib is highly customizable, but it can be complicated at the same time as it is hard to know what settings to tweak to achieve a good looking plot. Seaborn comes with a number of themes and a high-level interface for controlling the look of matplotlib figures. Let’s see it working:
#matplotlib inline
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
import seaborn as sns
np.random.seed(sum(map(ord, "aesthetics")))
#Define a simple plot function, to plot offset sine waves
def sinplot(flip=1):
x = np.linspace(0, 14, 100)
for i in range(1, 7):
plt.plot(x, np.sin(x + i * .5) * (7 - i) * flip)
sinplot()
This is what the plot looks like with matplotlib defaults:
If you want to switch to seaborn defaults, simply call ‘set’ function:
sns.set()
sinplot()
This is how the plot look now:
Seaborn figure styles
Seaborn provides five preset themes: white grid, dark grid, white, dark, and ticks, each suited to different applications and also personal preferences.
Darkgrid is the default one. The White grid theme is similar but better suited to plots with heavy data elements, to switch to white grid:
sns.set_style("whitegrid")
data = np.random.normal(size=(20, 6)) + np.arange(6) / 2
sns.boxplot(data=data)
The output will be:
For many plots, the grid is less necessary. Remove it by adding this code snippet:
sns.set_style("dark")
sinplot()
The plot looks like:
Or try the white background:
sns.set_style("white")
sinplot()
This time, the background looks like:
Sometimes you might want to give a little extra structure to the plots, which is where ticks come in handy:
sns.set_style("ticks")
sinplot()
The plot looks like:
Removing axes spines
You can call despine
function to remove them:
sinplot()
sns.despine()
The plot looks like:
Some plots benefit from offsetting the spines away from the data. When the ticks don’t cover the whole range of the axis, the trim parameter will limit the range of the surviving spines:
f, ax = plt.subplots()
sns.violinplot(data=data)
sns.despine(offset=10, trim=True)
The plot looks like:
You can also control which spines are removed with additional arguments to despine:
sns.set_style("whitegrid")
sns.boxplot(data=data, palette="deep")
sns.despine(left=True)
The plot looks like:
Temporarily setting figure style
axes_style()
comes to help when you need to set figure style, temporarily:
with sns.axes_style("darkgrid"):
plt.subplot(211)
sinplot()
plt.subplot(212)
sinplot(-1)
The plot looks like:
Overriding elements of the seaborn styles
A dictionary of parameters can be passed to the rc
argument of axes_style()
and set_style()
in order to customize figures.
Note: Only the parameters that are part of the style definition through this method can be overridden. For other purposes, you should use set()
as it takes all the parameters.
In case you want to see what parameters are included, just call the function without any arguments, an object is returned:
sns.axes_style()
{'axes.axisbelow': True,
'axes.edgecolor': '.8',
'axes.facecolor': 'white',
'axes.grid': True,
'axes.labelcolor': '.15',
'axes.linewidth': 1.0,
'figure.facecolor': 'white',
'font.family': [u'sans-serif'],
'font.sans-serif': [u'Arial',
u'DejaVu Sans',
u'Liberation Sans',
u'Bitstream Vera Sans',
u'sans-serif'],
'grid.color': '.8',
'grid.linestyle': u'-',
'image.cmap': u'rocket',
'legend.frameon': False,
'legend.numpoints': 1,
'legend.scatterpoints': 1,
'lines.solid_capstyle': u'round',
'text.color': '.15',
'xtick.color': '.15',
'xtick.direction': u'out',
'xtick.major.size': 0.0,
'xtick.minor.size': 0.0,
'ytick.color': '.15',
'ytick.direction': u'out',
'ytick.major.size': 0.0,
'ytick.minor.size': 0.0}
You can then set different versions of these parameters:
sns.set_style("darkgrid", {"axes.facecolor": ".9"})
sinplot()
The plot looks like:
Scaling plot elements
Let’s try to manipulate scale of the plot. We can reset the default parameters by calling set():
sns.set()
The four preset contexts are – paper, notebook, talk and poster. The notebook style is the default, and was used in the plots above:
sns.set_context("paper")
sinplot()
The plot looks like:
sns.set_context("talk")
sinplot()
The plot looks like:
Conclusion
In this lesson, we have seen that Seaborn makes it easy to manipulate different graph plots. We have seen examples of scaling and changing context.
Seaborn makes it easy to visualize data in an attractive manner and make it easier to read and understand.