Seaborn Scatter Plot – The Ultimate Guide

Filed Under: Python
Seaborn Scatter Plot

Hey, folks! In the series of Data Visualization with Seaborn, will be focusing on Seaborn Scatter Plots for data visualization.


What is a Scatter Plot?

Scatter Plot represents the relationship between two continuous values, respectively. It depicts how one data variable gets affected by the other data variable in every fraction of the value of the data set.

So, now let us start with plotting Scatter Plots using the Seaborn Library.

We will be using the below data set through out the article for data input.

MTCARS Dataset
MTCARS Dataset

Getting started with Seaborn Scatter Plot

Before moving ahead with the plotting, we need to install the Seaborn Library using the below command:

pip install seaborn

After having installed the library, we need to import the library into the Python environment to load the functions and plot the data to visualize it using the below command:

import seaborn

Creating a Scatter Plot

The seaborn.scatterplot() function is used to plot the data and depict the relationship between the values using the scatter visualization.

Syntax:

seaborn.scatterplot(x,y,data)
  • x: Data variable that needs to be plotted on the x-axis.
  • y: The data variable to be plotted on the y-axis.
  • data: The pointer variable wherein the entire data is stored.

Example 1:

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
 
Year = [1,3,5,2,12,5,65,12,4,76,45,23,98,67,32,12,90]
Profit = [80, 75.8, 74, 65, 99.5, 19, 33.6,23,45,12,86,34,567,21,80,34,54]
  
 
data_plot = pd.DataFrame({"Year":Year, "Profit":Profit})
  
 
sns.scatterplot(x = "Year", y = "Profit", data=data_plot)
plt.show()

In the above example, we have plotted the relationship between the ‘Year’ and ‘Profit’ using the scatter plot. Moreover, we have used the pyplot.show() function to present the data in a proper plot format.

Output:

Scatter Plot Example 1
Scatter Plot Example 1

Example 2:

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

data = pd.read_csv("C:/mtcars.csv")
sns.scatterplot(x = "drat", y = "qsec",data=data)
sns.set(style='darkgrid',)
plt.show()

In the above example, we have represented the relationship between two data columns of a data set passed to the function as a parameter.

Output:

Scatter Plot Example 2
Scatter Plot Example 2

Grouping variables in Seaborn Scatter Plot

As seen above, a scatter plot depicts the relationship between two factors. We can further depict the relationship between multiple data variables i.e. how does the variation in one data variable affects the representation of the other data variables on a whole plot.

In the upcoming section, will be having a look at the below ways through which we can depict the multivariable relatiopnship–

  • hue
  • style
  • size

1. Using the parameter ‘hue’

The hue parameter can be used to group the multiple data variables and show dependency between them in terms of different colors of the markers used to plot the data values.

Syntax:

seaborn.scatterplot(x,y,data,hue)
  • hue: The data parameter around which the dependency of the passed data values are to be plotted.

Example:

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

data = pd.read_csv("C:/mtcars.csv")
sns.scatterplot(x = "drat", y = "qsec",data=data, hue='am')
sns.set(style='whitegrid',)
plt.show()

In the above example, we have plotted the dependency between ‘drat‘ and ‘qsec‘ data variables against the data variable ‘am‘ of the dataset. The data variable is a categorical variable i.e. the data values lies between 0-1. Thus using hue, the two data values 0 and 1 of variable am are represented using two different colours.

Output:

Scatter Plot-hue
Scatter Plot-hue

2. The parameter ‘style’

Using style as a parameter, we can depict the relationship between multiple data variables and their dependency using different types of scatter icons used to depict the data values.

Syntax:

seaborn.scatterplot(x,y,data,style)
  • style: The data parameter which acts as a reference to plot the multivariable relationship.

Example:

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

data = pd.read_csv("C:/mtcars.csv")
sns.scatterplot(x = "drat", y = "qsec",data=data, hue='am',style='am')
sns.set(style='whitegrid',)
plt.show()

In the above example, the different pattern of plots like ‘o‘ and ‘x‘ helps depict the dependency between x, y-axis variables keeping ‘am’ variable as a reference.

Output:

Scatter Plot - style
Scatter Plot – style

3. Using parameter ‘size’

The size parameter produces the plot in such a manner that the dependency and relationship between the multiple plots is depicted using scatter patterns of different sizes.

Syntax:

seaborn.scatterplot(x,y,data,size)

Example:

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

data = pd.read_csv("C:/mtcars.csv")
sns.scatterplot(x = "drat", y = "qsec",data=data,size='am',hue='am')
sns.set(style='whitegrid',)
plt.show()

As seen clearly, the scatter markers of different size help depict the relationship between the data values passed to it as parameter, as a reference.

Output:

Scatter Plot - size
Scatter Plot – size

Seaborn Scatter Plot using “palette” parameter

We can visualize the data in a better manner using Seaborn palette. The inclusion of palette parameter helps us represent the data with different Seaborn colormap values.

Various palette colors available in the Seaborn colormap which help plot the data values.

Example 1:

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

data = pd.read_csv("C:/mtcars.csv")

sns.scatterplot(x = "drat", y = "qsec",data=data,size='am',hue='am',palette='Spectral')
sns.set(style='whitegrid',)
plt.show()

In the above example, we have made use of the palette ‘Spectral‘ to visualize the data.

Output:

Scatter Plot-palette
Scatter Plot-palette

Example 2:

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

data = pd.read_csv("C:/mtcars.csv")

sns.scatterplot(x = "drat", y = "qsec",data=data,size='am',hue='am',palette='hot')
sns.set(style='whitegrid',)
plt.show()

In this example, we have used the palette ‘hot‘ along with size parameter to depict different colormap along with size of the scatter markers.

Output:

Scatter Plot palette 1
Scatter Plot palette 1

Visualizing the Scatter Plot using ‘marker’

The markers are the scatter patterns that are used to represent the data values. Using markers can help add value to the plot in terms of graphics and visualization.

Syntax:

seaborn.scatterplot(x,y,markers)
  • markers: The list representing the marker designs we want to be inculcated in the plot.

Example:

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

data = pd.read_csv("C:/mtcars.csv")

sns.scatterplot(x = "drat", y = "qsec", data=data, hue='am', style='am', markers=['*', 'o'], palette='hot')
sns.set(style='dark',)
plt.show()

Output:

Scatter Plot marker
Scatter Plot marker

Seaborn Scatter Plot at a Glance!

Thus, in this article, we have understood the actual meaning of scatter plot i.e. depicting the dependency between the data variables. Moreover, we can make use of various parameters such as ‘hue‘, ‘palette‘, ‘style‘, ‘size‘ and ‘markers‘ to enhance the plot and avail a much better pictorial representation of the plot.

Important Note: The Seaborn library and its functions are completely build upon the Matplotlib library. Thus, I recommended you to go through the Python Matplotlib tutorial.


Conclusion

Thus, we have understood and implemented Seaborn Scatter Plots in Python.

I strongly recommend you to go through the Seaborn tutorial to have a better understanding about the topic.


References

Leave a Reply

Your email address will not be published. Required fields are marked *

close
Generic selectors
Exact matches only
Search in title
Search in content
Search in posts
Search in pages