Hey, folks! Today we will be unveiling a very interesting module of Python — **Seaborn Module** and will be understanding its contribution to **Data Visualizations**.

Table of Contents

- 1 Need of Seaborn module
- 2 Visualizing Data with Python Seaborn
- 3 Statistical Data Visualization with Seaborn
- 4 Categorical Data visualization with Seaborn and Pandas
- 5 Estimation of categorical data using Seaborn
- 6 Univariate distribution using Seaborn Distplot
- 7 Bivariate distribution using Seaborn Kdeplot
- 8 Setting different backgrounds using Seaborn
- 9 Conclusion
- 10 References

## Need of Seaborn module

**Data visualization** is the representation of the data values in a pictorial format. Visualization of data helps in attaining a better understanding and helps draw out perfect conclusions from the data.

**Python Matplotlib library** provides a base for all the data visualization modules present in Python. Python Seaborn module is built over the Matplotlib module and provides functions with better efficiency and plot features inculcated in it.

With Seaborn, data can be presented with different visualizations and different features can be added to it to enhance the pictorial representation.

## Visualizing Data with Python Seaborn

In order to get started with data visualization with Seaborn, the following modules need to be installed and imported in the Python environment.

Note: I have linked the above modules(in the bullets) with the article links for reference.

Further, we need to install and load the Python Seaborn module into the environment.

```
pip install seaborn
import seaborn
```

Now that we have installed and imported the Seaborn module in our working environment, Let us get started with Data visualizations in Seaborn.

## Statistical Data Visualization with Seaborn

Python Seaborn module helps us visualize and depict the data in statistical terms i.e. understanding of the relationship between data values with the help of the following plots:

**Line Plot****Scatter Plot**

Let us understand each of them in detail in the upcoming sections.

### Seaborn Line Plot

Seaborn Line Plot depicts the relationship between the data values amongst a set of data points. Line Plot helps in depicting the dependence of a data variable/value over the other data value.

The `seaborn.lineplot() function`

plots a line out of the data points to visualize the dependence of a data variable over the other parametric data variable.

**Syntax:**

```
seaborn.lineplot(x,y)
```

**Example 1:**

```
import seaborn as sn
import matplotlib.pyplot as plt
import numpy as np
import pandas
data = pandas.read_csv("C:/mtcars.csv")
res = sn.lineplot(data['hp'],data['cyl'])
plt.show()
```

**Output:**

**Example 2:**

```
import seaborn as sn
import matplotlib.pyplot as plt
import numpy as np
import pandas
data = pandas.read_csv("C:/mtcars.csv")
res = sn.lineplot(data['hp'],data['cyl'],hue=data['am'],style=data['am'])
plt.show()
```

In the above example, we have depicted the relationship between various data values using the parameter `hue `

and `style `

to depict the relationship between them using different plotting styles.

**Output:**

### Seaborn Scatter Plot

Seaborn Scatter plot too helps depicts the relationship between various data values against a continuous/categorical data value(parameter).

Scatter plot is extensively used to detect outliers in the field of data visualization and data cleansing. The outliers is the data values that lie away from the normal range of all the data values. Scatter plot helps in visualizing the data points and highlight the outliers out of it.

**Syntax:**

```
seaborn.scatterplot()
```

The `seaborn.scatterplot()`

function plots the data points in the clusters of data points to depict and visualize the relationship between the data variables. While visualizing the data model, we need to place the dependent or the response variable values against the y-axis and independent variable values against the x-axis.

**Example 1:**

```
import seaborn as sn
import matplotlib.pyplot as plt
import numpy as np
import pandas
data = pandas.read_csv("C:/mtcars.csv")
res = sn.scatterplot(data['hp'],data['cyl'])
plt.show()
```

**Output:**

**Example 2:**

```
import seaborn as sn
import matplotlib.pyplot as plt
import numpy as np
import pandas
data = pandas.read_csv("C:/mtcars.csv")
res = sn.scatterplot(data['hp'],data['cyl'],hue=data['am'],style=data['am'])
plt.show()
```

With the parameters ‘`hue`

‘ and ‘`style`

‘, we can visualize multiple data variables with different plotting styles.

**Output:**

## Categorical Data visualization with Seaborn and Pandas

Before getting started with the categorical data distribution, it is necessary for us to understand certain terms related to data analysis and visualization.

**Continuous variable**: It is a data variable that contains continuous and numeric values. For example: Age is a continuous variable whose value can lie between 1 – 100**Categorical variable**: It is a data variable containing discrete values i.e. in the form of groups or categories. For example: Gender can be categorized into two groups– ‘Male’, ‘Female’ and ‘Others’.

Having understood the basic terminologies, let us dive into the visualization of categorical data variables.

### Box Plot

Seaborn Boxplot is used to visualize the categorical/numeric data variable and is extensively used to detect **outliers **in the data cleansing process.

The `seaborn.boxplot() method`

is used create a boxplot for a particular data variable. The box structure represents the main quartile of the plot.

**Syntax:**

```
seaborn.boxplot()
```

The two lines represent the lower and the upper range. Any data point that lies below the lower range or above the upper range is considered as an outlier.

**Example:**

```
import seaborn as sn
import matplotlib.pyplot as plt
import numpy as np
import pandas
data = pandas.read_csv("C:/mtcars.csv")
res = sn.boxplot(data['mpg'])
plt.show()
```

**Output:**

In the above boxplot, the data point lying above the upper range is marked as a data point and considered as an outlier to the dataset.

### Boxen Plot

Seaborn Boxenplot resembles the boxplot but has a slight difference in the presentation of the plot.

The `seaborn.boxenplot() function`

plots the data variable with enlarged inter quartile blocks depicting a detailed representation of the data values.

**Syntax:**

```
seaborn.boxenplot()
```

**Example:**

```
import seaborn as sn
import matplotlib.pyplot as plt
import numpy as np
import pandas
data = pandas.read_csv("C:/mtcars.csv")
res = sn.boxenplot(data['hp'])
plt.show()
```

**Output:**

### Violin Plot

**Seaborn Violin Plot** is used to represent the underlying data distribution of a data variable across its data values.

**Syntax:**

```
seaborn.violinplot()
```

**Example:**

```
import seaborn as sn
import matplotlib.pyplot as plt
import numpy as np
import pandas
data = pandas.read_csv("C:/mtcars.csv")
res = sn.violinplot(data['hp'])
plt.show()
```

**Output:**

### SwarmPlot

Seaborn Swarmplot gives a better picture in terms of the description of the relationship amongst categorical data variables.

The `seaborn.swarmplot() function`

creates a** swarm of data points** around the data values that happen to represent a relationship between the two categorical data variables/columns.

**Syntax:**

```
seaborn.swarmplot()
```

**Example:**

```
import seaborn as sn
import matplotlib.pyplot as plt
import numpy as np
import pandas
data = pandas.read_csv("C:/mtcars.csv")
res = sn.swarmplot(data['am'],data['cyl'])
plt.show()
```

**Output:**

## Estimation of categorical data using Seaborn

In the field of data analysis and visualization, we often require data plots that help us estimate the frequency or count of certain survey/re-searches, etc. The following plots are useful to serve the same purpose:

**Barplot****Pointplot****Countplot**

### 1. Barplot

Seaborn Barplot represents the data distribution among the data variables as a **frequency distribution** of the central tendency values.

**Syntax:**

```
seaborn.barplot()
```

**Example:**

```
import seaborn as sn
import matplotlib.pyplot as plt
import numpy as np
import pandas
data = pandas.read_csv("C:/mtcars.csv")
res = sn.barplot(data['cyl'],data['carb'])
plt.show()
```

**Output:**

### 2. Pointplot

Seaborn Pointplot is a combination of Statistical Seaborn Line and Scatter Plots. The `seaborn.pointplot() function`

represents the relationship between the data variables in the form of scatter points and lines joining them.

**Syntax:**

```
seaborn.pointplot()
```

**Example:**

```
import seaborn as sn
import matplotlib.pyplot as plt
import numpy as np
import pandas
data = pandas.read_csv("C:/mtcars.csv")
res = sn.pointplot(data['carb'],data['cyl'])
plt.show()
```

**Output:**

### 3. Countplot

Seaborn Countplot represents the count or the frequency of the data variable passed to it. Thus it can be considered as a Univariate Data distribution plot.

**Syntax:**

```
seaborn.countplot()
```

**Example:**

```
import seaborn as sn
import matplotlib.pyplot as plt
import numpy as np
import pandas
data = pandas.read_csv("C:/mtcars.csv")
res = sn.countplot(data['carb'])
plt.show()
```

**Output:**

## Univariate distribution using Seaborn Distplot

The Seaborn Distplot is extensively used for univariate data distribution and visualization i.e. visualizing the data values of a single data variable.

The `seaborn.distplot() function`

depicts the data distribution of a continuous variable. It is represented as histogram along with a line.

**Syntax:**

```
seaborn.distplot()
```

**Example:**

```
import seaborn as sn
import matplotlib.pyplot as plt
import numpy as np
import pandas
data = pandas.read_csv("C:/mtcars.csv")
res = sn.distplot(data['mpg'])
plt.show()
```

**Output:**

## Bivariate distribution using Seaborn Kdeplot

Seaborn Kdeplot depicts the statistical probability distribution representation of multiple continuous variables altogether.

**Syntax:**

```
seaborn.kdeplot()
```

**Example:**

```
import seaborn as sn
import matplotlib.pyplot as plt
import numpy as np
import pandas
data = pandas.read_csv("C:/mtcars.csv")
res = sn.kdeplot(data['mpg'],data['qsec'])
plt.show()
```

**Output:**

## Setting different backgrounds using Seaborn

The `seaborn.set() function`

can be used to set different background to the plots such as ‘**dark**‘, ‘**whitegrid**‘, ‘**darkgrid**‘, etc.

**Syntax**:

```
seaborn.set(style)
```

**Example:**

```
import seaborn as sn
import matplotlib.pyplot as plt
import numpy as np
import pandas
data = pandas.read_csv("C:/mtcars.csv")
sn.set(style='darkgrid',)
res = sn.lineplot(data['mpg'],data['qsec'])
plt.show()
```

**Output:**

## Conclusion

Thus, Seaborn module helps in visualizing the data using different plots according to the purpose of visualization.