In this tutorial, we’ll go over the steps to plot a histogram in R. **A histogram** is a graphical representation of the values along with its range. It is similar to a bar plot and each bar present in a histogram will represent the range and height of the specified value.

R offers standard function **hist()** to plot the histogram in Rstudio. It also offers function **geom_density() to plot histogram using ggplot2.**

Table of Contents

## Advantages of Histograms

- A histogram provides the
**distribution of the data**, frequency of the data along with its range. - It is an easier way to visualize
**large data sets**. - The histogram also shows the
**skewness of the data**.

## Types of Histogram plots in R

Based on the distribution of the data, a histogram exhibits many different shapes. In this section, we will try to understand the different types of histogram shapes and their meaning.

The major types of histogram distributions are,

- Normal distribution.
- Right skewed distribution.
- Left skewed distribution.
- Bimodal distribution

## Basic Histogram in R

In this section, we will plot a simple histogram using the ‘airquality’ data set.

Execute the below code to plot this simple histogram.

```
#this code imports the dataset from the R(built-in data sets)
datasets::airquality
#creates the simple histogram
hist(airquality$Temp, xlab = 'Temparature', ylab='Frequency', main='Simple histogram plot', col = 'yellow', border = 'black')
```

## Normal distribution

A **normal distribution** in the histogram is the** ideal bell-shaped plot, which contains less or no random data. **

This distribution shows that the majority of the values are concentrated at the center range.

**However, the remaining data points will end up as a tail in both sides** as you can see in the below plot.

Execute the below code to create the histogram which shows the normal distribution.

```
#imports the default dataset which is present in R
data("iris")
#reads the data
head(iris, 5)
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3.0 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1 1.5 0.2 setosa
5 5.0 3.6 1.4 0.2 setosa
#creates the histogram bins based on 'sepal length'
hist(iris$Sepal.Width, xlab = 'Sepal width', ylab = 'frequency', main='normal distribution of the data', col = 'brown')
```

## Left or Negatively Skewed Histogram in R

In this section, we will plot the left or negetive skewed histogram.

**Negative skewed**: If the histogram distribution shows the values which are concentrated on the right side and the **tail will be on the left side** or on the **negative value side**, then it is called as **negatively of left-skewed distribution. **

Execute the below code to create a negetive skewed histogram in Rstudio.

**Dataset:** google play store dataset by kaggle

```
#imports the csv file
df<- read.csv("googleplaystore.csv")
#reads the data
df
#plots the histogram which is negetively or left skewed
hist(df$Rating, xlab = 'Ratings', ylab = 'Frequency', main = 'Negetive or left skewed distribution', col='brown')
```

## Right or Positively skewed Histogram

In this section, we will plot the right or positively skewed histogram.

**Positive skewed:** If the histogram’s distribution shows that the values are concentrated on the left side and** tail is on the right side of the plot**, then such distribution is called **positively or right-skewed histogram distribution.**

Execute the below code to plot the right or positively skewed histogram.

```
#imports the data from the R's default dataset named 'attenu'.
datasets::attenu
#plots the right or posiively skewed distribution
hist(attenu$accel, xlab = 'attenu', ylab = 'Frequency', main = 'Right or positively skewed distribution', col = 'brown')
```

## Bimodal Distribution of the data plotted using Histogram

In this section, we will plot a bimodal distribution of the data.

**Bimodal distribution:** Bimodal distribution is a type of histogram distribution, where you can witness **two data peaks**.

In the below graph, the **x value ‘quakes’ represent the quakes data distribution. **

Execute the below code to plot the bimodal distribution.

```
#imports the data from the R's default dataset named 'quakes'
datasets::quakes
#plots the bimodal histogram distribution
hist(quakes$depth, xlab = 'Quakes', ylab = 'Frequency', main = 'Bimodal distribution', col = 'brown')
```

## Plotting a Histogram using ggplot2 in R.

As you know **ggplot2 is the most used visualization package in R.ggplot2 offers great themes and functions to create visually appealing graphs**.

In this section, we will plot the histogram of the values present in the **‘diamonds’ **data set, which is present in R by default.

Execute the below code to plot the histogram using ggplot2.

```
#install the required packages
install.packages('ggplot2')
install.packages('dplyr')
install.packages('ggthemes')
#import the required libraries
library(ggplot2)
library(dplyr)
library(ggthemes)
#shows the data
head(diamonds)
#plots the histogram
ggplot(diamonds, aes(carat))+geom_histogram()
#changes the bin width
ggplot(diamonds, aes(carat))+geom_histogram(binwidth = 0.01)
#adds the fill element and x,y and main labels of the graph
ggplot(diamonds, aes(carat, fill=cut))+geom_histogram()+labs(x='carats', y=' Frequency of carats')+ggtitle("Distribution of diamonds's carat by cut values")
#chnages the theme for attractive graph
ggplot(diamonds, aes(carat, fill=cut))+geom_histogram()+labs(x='carats', y=' Frequency of carats')+ggtitle("Distribution of diamonds's carat by cut values")+theme_classic()
```

## Conclusion

The histogram is similar to a **bar plot, which represents the distribution of data along with their** range.

R offers built-in functions such as **hist()** to plot the graph in basic R and **geom_histogram() **to plot the graph using ggplot2 in R.

The histogram has many types. The major ones are **normal distribution, positively skewed, negatively skewed, and bimodal distribution**.

In this tutorial all these plot types are explained and plotting using ggplot2 is also illustrated in the end.

I hope, you have understood the **histogram plotting and usage of different types of histograms.**

Try practicing with different datasets. For any queries, just post it in the comments section. **keep going!!!**