In this tutorial, we’ll go over the steps to plot a histogram in R. **A histogram** is a graphical representation of the values along with its range. It is similar to a bar plot and each bar present in a histogram will represent the range and height of the specified value.

R offers standard function **hist()** to plot the histogram in Rstudio. It also offers function **geom_density() to plot histogram using ggplot2.**

Table of Contents

## Advantages of Histograms

- A histogram provides the
**distribution of the data**, frequency of the data along with its range. - It is an easier way to visualize
**large data sets**. - The histogram also shows the
**skewness of the data**.

## Types of Histogram plots in R

Based on the distribution of the data, a histogram exhibits many different shapes. In this section, we will try to understand the different types of histogram shapes and their meaning.

The major types of histogram distributions are,

- Normal distribution.
- Right skewed distribution.
- Left skewed distribution.
- Bimodal distribution

## Basic Histogram in R

In this section, we will plot a simple histogram using the ‘airquality’ data set.

Execute the below code to plot this simple histogram.

#this code imports the dataset from the R(built-in data sets) datasets::airquality #creates the simple histogram hist(airquality$Temp, xlab = 'Temparature', ylab='Frequency', main='Simple histogram plot', col = 'yellow', border = 'black')

## Normal distribution

A **normal distribution** in the histogram is the** ideal bell-shaped plot, which contains less or no random data. **

This distribution shows that the majority of the values are concentrated at the center range.

**However, the remaining data points will end up as a tail in both sides** as you can see in the below plot.

Execute the below code to create the histogram which shows the normal distribution.

#imports the default dataset which is present in R data("iris") #reads the data head(iris, 5) Sepal.Length Sepal.Width Petal.Length Petal.Width Species 1 5.1 3.5 1.4 0.2 setosa 2 4.9 3.0 1.4 0.2 setosa 3 4.7 3.2 1.3 0.2 setosa 4 4.6 3.1 1.5 0.2 setosa 5 5.0 3.6 1.4 0.2 setosa #creates the histogram bins based on 'sepal length' hist(iris$Sepal.Width, xlab = 'Sepal width', ylab = 'frequency', main='normal distribution of the data', col = 'brown')

## Left or Negatively Skewed Histogram in R

In this section, we will plot the left or negetive skewed histogram.

**Negative skewed**: If the histogram distribution shows the values which are concentrated on the right side and the **tail will be on the left side** or on the **negative value side**, then it is called as **negatively of left-skewed distribution. **

Execute the below code to create a negetive skewed histogram in Rstudio.

**Dataset:** google play store dataset by kaggle

#imports the csv file df<- read.csv("googleplaystore.csv") #reads the data df #plots the histogram which is negetively or left skewed hist(df$Rating, xlab = 'Ratings', ylab = 'Frequency', main = 'Negetive or left skewed distribution', col='brown')

## Right or Positively skewed Histogram

In this section, we will plot the right or positively skewed histogram.

**Positive skewed:** If the histogram’s distribution shows that the values are concentrated on the left side and** tail is on the right side of the plot**, then such distribution is called **positively or right-skewed histogram distribution.**

Execute the below code to plot the right or positively skewed histogram.

#imports the data from the R's default dataset named 'attenu'. datasets::attenu #plots the right or posiively skewed distribution hist(attenu$accel, xlab = 'attenu', ylab = 'Frequency', main = 'Right or positively skewed distribution', col = 'brown')

## Bimodal Distribution of the data plotted using Histogram

In this section, we will plot a bimodal distribution of the data.

**Bimodal distribution:** Bimodal distribution is a type of histogram distribution, where you can witness **two data peaks**.

In the below graph, the **x value ‘quakes’ represent the quakes data distribution. **

Execute the below code to plot the bimodal distribution.

#imports the data from the R's default dataset named 'quakes' datasets::quakes #plots the bimodal histogram distribution hist(quakes$depth, xlab = 'Quakes', ylab = 'Frequency', main = 'Bimodal distribution', col = 'brown')

## Plotting a Histogram using ggplot2 in R.

As you know **ggplot2 is the most used visualization package in R.ggplot2 offers great themes and functions to create visually appealing graphs**.

In this section, we will plot the histogram of the values present in the **‘diamonds’ **data set, which is present in R by default.

Execute the below code to plot the histogram using ggplot2.

#install the required packages install.packages('ggplot2') install.packages('dplyr') install.packages('ggthemes') #import the required libraries library(ggplot2) library(dplyr) library(ggthemes) #shows the data head(diamonds) #plots the histogram ggplot(diamonds, aes(carat))+geom_histogram() #changes the bin width ggplot(diamonds, aes(carat))+geom_histogram(binwidth = 0.01) #adds the fill element and x,y and main labels of the graph ggplot(diamonds, aes(carat, fill=cut))+geom_histogram()+labs(x='carats', y=' Frequency of carats')+ggtitle("Distribution of diamonds's carat by cut values") #chnages the theme for attractive graph ggplot(diamonds, aes(carat, fill=cut))+geom_histogram()+labs(x='carats', y=' Frequency of carats')+ggtitle("Distribution of diamonds's carat by cut values")+theme_classic()

## Conclusion

The histogram is similar to a **bar plot, which represents the distribution of data along with their** range.

R offers built-in functions such as **hist()** to plot the graph in basic R and **geom_histogram() **to plot the graph using ggplot2 in R.

The histogram has many types. The major ones are **normal distribution, positively skewed, negatively skewed, and bimodal distribution**.

In this tutorial all these plot types are explained and plotting using ggplot2 is also illustrated in the end.

I hope, you have understood the **histogram plotting and usage of different types of histograms.**

Try practicing with different datasets. For any queries, just post it in the comments section. **keep going!!!**