How to Create an Area Plot in R using ggplot2

Filed Under: R Programming
Area Plot In R

The area graphs are the plots that demonstrate the quantitative data. R offers the standard function geom_area() to plot the area charts and geom_line() to draw the line over the data points using the ggplot2 package.

What is an area plot?

An area plot is a kind of line plot which represents the distribution of the quantitative data. In this chart type, we will first mark the data points and then join them by a line to demonstrate the quantity of the data point or value at different time periods.

In this tutorial, we are going to create an area chart using the ggplot2 library. Well, if you are aware of using geom_area() function, you are just a few steps away from creating a beautiful area chart in R.

Let’s roll!

Create a Simple Area Plot in R using ggplot2

Let’s plot a simple area chart using the normal distribution values.

This is the basic area plot in the R using ggplot2. Here the data is taken as the normal distribution values(rnorm).

Execute the below code to plot the area chart.

#imports the ggplot2 library
library(ggplot2)

#creates the dataframe having the normal distribution values (rnorm)
xdata<-1:50
ydata<-cumsum(rnorm(50))
data1<-data.frame(xdata,ydata)

#plots the area chart
ggplot(data1, aes(x=xdata, y=ydata))+geom_area(fill='#142F86',alpha=2)
Basic Area Plot In R

Customizing the area plot using ggplot2 and hrbrthemes libraries

A simple area chart, as shown above, doesn’t look exciting right? Well, lets put some life into our area chart by adding colors, fonts, styles, and themes.

For this, you have to install certain packages such as ggplot2 and hrbrthemes

To install ggplot2 – install.package(‘ggplot2)

To install hrbrthemes – install.packages(‘hrbrthemes)

This plot inlcudes the line and the points over the area plot. The points and lines joing them makes some sense than a simple area chart. Execute the below code to plot the customized area chart.

#install the required visualization libraries 
library(ggplot2)
library(hrbrthemes)
 
#loading the x and y data (normal distribution)
xdata<-1:50
ydata<-cumsum(rnorm(50))
#reading data into data frames
data<- data.frame(xdata,ydata)

#plots the area chart with theme, title and labels 
ggplot(data, aes(x=xdata, y=ydata))+
geom_area(fill='#142F86', alpha=1)+
geom_line(color='skyblue', size=1)+
geom_point(size=1, color='blue')+
ggtitle("Area plot using ggplot2 in R")+
labs(x='Value', y='frequency')+
theme_ipsum()
Area Plot Using Ggplot2 In R

A basic stacked area plot using ggplot in R

The stacked area graph is a part of the area graph where it demonstrates the behavior of multiple groups in a single chart.

For this, you need to install the dplyr package. To install dplyr, run the below code in r studio.

install.packages(‘dplyr’)

The below code will illustrate the same.

#import the libraries

library(ggplot2)
library(dplyr)
 
#creates the values and data frame
time<- as.numeric(rep(seq(1,7),each=7))
value<- runif(49,30,100)
group<- rep(LETTERS[1:7], times=7)
data1<-data.frame(time,value,group)
 
#plot the area stacked area chart
ggplot(data1, aes(x=time, y=value, fill=group))+geom_area() 
Stacked Area Graph

Enhancing the area plot using Viridis library

The way we enhanced the simple area chart in the above section is fantastic. In the same way, we are going to add some fonts, colors, and styles to the stacked area chart, but this time using Viridis.

Viridis is a visualization library that helps in adding the colors and different styles to the graphs. To install the Viridis package, run the below code in r studio.

install.package(‘viridis’)

#impots the required libraries 
library(viridis)
library(hrbrthemes)

time <- as.numeric(rep(seq(1,7),each=7)) 
value <- runif(49, 10, 100)               
group <- rep(LETTERS[1:7],times=7)      
data <- data.frame(time, value, group)

#adds title, colors and styles to the plot
ggplot(data1, aes(x=time, y=value, fill=group))+
     geom_area(size=0.5, alpha=0.8, color='yellow')+
     scale_fill_viridis(discrete = TRUE)+
     theme_ipsum()+
     ggtitle("Customized area plot using viridis library")
Area Plot Using Viridis Library In R

Plotting the area chart using plotly library

plotly is an open-source library that is used for creating highly appealing visual graphs with various themes and hovers.

In this section, we are going to plot the stacked area plot for the popularity of the American baby names over the past years.

As you can see the graph below, which is highly appealing and smooth with a legend. Say thanks to plotly.

#imports the required libraries
library(ggplot2)
library(hrbrthemes)
library(viridis)
library(babynames)
library(tidyverse)
library(plotly)

#creates the data frame with baby names
data<-babynames %>%
    filter(name %in% c('Margaret','Anna','Emma','Bertha','Sarah'))%>%
    filter(sex=='F')
 
#plots the stacked area chart with american babynames 
 p<-data%>%
     ggplot(aes(x=year, y=n, fill=name, text=name))+
     geom_area()+
     scale_fill_viridis(discrete = T)+
     theme(legend.position = 'none')+
     theme_ipsum()+
     ggtitle('Yearwise american baby names popularity')
 ggplotly(p, tootltip='text')

Area Plot In R Stacked

Plotting the multiple area graphs using facet_wrap()

The multiple facets are the major part of the area charts as they will demonstrate the behavior of each data group. In this case, the popularity of each baby’s name was illustrated using the facet_wrap() function.

Using a simple function facet_wrap(), you can create multiple plot panels in R. This function is convenient to show the behavior of various groups, as shown below.

Execute the below code to create a multiple-panel plot using the facet_wrap() function.

#loads the babynames data with a filter of name and sex as 'F'
data<-babynames %>%
     filter(name %in% c('Margaret','Anna','Emma','Bertha','Sarah'))%>%
     filter(sex=='F')

#plots the multiple area plots using the function facet_wrap()
data%>%
 ggplot(aes(x=year, y=n, group=name, fill=name))+
 geom_area()+
 scale_fill_viridis(discrete = TRUE)+
 theme(legend.position = 'none')+
 ggtitle("Indivisual american names popularity - yearwise")+
 theme_ipsum()+
 theme(legend.position = "none", panel.spacing = unit(0.1, "lines"), 
 strip.text.x = element_text(size = 6))+
facet_wrap(~name, scale='free_y')
Multiple Area Plot In R 1

Finding the age distribution of population in the US between 1900-2002 using the stacked area plot in R

In this section, we are going to plot a stacked area plot which shows the distribution of the population age between the years 1900 and 2002.

For this, you have to install a package gcookbook, which includes the USpopage data. You can install by running this code – install.packages(‘gcookbook’).

Execute the below code to plot the stacked area chart which shows the age distribution of the population.

#installs the required package
install.packages('gcookbook’)

#imports the libraries
library(gcookbook)
library(ggplot2)

#reads the data
Str(uspopage)
ggplot(uspopage, aes(x=Year, y=Thousands, fill=AgeGroup))+geom_area()

ggplot(uspopage, aes(x=Year, y=Thousands, fill=AgeGroup))+geom_area(color='black', size=0.3, alpha=1)+scale_fill_brewer(palette = 'blues',breaks=rev(uspopage$AgeGroup))

#creates the stacked area chart with uspopage data
ggplot(uspopage, mapping = aes(x=Year, y=Thousands, fill=AgeGroup))+
     geom_area(color='black', size=0.5, alpha=1, position = position_stack(reverse = T))+
     scale_fill_brewer(palette = 'blues')+
guides(fill=guide_legend(reverse = T))

Ggplot Multiple Plots
Ggplot Multiple Area Plots

Proportional stacked area plot in R using dplyr() library

In the proportional stacked area charts, the value of the groups is represented by the percentages instead of other parameters.

This method is very helpful to clearly understand the percentages of the groups and note that percentages make more sense and identifies hidden data patterns as well.

Well, for this method, first we have to create an additional column of percentage. To create this we need a library named ‘dplyr’.

Dplyr is a special package in the R which includes the specific tools for the data manipulation.

Execute the below code to plot the stacked area plot with groups that are represented in percentages.

#installs the required package
install.packages(‘dplyr’)
#imports the library
library(dplyr)

#groups the data and adds the column with the percentiles
us_dplyr<-uspopage%>%
 group_by(Year)%>%
 mutate(percentage=Thousands/sum(Thousands)*100)
View(us_dplyr)

#plots the chart
ggplot(us_dplyr, aes(x=Year, y=percentage, fill=AgeGroup))+geom_area(color='black', size=0.3, alpha=1)+scale_fill_brewer(palette = 'blues',breaks=rev(uspopage$AgeGroup))

This is the data frame which shows the added ‘percentage’ column.

Proportional Area Plot In R
Stacked Area Plot In R Using Dplyr

Summing up

Area charts are just like the line charts which are used to demonstrate the evolution of somethings over time or behavior of an object over time.

R offers the standard function geom_area() to plot the area charts.

The stacked area charts are used to represent multiple variables against some parameters. Stacked area charts are the most used area chart types in the analysis.

Well, in this tutorial we have gone through various types of area charts and stacked area charts as well. R offers various visualization libraries such as tidyverse, Viridis, ggplot2, hrbrthemes to enhance the visual graphs.

Hope you enjoyed the tutorial. For any queries don’t hesitate to hit the comments section. Happy plotting!!!

Leave a Reply

Your email address will not be published. Required fields are marked *

close
Generic selectors
Exact matches only
Search in title
Search in content
Search in posts
Search in pages