“A picture can speak a thousand words”. Pictures or graphs can communicate data patterns more effectively and easily. I am sure you use many visualizations when you are analyzing the data. But, creating visualizations is not enough. Those plots should be appealing and convincing and to the point. R offers many libraries such as ggplot, lattice, leaflet, and more. But, now I am going to introduce ggpubr in R to you for effective and professional plotting. It will be great fun, hold tight!
What is ggpubr in R?
The ggpubr package in R is built to produce production-quality visualizations. It creates ggplot2 based plots with more add-ons to make the plots look amazing.
In the ggpubr –
- The syntax is simpler compared to ggplot2.
- Creates publication ready plots with minimum code.
- In the box plots and line plots, it automatically adds P and significance values.
- Annotation is satisfying to watch.
- You can easily play with colors and labels of the plot.
Install ggpubr in R
So, we learned something about ggpubr and its features. Now, let’s install the ggpubr in R and load it to the environment to get started.
#Install required package install.packages('ggpubr') #Load the package library(ggpubr)
Along with this package, we need some more add-on packages to support our visualizations. Let’s install them too.
Anyway, we need to install and load 5 packages. So, I will use pacman’s p_load() function to install and load those packages at once.
You need to install and load
pacman() package in R to continue with this.
#Install and load required packages library(pacman) pacman::p_load('colorspace','ggplot2','patchwork','wesanderson')
You will see the confirmation about the installation and loading of the specified packages in your R studio. If you are done with this, we are good to go!
Import and Load Data
For the visualization purpose, I will go with iris data because of its global usage for this illustration.
#Import iris dataset df <- datasets::iris #Display top n rows head(iris)
Sepal.Length Sepal.Width Petal.Length Petal.Width Species 1 5.1 3.5 1.4 0.2 setosa 2 4.9 3.0 1.4 0.2 setosa 3 4.7 3.2 1.3 0.2 setosa 4 4.6 3.1 1.5 0.2 setosa 5 5.0 3.6 1.4 0.2 setosa 6 5.4 3.9 1.7 0.4 setosa
Let’s Go Charting!
We got our data ready and let’s plot a histogram of the data to understand its distribution.
#Plots the histogram of the data distribution gghistogram(df, x = 'Sepal.Length', color = 'Species', fill = 'Species', palette = wes_palette('FantasticFox1'))
- You have to call
- Mention the data, variables and the color.
- Use species as fill argument.
- Palette – You can try many palettes. Read more about more palettes here
Density Plots using ggpubr
As you already know about the use of density plots which is great to understand the numerical distribution, let’s plot a density chart for the data.
You can use ggdensity function offered by ggpubr for this purpose.
#Density plot for the data ggdensity(df, x = 'Sepal.Length', fill = 'Species', color = 'Species', palette = wes_palette('FantasticFox1'), add = 'mean')
- Call the
- Add the data and mention the variables.
- You can add color and fill arguments.
- Don’t forget to mention the palette for great visualizations.
- Add the mean line for a professional plot.
Box Plots Using ggpubr
Box plots are more useful in finding the outliers in the data. It will also give you the percentiles of the data distribution.
Here, we will make use of
The colorspace will help us to add more palettes to the plot. Let’s rock!!!
#Plots the boxplot of the data ggboxplot(df, x = 'Species', y = 'Sepal.Length',color = 'Species', palette = qualitative_hcl(3, palette = 'harmonic'), add = 'jitter', shape = 'Species' )
- Call the
- Add the X and Y axis variables.
- Make use of colorspace and add the palettes.
- You can also add the shape.
Regression Plot using ggpubr
Using the ggpubr package which offers the ggscatter function to add the regression line for your scatter plots. It is a cool function to draw regression lines. Try this out.
#Add regression line to the plot ggscatter(df, x = 'Sepal.Width', y = 'Sepal.Length', palette = 'jco', shape = 'Species', add = 'reg.line',color = 'Species', conf.int = TRUE)
- Call the
- Add the variables.
- Make use of the palettes and add the shape with color.
- You have to add the
reg.lineargument to draw regression line.
Merge Two Plots Using PatchWork
The most amazing thing that I enjoyed is the patchwork library. It will help you to merge two plots in seconds.
All you need to do is to assign both plots to individual variables and then add the variables. That’s it. Plots will be merged and you can see the most beautiful plots and professional as well.
#Merge the plots #Assign plots to a variable plot1 <- ggboxplot(df, x = 'Species', y = 'Sepal.Length',color = 'Species', palette = qualitative_hcl(3, palette = 'harmonic'), add = 'jitter', shape = 'Species' ) #Assign plots to a variable plot2 <- ggscatter(df, x = 'Sepal.Width', y = 'Sepal.Length', palette = 'jco', shape = 'Species', add = 'reg.line',color = 'Species', conf.int = TRUE) #Merge the plots Merged_plots <- plot1+plot2 Merged_plots
- You have to create 2 plots.
- Assign each plot to a variable.
- Make sure you loaded the patchwork package.
- Then finally add the two plots as shown in the code.
- View the merged plot.
Ending Note – ggpubr in R
The ggpubr in R is one of the best packages that I have used for data visualization. You can see the most professional standard plots in this article.
I am sure if you are going to create these plots in the tableau, you will scratch your heads. As I always say, R is not a language but a love. It offers hundreds of packages that will help you not only in visualization but in all your data-related works.
I hope you have enjoyed using ggpubr in R. That’s all for now. Happy R!!!
More read: R documentation