The Union() function in the R language is used to unify the data and eliminate the repeated values in it. The function is useful in removing the duplicate records from the data.
Syntax of the union() function
Union(): The union function is used to accomplish the data union such as vectors or data frames.
union(x,y)
Where:
X = Input data set or a vector.
Y = Input data set or a vector.
A basic example of the union() function
We get to know what is union function and now let’s see how it works in R language.
#Creating vectors
x<-c(1,2,3,4,5)
y<-c(3,4,5,6,7)
#Removes the duplicates from data
union(x,y)
1 2 3 4 5 6 7
As you can see in the above output, we have created two vectors and then passed them as an input to the union() function and it returned the unified values which are free of duplicates.
The union() function with the data frames
Let’s make use of union function to unify the two data frames and remove the duplicate values among them.
We have to create two data frames for this. I am creating student marks data for this purpose.
#Creating a data frame
df_one <- data.frame(Student_ID =c(1,2,3,4,5,6),Marks=c(81,80,78,85,91,94),Subject=c('Maths','English','Science','Economics','Computers','Geography'))
df_one
Student_ID Marks Subject
1 1 81 Maths
2 2 80 English
3 3 78 Science
4 4 85 Economics
5 5 91 Computers
6 6 94 Geography
df_two <- data.frame(Student_ID=c(4,5,6,7,8,9),Marks=c(85,91,94,93,80,83),Subject=c('Economics','Maths','Computers','Science','Stats','Chemistry'))
df_two
Student_ID Marks Subject
1 4 85 Economics
2 5 91 Maths
3 6 94 Computers
4 7 93 Science
5 8 80 Stats
6 9 83 Chemistry
Now, we have the dataframes, and let’s use the merge() function to accomplish the union of data frames.
Let’s see how it works.
#Uninfy the data frames
my_union <-merge(df_one,df_two,all = T)
my_union
Student_ID Marks Subject
1 1 81 Maths
2 2 80 English
3 3 78 Science
4 4 85 Economics
5 5 91 Computers
6 5 91 Maths
7 6 94 Computers
8 6 94 Geography
9 7 93 Science
10 8 80 Stats
11 9 83 Chemistry
By this method, you can easily unify the data and remove the duplicate values present in it.
Where and How you can use the union() function
- When you are Merging the dataset.
- To remove the Duplicate values in the data.
- When using the ‘Dplyr’ package.
- In Exploratory Data and Business Analytics.
Wrapping Up
The union() function in the R language is used to unify the data frame or a vector and to remove the duplicate values in the data.
You can union function along with merge, cbind, rbind functions, and dplyr package as well.
That’s all for now. Happy Unifying!
More read: R documentation