The union() function in R – Eliminate Duplicate Values

Filed Under: R Programming
THE UNION() FUNCTION IN R PROGRAMMING

The Union() function in the R language is used to unify the data and eliminate the repeated values in it. The function is useful in removing the duplicate records from the data.

Syntax of the union() function

Union(): The union function is used to accomplish the data union such as vectors or data frames.

union(x,y)

Where:

X = Input data set or a vector.

Y = Input data set or a vector.

A basic example of the union() function

We get to know what is union function and now let’s see how it works in R language.

#Creating vectors
x<-c(1,2,3,4,5)
y<-c(3,4,5,6,7)

#Removes the duplicates from data 
union(x,y)
 1 2 3 4 5 6 7

As you can see in the above output, we have created two vectors and then passed them as an input to the union() function and it returned the unified values which are free of duplicates.

The union() function with the data frames

Let’s make use of union function to unify the two data frames and remove the duplicate values among them.

We have to create two data frames for this. I am creating student marks data for this purpose.

#Creating a data frame 
df_one <- data.frame(Student_ID =c(1,2,3,4,5,6),Marks=c(81,80,78,85,91,94),Subject=c('Maths','English','Science','Economics','Computers','Geography')) 
df_one
    Student_ID Marks   Subject
1          1    81     Maths
2          2    80     English
3          3    78     Science
4          4    85     Economics
5          5    91     Computers
6          6    94     Geography
df_two <- data.frame(Student_ID=c(4,5,6,7,8,9),Marks=c(85,91,94,93,80,83),Subject=c('Economics','Maths','Computers','Science','Stats','Chemistry'))
df_two
    Student_ID Marks   Subject
1          4    85     Economics
2          5    91     Maths
3          6    94     Computers
4          7    93     Science
5          8    80     Stats
6          9    83     Chemistry

Now, we have the dataframes, and let’s use the merge() function to accomplish the union of data frames.

Let’s see how it works.

#Uninfy the data frames 
my_union <-merge(df_one,df_two,all = T)
my_union
       Student_ID   Marks   Subject
1           1        81     Maths
2           2        80     English
3           3        78     Science
4           4        85     Economics
5           5        91     Computers
6           5        91     Maths
7           6        94     Computers
8           6        94     Geography
9           7        93     Science
10          8        80     Stats
11          9        83     Chemistry

By this method, you can easily unify the data and remove the duplicate values present in it.

Where and How you can use the union() function

  • When you are Merging the dataset.
  • To remove the Duplicate values in the data.
  • When using the ‘Dplyr’ package.
  • In Exploratory Data and Business Analytics.

Wrapping Up

The union() function in the R language is used to unify the data frame or a vector and to remove the duplicate values in the data.

You can union function along with merge, cbind, rbind functions, and dplyr package as well.

That’s all for now. Happy Unifying!

More read: R documentation

Leave a Reply

Your email address will not be published. Required fields are marked *

close
Generic selectors
Exact matches only
Search in title
Search in content
Search in posts
Search in pages