Handling Missing Values in R – Omit() function in R

Filed Under: R Programming
Missing Values In R

In this article, we’ll work on handling missing values using the omit() function in R. If you are an analyst, then you will be dealing with tons of data each day.

You will continue to receive unstructured data most of the time which includes missing values. In this article, we will be focusing on handling missing values in R using the na.omit() function. The same task can also be achieved using the replace function in R.

The syntax of na.omit() function in R

na.omit(): The na.omit() function in R is used to eliminate or omit the missing values present in the data to achieve the best accuracy.

na.omit(data)

Where,

data = Input data.

By the end of this article, you will be better at handling missing values in the data using multiple methods an techniques. Let’s roll!!!

1. na.omit() function in R with a vector

As you know, a vector in R is a basic data structure that includes the elements of the same data type

Now, let’s see how we can eliminate the NA values present in the vector using the na.omit() function in R.

#Creates a vector with NA values in it
df <- c(34,We25,67,NA,56,87,NA,34,56)

#Omits or Eliminates the NA values in the data
na.omit(df)
34 25 67 56 87 34 56

Fantastic!!! We have successfully omitted the missing values present in our vector. It is as simple as that. A single line of code can get the job done for you.

attr(,"na.action")
4 7
attr(,"class")
"omit"

Here, you can also see that the output is giving more brief about the action we performed. You can observe that it is showing the action performed i.e. NA. action along with its position. The NA values were present in the position of 4 and 7 in the vector and the class will be ‘omit’.

2. na.omit() function with a dataframe

Well, now we can continue our journey with na.omit() function, but this time with a new guest, dataframe in R.

We are going to import a dataset, which is available by default in R i.e. ‘airquality’, which has NA values, for this purpose.

Let’s see how it works!!!

#Importing the dataset
df <- datasets::airquality

#Prints the data
df
      Ozone Solar.R  Wind Temp  Month Day
1      41     190    7.4   67     5   1
2      36     118    8.0   72     5   2
3      12     149   12.6   74     5   3
4      18     313   11.5   62     5   4
5      NA      NA   14.3   56     5   5
6      28      NA   14.9   66     5   6
7      23     299    8.6   65     5   7
8      19      99   13.8   59     5   8
9       8      19   20.1   61     5   9
10     NA     194    8.6   69     5  10
11      7      NA    6.9   74     5  11
12     16     256    9.7   69     5  12

Great! Now our data is ready with NA values. With the experience of handling missing values in R using the na.omit() function, we can go berserk and omit the NA values out of the data.

na.omit(df)

Again it’s just a one liner in R to handle most of the things.

     Ozone Solar.R Wind  Temp  Month Day
1      41     190   7.4   67     5   1
2      36     118   8.0   72     5   2
3      12     149  12.6   74     5   3
4      18     313  11.5   62     5   4
7      23     299   8.6   65     5   7
8      19      99  13.8   59     5   8
9       8      19  20.1   61     5   9
12     16     256   9.7   69     5  12
13     11     290   9.2   66     5  13
14     14     274  10.9   68     5  14

You can now see that the na.omit() function has omitted all the rows with the NA values.

Note: You can freely use na.omit() function if you have a huge data set. Otherwise, you will lose a tremendous part of the data as this function eliminates the entire row. This may lead to poor modeling and accuracy as well.

Just assume your data has 100 rows and 40 columns. Now, if you use the na.omit() function to get rid of missing values in R, if those NA values were distributed over 20-30 rows, you are going to lose over 30% of the data for no reason.

In such cases, the replace function in R performs better as you can replace those specific instances of the data with a substitute value.

3. na.omit() function with specific column of the data

missing values in r

But Don’t’ worry, we have some alternatives for it. Instead of eliminating the entire row, you can omit the NA values in a particular column.

Fantastic right? Now, I can see a relaxed “you” 🙂

Let’s see how it woks!!!

Consider the same ‘airquality’ data for this purpose.

    Ozone Solar.R  Wind   Temp  Month Day
1      41     190    7.4   67     5   1
2      36     118    8.0   72     5   2
3      12     149   12.6   74     5   3
4      18     313   11.5   62     5   4
5      NA      NA   14.3   56     5   5
6      28      NA   14.9   66     5   6
7      23     299    8.6   65     5   7
8      19      99   13.8   59     5   8
9       8      19   20.1   61     5   9
10     NA     194    8.6   69     5  10
11      7      NA    6.9   74     5  11
12     16     256    9.7   69     5  12

You can see, there are 2 NA values present in the column Ozone. Now we are going to omit those 2 NA values without interpreting the rest of the data.

#Omits the NA values from column Ozone
na.omit(df$Ozone)
41  36  12  18  28  23  19   8   7  16  
attr(,"na.action")
5  10  
attr(,"class")
[1] "omit"

WoW!!! the na.omit() function has eliminated the NA values from the column Ozone without disturbing the rest of the data and it’s pretty much awesome.

You can use this method to eliminate the NA values along while avoiding data loss in other columns. These methods will be really helpful in handling missing values in R.

Wrapping Up – Handling Missing Values in R

R language offers great features, functions, and tools for data manipulation, collection, storage, and analysis.

This article is dedicated to handling missing values in R using the na.omit() function.

I hope you got the idea on how to use na.omit() function and it’s techniques as well. That’s all for today. Happy R!!!

More read: R documentation

Leave a Reply

Your email address will not be published. Required fields are marked *

close
Generic selectors
Exact matches only
Search in title
Search in content
Search in posts
Search in pages