Factor functions in R – 5 functions to know!

Filed Under: R Programming
Factor Functions In R

Hello, readers! In this article, we will be focusing on 5 Factor functions in R programming, in detail.

So, let us begin!


So, what is Factor in R?

R provides us with various data objects to store data and process them accordingly. Some of the objects include,

  1. Factors
  2. Data frames
  3. Vectors
  4. Matrices, etc.

Factor functions in R, like data structures, are useful to store categorical type of data values altogether. That is all the data that encloses the presence of category can be stored in factors. With factors, the groups of the data are assigned certain levels, by which the entire category is recognized easily.

Factor functions in R

In this article, we will be having a look at some of the most used functions of Factors:

  1. levels() function
  2. nlevels() function
  3. droplevels() function
  4. gl() function
  5. recode_factor() function

Let us have a look at them one by one in the upcoming section!


1. R levels() function

‘Levels’ in R factors actually describe each and every group present in the categorical data variable. R levels() function enables us to create or assign levels to every category of the factor type data variables.

Syntax:

levels(data-column)

Example:

Initially, we have created a vector of categories and converted it to factor using factor() function. Further, we have created levels for the factor type data using levels() function.

rm(list = ls())
 
Poll <- factor(c("Yes", "No", "May BE","May be Yes", "May be NO")) 
levels(Poll)

Output:

The levels() function encounters and finds out the unique categories from the variable and sets them as a level.

> levels(Poll)
[1] "May BE"     "May be NO"  "May be Yes" "No"         "Yes" 

2. R nlevels() function

R nlevels() function enables us to fetch the number of levels represented by the factor type data variables.

With nlevels() function, we can easily get the count of total categorical groups in the data.

Example:

rm(list = ls())
 
Poll <- factor(c("Yes", "No", "May BE","May be Yes", "May be NO")) 
levels(Poll)
nlevels(Poll)

Output:

> levels(Poll)
[1] "May BE"     "May be NO"  "May be Yes" "No"   "Yes"       
> nlevels(Poll)
[1] 5

3. R droplevels() function

R droplevels() function helps us drop or delete the unused levels from the model. With droplevels() function, any observed level that is not being in use with the model can be easily removed or deleted from the set of levels.

Example:

In this example, we have created factory type data. Further, we have deleted the value at 8th place i.e. 40. Further deletion, the value gets deleted from the variable, but we still see the 40 as a level in the levels section.

Then, we use droplevels() to remove the unused level ( that is not the part of the data variable anymore) to be deleted from the levels tag.

rm(list = ls())
 
data <- factor(c(10,20,30,10,10,20,30,40,50,50)) 

print("Factor values before deletion:") 
print(data) 

data <- data[-8]

cat("Factor deleting value:") 
print(data) 

print("Dropping unused level:") 
dta <- droplevels(data) 
print(dta) 

Output:

> print("Factor values before deletion:") 
[1] "Factor values before deletion:"
> print(data) 
 [1] 10 20 30 10 10 20 30 40 50 50
Levels: 10 20 30 40 50
> 
> data <- data[-8]
> 
> cat("Factor deleting value:") 
Factor deleting value:> print(data) 
[1] 10 20 30 10 10 20 30 50 50
Levels: 10 20 30 40 50
> 
> print("Dropping unused level:") 
[1] "Dropping unused level:"
> dta <- droplevels(data) 
> print(dta) 
[1] 10 20 30 10 10 20 30 50 50
Levels: 10 20 30 50

4. R gl() function

R gl() function helps us create customized factors. That is, it allows us to create factors by specifying a particular pattern and other characteristics as mentioned below–

gl(x, k, length, labels, ordered)
  • x: Number of levels needed.
  • k: Number of repetitions allowed.
  • length: Length of the end result.
  • labels: Label of the vectors (optional)
  • ordered: If set to True, it orders the boolean values.

Example:

In this example, we have created factor data with 2 levels and 4 as the set of repetition and 12 as the length of factor data.

rm(list = ls())
 
data <- gl(2, 4, 12)
print(data)

Output:

> print(data)
 [1] 1 1 1 1 2 2 2 2 1 1 1 1
Levels: 1 2

5. R recode_factor() function

R recode_factor() function helps us to customize few portions of the created factor. That is, we can alter the values of the factor and set them to some other value with recode_factor() method.

Example:

In this example, we have replaced ‘z’ with ‘a’ using recode_factor() method.

rm(list = ls())
 
library(dplyr) 

dta <- as.factor(c("y", "z", "a")) 

print("Factor values before replacement:") 
print(dta) 

print("Factor after replacement:") 
print(recode_factor(dta, "z" = "a")) 

Output:

> print("Factor values before replacement:") 
[1] "Factor values before replacement:"
> print(dta) 
[1] y z a
Levels: a y z
 
> print("Factor after replacement:") 
[1] "Factor after replacement:"
> print(recode_factor(dta, "z" = "a")) 
[1] y a a
Levels: a y

Conclusion

By this, we have come to the end of this topic. Feel free to comment below, in case you come across any question.

For more such posts related to R programming, Stay tuned with us.

Till then happy learning!! 馃檪

Leave a Reply

Your email address will not be published. Required fields are marked *

close
Generic selectors
Exact matches only
Search in title
Search in content