Hello, readers! In this article, you’ll learn about an important aspect of R programming — R data.table() function, in detail.
So, let us begin!!
Usage of R data.table() function
Be it any programming language, the efficiency of functions to work on complex data (huge in size) has always proven itself over other functions with a variety of offerings.
R programming offers us one such function that proves to be superior amongst all other functions due to its capability of dealing with huge data and complex structures. Yes, you guessed it right!! 🙂
We are referring about
R data.table() function. We can consider the data.table library as the fastest library for data analytics and manipulation on a generic level. That is, it offers the fastest and easiest way to perform the following functions using data.table package:
- selection and computation of columns and rows
- Optimizes the operations internally
Thus, by this, it helps us achieve faster development with a simple and short syntax as shown below!
data[x, y, by]
- x: rows
- y: columns
- by: grouping condition which can also include with, rolls, etc.
Having understood the functioning and structure of data.table() function, let us now focus on some practical examples of the same.
1. Creation of a data frame using R data.table()
data.table() inherits from the data frame, it provides a simple and optimistic syntax to implement and create data frames as shown below–
In the below example, we have created a data frame with the columns ‘City, id and a’.
rm(list = ls()) library(data.table) data = data.table( City = c("Pune","Satara","Pune","Mumbai","Goa","Gujarat"), id = 1:6, a = 2:7) print(data)
City id a 1: Pune 1 2 2: Satara 2 3 3: Pune 3 4 4: Mumbai 4 5 5: Goa 5 6 6: Gujarat 6 7
2. Selection of columns using data.table
Now, we have performed the selection operation using the concise and simple syntactic structure:
In the below example, if we do not use the above syntax, we result into obtaining a vector.
Thus, in order to obtain a data.table structure, we need to follow the above mentioned syntax.
info = data[ , City]#returns a vector print(info) info = data[ , .(City)]#returns a data.table print(info)
"Pune" "Satara" "Pune" "Mumbai" "Goa" "Gujarat" City 1: Pune 2: Satara 3: Pune 4: Mumbai 5: Goa 6: Gujarat
3. Selecting columns based on index position
In the below example, we have selected the columns with index 1 and 2 i.e. City and id. This is made possible by the crisp and simple syntactic structure offered by data.table package irrespective of the operations to be performed.
info = data[, c(1:2), with=FALSE ] print(info)
City id 1: Pune 1 2: Satara 2 3: Pune 3 4: Mumbai 4 5: Goa 5 6: Gujarat 6
4. Use of %like% operator with data.table package
We can even make use of %like% operator with data.table library to identify and group values by a particular set of entities.
In the below example, we have selected all the data values of every column which contains the ‘City’ values as ‘Pune’ or ‘Satara’.
info = data[City %in% c("Pune", "Satara")] print(info)
City id a 1: Pune 1 2 2: Satara 2 3 3: Pune 3 4
5. Conditionally display rows with the R data.table() function
In the below example, we have created a subset of the data frame using == operator. That is, we select all the data values which have ‘Pune’ as the value for ‘City’ and ‘1’ as the value for the column ‘id’.
info = data[City == "Pune" & id == "1"] print(info)
City id a 1: Pune 1 2
By this, we have come to the end of this topic. Feel free to comment below, in case you come across any question.
For more such posts related to R programming, stay tuned and till then, Happy Learning!! 🙂