With the help of specific functions offered by R, reading the CSV files into data frames is much easier.
Table of Contents
What is a CSV file?
CSV is expanded as Comma, Separated, Values. In this file, the values stored are separated by a comma. This process of storing the data is much easier.
Why CSV is the most used file format for data storing?
Storing the data in an excel sheet is the most common practice in many companies. In the majority of firms, people are storing data as comma-separated-values (CSV), as the process is easier than creating normal spreadsheets. Later they can use R’s built in packages to read and analyze the data.
Being the most popular and powerful statistical analysis programming language, R offers specific functions to read data into organized data frames from a CSV file.
Reading CSV File to Data Frame
In this short example, we will see how we can read a CSV file into organized data frames.
The first thing in this process is to getting and setting up the working directory. You need to choose the working path of the CSV file.
1. Setting up the working directory
Here you can check the default working directory using getwd() function and you can also change the directory using the function setwd().
>getwd() #Shows the default working directory ----> "C:/Users/Dell/Documents" > setwd("C:\Users\Dell\Documents\R-test data") #to set the new working Directory > getwd() #you can see the updated working directory ---> "C:/Users/Dell/Documents/R-test data"
2. Importing and Reading the dataset / CSV file
After the setting of the working path, you need to import the data set or a CSV file as shown below.
> readfile <- read.csv("testdata.txt")
Execute the above line of code in R studio to get the data frame as shown below.
To check the class of the variable ‘readfile’, execute the below code.
> class(readfile) ---> "data.frame"
In the above image you can see the data frame which includes the information of student names, their ID’s, departments, gender and marks.
3. Extracting the student’s information from the CSV file
After getting the data frame, you can now analyse the data. You can extract particular information from the data frame.
To extract the highest marks scored by students,
>marks <- max(data$Marks.Scored) #this will give you the highest marks #To extract the details of a student who scored the highest marks, > data <- read.csv("traindata.csv") > Marks <- max(data$Marks.Scored) > retval <- subset(data, Marks.Scored == max(Marks.Scored)) #This will extract the details of the student who secured highest marks > View(retval)
To extract the details of the students who are in studying in ‘chemistry’ Dept,
> readfile <- read.csv("traindata.csv") > retval <- subset( data, Department == "chemistry") # This will extract the student details who are in Biochemistry department > View(retval)
By this process you can read the csv files in R with the use of read.csv(“ “) function. This tutorial covers how to import the csv file and reading the csv file and extracting some specific information from the data frame.
I used R studio for this project. RStudio offers great features like console, editor, and environment as well. Anyhow you are free to use other editors like Thinn-R, Crimson editor, etc. I hope this tutorial will help you in understanding the reading of CSV files in R and extracting some information from the data frame.
For more read: https://cran.r-project.org/manuals.html