Hello readers! In this article, we will be focusing on an important concept of Machine Learning models — **Linear Regression in R** programming.

So. let us get started!

Table of Contents

## First, what is Linear Regression?

Talking about our understanding regarding Machine Learning algorithms, they are broadly classified into **Regression **and **Classification **algorithms.

*Linear Regression is one such Regression Algorithm. The main task of a Linear Regression Model is to predict numeric values for continuous data valued dataset. *

*By this, we mean to say, that the basic task of a linear regression model is to find the best fit line and on the basis of this line the test data values are predicted.*

It tries to model the relation between two or more data variables by fitting them in the below linear equation:

**y = mx + c**

In terms of modeling, a linear regression model attempts to extract a linear relationship between the input (independent) variables and a single response/dependent variable.

## Linear Regression – A Practical Approach

Having understood about Linear Regression, let us now understand the steps in the execution of the model on a dataset.

In this example, we would be using the **Bike Rental Count Prediction** Problem to predict the customer who would opt for renting bikes depending on weather and other conditions. You can find the dataset here!

### 1. Load the dataset

Its the time to load the dataset into the R environment. To achieve the same, we have made use of read.csv() function.

#Removed all the existing objects rm(list = ls()) #Setting the working directory setwd("D:/Bike_Rental_Count/") getwd() #Load the dataset bike_data = read.csv("day.csv",header=TRUE)

## 2. Splitting the dataset

Having loaded the dataset into the R environment, now is the time to segregate the data into training and testing sets. We have made use of `createDataPartition() method`

to split the data into training and testing data values.

### SAMPLING OF DATA -- Splitting of Data columns into Training and Test dataset ### categorical_col_updated = c('season','yr','mnth','weathersit','holiday') library(dummies) bike = bike_data bike = dummy.data.frame(bike,categorical_col_updated) dim(bike) #Separating the depenedent and independent data variables into two dataframes. library(caret) set.seed(101) split_val = createDataPartition(bike$cnt, p = 0.80, list = FALSE) train_data = bike[split_val,] test_data = bike[-split_val,]

### 3. Error Metrics for Evaluation

Evaluation is one of the most crucial steps in the domain of data modelling. Since this dataset belongs to Regression type, we have made use of **MAPE **(mean absolute percentage error) to detect the error from the predictions.

#Defining error metrics to check the error rate and accuracy of the Regression ML algorithms #1. MEAN ABSOLUTE PERCENTAGE ERROR (MAPE) MAPE = function(y_actual,y_predict){ mean(abs((y_actual-y_predict)/y_actual))*100 }

### 4. Modelling – Linear Regression

In order to apply linear regression, we have made use of `lm() function`

from the R documentation. After applying the model to the training data, we have used the predict() function to predict the values for the testing dataset using the same applied model.

Finally, we have evaluated the model against **MAPE **and **Accuracy **levels.

linear_model = lm(cnt~., train_data) #Building the Linear Regression Model on our dataset summary(linear_model) linear_predict=predict(linear_model,test_data[-27]) #Predictions on Testing data LR_MAPE = MAPE(test_data[,27],linear_predict) # Using MAPE error metrics to check for the error rate and accuracy level Accuracy_Linear = 100 - LR_MAPE print("MAPE: ") print(LR_MAPE) print('Accuracy of Linear Regression: ') print(Accuracy_Linear)

**Output:**

As seen below, linear regression applied to the dataset has got 82% accuracy for our problem statement. That is, 82% of the values are rightly predicted.

"MAPE: " 17.61674 "Accuracy of Linear Regression: " 82.38326

## Conclusion

By this, we have come to the end of this topic. Feel free to comment below, in case you come across any question.

Try implementing the concept of Linear Regression in R programming on different datasets and do let us know about your understanding in the comment section.

Till then, Stay tuned and Happy Learning!! 🙂