Hello, readers! In this article, we will be focusing on Python isna() and notna() functions for Data Pre-processing, in detail.
So, let us begin!! 🙂
Relevance of Python in Data Pre-processing
Python offers us a huge number of modules and in-built functions to deal with the data. In the domain of data science, data preprocessing plays a vital role. It is the process of cleaning the data and making it available for use & processing. By this, we understand the data much better and also enables us to eliminate the unwanted values from the data.
Raw data contains different forms of elements as they seem to be a result of surveys, historical data, etc. In order to align the data in an understood format, we need functions to treat the data.
One such important aspect in data pre-processing is missing value analysis. With missing value analysis, we tend to check for the presence of missing or NULL values and we either treat them or eliminate them from the dataset as it causes uneven distribution of the data.
When it comes to initial cleaning and missing value analysis of data, Python Pandas module offers us two important functions for the same–
- isna() function
- notna() function
In context to the concept of this topic, we will be having a look at the above functions in detail. For the same, we will be making use of the Bike Rental Count Prediction dataset in the upcoming examples.
You can find the dataset here!
1. Python isna() function
In the initial stages of data pre-processing and missing value analysis, Python isna() function comes to our rescue to hand us over the missing value data.
That is, with isna() function, we can easily detect the presence of missing value which is NULL or NA value from the entire dataset. It is a boolean function that returns TRUE only if the dataset consists of missing values.
Thus for a quick and easy pre-processing check, isna() function can be used to help us have an idea about the missing values in the dataset.
import pandas data = pandas.read_csv("bike.csv") data.isna()
As clearly seen below, the isna() function checks for the presence of missing value against every single element and returns FALSE as the result. Which means, the dataset is free from missing values.
2. Python notna() function
On the contrary to isna() function, Python notna() function is a quick and easy method to represent those data elements that do not happen to have missing values in them.
At times, we come across situations wherein we feel the need to segregate and check data against missing values, at this point notna() function can be used.
The notna() function is a boolean function which returns TRUE only and only if the data variable does not occupy a NULL or a missing data.
import pandas data = pandas.read_csv("bike.csv") data.notna()
As seen below, the notna() function returns TRUE because it does not contain any missing values.
By this, we have come to the end of this topic. Feel free to comment below, in case you come across any question.
For more such posts related to Python programming, Stay tuned with us.
Till then, Happy Learning!! 🙂