Hello, readers! In this article, we will be focusing on Pandas Conversion functions, in detail.
So, let us begin!! 🙂
Need of Pandas Conversion functions
Python has a special place for development when it comes to Data Science and Machine Learning! It offers us various modules to deal with the data and manipulate the same.
One such module is Pandas Module.
Pandas module offers us with DataFrame as a data structure to store and manipulate the data. the beauty of it is the structure of rows and columns which makes it an essential part of data pre-processing.
While data pre-processing and manipulation, we come across the need to change the data type of the variable to a particular type for better cleaning and understanding of the data.
For this inter-conversion within the variables, we will be focusing on the below functions to perform conversion of variables:
- Python isna() function
- Python astype() function
- The copy() function
- Python notna() function
Let us begin!
1. Python isna() function
Python isna() function proves to be important in data pre-processing and cleaning of data values.
Further, with isna() function, we can easily detect for the presence of missing values. By this, the functions returns TRUE, if it detects a missing or NULL value within every variable.
import pandas info = pandas.read_csv("bike.csv") info.isna()
2. The astype() function for conversion
With the Python astype() function, comes inter-conversion of data values. Yes, astype() function enables us to convert the data type of data from one type to another.
Thus, during the data preparation, astype() function is the key to ease.
In this example, at first, we examine the data type of the variables using the below attribute-
Output– Before data-type conversion
instant int64 dteday object season int64 yr int64 mnth int64 holiday int64 weekday int64 workingday int64 weathersit int64 temp float64 atemp float64 hum float64 windspeed float64 casual int64 registered int64 cnt int64 dtype: object
Now, we convert the data type of the variable mnth from int64 to category type.
info.mnth = info.mnth.astype("category") info.dtypes
Output — After data-type conversion
instant int64 dteday object season int64 yr int64 mnth category holiday int64 weekday int64 workingday int64 weathersit int64 temp float64 atemp float64 hum float64 windspeed float64 casual int64 registered int64 cnt int64
3. Pandas dataframe.copy() function
While we make a lot of manipulations to the data, it is definitely very essential for us to have a backup of the original data in the current working environment to reduce the overhead of extraction of data.
For the same, we have the Python copy() function. The copy() function enables us to copy the entire data and store it into a new dataset in the current environment.
4. Python notna() function
On contrary to the Python isna() function, with the Python Pandas notna() function, we can easily separate the variables that do not have a NULL or missing value.
It also enables us to check against the presence of missing and returns TRUE only if the data variables do not contain a missing data value.
import pandas info = pandas.read_csv("bike.csv") info.notna()
By this, we have come to the end of this topic. Feel free to comment below, in case you come across any question.
For more such posts related to Python programming, Stay tuned with us.
Till then, Happy learning!! 🙂