As the prominence and importance of exploratory data analysis are universal, developers kept pushing many libraries which help us in performing EDA and exploring the data. Now, QuickDA is the new addition to the list of libraries that promotes automated EDA. In this article, we will be focusing on how we can leverage the benefits of QuickDA for your data exploration.
Typically, considering the importance of the EDA process, we spent minutes to hours on it. You will write some code and try to explore the data in all possible ways to get some insights that make sense. But, it’s time for QuickDA now. You can perform the EDA within few minutes as it offers many functions which will eventually help you to explore the data in and out.
QuickDA in Python
The QuickDA is a python data analysis library used to perform EDA on any of the structured datasets. It is a very easy-to-use library and has simple syntax for implementation.
All you need to do is to install the QuickDA and load it into python to get started.
Installation of QuickDA
Now, we have to install the QuickDA library into the python environment. Run the below code which will do the same for you.
#install required library pip install quickda #Explore the data from quickda.explore_data import * #data cleaning from quickda.clean_data import * #Explore numerical data from quickda.explore_numeric import * #Explore catgorical data from quickda.explore_categoric import * #Data exploration from quickda.explore_numeric_categoric import * #Time series data from quickda.explore_time_series import * #Import pandas import pandas as pd
We have installed the library and imported all the required functionalities. Let’s get started with this.
Load the data
I will be using the titanic dataset for this purpose. Let’s load the data and then we are good to go with exploring it.
#load the data df = pd.read_csv('titanic.csv') df
Our data is ready to undergo EDA!
As a first step, we will be exploring the statistical properties of the given dataset. You have to use the
explore function for this purpose as shown below.
#Explore the data explore(df)
The explore function gives the detailed statistical report of the variables in the data as shown above.
As I already told you, QuickDA offers many methods to support EDA. You can preprocess the data using a method – ‘Standardize’. Let’s see how it works
#Data preprocessing df1 = clean(df, method='standardize') df1
Here, you can observe that all the variable names have been changes to lowercase to maintain the data standards.
Using this library, you can create an EDA report of the data. The method used here is the ‘profile’ method and you have to mention the report as well.
#EDA report explore(df, method = 'profile', report_name = 'Report')
The EDA report will be saved in your working directory as a web page. You can access that anytime to see the detailed EDA report of your data.
It will save a huge chunk of your time on EDA and you can focus on much more things.
Removing duplicate data is very important in EDA as it will drive wrong interpretations over the data. QuickDA offers a method
'Duplicates' to eliminate all the duplicate values present in the data.
#Remove duplicates df3 = clean(df, method = 'duplicates') df3
The above retuned the same input data as there were no duplicates present in the data. If your data have any duplicate values, it will detect and eliminate them for you.
Dealing with missing values is more important and also to maintain the data quality. It will help you in modeling. So, QuickDA offers a method – ‘fill missing to handle this.
#Missing values df4 = clean(df, method = 'fillmissing') #Check the missing values now df4.isnull().any()
PassengerId False Survived False Pclass False Name False Sex False Age False SibSp False Parch False Ticket False Fare False Cabin False Embarked False dtype: bool
This code will fill the missing values in your data. In initial data, we have ~19% missing values in Age variable and 38% missing data in the Cabin variable.
But now, all the missing data is being filled by the QuickDA. Therefore, using this library in your next assignments can be fruitful for you. Above all, it will save a lot of time and also offers quality EDA functions and reports which you can use straight away.
Ending Note – QuickDA
Well, we have discussed one of the best EDA libraries in Python. QuickDA offers many methods for all your EDA needs. As I already told you, it will offer amazing quality reports along with dedicated functions and methods to make your EDA journey remarkable. I hope you enjoyed this.
And, that’s all for now! Happy Python 🙂
See you soon!
More read: Official QuickDA documentation