QuickDA in Python: Explore Your Data In Seconds

Filed Under: Python Modules
Quickda In Python

As the prominence and importance of exploratory data analysis are universal, developers kept pushing many libraries which help us in performing EDA and exploring the data. Now, QuickDA is the new addition to the list of libraries that promotes automated EDA. In this article, we will be focusing on how we can leverage the benefits of QuickDA for your data exploration. 

Typically, considering the importance of the EDA process, we spent minutes to hours on it. You will write some code and try to explore the data in all possible ways to get some insights that make sense. But, it’s time for QuickDA now. You can perform the EDA within few minutes as it offers many functions which will eventually help you to explore the data in and out.


QuickDA in Python

The QuickDA is a python data analysis library used to perform EDA on any of the structured datasets. It is a very easy-to-use library and has simple syntax for implementation.

All you need to do is to install the QuickDA and load it into python to get started.


Installation of QuickDA

Now, we have to install the QuickDA library into the python environment. Run the below code which will do the same for you.

#install required library 

pip install quickda

#Explore the data
from quickda.explore_data import *

#data cleaning
from quickda.clean_data import *

#Explore numerical data
from quickda.explore_numeric import *

#Explore catgorical data
from quickda.explore_categoric import *

#Data exploration
from quickda.explore_numeric_categoric import *

#Time series data
from quickda.explore_time_series import *

#Import pandas 
import pandas as pd

Cool!

We have installed the library and imported all the required functionalities. Let’s get started with this.


Load the data

I will be using the titanic dataset for this purpose. Let’s load the data and then we are good to go with exploring it.

#load the data

df = pd.read_csv('titanic.csv')

df
Titanic 4

Our data is ready to undergo EDA!


Statistical Properties

As a first step, we will be exploring the statistical properties of the given dataset. You have to use the explore function for this purpose as shown below.

#Explore the data

explore(df)
Quickda Explore

The explore function gives the detailed statistical report of the variables in the data as shown above.


Data Preprocessing

As I already told you, QuickDA offers many methods to support EDA. You can preprocess the data using a method – ‘Standardize’. Let’s see how it works

#Data preprocessing

df1 = clean(df, method='standardize')
df1
Data preprocessing

Here, you can observe that all the variable names have been changes to lowercase to maintain the data standards.


EDA Report

Using this library, you can create an EDA report of the data. The method used here is the ‘profile’ method and you have to mention the report as well.

#EDA report

explore(df, method = 'profile', report_name = 'Report')
EDA Report 1
EDA Report 2

The EDA report will be saved in your working directory as a web page. You can access that anytime to see the detailed EDA report of your data.

It will save a huge chunk of your time on EDA and you can focus on much more things.


Remove Duplicates

Removing duplicate data is very important in EDA as it will drive wrong interpretations over the data. QuickDA offers a method 'Duplicates' to eliminate all the duplicate values present in the data.

#Remove duplicates

df3 = clean(df, method = 'duplicates')
df3

The above retuned the same input data as there were no duplicates present in the data. If your data have any duplicate values, it will detect and eliminate them for you.


Missing Values

Dealing with missing values is more important and also to maintain the data quality. It will help you in modeling. So, QuickDA offers a method – ‘fill missing to handle this.

#Missing values

df4 = clean(df, method = 'fillmissing')


#Check the missing values now

df4.isnull().any()
PassengerId    False
Survived       False
Pclass         False
Name           False
Sex            False
Age            False
SibSp          False
Parch          False
Ticket         False
Fare           False
Cabin          False
Embarked       False
dtype: bool

This code will fill the missing values in your data. In initial data, we have ~19% missing values in Age variable and 38% missing data in the Cabin variable.

But now, all the missing data is being filled by the QuickDA. Therefore, using this library in your next assignments can be fruitful for you. Above all, it will save a lot of time and also offers quality EDA functions and reports which you can use straight away.


Ending Note – QuickDA

Well, we have discussed one of the best EDA libraries in Python. QuickDA offers many methods for all your EDA needs. As I already told you, it will offer amazing quality reports along with dedicated functions and methods to make your EDA journey remarkable. I hope you enjoyed this.

And, that’s all for now! Happy Python 馃檪

See you soon!

More read: Official QuickDA documentation

Leave a Reply

Your email address will not be published. Required fields are marked *

close
Generic selectors
Exact matches only
Search in title
Search in content