Klib is an easy-to-use Python library for data cleaning, preprocessing, and visualization. It is an open-source library that helps in data analysis. As we know, visualizations can easily and effectively summarize the key insights and data distributions. In this article, we will be focusing on data visualization using Klib in python.
Installing Klib in Python
First things first!
Run the below code to install and load the library into python. The installation code for the conda environment is also given below.
#Install klib in python pip install -U klib #For conda environment conda install -c conda-forge klib
#load Klib library import klib
After all the requirements were satisfied, you will see the success message as shown above. If you can see this on your PC, perfect! Let’s move forward and load the data on which we try to create some visualizations.
Klib – Create Awesome Visualizations in Seconds
As I already told you, using this, you can visualize the data in seconds. The plots will be smooth and nearly tremendous to see. Excited?!
Klib library offers 5 functions for describing/visualizing the data –
We will be discussing all of these in the following sections.
Load the Data
I will be using the Titanic dataset for this whole tutorial. You can download the dataset here.
import pandas as pd df = pd.read_csv('titanic.csv') df.head()
That’s good. Our data is ready to grill!
1. Klib Categorical Plot
The categorical plot is used to visualize the relationship between the categorical data in the dataset. Let’s see how we can do this.
#Categorical plot klib.cat_plot(df)
Cool! You can see the categorical plot above. The
cat_plot() function will visualize all the categorical data present in the dataset.
2. Klib – Correlation Matrix
corr_mat() function is used to create the correlation matrix of the data in no time. It is a very simple and easy-to-use functionality for correlation.
#Correlation matrix klib.corr_mat(df)
It is the display of the matrix alone. Therefore, we are going with corr_plot() to visualize this matrix.
3. Correlation Plot
In the above section, we have created a correlation matrix and it’s time to visualize it using the corr_plot() function. It offers an entire correlation plot along with positive and negative correlation plots as shown below. It is a wonderful feature to use.
#Correlation plots #Positive correlation plot klib.corr_plot(df, split="pos") #Negative correlation plot klib.corr_plot(df, split="neg") #Entire correlation plot klib.corr_plot(df)
These are awesome graphs to watch out for! I hope you love these 🙂
4. Dist plot
The dist plot or also called as distribution plot is used to describe the variation in the data distribution. Let’s see how we can do this using
#Dist plot klib.dist_plot(df)
The plots include all the required details and looks great!
5. Missing Value plot
Finally, we have a missing value plot function which is used to visualize the missing values. Therefore, we can give it a try here.
#Missing value plot klib.missingval_plot(df)
This is how it looks like. Pretty good!
Klib is an awesome data analysis library using which you can create amazing visualizations as shown below. All it takes is two lines of code.
I hope you love this library as much as I do and you can make use of it in your next assignments. That’s all for now! Happy Python 😛
More read: Klib library documentation