Klib in Python – Speed Up Your Data Visualization

Filed Under: Python Modules
Klib In Python

Klib is an easy-to-use Python library for data cleaning, preprocessing, and visualization. It is an open-source library that helps in data analysis. As we know, visualizations can easily and effectively summarize the key insights and data distributions. In this article, we will be focusing on data visualization using Klib in python.


Installing Klib in Python

First things first!

Run the below code to install and load the library into python. The installation code for the conda environment is also given below.

#Install klib in python 

pip install -U klib

#For conda environment 

conda install -c conda-forge klib
#load Klib library

import klib
Klib Installation

After all the requirements were satisfied, you will see the success message as shown above. If you can see this on your PC, perfect! Let’s move forward and load the data on which we try to create some visualizations.


Klib – Create Awesome Visualizations in Seconds

As I already told you, using this, you can visualize the data in seconds. The plots will be smooth and nearly tremendous to see. Excited?!

Klib library offers 5 functions for describing/visualizing the data –

  • cat_plot()
  • corr_mat()
  • corr_plot()
  • dist_plot()
  • missingval_plot()

We will be discussing all of these in the following sections.


Load the Data

I will be using the Titanic dataset for this whole tutorial. You can download the dataset here.

import pandas as pd

df = pd.read_csv('titanic.csv')
df.head()
Titanic 1

That’s good. Our data is ready to grill!


1. Klib Categorical Plot

The categorical plot is used to visualize the relationship between the categorical data in the dataset. Let’s see how we can do this.

#Categorical plot

klib.cat_plot(df)
Categorical Plot

Cool! You can see the categorical plot above. The cat_plot() function will visualize all the categorical data present in the dataset.


2. Klib – Correlation Matrix

The corr_mat() function is used to create the correlation matrix of the data in no time. It is a very simple and easy-to-use functionality for correlation.

#Correlation matrix

klib.corr_mat(df)
Correlation Matrix

It is the display of the matrix alone. Therefore, we are going with corr_plot() to visualize this matrix.


3. Correlation Plot

In the above section, we have created a correlation matrix and it’s time to visualize it using the corr_plot() function. It offers an entire correlation plot along with positive and negative correlation plots as shown below. It is a wonderful feature to use.

#Correlation plots

#Positive correlation plot
klib.corr_plot(df, split="pos")

#Negative correlation plot 
klib.corr_plot(df, split="neg")

#Entire correlation plot 
klib.corr_plot(df)
Positive Correlation
Negative Correlation
Correlation Plot

These are awesome graphs to watch out for! I hope you love these 馃檪


4. Dist plot

The dist plot or also called as distribution plot is used to describe the variation in the data distribution. Let’s see how we can do this using dist_plot().

#Dist plot

klib.dist_plot(df)
Klib Dist Plot
Klib Dist Plot 2
Klib Dist Plot 3

The plots include all the required details and looks great!


5. Missing Value plot

Finally, we have a missing value plot function which is used to visualize the missing values. Therefore, we can give it a try here.

#Missing value plot

klib.missingval_plot(df)
Missing Value Plot

This is how it looks like. Pretty good!


Conclusion

Klib is an awesome data analysis library using which you can create amazing visualizations as shown below. All it takes is two lines of code.

I hope you love this library as much as I do and you can make use of it in your next assignments. That’s all for now! Happy Python 馃槢

More read: Klib library documentation

Leave a Reply

Your email address will not be published. Required fields are marked *

close
Generic selectors
Exact matches only
Search in title
Search in content