5 Simple Python Techniques To Speed Up Data Analysis

Filed Under: Python
Python DataAnalysis Tricks FeaImg

Python is one of the most important and widely used data analysis tools. But what if everyone else in the competition uses Python? How can the analysis be sped up? How can you make your data analysis stand out from the crowd and get to the top of the points table?

So, here are some of my favorite tips and tactics, which I have utilized and gathered into this tutorial. Some may be well-known, while others may be new to you, but I am confident they will come in helpful the next time you work on a Data Analysis project.

1. Profiling using Pandas in Python

Profiling is a procedure that allows us to better understand our data, and Pandas Profiling is a Python library that does just that. It is a straightforward and quick method for performing exploratory data analysis on a Pandas Dataframe.

Normally, the pandas df.describe() and df.info() methods are used as the initial step in the EDA process. However, it only provides a very basic perspective of the data and is ineffective when dealing with big data sets.

The Pandas Profiling function, on the other hand, adds df.profile_report() to the pandas DataFrame for rapid data analysis. It presents a lot of information in an interactive HTML report with a single line of code.

Implementation of Profiling 

I’ll be using Google Colabs, and the command below will be used to install profiling. To show the possibilities of the adaptable python profiler, we’ll utilize the age-old Titanic dataset.

!pip install https://github.com/pandas-profiling/pandas-profiling/archive/master.zip 
import pandas as pd
import pandas_profiling
df = pd.read_csv('titanic.csv')
df.profile_report()

This is all the code you need to display the data profiling report in a notebook. The report is rather extensive, with charts used as needed.

Pandas Profiling Output
Pandas Profiling Output

2. Interactive Pandas Plots in Python

Pandas’ DataFrame class includes a built-in .plot() method. However, the visuals produced by this function are not interactive, which makes them less attractive.

On the contrary, the simplicity with which charts get plot using pandas. The DataFrame.plot() function cannot be ruled out either.

What if we could use pandas to create interactive plotly-like charts without making big changes to the code? You may accomplish so with the aid of the Cufflinks library.

For quick charting, the Cufflinks library combines the power of plotly with the flexibility of pandas. Let’s now look at how to install the library and get it to work in pandas.

Magic of Python

Magic commands are a collection of useful methods in Jupyter Notebooks intent to handle some of the most prevalent challenges in regular data analysis. With the aid of %lsmagic, you may see all accessible magics.

Line magics, which are preceded by a single % character and work on a single line of input, and cell magics, which are associated with the double %% prefix and operate on several lines of input are the two types of magic instructions.

3. Making the task of Eliminating Errors in Python Easy

The interactive debugger is likewise a magic function, but it has its own category. If you get an exception while running the code cell, start a new line and type %debug.

This launches an interactive debugging environment that takes you to the location of the exception. You may also use this function to check the values of variables assigned in the program and to conduct actions. Press q to exit the debugger.

Implementation of Interactive Debugger

Interative Debugging Python
Interactive Debugging Python

4. Printing in Python Made Easier!

If you want to create visually appealing representations of your data structures, pprint is the module to use. It comes in handy when printing dictionaries or JSON data. Let’s look at an example that displays the results using both print and pprint.

Implementation of pprint

import pprint
students = {'S_ID': '101', 'Name': 'Terry','Sub_IDs': {'S1': 1308, 'S2':'66D4','S3':2}}

print("NORMAL PRINTING")
print(students)
print()

print("PPRINT FUNCTION")
pprint.pprint(students,width=1)
NORMAL PRINTING
{'S_ID': '101', 'Name': 'Terry', 'Sub_IDs': {'S1': 1308, 'S2': '66D4', 'S3': 2}}

PPRINT FUNCTION
{'Name': 'Terry',
 'S_ID': '101',
 'Sub_IDs': {'S1': 1308,
             'S2': '66D4',
             'S3': 2}}

5. Automatic Commenting in Python

Ctrl/Cmd + / immediately comment out chosen lines in the cell. When you press the combination again, the identical line of code will be uncommented.

Conclusion

In this article, I’ve compiled a collection of the most useful tidbits I’ve learned while working with Python and Jupyter Notebooks. I am confident that these easy techniques will be useful to you, and that you will learn something from this essay. In the meanwhile, Happy Coding!

close
Generic selectors
Exact matches only
Search in title
Search in content