How To Apply Functions To Columns In Python?

Filed Under: Pandas
How To Apply Functions To Columns

Without any doubt, Pandas is a widely used robust python module for data manipulation and analysis. It offers hundreds of functions which makes our analysis lifecycle not only easy but efficient.

At often times, we do update existing features or create new features from existing data for desired results. Today, let’s understand how we can apply functions to columns or features. 

Apply Functions to Columns in Python

We will be discussing 2 methods to apply functions to columns.

Also read: Conditional Filtering Using Pandas In Python

Load the data

Before we move forward, we need to import data to work with. We will be using the housing dataset for this tutorial. You can download this dataset on the Kaggle website. 

#loading dataset

import pandas as pd
data = pd.read_csv('housing.csv')
housing data

We are good to go!

1. Pandas Apply function

The apply function in pandas will apply the specific function to every value of a particular column.

In our data, we have a column names price, which represents the price of the house based on many factors.

Now, we try to apply a function on those price values to convert them into million format for easy consumption.

#Pandas apply

def measure_update(num):
    return num/1000000

data['price_in_millions'] = data['price'].apply(measure_update)

Image 10

I have added pictures of data before and after applying our custom function. Basically, this function will convert the price to millions. After is 13300000 = 13.3 Million.

You can create any custom function based on your needs. This will help in many ways and saves your time on data analysis.

2. Complex Functions

Simple functions cannot serve the purpose all the time. To reduce your code and get optimal results, I suggest using complex functions or functions with multiple conditions.

Let’s walk through an example.

#multiple conditions

def price_range(price_in_millions):
      if price_in_millions >= 10.0:
        return "High"
      elif price_in_millions < 10 and price_in_millions > 5:
        return "Affordable"
            return 'Cheap'

data['price_range'] = data['price_in_millions'].apply(price_range)


What the above does is it will take in values in the Price column as input and group them based on conditional statements set by us.

After applying the function, it’s good to cross-check the results as shown above. You can easily select the required columns using pandas.

3. Ratios

Yes, getting the ratio of some columns can be a part of creating a new feature which may help in our analysis. So, let’s see how we can create a ratio column based on our data using pandas.


def demo_ratio(bedrooms, bathrooms):
  return bedrooms / bathrooms 

data['ratio'] = data[['bedrooms', 'bathrooms']].apply(lambda data: demo_ratio(data['bedrooms'], data['bathrooms']), axis=1)

Apply Functions To Columns

That’s cool. Now we have the bedroom per bathroom ratio. So based on our results, we have 1 bathroom for every 2 bedrooms.

4. Numpy Magic

Yes, you read it right. Numpy’s magic will never get old. You have created a ratio attribute in the above section.

Now, let’s see how we can get the same output using Numpy vectorization. When it comes to numbers, Numpy is unstoppable.


data['do_ratio'] = np.vectorize(demo_ratio)(data['bedrooms'], data['bathrooms'])

Apply Functions To Columns

That’s nasty from Numpy 😛

We got the same output (Ratio) using the Numpy vectorization method. Now, you will believe in NumPy’s magic.

Apply Functions To Columns – Conclusion

It’s very easy to apply functions to columns using both pandas and numpy as shown here. These methods will be very handy whenever you will work on data manipulation and analysis. I hope you get to learn something new. That’s all for now. Happy Python!!!

More read: Numpy vectorization

Generic selectors
Exact matches only
Search in title
Search in content