Pandas Drop Columns and Rows

Filed Under: Python
Pandas Dataframe Drop Column Rows

1. Pandas drop() Function Syntax

Pandas DataFrame drop() function allows us to delete columns and rows. The drop() function syntax is:


drop(
    self,
    labels=None,
    axis=0,
    index=None,
    columns=None,
    level=None,
    inplace=False,
    errors="raise"
)
  • labels: The labels to remove from the DataFrame. It’s used with ‘axis’ to identify rows or column names.
  • axis: The possible values are {0 or ‘index’, 1 or ‘columns’}, default 0. It’s used with ‘labels’ to specify rows or columns.
  • index: indexes to drop from the DataFrame.
  • columns: columns to drop from the DataFrame.
  • level: used to specify the level incase of MultiIndex DataFrame.
  • inplace: if True, the source DataFrame is changed and None is returned. The default value is False, the source DataFrame remains unchanged and a new DataFrame object is returned.
  • errors: the possible values are {‘ignore’, ‘raise’}, default ‘raise’. If the DataFrame doesn’t have the specified label, KeyError is raised. If we specify errors as ‘ignore’, the error is suppressed and only existing labels are removed.

Let’s look into some of the examples of using the Pandas DataFrame drop() function.

2. Pandas Drop Columns

We can drop a single column as well as multiple columns from the DataFrame.

2.1) Drop Single Column


import pandas as pd

d1 = {'Name': ['Pankaj', 'Meghna', 'David'], 'ID': [1, 2, 3], 'Role': ['CEO', 'CTO', 'Editor']}

source_df = pd.DataFrame(d1)

print(source_df)

# drop single column
result_df = source_df.drop(columns='ID')
print(result_df)

Output:


     Name  ID    Role
0  Pankaj   1     CEO
1  Meghna   2     CTO
2   David   3  Editor

     Name    Role
0  Pankaj     CEO
1  Meghna     CTO
2   David  Editor

2.2) Drop Multiple Columns


result_df = source_df.drop(columns=['ID', 'Role'])
print(result_df)

Output:


     Name
0  Pankaj
1  Meghna
2   David

3. Pandas Drop Rows

Let’s look into some examples to drop a single row and multiple rows from the DataFrame object.

3.1) Drop Single Row


import pandas as pd

d1 = {'Name': ['Pankaj', 'Meghna', 'David'], 'ID': [1, 2, 3], 'Role': ['CEO', 'CTO', 'Editor']}

source_df = pd.DataFrame(d1)

result_df = source_df.drop(index=0)
print(result_df)

Output:


     Name  ID    Role
1  Meghna   2     CTO
2   David   3  Editor

3.2) Drop Multiple Rows


result_df = source_df.drop(index=[1, 2])
print(result_df)

Output:


     Name  ID Role
0  Pankaj   1  CEO

4. Drop DataFrame Columns and Rows in place

We can specify inplace=True to drop columns and rows from the source DataFrame itself. In this case, None is returned from the drop() function call.


import pandas as pd

d1 = {'Name': ['Pankaj', 'Meghna', 'David'], 'ID': [1, 2, 3], 'Role': ['CEO', 'CTO', 'Editor']}

source_df = pd.DataFrame(d1)

source_df.drop(columns=['ID'], index=[0], inplace=True)
print(source_df)

Output:


     Name    Role
1  Meghna     CTO
2   David  Editor

5. Using labels and axis to drop columns and rows

It’s not the recommended approach to delete rows and columns. But, it’s good to know because the ‘index’ and ‘columns’ parameters were introduced to drop() function in pandas version 0.21.0. So you may encounter it for older code.


import pandas as pd

d1 = {'Name': ['Pankaj', 'Meghna', 'David'], 'ID': [1, 2, 3], 'Role': ['CEO', 'CTO', 'Editor']}

source_df = pd.DataFrame(d1)

# drop rows
result_df = source_df.drop(labels=[0, 1], axis=0)
print(result_df)

# drop columns
result_df = source_df.drop(labels=['ID', 'Role'], axis=1)
print(result_df)

Output:


    Name  ID    Role
2  David   3  Editor

     Name
0  Pankaj
1  Meghna
2   David

6. Suppressing Errors in Dropping Columns and Rows

If the DataFrame doesn’t contain the given labels, KeyError is raised.


result_df = source_df.drop(columns=['XYZ'])

Output:


KeyError: "['XYZ'] not found in axis"

We can suppress this error by specifying errors='ignore' in the drop() function call.


result_df = source_df.drop(columns=['XYZ'], errors='ignore')
print(result_df)

Output:


     Name  ID    Role
0  Pankaj   1     CEO
1  Meghna   2     CTO
2   David   3  Editor

7. Conclusion

Pandas DataFrame drop() is a very useful function to drop unwanted columns and rows. There are two more functions that extends the drop() functionality.

  1. drop_duplicates() to remove duplicate rows
  2. dropna() to remove rows and columns with missing values

8. References

Leave a Reply

Your email address will not be published. Required fields are marked *

close
Generic selectors
Exact matches only
Search in title
Search in content
Search in posts
Search in pages