Pandas DataFrame loc[] to access a group of Rows and Columns

Filed Under: Pandas
Pandas Dataframe Loc

Pandas DataFrame loc[] allows us to access a group of rows and columns. We can pass labels as well as boolean values to select the rows and columns.

DataFrame loc[] inputs

Some of the allowed inputs are:

  • A Single Label – returning the row as Series object.
  • A list of Labels – returns a DataFrame of selected rows.
  • A Slice with Labels – returns a Series with the specified rows, including start and stop labels.
  • A boolean array – returns a DataFrame for True labels, the length of the array must be the same as the axis being selected.
  • A conditional statement or callable function – must return a valid value to select the rows and columns to return.

DataFrame loc[] Examples

Let’s look into some examples of using the loc attribute of the DataFrame object. But, first, we will create a sample DataFrame for us to use.

import pandas as pd

d1 = {'Name': ['John', 'Jane', 'Mary'], 'ID': [1, 2, 3], 'Role': ['CEO', 'CTO', 'CFO']}

df = pd.DataFrame(d1)

print('DataFrame:\n', df)

Output:

DataFrame:
    Name  ID Role
0  John   1  CEO
1  Jane   2  CTO
2  Mary   3  CFO

1. loc[] with a single label

row_1_series = df.loc[1]
print(type(row_1_series))
print(df.loc[1])

Output:

<class 'pandas.core.series.Series'>
Name    Jane
ID         2
Role     CTO
Name: 1, dtype: object

2. loc[] with a list of label

row_0_2_df = df.loc[[0, 2]]
print(type(row_0_2_df))
print(row_0_2_df)

Output:

<class 'pandas.core.frame.DataFrame'>
   Name  ID Role
0  John   1  CEO
2  Mary   3  CFO

3. Getting a Single Value

We can specify the row and column labels to get the single value from the DataFrame object.

jane_role = df.loc[1, 'Role']
print(jane_role)  # CTO

4. Slice with loc[]

We can pass a slice of labels too, in that case, the start and stop labels will be included in the result Series object.

roles = df.loc[0:1, 'Role']
print(roles)

Output:

0    CEO
1    CTO
Name: Role, dtype: object

5. loc[] with an array of Boolean values

row_1_series = df.loc[[False, True, False]]
print(row_1_series)

Output:

   Name  ID Role
1  Jane   2  CTO

Since the DataFrame has 3 rows, the array length should be 3. If the argument boolean array length doesn’t match with the length of the axis, IndexError: Item wrong length is raised.

6. loc[] with Conditional Statements

data = df.loc[df['ID'] > 1]
print(data)

Output: A DataFrame of the rows where the ID is greater than 1.

   Name  ID Role
1  Jane   2  CTO
2  Mary   3  CFO

7. DataFrame loc[] with Callable Function

We can also use a lambda function with the DataFrame loc[] attribute.

id_2_row = df.loc[lambda df1: df1['ID'] == 2]
print(id_2_row)

Output:

   Name  ID Role
1  Jane   2  CTO

Setting DataFrame Values using loc[] attribute

One of the special features of loc[] is that we can use it to set the DataFrame values. Let’s look at some examples to set DataFrame values using the loc[] attribute.

1. Setting a Single Value

We can specify the row and column labels to set the value of a specific index.

import pandas as pd

d1 = {'Name': ['John', 'Jane', 'Mary'], 'ID': [1, 2, 3], 'Role': ['CEO', 'CTO', 'CFO']}

df = pd.DataFrame(d1, index=['A', 'B', 'C'])
print('Original DataFrame:\n', df)

# set a single value
df.loc['B', 'Role'] = 'Editor'
print('Updated DataFrame:\n', df)

Output:

Original DataFrame:
    Name  ID Role
A  John   1  CEO
B  Jane   2  CTO
C  Mary   3  CFO

Updated DataFrame:
    Name  ID    Role
A  John   1     CEO
B  Jane   2  Editor
C  Mary   3     CFO

2. Setting values of an entire row

If we specify only a single label, all the values in that row will be set to the specified one.

df.loc['B'] = None
print('Updated DataFrame with None:\n', df)

Output:

Updated DataFrame with None:
    Name   ID  Role
A  John  1.0   CEO
B  None  NaN  None
C  Mary  3.0   CFO

3. Setting values of an entire column

We can use a slice to select all the rows and specify a column to set its values to the specified one.

df.loc[:, 'Role'] = 'Employee'
print('Updated DataFrame Role to Employee:\n', df)

Output:

Updated DataFrame Role to Employee:
    Name   ID      Role
A  John  1.0  Employee
B  None  NaN  Employee
C  Mary  3.0  Employee

4. Setting Value based on a Condition

df.loc[df['ID'] == 1, 'Role'] = 'CEO'
print(df)

Output:

   Name   ID      Role
A  John  1.0       CEO
B  None  NaN  Employee
C  Mary  3.0  Employee

Conclusion

Python DataFrame loc[] attribute is very useful because we can get specific values as well as set the values. The support for conditional parameters and lambda expressions with the loc[] attribute makes it a very powerful resource.

References:

Leave a Reply

Your email address will not be published. Required fields are marked *

close
Generic selectors
Exact matches only
Search in title
Search in content
Search in posts
Search in pages