Pandas melt() and unmelt using pivot() function

Filed Under: Python
Pandas Melt Function

Pandas melt() function is used to change the DataFrame format from wide to long. It’s used to create a specific format of the DataFrame object where one or more columns work as identifiers. All the remaining columns are treated as values and unpivoted to the row axis and only two columns – variable and value.

1. Pandas melt() Example

The use of melt() function is more clear when looked through an example.


import pandas as pd

d1 = {"Name": ["Pankaj", "Lisa", "David"], "ID": [1, 2, 3], "Role": ["CEO", "Editor", "Author"]}

df = pd.DataFrame(d1)

print(df)

df_melted = pd.melt(df, id_vars=["ID"], value_vars=["Name", "Role"])

print(df_melted)

Output:


     Name  ID    Role
0  Pankaj   1     CEO
1    Lisa   2  Editor
2   David   3  Author

   ID variable   value
0   1      Name  Pankaj
1   2      Name    Lisa
2   3      Name   David
3   1      Role     CEO
4   2      Role  Editor
5   3      Role  Author

We can pass the ‘var_name’ and ‘value_name’ parameters to change the column names of ‘variable’ and ‘value’.


df_melted = pd.melt(df, id_vars=["ID"], value_vars=["Name", "Role"], var_name="Attribute", value_name="Value")
Recommended Read: Python Pandas Tutorial

2. Multiple Columns as id_vars

Let’s see what happens when we pass multiple columns as the id_vars parameter.


df_melted = pd.melt(df, id_vars=["ID", "Name"], value_vars=["Role"])
print(df_melted)

Output:


   ID    Name variable   value
0   1  Pankaj     Role     CEO
1   2    Lisa     Role  Editor
2   3   David     Role  Author

3. Skipping Columns in melt() Function

It’s not required to use all the rows from the source DataFrame. Let’s skip the “ID” column in the next example.


df_melted = pd.melt(df, id_vars=["Name"], value_vars=["Role"])
print(df_melted)

Output:


     Name variable   value
0  Pankaj     Role     CEO
1    Lisa     Role  Editor
2   David     Role  Author

4. Unmelting DataFrame using pivot() function

We can use pivot() function to unmelt a DataFrame object and get the original dataframe. The pivot() function ‘index’ parameter value should be same as the ‘id_vars’ value. The ‘columns’ value should be passed as the name of the ‘variable’ column.


import pandas as pd

d1 = {"Name": ["Pankaj", "Lisa", "David"], "ID": [1, 2, 3], "Role": ["CEO", "Editor", "Author"]}

df = pd.DataFrame(d1)

# print(df)

df_melted = pd.melt(df, id_vars=["ID"], value_vars=["Name", "Role"], var_name="Attribute", value_name="Value")

print(df_melted)

# unmelting using pivot()

df_unmelted = df_melted.pivot(index='ID', columns='Attribute')

print(df_unmelted)

Output:


   ID Attribute   Value
0   1      Name  Pankaj
1   2      Name    Lisa
2   3      Name   David
3   1      Role     CEO
4   2      Role  Editor
5   3      Role  Author

            Value        
Attribute    Name    Role
ID                       
1          Pankaj     CEO
2            Lisa  Editor
3           David  Author

The unmelted DataFrame values are the same as the original DataFrame. But, the columns and index need some minor changes to make it exactly like the original data frame.


df_unmelted = df_unmelted['Value'].reset_index()
df_unmelted.columns.name = None
print(df_unmelted)

Output:


   ID    Name    Role
0   1  Pankaj     CEO
1   2    Lisa  Editor
2   3   David  Author

Reference: pandas melt() API Doc

Leave a Reply

Your email address will not be published. Required fields are marked *

close
Generic selectors
Exact matches only
Search in title
Search in content
Search in posts
Search in pages