Pandas melt() function is used to change the DataFrame format from wide to long. It’s used to create a specific format of the DataFrame object where one or more columns work as identifiers. All the remaining columns are treated as values and unpivoted to the row axis and only two columns – variable and value.
Table of Contents
1. Pandas melt() Example
The use of melt() function is more clear when looked through an example.
import pandas as pd
d1 = {"Name": ["Pankaj", "Lisa", "David"], "ID": [1, 2, 3], "Role": ["CEO", "Editor", "Author"]}
df = pd.DataFrame(d1)
print(df)
df_melted = pd.melt(df, id_vars=["ID"], value_vars=["Name", "Role"])
print(df_melted)
Output:
Name ID Role
0 Pankaj 1 CEO
1 Lisa 2 Editor
2 David 3 Author
ID variable value
0 1 Name Pankaj
1 2 Name Lisa
2 3 Name David
3 1 Role CEO
4 2 Role Editor
5 3 Role Author
We can pass the ‘var_name’ and ‘value_name’ parameters to change the column names of ‘variable’ and ‘value’.
df_melted = pd.melt(df, id_vars=["ID"], value_vars=["Name", "Role"], var_name="Attribute", value_name="Value")
2. Multiple Columns as id_vars
Let’s see what happens when we pass multiple columns as the id_vars parameter.
df_melted = pd.melt(df, id_vars=["ID", "Name"], value_vars=["Role"])
print(df_melted)
Output:
ID Name variable value
0 1 Pankaj Role CEO
1 2 Lisa Role Editor
2 3 David Role Author
3. Skipping Columns in melt() Function
It’s not required to use all the rows from the source DataFrame. Let’s skip the “ID” column in the next example.
df_melted = pd.melt(df, id_vars=["Name"], value_vars=["Role"])
print(df_melted)
Output:
Name variable value
0 Pankaj Role CEO
1 Lisa Role Editor
2 David Role Author
4. Unmelting DataFrame using pivot() function
We can use pivot() function to unmelt a DataFrame object and get the original dataframe. The pivot() function ‘index’ parameter value should be same as the ‘id_vars’ value. The ‘columns’ value should be passed as the name of the ‘variable’ column.
import pandas as pd
d1 = {"Name": ["Pankaj", "Lisa", "David"], "ID": [1, 2, 3], "Role": ["CEO", "Editor", "Author"]}
df = pd.DataFrame(d1)
# print(df)
df_melted = pd.melt(df, id_vars=["ID"], value_vars=["Name", "Role"], var_name="Attribute", value_name="Value")
print(df_melted)
# unmelting using pivot()
df_unmelted = df_melted.pivot(index='ID', columns='Attribute')
print(df_unmelted)
Output:
ID Attribute Value
0 1 Name Pankaj
1 2 Name Lisa
2 3 Name David
3 1 Role CEO
4 2 Role Editor
5 3 Role Author
Value
Attribute Name Role
ID
1 Pankaj CEO
2 Lisa Editor
3 David Author
The unmelted DataFrame values are the same as the original DataFrame. But, the columns and index need some minor changes to make it exactly like the original data frame.
df_unmelted = df_unmelted['Value'].reset_index()
df_unmelted.columns.name = None
print(df_unmelted)
Output:
ID Name Role
0 1 Pankaj CEO
1 2 Lisa Editor
2 3 David Author
Reference: pandas melt() API Doc