# Data Mapping Using Numpy and Pandas in Python

Data manipulation or transformation is the key aspect of any analysis. I am saying this because chances of getting insights that make sense are highly impossible. You should transform raw data into meaningful data. You may need to create new variables, bring the data into one form or even rearrange the data to make sense out of it.

This helps in identifying the anomalies and extract more insights than you think. Therefore, in this article, we will be discussing some of the python pandas and numpy functions which help us in Data mapping and replacement in python.

## 1. Create a Data Set

For the data mapping purpose, let’s create a simple dataset using the pandas dataframe function. This will be a simple student grade dataset.

we will be creating a simple dataset having 2 columns, one for student name and another for student grade.

```#Create a dataset

import pandas as pd

df = pd.DataFrame(student)

df

```
```	Name	Grade
0	Mike	3.5
1	Julia	4.0
2	Trevor	2.1
3	Brooks  4.6
4	Murphy	3.1
```

Well, we got simple students data. Let’s see how we can map and replace the values as a part of the data transformation process.

## 2. Replacing Values in the data

So, we have data that include 5 values and multiple attributes. Now, we got a message from the class teacher that Murphy actually secured 5 grades and he is the topper in the class. We need to replace the old grade with a new grade as per the teacher’s words.

So, here we go…

```#Replacing data

#Updated data

0	Mike	3.5
1	Julia	4.0
2	Trevor	2.1
3	Brooks	4.6
4	Murphy	5.0
```

That’s great! We have successfully replaced the old grade(Value) with a new grade(Value). It is just an example and I have provided a real-world application of this process.

### More Examples / Instances

• Well, now we look for some other requirements as well. Let’s see how we can replace multiple old values with a set of new values.
```#Replace multiple values with new set of values

df
```
```	Name	Grade
0	Mike	Average
1	Julia	Good
2	Trevor	Needs Improvement
3	Brooks	Good
4	Murphy	Excellent
```

That’s cool!

We have amazingly replaced multiple values a set of new values. As you can see, we have replaced all 5 values at once.

• Replacing multiple values with a single new value.
```#Replacing multiple values with a single new value

df
```
```    Name	Grade
0	Mike	Good
1	Julia	Good
2	Trevor	Good
3	Brooks	Good
4	Murphy	Good
```

That’s it. As simple as that. This is how you can replace multiple value with new set of values and a single new value.

## 3. Data Mapping Using Pandas Cut function

Well, we have discussed replacing values with multiple scenarios. Now, we will see how we can do this using the Pandas cut function in python.

In the above examples, we have manually replaced the values. But here, we will be creating bins and assign the values based on the grades.

```#Pandas cut function

my_bins = [0,2,4,5]
```
```    Name	Grade	New_Grades
0	Mike	3.5	    Satisfied
1	Julia	4.0	    Satisfied
2	Trevor	2.1	    Satisfied
3	Brooks	4.6	    Good
4	Murphy	5.0	    Good
```

Excellent! We have mapped new grades into the data.

• You need to define the bins.
• Map the new variable into the data

## 4. Data Mapping using Numpy.digitize Function

This function will do the same mapping as pandas cut did. But, the difference is we have to create a dictionary and map it to the data.

Here, defining bins and bin range names will be same as above.

```#Data mapping using numpy

import numpy as np

my_bins = [0,2,4.5,5]

df
```
```	Name	Grade	New_Grades	Numpy.digitize
0	Mike	3.5	    Satisfied	Satisfied
1	Julia	4.0	    Satisfied	Satisfied
2	Trevor	2.1	    Satisfied	Satisfied
3	Brooks	4.6	         Good	     Good
4	Murphy	5.0	         Good	     Good
```

You can see that, `numpy.digitize` method also produces the same result as of pandas cut function.

## 5. Numpy.select()

If you use this method for data mapping, you have to set the list conditions. based on your conditions, it will return an array of your choice.

```#Numpy.select method

import numpy as np

values = ['Poor', 'Satisfied', 'Good']
df['Numpy_select'] = np.select(Numpy_select, values, 0)
```
```Name	Grade	New_Grades	Numpy.digitize	Numpy_select
0	Mike	3.5	Satisfied	Satisfied	     Satisfied
1	Julia	4.0	Satisfied	Satisfied	     Satisfied
2	Trevor	2.1	Satisfied	Satisfied	     Satisfied
3	Brooks	4.6	     Good	     Good	          Good
4	Murphy	5.0	     Good	     Good	          Good
```

The code itself is self explanatory and you will get the idea easily.

## 6. User-defined Function

Finally, we are going to create a custom function which will do the same job like pandas cut, numpy.digitize and numpy.select functions.

```#User defined function

def user_defined(values):
if values >=0 and values <=2:
return 'Poor'
elif values >2 and values <= 4:
return 'Satisfied'
else:
return 'Good'

#Using the custom function
```
```	Name	Grade	New_Grades	Numpy.digitize	Numpy_select	user_defined
0	Mike	3.5	    Satisfied	   Satisfied	  Satisfied	     Satisfied
1	Julia	4.0	    Satisfied	   Satisfied	  Satisfied	     Satisfied
2	Trevor	2.1	    Satisfied	   Satisfied	  Satisfied	     Satisfied
3	Brooks	4.6	         Good	        Good	       Good	          Good
4	Murphy	5.0	         Good	        Good	       Good	          Good
```

Impressive!

We got the same output using different methods. You are free to use any of these shown methods when you working on data transformation and data mapping or data replacement as well.

## Ending Note – Data Mapping

Data mapping and transformation is the vital part of the analysis. It will turn your raw data into an insights engine where you can get as many patterns and meaningful insights as you want. I hope you find this tutorial useful and enjoyed playing with the above methods.

That’s all for now! Happy Python ðŸ™‚