## 1. Pandas cut() Function

Pandas cut() function is used to segregate array elements into separate bins. The cut() function works only on one-dimensional array-like objects.

## 2. Usage of Pandas cut() Function

The cut() function is useful when we have a large number of scalar data and we want to perform some statistical analysis on it.

For example, let’s say we have an array of numbers between 1 and 20. We want to divide them into two bins of (1, 10] and (10, 20] and add labels such as “Lows” and “Highs”. We can easily perform this using the pandas cut() function.

Furthermore, we can perform functions on the elements of a specific bin and label elements.

## 3. Pandas cut() function syntax

The cut() function sytax is:

```
cut(
x,
bins,
right=True,
labels=None,
retbins=False,
precision=3,
include_lowest=False,
duplicates="raise",
)
```

**x**is the input array to be binned. It must be one-dimensional.**bins**defines the bin edges for the segmentation.**right**indicates whether to include the rightmost edge or not, default value is True.**labels**is used to specify the labels for the returned bins.**retbins**specifies whether to return the bins or not.**precision**specifies the precision at which to store and display the bins labels.**include_lowest**specifies whether the first interval should be left-inclusive or not.**duplicates**speicifies what to do if the bins edges are not unique, whether to raise ValueError or drop non-uniques.

## 4. Pandas cut() function examples

Let’s look into some examples of pandas cut() function. I will use NumPy to generate random numbers to populate the `DataFrame`

object.

### 4.1) Segment Numbers into Bins

```
import pandas as pd
import numpy as np
df_nums = pd.DataFrame({'num': np.random.randint(1, 100, 10)})
print(df_nums)
df_nums['num_bins'] = pd.cut(x=df_nums['num'], bins=[1, 25, 50, 75, 100])
print(df_nums)
print(df_nums['num_bins'].unique())
```

Output:

```
num
0 80
1 40
2 25
3 9
4 66
5 13
6 63
7 33
8 20
9 60
num num_bins
0 80 (75, 100]
1 40 (25, 50]
2 25 (1, 25]
3 9 (1, 25]
4 66 (50, 75]
5 13 (1, 25]
6 63 (50, 75]
7 33 (25, 50]
8 20 (1, 25]
9 60 (50, 75]
[(75, 100], (25, 50], (1, 25], (50, 75]]
Categories (4, interval[int64]): [(1, 25] < (25, 50] < (50, 75] < (75, 100]]
```

Notice that 25 is part of the bin (1, 25]. It’s because the rightmost edge is included by default. If you don’t want that then pass the `right=False`

parameter to the cut() function.

### 4.2) Adding Labels to Bins

```
import pandas as pd
import numpy as np
df_nums = pd.DataFrame({'num': np.random.randint(1, 20, 10)})
print(df_nums)
df_nums['nums_labels'] = pd.cut(x=df_nums['num'], bins=[1, 10, 20], labels=['Lows', 'Highs'], right=False)
print(df_nums)
print(df_nums['nums_labels'].unique())
```

Since we want 10 to be part of Highs, we are specifying **right=False** in the cut() function call.

Output:

```
num
0 5
1 16
2 6
3 13
4 2
5 10
6 18
7 10
8 2
9 18
num nums_labels
0 5 Lows
1 16 Highs
2 6 Lows
3 13 Highs
4 2 Lows
5 10 Highs
6 18 Highs
7 10 Highs
8 2 Lows
9 18 Highs
[Lows, Highs]
Categories (2, object): [Lows < Highs]
```