In Python, we can use the **numpy.where()** function to select elements from a numpy array, based on a condition.

Not only that, but we can perform some operations on those elements if the condition is satisfied.

Let’s look at how we can use this function, using some illustrative examples!

## Syntax of Python numpy.where()

This function accepts a numpy-like array (ex. a NumPy array of integers/booleans).

It returns a new numpy array, after filtering based on a **condition**, which is a numpy-like array of boolean values.

For example, `condition`

can take the value of `array([[True, True, True]]`

), which is a numpy-like boolean array. (By default, NumPy only supports numeric values, but we can cast them to `bool`

also)

For example, if `condition`

is `array([[True, True, False]])`

, and our array is `a = ndarray([[1, 2, 3]])`

, on applying a condition to array (`a[:, condition]`

), we will get the array `ndarray([[1 2]])`

.

```
import numpy as np
a = np.arange(10)
print(a[a <= 2]) # Will only capture elements <= 2 and ignore others
```

**Output**

```
array([0 1 2])
```

**NOTE**: The same condition condition can also be represented as **a <= 2**. This is the recommended format for the condition array, as it is very tedious writing it as a boolean array

But what if we want to preserve the dimension of the result, and not lose out on elements from our original array? We can use **numpy.where()** for this.

```
numpy.where(condition [, x, y])
```

We have two more parameters `x`

and `y`

. What are those?

Basically, what this says is that if `condition`

holds true for some element in our array, the new array will choose elements from `x`

.

Otherwise, if it’s false, elements from `y`

will be taken.

With that, our final output array will be an array with elements from `x`

wherever `condition = True`

, and elements from `y`

whenever `condition = False`

.

Note that although `x`

and `y`

are optional, if you specify `x`

, you **MUST **also specify `y`

. This is because,** in this case**, the output array shape must be the same as the input array.

**NOTE**: The same logic applies for both single and multi-dimensional arrays too. In both cases, we filter based on the condition. Also remember that the shapes of `x`

, `y`

and `condition`

are broadcasted together.

Now, let us look at some examples, to understand this function properly.

## Using Python numpy.where()

Suppose we want to take only positive elements from a numpy array and set all negative elements to 0, let’s write the code using `numpy.where()`

.

### 1. Replace Elements with numpy.where()

We’ll use a 2 dimensional random array here, and only output the positive elements.

```
import numpy as np
# Random initialization of a (2D array)
a = np.random.randn(2, 3)
print(a)
# b will be all elements of a whenever the condition holds true (i.e only positive elements)
# Otherwise, set it as 0
b = np.where(a > 0, a, 0)
print(b)
```

**Possible Output**

```
[[-1.06455975 0.94589166 -1.94987123]
[-1.72083344 -0.69813711 1.05448464]]
[[0. 0.94589166 0. ]
[0. 0. 1.05448464]]
```

As you can see, only the positive elements are now retained!

### 2. Using numpy.where() with only a condition

There may be some confusion regarding the above code, as some of you may think that the more intuitive way would be to simply write the condition like this:

```
import random
import numpy as np
a = np.random.randn(2, 3)
b = np.where(a > 0)
print(b)
```

If you now try running the above code, with this change, you’ll get an output like this:

```
(array([0, 1]), array([2, 1]))
```

If you observe closely, `b`

is now a **tuple** of numpy arrays. And each array is the location of a positive element. What does this mean?

Whenever we provide just a condition, this function is actually equivalent to `np.asarray.nonzero()`

.

In our example, `np.asarray(a > 0)`

will return a boolean-like array after applying the condition, and `np.nonzero(arr_like)`

will return the indices of the non-zero elements of `arr_like`

. (Refer to this link)

So, we’ll now look at a simpler example, that shows us how flexible we can be with numpy!

```
import numpy as np
a = np.arange(10)
b = np.where(a < 5, a, a * 10)
print(a)
print(b)
```

Ouptut

```
[0 1 2 3 4 5 6 7 8 9]
[ 0 1 2 3 4 50 60 70 80 90]
```

Here, the condition is `a < 5`

, which will be the numpy-like array `[True True True True True False False False False False]`

, `x`

is the array a, and `y`

is the array a * 10. So, we choose from an only if a < 5, and from a * 10, if a > 5.

So, this transforms all elements >= 5, by multiplication with 10. This is what we get indeed!

### Broadcasting with numpy.where()

If we provide all of `condition`

, `x`

, and `y`

arrays, numpy will broadcast them together.

```
import numpy as np
a = np.arange(12).reshape(3, 4)
b = np.arange(4).reshape(1, 4)
print(a)
print(b)
# Broadcasts (a < 5, a, and b * 10)
# of shape (3, 4), (3, 4) and (1, 4)
c = np.where(a < 5, a, b * 10)
print(c)
```

**Output**

```
[[ 0 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]]
[[0 1 2 3]]
[[ 0 1 2 3]
[ 4 10 20 30]
[ 0 10 20 30]]
```

Again, here, the output is selected based on the condition, so all elements, but here, `b`

is broadcasted to the shape of `a`

. (One of its dimensions has only one element, so there will be no errors during broadcasting)

So, `b`

will now become `[[0 1 2 3] [0 1 2 3] [0 1 2 3]]`

, and now, we can select elements even from this broadcasted array.

So the shape of the output is the same as the shape of `a`

.

## Conclusion

In this article, we learned about how we can use the Python **numpy.where()** function to select arrays based on another condition array.

## References

- SciPy Documentation on Python numpy.where() function