In Python, we can use the **numpy.where()** function to select elements from a numpy array, based on a condition.

Not only that, but we can perform some operations on those elements if the condition is satisfied.

Let’s look at how we can use this function, using some illustrative examples!

Table of Contents

## Syntax of Python numpy.where()

This function accepts a numpy-like array (ex. a NumPy array of integers/booleans).

It returns a new numpy array, after filtering based on a **condition**, which is a numpy-like array of boolean values.

For example, `condition`

can take the value of `array([[True, True, True]]`

), which is a numpy-like boolean array. (By default, NumPy only supports numeric values, but we can cast them to `bool`

also)

For example, if `condition`

is `array([[True, True, False]])`

, and our array is `a = ndarray([[1, 2, 3]])`

, on applying a condition to array (`a[:, condition]`

), we will get the array `ndarray([[1 2]])`

.

```
import numpy as np
a = np.arange(10)
print(a[a <= 2]) # Will only capture elements <= 2 and ignore others
```

**Output**

```
array([0 1 2])
```

**NOTE**: The same condition condition can also be represented as **a <= 2**. This is the recommended format for the condition array, as it is very tedious writing it as a boolean array

But what if we want to preserve the dimension of the result, and not lose out on elements from our original array? We can use **numpy.where()** for this.

```
numpy.where(condition [, x, y])
```

We have two more parameters `x`

and `y`

. What are those?

Basically, what this says is that if `condition`

holds true for some element in our array, the new array will choose elements from `x`

.

Otherwise, if it’s false, elements from `y`

will be taken.

With that, our final output array will be an array with elements from `x`

wherever `condition = True`

, and elements from `y`

whenever `condition = False`

.

Note that although `x`

and `y`

are optional, if you specify `x`

, you **MUST **also specify `y`

. This is because,** in this case**, the output array shape must be the same as the input array.

**NOTE**: The same logic applies for both single and multi-dimensional arrays too. In both cases, we filter based on the condition. Also remember that the shapes of `x`

, `y`

and `condition`

are broadcasted together.

Now, let us look at some examples, to understand this function properly.

## Using Python numpy.where()

Suppose we want to take only positive elements from a numpy array and set all negative elements to 0, let’s write the code using `numpy.where()`

.

### 1. Replace Elements with numpy.where()

We’ll use a 2 dimensional random array here, and only output the positive elements.

```
import numpy as np
# Random initialization of a (2D array)
a = np.random.randn(2, 3)
print(a)
# b will be all elements of a whenever the condition holds true (i.e only positive elements)
# Otherwise, set it as 0
b = np.where(a > 0, a, 0)
print(b)
```

**Possible Output**

```
[[-1.06455975 0.94589166 -1.94987123]
[-1.72083344 -0.69813711 1.05448464]]
[[0. 0.94589166 0. ]
[0. 0. 1.05448464]]
```

As you can see, only the positive elements are now retained!

### 2. Using numpy.where() with only a condition

There may be some confusion regarding the above code, as some of you may think that the more intuitive way would be to simply write the condition like this:

```
import random
import numpy as np
a = np.random.randn(2, 3)
b = np.where(a > 0)
print(b)
```

If you now try running the above code, with this change, you’ll get an output like this:

```
(array([0, 1]), array([2, 1]))
```

If you observe closely, `b`

is now a **tuple** of numpy arrays. And each array is the location of a positive element. What does this mean?

Whenever we provide just a condition, this function is actually equivalent to `np.asarray.nonzero()`

.

In our example, `np.asarray(a > 0)`

will return a boolean-like array after applying the condition, and `np.nonzero(arr_like)`

will return the indices of the non-zero elements of `arr_like`

. (Refer to this link)

So, we’ll now look at a simpler example, that shows us how flexible we can be with numpy!

```
import numpy as np
a = np.arange(10)
b = np.where(a < 5, a, a * 10)
print(a)
print(b)
```

Ouptut

```
[0 1 2 3 4 5 6 7 8 9]
[ 0 1 2 3 4 50 60 70 80 90]
```

Here, the condition is `a < 5`

, which will be the numpy-like array `[True True True True True False False False False False]`

, `x`

is the array a, and `y`

is the array a * 10. So, we choose from an only if a < 5, and from a * 10, if a > 5.

So, this transforms all elements >= 5, by multiplication with 10. This is what we get indeed!

### Broadcasting with numpy.where()

If we provide all of `condition`

, `x`

, and `y`

arrays, numpy will broadcast them together.

```
import numpy as np
a = np.arange(12).reshape(3, 4)
b = np.arange(4).reshape(1, 4)
print(a)
print(b)
# Broadcasts (a < 5, a, and b * 10)
# of shape (3, 4), (3, 4) and (1, 4)
c = np.where(a < 5, a, b * 10)
print(c)
```

**Output**

```
[[ 0 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]]
[[0 1 2 3]]
[[ 0 1 2 3]
[ 4 10 20 30]
[ 0 10 20 30]]
```

Again, here, the output is selected based on the condition, so all elements, but here, `b`

is broadcasted to the shape of `a`

. (One of its dimensions has only one element, so there will be no errors during broadcasting)

So, `b`

will now become `[[0 1 2 3] [0 1 2 3] [0 1 2 3]]`

, and now, we can select elements even from this broadcasted array.

So the shape of the output is the same as the shape of `a`

.

## Conclusion

In this article, we learned about how we can use the Python **numpy.where()** function to select arrays based on another condition array.

## References

- SciPy Documentation on Python numpy.where() function