Hello readers, today we will be looking into an amazing concept of what exactly is Vectorization in python. If you ask me, I would love to say, vectorization is an art. Yes, it’s the art of avoiding explicit folders from your code. Of course, you can use this in any of your coding works. But, particularly in deep learning, where you will work with tons of data, your code must execute faster than ever. So, you will be using loops right? if so, you got good news. You need not use loops explicitly to get into your data. Instead, you can vectorize the data points for faster execution. Let’s see how this works.
Table of Contents
Vectorization in Python
We are going to understand Vectorizations in the context of logistic regression. It is used to speed up the code without explicitly using it for loops. This not only makes execution faster but also reduces errors and produces a neat code that will be easier to read.
Numpy which is a python library widely used for the numerical computations. This library will help us in vectorization. There will be two approaches –
- Non-vectorized approach
- Vectorized approach
Let’s understand about the math behind both as well their implementation.
As shown in the above image, in logistic regression, we need to compute Z equals W transpose T plus b, where W and X are a column vector with many features. So, now we can say both W and X are Nx dimensional vectors. For a non-vectorized approach or implementation, the code model will be shown below.
z = 0 for i in range (n-x): z+ = w[i] * x[i] z+ = b
In the vectorized approach, we are not going to use for loop in our code. Instead, we will use Numpy library for vectorizing the arrays. The vectorized equation with respect to logistic regression is shown below.
np = np.dot(w,x) + b
In this vectorized approach, the NumPy library will compute the dot product or inner product with element-wise multiplication. The above equation, the term np.dot(w,x) is equal to W transpose T and X as discussed above. In this approach, the W transpose T and x will get directly computed without any iterations like for loops. This will make the code run faster and also looks clean.
Implementing Vectorization in Python
Well, the above sections will make sense about vectorized and non-vectorized approaches in a simple way with some math intuition as well. Now, you know how they work and the idea behind them. So, let’s see how they are different and how much time they take for execution.
#Let's check Numpy first import numpy as np a = np.array([1,2,3,4]) print(a)
Output – [1 2 3 4]
Nympy is ready to go.
#Execution time for both Vectorized and Non-vectorized versions. import time a = np.random.rand(10000000) b = np.random.rand(10000000) x = time.time() c = np.dot(a,b) y = time.time() print(c) print("Vectorized version: " + str(1000*(y-x))+ "ms") c = 0 x = time.time() for i in range(10000000): c+= a[i] * b[i] y = time.time() print(c) print("Non-vectorized version: " + str(1000*(y-x))+"ms")
Vectorized version: 10.077238082885742ms
Non-vectorized version: 6419.343948364258ms
Fantastic. You can see the results. A vectorised version is ~640 times faster in this case compared to non-vectorized version. This is what vectorization and it’s tremendous ability is.
Wrapping Up – Vectorization in Python
Vectorization in python is the process of avoiding explicit loops in a code to reduce the execution time. Particularly, in deep learning, where you will deal with unstructured data such as Image and audio data, this approach will be fruitful in cutting the training time of the model.
It will also help to make code much clean and anybody can easily understand what’s going on there. Finally, as we say, less code = fewer errors, and vectorization makes it possible. That’s all for now. Don’t forget to use vectorization in your next coding assignment. Happy Python!!!