Hey fellow coder! Today we are going to look at the dataset of a very popular movies streaming platform, Netflix. The dataset contains information about the number of shows, subscription costs for a lot of countries present in the dataset which uses Netflix.
Let’s start off by understanding the dataset.
Also read: Sentiment Analysis on Animal Crossing Game Dataset using Python
Netflix Subscription Dataset Description
You can download the dataset from the Kaggle link here. It contains the following attributes:
- Country: Some countries that uses Netflix.
- Total Library Size: Total number of movies & TV series aired in a particular country.
- No. of TV Shows: Total number of TV series broadcast in the country.
- No. of Movies: Total number of movies released in the country.
- Cost Per Month – Basic: The monthly price of the “basic package”.
- The Cost Per Month – Standard: The monthly price of the “standard package”.
- Cost Per Month – Premium: The monthly price of the “premium package”.
Code Implementation for Netflix Subscription Data Study
Let’s now get into studying the dataset for Netflix subscriptions using Python.
Importing Libraries
import numpy as np
import pandas as pd
import os
import seaborn as sns
import matplotlib.pyplot as plt
import plotly.express as px
import pandas_profiling
Loading Dataset
The dataset present is in form of CSV files which include one row of data per line, and each line is a comma-separated list with each element being a column. Pandas make reading this data simple and hence, we use the pandas module to read the dataset using the code below.
data = pd.read_csv('gta_cars.csv')
data.head()

Visualizing some basic Histograms
We will visualize histograms for some of the columns from the dataset using the code below. Histograms help us to understand how a certain column is distributed along with a certain range of values.
plt.style.use('seaborn')
plt.figure(figsize=(20,7),facecolor='w')
plt.subplot(1,3,1)
plt.hist(data['Total Library Size'],edgecolor='black',color='pink')
plt.xlabel("Size of the Library")
plt.ylabel("Distribution")
plt.title("Histogram for Library Size")
plt.subplot(1,3,2)
plt.hist(data['No. of TV Shows'],edgecolor='black',color="lightgreen")
plt.xlabel("No. of TV Shows")
plt.ylabel("Distribution")
plt.title("Histogram for No. of TV Shows")
plt.subplot(1,3,3)
plt.hist(data['No. of Movies'],edgecolor='black',color="cyan")
plt.xlabel("No. of Movies")
plt.ylabel("Distribution")
plt.title("Histogram for No. of Movies")
plt.show()

Visualizing Montly Subscription Cost of the countries
We can also visualize the subscription cost for basic, standard, and premium packages of Netflix for all the countries present in the dataset. For this tutorial, we will be visualizing the basic monthly cost in the form of the bar chart, pie chart, and scatter plot using the codes below.
You can see how beautiful the plots turn out to be and they are interactive as well which makes them a plus!
fig = px.bar(data, x='Country', y='Cost Per Month - Basic ($)', color = "Cost Per Month - Basic ($)",
title="Country vs Cost per Month")
fig.show()

fig = px.pie(data, values='Cost Per Month - Basic ($)', names='Country',title = "Cost Per Month - Basic ($)")
fig.update_traces(textposition='inside')
fig.update_layout(uniformtext_minsize=12, uniformtext_mode='hide')
fig.show()

fig = px.scatter(data, x="Country", y="Cost Per Month - Basic ($)",title = "Cost Per Month - Basic ($)")
fig.show()

All Subsription costs in one plot
Next, we can also visualize all the subscription types ( Basic, Standard, and Premium ) costs of all the countries into one single plot using the code below.
plt.figure(figsize=(20,10),facecolor='w')
plt.plot(data["Country"],data["Cost Per Month - Basic ($)"],color="maroon",label="Basic Subscription")
plt.plot(data["Country"],data["Cost Per Month - Standard ($)"],color="darkblue",label="Standard Subscription")
plt.plot(data["Country"],data["Cost Per Month - Premium ($)"],color="orchid",label="Premium Subscription")
plt.xticks(rotation=90)
plt.title("All Subscription Costs in Various Countries",size=14)
plt.legend(title = "Subscription Type")
plt.show()

Conclusion
Congratulations! This tutorial covered the basic visualizations of the Netflix subscription dataset present on Kaggle. I hope you learned a lot through the tutorial and will be able to apply the same code snippets on other datasets as well.
Thank you for reading!
If you like reading such tutorials, here are some similar tutorials you will surely enjoy: