Keras Deep Learning Tutorial

Filed Under: Machine Learning

What is Keras?

Keras is a high-level neural networks API. It is written in Python and can run on top of Theano, TensorFlow or CNTK. It was developed with the idea of:

Being able to go from idea to result with the least possible delay is key to doing good research.

Keras is a user-friendly, extensible and modular library which makes prototyping easy and fast. It supports convolutional networks, recurrent networks and even the combination of both.

Initial development of Keras was a part of the research of project ONEIROS (Open-ended Neuro-Electronic Intelligent Robot Operating System).

Why Keras?

There are countless deep-learning frameworks available today, but there are some of the areas in which Keras proved better than other alternatives.

Keras focuses on minimal user action requirement when common use cases are concerned also if the user makes an error, clear and actionable feedback is provided. This makes keras easy to learn and use.

When you want to put your Keras models to use into some application, you need to deploy it on other platforms which is comparatively easy if you are using keras. It also supports multiple backends and also allows portability across backends i.e. you can train using one backend and load it with another.

It has got a strong back with built-in multiple GPU support, it also supports distributed training.

Keras Tutorial

Installing Keras

We need to install one of the backend engines before we actually get to installing Keras. Let’s go and install any of TensorFlow or Theano or CNTK modules.

Now, we are ready to install keras. We can either use pip installation or clone the repository from git. To install using pip, open the terminal and run the following command:

pip install keras

In case pip installation doesn’t work or you want another method, you can clone the git repository using

git clone

Once cloned, move to the cloned directory and run:

sudo python install

Using Keras

To use Keras in any of your python scripts we simply need to import it using:

import keras

Densely Connected Network

A Sequential model is probably a better choice to create such network, but we are just getting started so it’s a better choice to start with something really simple:

from keras.layers import Input, Dense
from keras.models import Model
# This returns a tensor
inputs = Input(shape=(784,))
# a layer instance is callable on a tensor, and returns a tensor
x = Dense(64, activation='relu')(inputs)
x = Dense(64, activation='relu')(x)
predictions = Dense(10, activation='softmax')(x)
# This creates a model that includes
# the Input layer and three Dense layers
model = Model(inputs=inputs, outputs=predictions)
model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])

Now that you have seen how to create a simple Densely Connected Network model you can train it with your training data and may use it in your deep learning module.

Sequential Model

Model is core data structure of Keras. The simplest type of model is a linear stack of layers, we call it Sequential Model. Let’s put our hands in code and try to build one:

# import required modules
from keras.models import Sequential
from keras.layers import Dense
import numpy as np
# Create a model
model= Sequential()
# Stack Layers
model.add(Dense(units=64, activation='relu', input_dim=100))
model.add(Dense(units=10, activation='softmax'))
# Configure learning
model.compile(loss='categorical_crossentropy', optimizer='sgd',metrics=['accuracy'])
# Create Numpy arrays with random values, use your training or test data here
x_train = np.random.random((64,100))
y_train = np.random.random((64,10))
x_test = np.random.random((64,100))
y_test = np.random.random((64,10))
# Train using numpy arrays, y_train, epochs=5, batch_size=32)
# evaluate on existing data
loss_and_metrics = model.evaluate(x_test, y_test, batch_size=128)
# Generate predictions on new data
classes = model.predict(x_test, batch_size=128)

Let’s run the program to see the results:
keras tutorial, keras deep learning tutorial

Let’s try a few more models and how to create them like, Residual Connection on a Convolution Layer:

from keras.layers import Conv2D, Input

# input tensor for a 3-channel 256x256 image
x = Input(shape=(256, 256, 3))
# 3x3 conv with 3 output channels (same as input channels)
y = Conv2D(3, (3, 3), padding='same')(x)
# this returns x + y.
z = keras.layers.add([x, y])

Shared Vision Model

Shared Vision Model helps to classify whether two MNIST digits are the same digit or different digits by reusing the same image-processing module on two inputs. Let’s create one as shown below.

from keras.layers import Conv2D, MaxPooling2D, Input, Dense, Flatten
from keras.models import Model
import keras
# First, define the vision modules
digit_input = Input(shape=(27, 27, 1))
x = Conv2D(64, (3, 3))(digit_input)
x = Conv2D(64, (3, 3))(x)
x = MaxPooling2D((2, 2))(x)
out = Flatten()(x)
vision_model = Model(digit_input, out)
# Then define the tell-digits-apart model
digit_a = Input(shape=(27, 27, 1))
digit_b = Input(shape=(27, 27, 1))
# The vision model will be shared, weights and all
out_a = vision_model(digit_a)
out_b = vision_model(digit_b)
concatenated = keras.layers.concatenate([out_a, out_b])
out = Dense(1, activation='sigmoid')(concatenated)
classification_model = Model([digit_a, digit_b], out)

Visual Question Answering Model

Let’s create a model which can choose the correct one-word answer to a natural-language question about a picture.

It can be done by encoding the question and image into two separate vectors, concatenating both of them and training on top a logistic regression over some vocabulary of potential answers. Let’s try the model:

from keras.layers import Conv2D, MaxPooling2D, Flatten
from keras.layers import Input, LSTM, Embedding, Dense
from keras.models import Model, Sequential
import keras
# First, let's define a vision model using a Sequential model.
# This model will encode an image into a vector.
vision_model = Sequential()
vision_model.add(Conv2D(64, (3, 3), activation='relu', padding='same', input_shape=(224, 224, 3)))
vision_model.add(Conv2D(64, (3, 3), activation='relu'))
vision_model.add(MaxPooling2D((2, 2)))
vision_model.add(Conv2D(128, (3, 3), activation='relu', padding='same'))
vision_model.add(Conv2D(128, (3, 3), activation='relu'))
vision_model.add(MaxPooling2D((2, 2)))
vision_model.add(Conv2D(256, (3, 3), activation='relu', padding='same'))
vision_model.add(Conv2D(256, (3, 3), activation='relu'))
vision_model.add(Conv2D(256, (3, 3), activation='relu'))
vision_model.add(MaxPooling2D((2, 2)))
# Now let's get a tensor with the output of our vision model:
image_input = Input(shape=(224, 224, 3))
encoded_image = vision_model(image_input)
# Next, let's define a language model to encode the question into a vector.
# Each question will be at most 100 word long,
# and we will index words as integers from 1 to 9999.
question_input = Input(shape=(100,), dtype='int32')
embedded_question = Embedding(input_dim=10000, output_dim=256, input_length=100)(question_input)
encoded_question = LSTM(256)(embedded_question)
# Let's concatenate the question vector and the image vector:
merged = keras.layers.concatenate([encoded_question, encoded_image])
# And let's train a logistic regression over 1000 words on top:
output = Dense(1000, activation='softmax')(merged)
# This is our final model:
vqa_model = Model(inputs=[image_input, question_input], outputs=output)
# The next stage would be training this model on actual data.

If you want to learn more about Visual Question Answering (VQA), check out this beginner’s guide to VQA.

Training Neural Network

Now that we have seen how to build different models using Keras, let’s put things together and work on a complete example. The following example trains a Neural Network on MNIST data set:

import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.optimizers import RMSprop
batch_size = 128
num_classes = 10
epochs = 20
# the data, shuffled and split between train and test sets
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.reshape(60000, 784)
x_test = x_test.reshape(10000, 784)
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')
# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)
model = Sequential()
model.add(Dense(512, activation='relu', input_shape=(784,)))
model.add(Dense(512, activation='relu'))
model.add(Dense(num_classes, activation='softmax'))
# Compile model
history =, y_train,
                    validation_data=(x_test, y_test))
score = model.evaluate(x_test, y_test, verbose=0)
# Print the results
print('Test loss:', score[0])
print('Test accuracy:', score[1])

Let’s run this example and wait for results:
keras example, keras neural network tutorial
The output shows only the final part, it might take a few minutes for the program to finish execution depending on machine


In this tutorial, we discovered that Keras is a powerful framework and makes it easy for the user to create prototypes and that too very quickly. We have also seen how different models can be created using keras. These models can be used for feature extraction, fine-tuning and prediction. We have also seen how to train a neural network using keras.

Keras has grown popular with other frameworks and it is one of the most popular frameworks on Kaggle.

Generic selectors
Exact matches only
Search in title
Search in content