Model Development: AI End-to-End Series (Part — 3)

INSAID
5 min readDec 21, 2021

By Hiren Rupchandani, Abhinav Jangir, and Ashish Lepcha

In our previous article, we saw how to preprocess our image data using several different techniques. Now it is time to build a model using this pre-processed data. So, let’s get started:

The Dataset

We are going to classify whether a person is wearing a mask or not based on the input image which is the face of a person. The dataset contains two types of images — People wearing a face mask and people not wearing a face mask. Let’s take a glimpse of images in both the classes:

Images from our Dataset — Mask wearers
Images from our Dataset — Non-Mask wearers

Data Pre-processing

Data Augmentation

  • We are doing data augmentation by randomly applying various features.
  • This increases the diversity of data available for training models, without actually collecting new data
  • Some of the common Data Augmentation filters are :
    - Random Rotation
    - Horizontal Shift
    - Vertical Shift
    - Random Fliping
    - Shearing
    - Random Zooming

Model Building

  • Our Model consists of 3 convolutional layers followed by Max Pooling Layers and dropout.
  • There is a fully connected layer with 128 units after convolutional that is activated by a ReLU activation function.
  • Here we are using Binary Cross-Entropy loss with an ADAM optimizer for our binary classification problem.
  • According to the model summary, there are 6,446,369 trainable parameters.
Representation of our Face Mask Detection Model
  • The training/validation loss and accuracy graphs for the model is as follows:
Loss/Accuracy of our model

Using Transfer Learning

  • Another way to build a model can be using transfer learning. But what is it? It is like learning to ride a bicycle and taking that experience to learn a bike/scooter.
Transfer Learning Interpretation
  • Transfer learning consists of taking features learned on one problem and leveraging them on a new, similar problem.
  • For instance, features from a model that has learned to identify raccoons may be useful to kick-start a model meant to identify tanukis.
Raccoon and Japanese Tanukis
  • Transfer learning is usually done for tasks where your dataset has too little data to train a full-scale model from scratch.
  • The most common incarnation of transfer learning in the context of deep learning is the following workflow:
    - Take layers from a previously trained model.
    - Freeze them, so as to avoid destroying any of the information they contain during future training rounds.
    - Add some new, trainable layers on top of the frozen layers. They will learn to turn the old features into predictions on a new dataset.
    - Train the new layers on your dataset.
  • A last, optional step, is fine-tuning, which consists of unfreezing the entire model you obtained above (or part of it) and re-training it on the new data with a very low learning rate.
  • This can potentially achieve meaningful improvements, by incrementally adapting the pretrained features to the new data.

MobileNet

  • MobileNet-v2 is a convolutional neural network that is 53 layers deep.
  • You can load a pretrained version of the network trained on more than a million images from the ImageNet database.
  • The pretrained network can classify images into 1000 object categories, such as keyboard, mouse, pencil, and many animals.
MobileNetV2
  • As a result, the network has learned rich feature representations for a wide range of images.
  • The network has an image input size of 224-by-224.
  • MobileNets are small, low-latency, low-power models parameterized to meet the resource constraints of a variety of use cases.
  • The architecture delivers high accuracy results while keeping the parameters and mathematical operations as low as possible to bring deep neural networks to mobile devices.
  • In MobileNetV2, there are two types of blocks.
    - Inverted Residual Block
    - Bottleneck Residual Block
  • There are 3 layers for both types of blocks.
Block Types in MobileNetV2
  • One is a residual block with a stride of 1. Another one is a block with a stride of 2 for downsizing.
  • There are two types of Convolution layers in MobileNet V2 architecture:
    - 1x1 Convolution
    - 3x3 Depthwise Convolution
  • A MobileNetV2 network looks like this:
  • We have connected the output of a base MobileNetV2 network with a new model.
  • This model consists of an average pooling layer, followed by a flattening layer, and finally a fully connected dense neural network.
  • The output layer consists of a sigmoid activation to perform the binary classification.
  • On training this network, we are achieving an accuracy of 0.99.
Loss/Accuracy using Transfer Learning
  • If we were to fine-tune the model by re-training the entire model on our data, we can even achieve a test accuracy of 1.0.
  • You can visualize the model’s training and validation accuracy and loss using this TensorBoard Extension.
  • You can observe from the graphs that for our model we got a validation accuracy of around 88% but by using transfer learning, we achieved a validation accuracy of more than 97%.
  • There we have it — a model ready to be deployed, with great accuracy.

What’s Next?

In the next article of this series, we will deploy our model using Flask.

Follow us for more upcoming future articles related to Data Science, Machine Learning, and Artificial Intelligence.

Also, Do give us a Clap👏 if you find this article useful as your encouragement catalyzes inspiration for and helps to create more cool stuff like this.

Visit us on https://www.insaid.co/

--

--

INSAID

One of India’s leading institutions providing world-class Data Science & AI programs for working professionals with a mission to groom Data leaders of tomorrow!