Model Deployment using GCP: AI End-to-End Series (Part — 5)
- ML models require deployment to a production environment to provide business value.
- But the unfortunate reality is that many models never make it to production, or if they do, the deployment process takes much longer than necessary.
- Even successfully deployed models will require domain-specific upkeep that can create new engineering and operations challenges.
- ML models are software - deploying and maintaining any software is a serious task, and alongside ML, it introduces new complexities.
- In our previous article, we performed the first deployment of our model using Flask.
- This article will cover how we can upload our model into a production environment using the Google Cloud Platform.
Google Compute Engine
- Google Compute Engine offers virtual machines running in Google’s data centers connected to its worldwide fiber network.
- The tooling and workflow offered to enable scaling from single instances to global, load-balanced cloud computing.
- These VMs boot quickly, come with persistent disk storage and deliver consistent performance.
- The machines are available in many configurations including predefined sizes and can also be created with Custom Machine Types optimized for your specific needs.
- Finally, Compute Engine virtual machines are also the technology used by several other Google Cloud products (Kubernetes Engine, Cloud Dataproc, Cloud Dataflow, etc…).
Google App Engine
- Google App Engine lets you build and run your own custom applications on Google’s servers.
- App Engine applications are easy to create, maintain, and scale as your traffic and data storage needs change.
- You simply upload your application source code and it’s ready to go.
- It is a fully managed, serverless platform for developing and hosting web applications at scale.
- You can choose from several popular languages, libraries, and frameworks to develop your apps, and then let App Engine take care of provisioning servers and scaling your app instances based on demand.
Services provided by App Engine
- Platform as a Service (PaaS) to build and deploy scalable applications
- Hosting facility in fully-managed data centers.
- A fully-managed, flexible environment platform for managing application server and infrastructure
Some cool features of App Engine
- Scale downs to zero when no traffic/requests (Cost saving).
- Focus on writing the code, rest everything is taken care
- Application version deployment, you can roll back to the previous version within seconds
- Traffic Splitting: Split your traffic between two different versions. This helps to do A/B testing and incremental feature rollout.
Now that we are a bit familiar with some tools that we will use, let’s see how we can use them to build a production environment and deploy our model.
1. Create a project in Google Cloud Platform
- In order to deploy any application firstly, we need to create a project in GCP Console.
- Create a new project in GCP Dashboard. Provide it a unique name and also note down the project id for future references.
- Creating a new project will lead you to the dashboard of that particular project.
2. Create an APP in App Engine
- Navigate to the App Engine to create an app in App Engine.
- Select the location for your app. In our case, we’ll use south-asia-1(Mumbai) as our location.
3. Importing and Installing Packages
Download the required files using:
- Create a new directory to extract our files for deployment.
- Changing Default working directory for smooth deployment.
!unzip '/content/app engine.zip'
- Authenticate Google User using our Gcloud Account.
from google.colab import auth
Download Cloud SDK in Our System
- To deploy our app we need to install cloud SDK in our system.
- Cloud SDK provides tools and libraries for interacting with Google Cloud products and services
!curl https://sdk.cloud.google.com | bash
4. Loading and Creating Necessary Files
- Create app.yaml Configuration File in the project root folder.
- We need to configure the App Engine app’s settings in the app.yaml file. The app.yaml file also contains information about your app’s code, such as the runtime, instance, and the latest version identifier.
Creating main.py file
- The main.py file is our controller where the core application logic goes.
- It is necessary to have a main.py file in order to deploy our app because it is the first file App engine goes for.
Creating requirements.txt file
- This file contains information on all necessary libraries used in our project.
5. Deploying the model
- As a first step, we’ll use the bash command gcloud init.
- This will set up and configure Cloud SDK for deployment.
- Gcloud will ask some permission in this step:
- Pick configuration to use: We’ll choose the default configuration.
- Choose Account: We’ll choose the account on which we want to perform operations.
- Choose Project: We’ll choose the project that was created by us.
- Now we’ll use the following command to deploy our model on App Engine.
!gcloud app deploy app.yaml — project face-mask-deployment
- After the deployment is successful we can visit our deployed app using the link provided.
6. Making Predictions
7. Disabling APP
- After the successful deployment and testing, it is recommended for students to disable the app to save any further cost deduction.
- Kindly navigate to the settings in App Engine and disable it.
If your deployment fails, make sure the Cloud Build API is enabled in your project.
App Engine enables this API automatically the first time you deploy an app, but if someone has since disabled the API, deployments will fail.
- We have successfully deployed our project to GCP and simulated a production environment.
In the next article of this series, we will containerize our application using docker.
Follow us for more upcoming future articles related to Data Science, Machine Learning, and Artificial Intelligence.
Also, Do give us a Clap👏 if you find this article useful as your encouragement catalyzes inspiration for and helps to create more cool stuff like this.