Scaling the Docker container using Kubernetes: AI End-to-End Series (Part — 7)
Now it is time to prepare the containerized app for scaling. And how will we achieve that? Using K8s!
Kubernetes, also known as K8s, is an open-source system for automating deployment, scaling, and management of containerized applications.
Features of Kubernetes
Automated rollouts and rollbacks
- Kubernetes progressively rolls out changes to your application or its configuration, while monitoring application health to ensure it doesn’t kill all your instances at the same time.
- If something goes wrong, Kubernetes will roll back the change for you.
- Take advantage of a growing ecosystem of deployment solutions.
Service discovery and load balancing
- No need to modify your application to use an unfamiliar service discovery mechanism.
- Kubernetes gives Pods their own IP addresses and a single DNS name for a set of Pods and can load-balance across them.
- Automatically mount the storage system of your choice, whether from local storage, a public cloud provider such as GCP or AWS.
Secret and configuration management
- Deploy and update secrets and application configuration without rebuilding your image and without exposing secrets in your stack configuration.
Automatic bin packing
- Automatically places containers based on their resource requirements and other constraints, while not sacrificing availability.
- Mix critical and best-effort workloads in order to drive up utilization and save even more resources.
- In addition to services, Kubernetes can manage your batch and CI workloads, replacing containers that fail, if desired.
- Scale your application up and down with a simple command, with a UI, or automatically based on CPU usage.
- Restarts containers that fail, replaces and reschedules containers when nodes die, kills containers that don’t respond to your user-defined health check, and doesn’t advertise them to clients until they are ready to serve.
Designed for extensibility
- Add features to your Kubernetes cluster without changing upstream source code.
Working of Kubernetes
- In order to validate that our containerized application works well on Kubernetes, we’ll use Docker Desktop’s built-in Kubernetes environment right on our development machine to deploy our application.
- This will be followed by handing off the container to run on a full Kubernetes cluster in production.
- The Kubernetes environment created by Docker Desktop is fully featured, meaning it has all the Kubernetes features your app will enjoy on a real cluster, accessible from the convenience of your development machine.
Describing apps using Kubernetes YAML
- All containers in Kubernetes are scheduled as pods, which are groups of co-located containers that share some resources.
- Furthermore, in a realistic application, we almost never create individual pods; instead, most of our workloads are scheduled as deployments, which are scalable groups of pods maintained automatically by Kubernetes.
- Lastly, all Kubernetes objects can and should be described in manifests called Kubernetes YAML files.
- These YAML files describe all the components and configurations of your Kubernetes app and can be used to easily create and destroy your app in any Kubernetes environment.
- Download and install Docker Desktop.
- Make sure that Kubernetes is enabled on your Docker Desktop:
- Mac: Click the Docker icon in your menu bar, navigate to Preferences and make sure there’s a green light beside ‘Kubernetes’.
- Windows: Click the Docker icon in the system tray and navigate to Settings and make sure there’s a green light beside ‘Kubernetes’.
Creating Necessary Files
- In this Kubernetes YAML file, we have two objects, separated by the:
1. A Deployment, describing a scalable group of identical pods. In this case, you’ll get just one replica or copy of your pod, and that pod (which is described under the template: key) has just one container in it, based on the docker image created in the last article.
2. A NodePort service, which will route traffic from port 30001 on your host to port 3000 inside the pods it routes to, allowing you to reach your bulletin board from the network.
- Also, notice that while Kubernetes YAML can appear long and complicated at first, it almost always follows the same pattern:
- The apiVersion, which indicates the Kubernetes API that parses this object.
- The kind indicating what sort of object this is.
- Some metadata applies things like names to your objects.
- The specification specifies all the parameters and configurations of your object.
- You can download the flask app on your system/colab runtime using the following command:
Containerization using Docker
- Deploy and check your application.
- In a terminal, navigate to where you created flaskapp.yaml and deploy your application to Kubernetes:
kubectl apply -f flaskapp.yaml
- You should see output that looks like the following, indicating your Kubernetes objects were created successfully:
- Make sure everything worked by listing your deployments:
kubectl get deployments
- Do check your services as well:
kubectl get services
- If all is well, your deployment should be listed as follows:
- In addition to the default Kubernetes service, we see our
flask-entrypointservice, accepting traffic on port
- Open a browser and visit your bulletin board at
localhost:30002; you should see your face mask model.
- You can also delete your application using:
kubectl delete -f flask.yaml
- Now that we have warmed up, let’s move on with the bigger tools — Google Kubernetes Engine.
Deploying App on Google Kubernetes Engine
- Install the Cloud SDK, which includes the gcloud command-line tool.
- Using the gcloud tool, install the Kubernetes command-line tool. kubectl is used to communicate with Kubernetes, which is the cluster orchestration system of GKE clusters.
The process of Deploying will be as follows
Create a GKE cluster
- Select or Create a New Project.
- Make sure that billing is enabled for your Cloud project. Go through this link to confirm whether billing is enabled for our project.
- Enable the Artifact Registry and Google Kubernetes Engine APIs.
- Go through this link with your Cloud Account to enable both APIs.
- Select Kubernetes Engine Application from Cloud Dashboard.
- Click on Create New Cluster.
- Choose Standard or Autopilot mode and click Configure.
- In the Name field, enter the name of the application.
- Select a zone or region.
- Select Autopilot cluster: Select a Compute Engine region from the Region drop-down list, such as asia-south1.
- Click Create. This creates a GKE cluster.
- Wait for the cluster to be created. When the cluster is ready, a green checkmark appears next to the cluster name.
Storing our local Image to Artifact Registry:
- Artifact Registry is a container management tool launched by Google.
- Using this, you can manage your container images and language packages.
- You must upload the container image to a registry so that your GKE cluster can download and run the container image.
- In order to store our Docker image to Artifact Registry, we need to create a repository in Google Artifact Registry.
- Navigate to Artifact Registry from Google Console.
- Click on Create New Repository and fill out the necessary details as follows:
Tagging the local image
- Make sure that you are authenticated to the repository.
- Determine the name of the image. The format of a full image name is:
- Replace the following values:
- LOCATION is the regional or multi-regional location of the repository where the image is stored.
- PROJECT is your Google Cloud Console project ID.
- REPOSITORY is the name of the repository where the image is stored.
- IMAGE is the image’s name. It can be different than the image’s local name.
- For example, in our case an image will have the following characteristics:
- Repository location: asia-south1
- Repository name: flaskapp2
- Project ID: face-mask-kubernates
- Local image name: flask-app
- Target image name: flask-app
- This image name for this example is:
- Tag the local image with the repository name with the command:
docker tag flask_app asia-south1-docker.pkg.dev/face-mask-kubernates/flaskapp/flask_app
- Package a sample web application into a Docker image.
- Pushing the Docker image to Artifact Registry
- Configure the Docker command-line tool to authenticate to Artifact Registry repository:
gcloud auth configure-docker asia-south1-docker.pkg.dev
- Make sure that you are authenticated to the repository.
- Push the tagged image with the command:
docker push LOCATION-docker.pkg.dev/PROJECT-ID/REPOSITORY/IMAGE
- So in our case, the command will look like this:
docker push asia-south1-docker.pkg.dev/face-mask-kubernates/flaskapp/flask_app
Deploy the sample app to the cluster
- You are now ready to deploy the Docker image you built to your GKE cluster.
- Kubernetes represent applications as Pods, which are scalable units holding one or more containers.
- The Pod is the smallest deployable unit in Kubernetes.
- Usually, you deploy Pods as a set of replicas that can be scaled and distributed together across your cluster.
- Go to the Workloads page in Cloud Console.
- Click Deploy.
- In the Container section, select Existing container image.
- In the Image path field, click Select.
- In the Select container image pane, select the flask_app image you pushed to Artifact Registry and click Select.
- In the Container section, click Done, then click Continue.
- In the Configuration section, under Labels, enter app for Key and flask_app for Value.
- Under Configuration YAML, click View YAML.
- This opens a YAML configuration file representing the two Kubernetes API resources about to be deployed into your cluster: one Deployment, and one HorizontalPodAutoscaler for that Deployment.
- Click Close, then click Deploy.
- When the Deployment Pods are ready, the Deployment details page opens.
- Under Managed pods, note the three running Pods for the flaskapp Deployment.
Expose the sample app to the internet
- While Pods do have individually-assigned IP addresses, those IPs can only be reached from inside your cluster.
- Also, GKE Pods are designed to be ephemeral, starting or stopping based on scaling needs. And when a Pod crashes due to an error, GKE automatically redeploys that Pod, assigning a new Pod IP address each time.
- What this means is that for any Deployment, the set of IP addresses corresponding to the active set of Pods is dynamic.
- We need a way to group Pods together into one static hostname, and expose a group of Pods outside the cluster, to the internet.
- To do so, follow the given steps:
- Go to Workloads
- Click flask-app.
- From the Deployment details page, click list Actions > Expose.
- In the Expose dialog, set the Target port to 8000. This is the port the flask-app container listens on.
- From the Service type drop-down list, select Load balancer.
- Click Expose to create a Kubernetes Service for flask-app.
- When the Load Balancer is ready, the Service details page opens.
- Scroll down to the External endpoints field, and copy the IP address.
Testing App using provided IP Address
- Now that we have seen a successful deployment of our application on GKE, we are almost done with the AI E2E productization.
- We just need to make sure that our future changes are updated properly and without any hassle.
- CI/CD is a solution to the problems integrating new code can cause for development and operations teams.
In the next article, we will perform continuous integration and continuous development using Jenkins.
Follow us for more upcoming future articles related to Data Science, Machine Learning, and Artificial Intelligence.
Also, Do give us a Clap👏 if you find this article useful as your encouragement catalyzes inspiration for and helps to create more cool stuff like this.