Table of Contents
In this tutorial, we will look into 3 methods to horizontal scale up/down the pods based on CPU utilization in Kubernetes. Scaling is a feature which is used extensively in Kubernetes technology where resources can be increased or decreased depending upon the current Server workload. There are basically two types of scaling - Horizontal and Vertical Scaling. While here we are mostly going to concentrate on Horizontal Scaling, in the later article we will look into Vertical Scaling.
Imagine a scenario where you are running a 3 node Kubernetes Cluster with 20 different applications running on 20 different pods. While you might have created the cluster after understanding the maximum workload it requires to handle but think of a situation when suddenly workload got increased beyond a limit that Cluster can handle. It might result into a Server Crash or application downtime or may be even a production loss. So to avoid this situation Horizontal scaling feature can be much useful. More can be checked on Kubernetes Official Documentation.
What is Horizontal Scaling
The process of adding additional number of resources(pods or nodes) to the existing Server Cluster to share the workload is Known as Horizontal Scaling. It is a usually a preferred scaling method over Vertical Scaling.
Horizontal Scale Up/Down the Pods Based on CPU Utilization in Kubernetes
Also Read: 3 Easy Methods to Deploy/Create Pods in Kubernetes Cluster
Method 1: Horizontal Scale Up/Down the Pods Based on CPU Utilization Using YAML File
In the very first method, we will discuss about using YAML file to horizontal scale up/down the pods. This is also the recommended way where you can specify the minimum replica, maximum replica, CPU utilization percentage after which scaling happens all in a single YAML file.
[root@localhost ~]# vi autoscale.yaml apiVersion: autoscaling/v1 kind: HorizontalPodAutoscaler metadata: name: web-app-scaler spec: scaleTargetRef: kind: ReplicaSet name: web-app minReplicas: 2 maxReplicas: 8 targetCPUUtilizationPercentage: 60
Now to create the HPA you need to use kubectl apply -f autoscale.yaml
command as shown below. This will create the web-app-scaler
HPA.
[root@localhost ~]# kubectl apply -f autoscale.yaml horizontalpodautoscaler.autoscaling/web-app-scaler created
If you want to check all the create HPA then you need to use kubectl get hpa
command as used below. As you can see we have only one HPA as of now.
[root@localhost ~]# kubectl get hpa NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE web-app-scaler ReplicaSet/web-app <unknown>/60% 2 8 0 76s
Once you are done with HPA you can easily delete them by using kubectl delete hpa <hpa_name>
command. Here we are deleting web-app-scaler
HPA by using kubectl delete hpa web-app-scaler
command as used below.
[root@localhost ~]# kubectl delete hpa web-app-scaler horizontalpodautoscaler.autoscaling "web-app-scaler" deleted
Once it is deleted you can verify the hpa list by using kubectl get hpa
command.
[root@localhost ~]# kubectl get hpa No resources found in cyberithub namespace.
NOTE:
root
user to run all the below commands. You can use any user with sudo
access to run all these commands. For more information Please check Step by Step: How to Add User to Sudoers to provide sudo
access to the User.Method 2: Horizontal Scale Up/Down the Pods Based on CPU Utilization Using JSON File
The second method that you will often find to be used by JSON lovers where you can simply change the extension of .yaml
file to .json
file and use as it is. Sometimes it does happen that you are working in a JSON environment so you require json file instead of yaml file. You can simply rename autoscale.yaml
file to autoscale.json
file by using mv autoscale.yaml autoscale.json
command as shown below.
[root@localhost ~]# mv autoscale.yaml autoscale.json
You can verify autoscale.json contents by opening the file with vi editor or by using cat autoscale.json command. You will find no difference.
[root@localhost ~]# vi autoscale.json apiVersion: autoscaling/v1 kind: HorizontalPodAutoscaler metadata: name: web-app-scaler spec: scaleTargetRef: kind: ReplicaSet name: web-app minReplicas: 2 maxReplicas: 8 targetCPUUtilizationPercentage: 60
Then you can use the same kubectl
command as you used in previous method with only exception of changing the file name to json
file instead of yaml
file.
[root@localhost ~]# kubectl apply -f autoscale.json horizontalpodautoscaler.autoscaling/web-app-scaler created
Then you will see web-app-scaler HPA got created successfully with CPU limit set to 60% as specified below.
[root@localhost ~]# kubectl get hpa NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE web-app-scaler ReplicaSet/web-app <unknown>/60% 2 8 0 10s
You can also delete the created HPA just like you have deleted in previous method using kubectl delete hpa web-app-scaler
command once you are done with it.
[root@localhost ~]# kubectl delete hpa web-app-scaler horizontalpodautoscaler.autoscaling "web-app-scaler" deleted
NOTE:
Method 3: Horizontal Scale Up/Down the Pods Based on CPU Utilization Using kubectl command
The third method that you can use is through kubectl command in CLI. You can create HPA in a single command where you can specify the maximum and minimum number of pods using --max
and --min
option and the CPU Utilization can be set by using --cpu-percent
option as shown below.
[root@localhost ~]# kubectl autoscale rs web-app --max=8 --min=2 --cpu-percent=60 horizontalpodautoscaler.autoscaling/web-app autoscaled
Once we have Horizontal Pod Autoscaler created you can verify it by using kubectl get hpa
command. This will show the list of HPA currently available along with the different options set for them.
[root@localhost ~]# kubectl get hpa NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE web-app ReplicaSet/web-app <unknown>/60% 2 8 2 35m
Just like above methods you can also delete this Autoscaler by using same kubectl delete hpa web-app
command.
[root@localhost ~]# kubectl delete hpa web-app horizontalpodautoscaler.autoscaling "web-app" deleted
This will delete the Autoscaler as you can confirm from below output.
[root@localhost ~]# kubectl get hpa No resources found in cyberithub namespace.
Conclusion
In this tutorial, we learnt about the meaning of scaling and the different methods used for scaling. We have learnt about the horizontal scaling and the different methods used for horizontal scaling based on CPU Utilization. We have also gone through different kubectl commands that can be used in Kubernetes to perform the required tasks. Hopefully this tutorial was helpful.
Popular Recommendations:-
How to Create New Custom Namespaces in Kubernetes{3 Best Methods}
Create a Service to Expose Your Apps on Kubernetes(v1.16)
How to Install and Configure Kubernetes on Redhat/CentOS 7 with Best Example
Best 15 Kubectl and Kubeadm Commands
How to Check Stateful and Stateless Pods in Kubernetes Cluster