k8s Horizontal Scaling

Posted on 2021-06-01 Views: Waline: Word count in article: 435 Reading time ≈ 2 mins.

This article records the usage of the HPA resource in k8s, introduces the official website tutorial links related to horizontal scaling, elaborates on two configuration methods of HPA and recommends the yaml file method, provides configuration instructions and dependencies, such as the deployment of the metric api can refer to the official website.

The content of this article actually involves a k8s resource that I knew about a long time ago. But I used it again recently, so I’m making a note.

I won’t explain horizontal scaling (scaling up and down) here. People interested in reading this article should already know about it.
The best tutorials are actually from the official website. Here are the relevant links:

A high - level introduction to horizontal scaling (HPA)
If you want to understand HPA more comprehensively, you can take a look at this article
https://kubernetes.io/zh/docs/tasks/run-application/horizontal-pod-autoscale/
Practical examples
There are yaml sample files that can be directly modified and used
https://kubernetes.io/zh/docs/tasks/run-application/horizontal-pod-autoscale-walkthrough/

HPA Configuration Methods

There are two ways to configure HPA in k8s:

Create directly through the command line, as follows,

1	kubectl autoscale deployment <deployment-name> --cpu-percent=75 --min=l --max=5

Through a yaml configuration file
Write the detailed configuration in the yaml file and create the hpa through kubectl apply.

The second method is more recommended, especially in a production environment. If you use the first method, after a while, you may forget that you created such a thing, or you may not remember the specific details clearly. The second method allows you to store the yaml file through git, and use some methods (such as: PR triggering CICD) to always keep the yaml file in the repo consistent with the production environment.

HPA Configuration Instructions

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: <hpa-name>
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    # Here, the Deployment resource is selected for demonstration, but it can support
    # types such as ReplicationController, Deployment, ReplicaSet, and StatefulSet
    kind: Deployment
    name: <the name of the resource for which auto - scaling configuration is intended>
  # Define the minimum and maximum number of replicas
  minReplicas: 1
  maxReplicas: 4
  # Monitor multiple resources to decide whether to scale
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 75
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 75

Dependencies for Configuring HPA

In the above configuration file, you can see that the two metrics I set are the utilization rates of Pods’ CPU and Memory. This means that k8s needs to provide an interface to collect this information, that is, the metric api. However, this is not deployed by default in k8s and needs to be deployed by yourself. For the specific deployment process, refer to the introduction on the official website.

Metric Server Deploy
https://kubernetes.io/zh/docs/tasks/debug-application-cluster/resource-metrics-pipeline/#metrics-server