k8s Horizontal Scaling

The content of this article actually involves a k8s resource that I knew about a long time ago. But I used it again recently, so I’m making a note.

I won’t explain horizontal scaling (scaling up and down) here. People interested in reading this article should already know about it.
The best tutorials are actually from the official website. Here are the relevant links:

HPA Configuration Methods

There are two ways to configure HPA in k8s:

  1. Create directly through the command line, as follows,
    1
    kubectl autoscale deployment <deployment-name> --cpu-percent=75 --min=l --max=5
  2. Through a yaml configuration file
    Write the detailed configuration in the yaml file and create the hpa through kubectl apply.

The second method is more recommended, especially in a production environment. If you use the first method, after a while, you may forget that you created such a thing, or you may not remember the specific details clearly. The second method allows you to store the yaml file through git, and use some methods (such as: PR triggering CICD) to always keep the yaml file in the repo consistent with the production environment.

HPA Configuration Instructions

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: <hpa-name>
spec:
  scaleTargetRef:
    apiVersion: apps/v1
# Here, the Deployment resource is selected for demonstration, but it can support
# types such as ReplicationController, Deployment, ReplicaSet, and StatefulSet
    kind: Deployment
    name: <the name of the resource for which auto - scaling configuration is intended>
  # Define the minimum and maximum number of replicas
minReplicas: 1
  maxReplicas: 4
  # Monitor multiple resources to decide whether to scale
metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 75
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 75

Dependencies for Configuring HPA

In the above configuration file, you can see that the two metrics I set are the utilization rates of Pods’ CPU and Memory. This means that k8s needs to provide an interface to collect this information, that is, the metric api. However, this is not deployed by default in k8s and needs to be deployed by yourself. For the specific deployment process, refer to the introduction on the official website.