# Kubernetes HPA 完整指南:如何实现 Pod 自动弹性伸缩
## 背景介绍
云原生应用的流量总是不稳定的。白天高峰期可能需要 20 个 Pod 来扛请求,深夜低谷期 2-3 个就够了。按照峰值流量配 Pod,资源浪费严重;按照低谷期配,高峰期服务就挂。
Horizontal Pod Autoscaler(HPA)是 Kubernetes 自带的自动扩缩容机制。它根据预设的指标自动调整 Pod 副本数,让应用灵活应对流量变化,同时省点资源。
## 问题描述
传统容器化部署靠人工监控、手动调整副本数。这有几个麻烦:
1. **响应慢**:人工发现流量高峰再扩容,黄花菜都凉了
2. **浪费多**:预留大量冗余资源,CPU 和内存利用率长期低迷
3. **累运维**:需要专人盯着指标,频繁手动操作
4. **扛不住**:流量突然激增,人工扩容根本来不及
HPA 就是来解决这些问题的。它自动、实时根据资源使用情况调整 Pod 数量,让应用始终保持合理的资源配置。
## 详细步骤
### 1. 环境准备
确保有一个运行中的 Kubernetes 集群(1.18 以上版本),kubectl 已配置好。
“`bash
# 验证集群连接
kubectl cluster-info
# 查看集群节点
kubectl get nodes
“`
### 2. 部署测试应用
先部署一个简单的 Nginx 作为测试目标:
“`yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-hpa-demo
labels:
app: nginx-hpa-demo
spec:
replicas: 2
selector:
matchLabels:
app: nginx-hpa-demo
template:
metadata:
labels:
app: nginx-hpa-demo
spec:
containers:
– name: nginx
image: nginx:1.21
ports:
– containerPort: 80
resources:
requests:
memory: “64Mi”
cpu: “100m”
limits:
memory: “128Mi”
cpu: “200m”
—
apiVersion: v1
kind: Service
metadata:
name: nginx-hpa-demo-service
spec:
selector:
app: nginx-hpa-demo
ports:
– port: 80
targetPort: 80
type: ClusterIP
“`
保存为 `nginx-deployment.yaml`,然后执行:
“`bash
kubectl apply -f nginx-deployment.yaml
“`
### 3. 部署 Metrics Server
HPA 需要 Metrics Server 提供 Pod 的资源使用数据。这是 Kubernetes 集群里的核心组件。
“`yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: metrics-server
namespace: kube-system
—
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: system:aggregated-metrics-reader
rules:
– apiGroups: [“metrics.k8s.io”]
resources: [“pods”, “nodes”]
verbs: [“get”, “list”, “watch”]
—
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: metrics-server:system:auth-delegator
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:auth-delegator
subjects:
– kind: ServiceAccount
name: metrics-server
namespace: kube-system
—
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: metrics-server-auth-reader
namespace: kube-system
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: extension-apiserver-authentication-reader
subjects:
– kind: ServiceAccount
name: metrics-server
namespace: kube-system
—
apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
name: v1beta1.metrics.k8s.io
spec:
service:
name: metrics-server
namespace: kube-system
group: metrics.k8s.io
version: v1beta1
insecureSkipTLSVerify: true
groupPriorityMinimum: 100
versionPriority: 100
—
apiVersion: apps/v1
kind: Deployment
metadata:
name: metrics-server
namespace: kube-system
spec:
selector:
matchLabels:
k8s-app: metrics-server
replicas: 1
template:
metadata:
labels:
k8s-app: metrics-server
spec:
serviceAccountName: metrics-server
containers:
– name: metrics-server
image: bitnami/metrics-server:0.6.3
command:
– /metrics-server
– –kubelet-insecure-tls
– –kubelet-preferred-address-types=InternalIP
resources:
requests:
memory: “64Mi”
cpu: “50m”
limits:
memory: “128Mi”
cpu: “100m”
—
apiVersion: v1
kind: Service
metadata:
name: metrics-server
namespace: kube-system
spec:
selector:
k8s-app: metrics-server
ports:
– port: 443
targetPort: 443
“`
保存为 `metrics-server.yaml` 并部署:
“`bash
kubectl apply -f metrics-server.yaml
“`
验证 Metrics Server 运行状态:
“`bash
kubectl get apiservices.v1beta1.metrics.k8s.io
kubectl top nodes
“`
### 4. 创建 HPA 策略
创建 HPA 资源来定义自动扩缩容策略:
“`yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: nginx-hpa-demo
namespace: default
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: nginx-hpa-demo
minReplicas: 2
maxReplicas: 10
metrics:
– type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 50
– type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 70
behavior:
scaleDown:
stabilizationWindowSeconds: 300
policies:
– type: Percent
value: 10
periodSeconds: 60
scaleUp:
stabilizationWindowSeconds: 0
policies:
– type: Percent
value: 100
periodSeconds: 15
– type: Pods
value: 4
periodSeconds: 15
selectPolicy: Max
“`
保存为 `hpa.yaml` 并应用:
“`bash
kubectl apply -f hpa.yaml
“`
### 5. 验证 HPA 状态
查看 HPA 当前状态:
“`bash
kubectl get hpa nginx-hpa-demo
# 详细描述
kubectl describe hpa nginx-hpa-demo
“`
### 6. 模拟负载测试
用 hey 工具进行负载测试,验证 HPA 是否正常工作:
“`bash
# 安装 hey(如果未安装)
go install github.com/rakyll/hey@latest
# 执行负载测试
hey -n 100000 -c 100 http://nginx-hpa-demo-service/
“`
## 运行结果
### HPA 状态变化
负载测试后,观察 HPA 状态:
“`bash
# 持续监控 HPA
watch -n 2 kubectl get hpa nginx-hpa-demo
“`
负载测试期间会看到类似输出:
“`
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
nginx-hpa-demo Deployment/nginx-hpa-demo 75%/50%, 280Mi/64Mi/70% 2 10 4 5m
“`
负载增加时,REPLICAS 会从 2 变成 4、6,最多到 8 或更多。负载结束后,等 300 秒 stabilization window,副本数会自动缩回最小值。
### 查看 Pod 详情
“`bash
kubectl get pods -l app=nginx-hpa-demo
# 查看特定 Pod 的资源使用
kubectl top pod nginx-hpa-demo-xxxxx
“`
## 总结
本文介绍了 Kubernetes HPA 的完整配置方法。HPA 的核心价值在于:
1. **自动化**:完全自动处理扩缩容,不用人盯着
2. **省钱**:根据实际负载动态调整资源
3. **稳定**:流量突增时自动扩容,保证服务可用
4. **灵活**:支持自定义指标和多种扩缩容策略
生产环境中注意几点:
– 合理设置 minReplicas 和 maxReplicas 范围
– 根据业务特征选合适的指标
– 调整 stabilization window 避免频繁扩缩容
– 定期监控 HPA 状态,优化策略参数
用好 HPA,Kubernetes 集群的资源利用会更高效,应用在各种流量场景下也能保持稳定。