How does the Kubernetes Horizontal Pod Autoscaler (HPA) work?
HPA automatically scales the number of pod replicas based on observed metrics. The default metric is CPU utilization, but it also supports memory and custom metrics via the Metrics API.
kubectl autoscale deployment my-app --min=2 --max=10 --cpu-percent=60
The HPA controller checks metrics every 15 seconds (default) and adjusts replicas to maintain the target. For custom metrics, you can integrate tools like KEDA (Kubernetes Event-Driven Autoscaling) which can scale based on Kafka lag, SQS queue depth, and more.