In-Depth and Practical AKS Cluster Autoscaling Strategies: Comprehensive Guide and Best Practices
Azure Kubernetes Service (AKS) offers powerful capabilities to automatically adjust the number of cluster nodes based on workload demand through the Cluster Autoscaler. Leveraging autoscaling ensures your applications maintain performance and availability while optimizing infrastructure costs by dynamically scaling node pools.
This comprehensive article dives into AKS cluster autoscaling strategies, from enabling and configuring the autoscaler on node pools to fine-tuning autoscaler profiles and monitoring autoscaling events. We’ll cover practical examples, best practices, and advanced tips to help intermediate to advanced Kubernetes operators maximize the value of AKS autoscaling.
Why Use Cluster Autoscaler in AKS?
The cluster autoscaler component is a Kubernetes-native mechanism that continuously monitors your workloads and the cluster’s resource availability. When pods cannot be scheduled due to insufficient resources, the autoscaler increases the node count in the affected node pool. Conversely, when nodes are underutilized or idle, it scales down to reduce cost.
This dynamic adjustment helps:
- Maintain application responsiveness under variable workloads.
- Optimize resource utilization and reduce cloud infrastructure expenses.
- Simplify cluster management by automating node scaling.
Before You Start: Prerequisites
- Azure CLI version 2.0.76 or later is required. Run
az --versionto check your version. - Ensure you have appropriate permissions to manage AKS resources and node pools.
Enabling Cluster Autoscaler on AKS Clusters
1. On New Clusters
When creating a new AKS cluster, you can enable the cluster autoscaler directly on the node pool. Specify the minimum and maximum node counts to define the autoscaling boundaries.
az group create --name myResourceGroup --location eastus
az aks create \
--resource-group myResourceGroup \
--name myAKSCluster \
--node-count 1 \
--vm-set-type VirtualMachineScaleSets \
--load-balancer-sku standard \
--enable-cluster-autoscaler \
--min-count 1 \
--max-count 3 \
--generate-ssh-keys
This example creates a cluster with an initial single node, automatically scaling between one and three nodes as needed.
2. On Existing Clusters
You can enable autoscaling on an existing cluster’s node pool using the update command:
az aks update \
--resource-group myResourceGroup \
--name myAKSCluster \
--enable-cluster-autoscaler \
--min-count 1 \
--max-count 3
This will activate autoscaling with the specified node count limits.
3. On Multiple Node Pools
AKS supports multiple node pools, allowing you to tailor autoscaling rules for different workloads or VM sizes. Enable autoscaling per node pool:
az aks nodepool update \
--resource-group myResourceGroup \
--cluster-name myAKSCluster \
--name nodepool1 \
--update-cluster-autoscaler \
--min-count 1 \
--max-count 5
You can independently adjust autoscaling for each node pool based on workload characteristics.
4. Disabling or Re-Enabling Autoscaling
To disable:
az aks update \
--resource-group myResourceGroup \
--name myAKSCluster \
--disable-cluster-autoscaler
Note that disabling autoscaling does not remove nodes immediately; manual scaling is needed if desired.
To re-enable, use the same update command with --enable-cluster-autoscaler.
Best Practices for Configuring Cluster Autoscaler
Define Appropriate Min/Max Counts
- Minimum nodes ensure baseline availability and prevent cluster from scaling below a critical capacity.
- Maximum nodes limit cost exposure and control cluster size.
Tune these numbers based on expected workload peaks and budget constraints.
Use Multiple Node Pools
Separate workloads by node pool to optimize resource allocation:
- Critical workloads on dedicated pools with autoscaling.
- Batch or ephemeral workloads on separate pools.
This separation allows distinct autoscaling policies and VM types.
Zone-Aware Autoscaling
If using node pools spanning multiple availability zones, consider creating one node pool per zone and enabling the balance-similar-node-groups autoscaler profile option. This helps maintain balanced node counts across zones, improving fault tolerance and scheduling.
Advanced: Configuring Cluster Autoscaler Profile
The cluster autoscaler profile fine-tunes autoscaling behavior globally across all node pools with autoscaling enabled. Use it to adapt scaling sensitivity, timing, and eviction behavior.
Important Profile Settings
| Setting | Description | Default |
|---|---|---|
scan-interval |
Frequency of autoscaler reevaluation | 10s |
scale-down-unneeded-time |
Time a node should be underutilized before scale down | 10m |
scale-down-delay-after-add |
Delay before scale down after scale up | 10m |
scale-down-utilization-threshold |
Resource utilization threshold below which node is eligible for scale down | 0.5 |
ignore-daemonsets-utilization |
Whether DaemonSet pods are ignored when calculating utilization | false |
balance-similar-node-groups |
Balances node count among similar node pools | false |
Setting Autoscaler Profile on Cluster Creation
az aks create \
--resource-group myResourceGroup \
--name myAKSCluster \
--node-count 1 \
--enable-cluster-autoscaler \
--min-count 1 \
--max-count 3 \
--cluster-autoscaler-profile scan-interval=30s \
--generate-ssh-keys
Updating Autoscaler Profile on Existing Clusters
az aks update \
--resource-group myResourceGroup \
--name myAKSCluster \
--cluster-autoscaler-profile scale-down-unneeded-time=15m,balance-similar-node-groups=true
Example: Aggressive Scale Down Profile
For workloads that experience frequent scaling, you can configure the autoscaler to act more aggressively:
az aks update \
--resource-group myResourceGroup \
--name myAKSCluster \
--cluster-autoscaler-profile scan-interval=30s,scale-down-delay-after-add=0m,scale-down-unneeded-time=3m,max-empty-bulk-delete=1000
Note: Aggressive scaling can cause instability if scale-outs and scale-ins happen too rapidly. Adjust with caution.
Reset Profile to Defaults
az aks update \
--resource-group myResourceGroup \
--name myAKSCluster \
--cluster-autoscaler-profile ""
Monitoring and Troubleshooting Cluster Autoscaler
Retrieving Autoscaler Logs
AKS runs the cluster autoscaler inside the managed control plane. To diagnose issues:
- Enable resource logs for the cluster autoscaler to send logs to Azure Log Analytics.
- Use queries like:
AzureDiagnostics
| where Category == "cluster-autoscaler"
- Use
kubectlto check autoscaler events:
kubectl get events --field-selector source=cluster-autoscaler,reason=NotTriggerScaleUp
kubectl get events --field-selector source=cluster-autoscaler,type=Warning
- Check autoscaler status configmap:
kubectl get configmap -n kube-system cluster-autoscaler-status -o yaml
Azure Portal Insights
Navigate to your AKS cluster’s Node pools section to view autoscaler events, warnings, and scale-up triggers.
Practical Scenario: Scaling a Bursty Web Application
Imagine hosting a web application with highly variable traffic spikes every few minutes. To efficiently handle this:
- Enable autoscaling on node pools running the web frontend.
- Adjust the autoscaler profile for bursty workloads:
az aks update \
--resource-group myResourceGroup \
--name myAKSCluster \
--cluster-autoscaler-profile scan-interval=20s,scale-down-delay-after-add=10m,scale-down-unneeded-time=5m
- Set minimum nodes to handle baseline traffic and maximum nodes to cap costs.
- Use multiple node pools if different backend services have distinct scaling needs.
Complementary Scaling: Horizontal and Vertical Pod Autoscalers
While cluster autoscaler adjusts node counts, application scaling requires adjusting pod replicas. Combine with:
- Horizontal Pod Autoscaler (HPA): Automatically scale pods based on CPU, memory, or custom metrics.
- Vertical Pod Autoscaler (VPA): Adjust pod resource requests and limits based on observed usage.
Together, these autoscalers enable a fully automated, resource-efficient Kubernetes environment.
Summary and Best Practices
| Best Practice | Description |
|---|---|
| Define clear min/max node counts | Prevent over-scaling and ensure baseline capacity |
| Use multiple node pools | Tailor autoscaling for workload types and VM SKUs |
| Configure autoscaler profiles | Fine-tune scale-up/down sensitivity and timing |
| Monitor autoscaler logs/events | Proactively diagnose scaling issues |
| Combine with HPA and VPA | Achieve pod-level scaling and resource optimization |
Conclusion
AKS cluster autoscaler is a foundational tool for dynamically managing node resources in response to application demand. By understanding and applying the detailed configuration options and profiles, Kubernetes operators can create resilient, cost-effective clusters that adapt fluidly to workload variations.
Use the practical CLI examples and best practices shared in this article to confidently implement autoscaling strategies tailored to your applications’ needs.
References
- AKS Cluster Autoscaler Documentation
- Kubernetes Cluster Autoscaler GitHub
- Azure CLI for AKS
- Scale applications in AKS
- Vertical Pod Autoscaler
Author: Joseph Perez