In-Depth and Practical AKS Cluster Autoscaling Strategies: Comprehensive Guide and Best Practices

Azure Kubernetes Service (AKS) offers powerful capabilities to automatically adjust the number of cluster nodes based on workload demand through the Cluster Autoscaler. Leveraging autoscaling ensures your applications maintain performance and availability while optimizing infrastructure costs by dynamically scaling node pools.

This comprehensive article dives into AKS cluster autoscaling strategies, from enabling and configuring the autoscaler on node pools to fine-tuning autoscaler profiles and monitoring autoscaling events. We’ll cover practical examples, best practices, and advanced tips to help intermediate to advanced Kubernetes operators maximize the value of AKS autoscaling.

Why Use Cluster Autoscaler in AKS?

The cluster autoscaler component is a Kubernetes-native mechanism that continuously monitors your workloads and the cluster’s resource availability. When pods cannot be scheduled due to insufficient resources, the autoscaler increases the node count in the affected node pool. Conversely, when nodes are underutilized or idle, it scales down to reduce cost.

This dynamic adjustment helps:

Maintain application responsiveness under variable workloads.
Optimize resource utilization and reduce cloud infrastructure expenses.
Simplify cluster management by automating node scaling.

Before You Start: Prerequisites

Azure CLI version 2.0.76 or later is required. Run az --version to check your version.
Ensure you have appropriate permissions to manage AKS resources and node pools.

Enabling Cluster Autoscaler on AKS Clusters

1. On New Clusters

When creating a new AKS cluster, you can enable the cluster autoscaler directly on the node pool. Specify the minimum and maximum node counts to define the autoscaling boundaries.

az group create --name myResourceGroup --location eastus

az aks create \
  --resource-group myResourceGroup \
  --name myAKSCluster \
  --node-count 1 \
  --vm-set-type VirtualMachineScaleSets \
  --load-balancer-sku standard \
  --enable-cluster-autoscaler \
  --min-count 1 \
  --max-count 3 \
  --generate-ssh-keys

This example creates a cluster with an initial single node, automatically scaling between one and three nodes as needed.

2. On Existing Clusters

You can enable autoscaling on an existing cluster’s node pool using the update command:

az aks update \
  --resource-group myResourceGroup \
  --name myAKSCluster \
  --enable-cluster-autoscaler \
  --min-count 1 \
  --max-count 3

This will activate autoscaling with the specified node count limits.

3. On Multiple Node Pools

AKS supports multiple node pools, allowing you to tailor autoscaling rules for different workloads or VM sizes. Enable autoscaling per node pool:

az aks nodepool update \
  --resource-group myResourceGroup \
  --cluster-name myAKSCluster \
  --name nodepool1 \
  --update-cluster-autoscaler \
  --min-count 1 \
  --max-count 5

You can independently adjust autoscaling for each node pool based on workload characteristics.

4. Disabling or Re-Enabling Autoscaling

To disable:

az aks update \
  --resource-group myResourceGroup \
  --name myAKSCluster \
  --disable-cluster-autoscaler

Note that disabling autoscaling does not remove nodes immediately; manual scaling is needed if desired.

To re-enable, use the same update command with --enable-cluster-autoscaler.

Best Practices for Configuring Cluster Autoscaler

Define Appropriate Min/Max Counts

Minimum nodes ensure baseline availability and prevent cluster from scaling below a critical capacity.
Maximum nodes limit cost exposure and control cluster size.

Tune these numbers based on expected workload peaks and budget constraints.

Use Multiple Node Pools

Separate workloads by node pool to optimize resource allocation:

Critical workloads on dedicated pools with autoscaling.
Batch or ephemeral workloads on separate pools.

This separation allows distinct autoscaling policies and VM types.

Zone-Aware Autoscaling

If using node pools spanning multiple availability zones, consider creating one node pool per zone and enabling the balance-similar-node-groups autoscaler profile option. This helps maintain balanced node counts across zones, improving fault tolerance and scheduling.

Advanced: Configuring Cluster Autoscaler Profile

The cluster autoscaler profile fine-tunes autoscaling behavior globally across all node pools with autoscaling enabled. Use it to adapt scaling sensitivity, timing, and eviction behavior.

Important Profile Settings

Setting	Description	Default
`scan-interval`	Frequency of autoscaler reevaluation	10s
`scale-down-unneeded-time`	Time a node should be underutilized before scale down	10m
`scale-down-delay-after-add`	Delay before scale down after scale up	10m
`scale-down-utilization-threshold`	Resource utilization threshold below which node is eligible for scale down	0.5
`ignore-daemonsets-utilization`	Whether DaemonSet pods are ignored when calculating utilization	false
`balance-similar-node-groups`	Balances node count among similar node pools	false

Setting Autoscaler Profile on Cluster Creation

az aks create \
  --resource-group myResourceGroup \
  --name myAKSCluster \
  --node-count 1 \
  --enable-cluster-autoscaler \
  --min-count 1 \
  --max-count 3 \
  --cluster-autoscaler-profile scan-interval=30s \
  --generate-ssh-keys

Updating Autoscaler Profile on Existing Clusters

az aks update \
  --resource-group myResourceGroup \
  --name myAKSCluster \
  --cluster-autoscaler-profile scale-down-unneeded-time=15m,balance-similar-node-groups=true

Example: Aggressive Scale Down Profile

For workloads that experience frequent scaling, you can configure the autoscaler to act more aggressively:

az aks update \
  --resource-group myResourceGroup \
  --name myAKSCluster \
  --cluster-autoscaler-profile scan-interval=30s,scale-down-delay-after-add=0m,scale-down-unneeded-time=3m,max-empty-bulk-delete=1000

Note: Aggressive scaling can cause instability if scale-outs and scale-ins happen too rapidly. Adjust with caution.

Reset Profile to Defaults

az aks update \
  --resource-group myResourceGroup \
  --name myAKSCluster \
  --cluster-autoscaler-profile ""

Monitoring and Troubleshooting Cluster Autoscaler

Retrieving Autoscaler Logs

AKS runs the cluster autoscaler inside the managed control plane. To diagnose issues:

Enable resource logs for the cluster autoscaler to send logs to Azure Log Analytics.
Use queries like:

AzureDiagnostics
| where Category == "cluster-autoscaler"

Use kubectl to check autoscaler events:

kubectl get events --field-selector source=cluster-autoscaler,reason=NotTriggerScaleUp
kubectl get events --field-selector source=cluster-autoscaler,type=Warning

Check autoscaler status configmap:

kubectl get configmap -n kube-system cluster-autoscaler-status -o yaml

Azure Portal Insights

Navigate to your AKS cluster’s Node pools section to view autoscaler events, warnings, and scale-up triggers.

Practical Scenario: Scaling a Bursty Web Application

Imagine hosting a web application with highly variable traffic spikes every few minutes. To efficiently handle this:

Enable autoscaling on node pools running the web frontend.
Adjust the autoscaler profile for bursty workloads:

az aks update \
  --resource-group myResourceGroup \
  --name myAKSCluster \
  --cluster-autoscaler-profile scan-interval=20s,scale-down-delay-after-add=10m,scale-down-unneeded-time=5m

Set minimum nodes to handle baseline traffic and maximum nodes to cap costs.
Use multiple node pools if different backend services have distinct scaling needs.

Complementary Scaling: Horizontal and Vertical Pod Autoscalers

While cluster autoscaler adjusts node counts, application scaling requires adjusting pod replicas. Combine with:

Horizontal Pod Autoscaler (HPA): Automatically scale pods based on CPU, memory, or custom metrics.
Vertical Pod Autoscaler (VPA): Adjust pod resource requests and limits based on observed usage.

Together, these autoscalers enable a fully automated, resource-efficient Kubernetes environment.

Summary and Best Practices

Best Practice	Description
Define clear min/max node counts	Prevent over-scaling and ensure baseline capacity
Use multiple node pools	Tailor autoscaling for workload types and VM SKUs
Configure autoscaler profiles	Fine-tune scale-up/down sensitivity and timing
Monitor autoscaler logs/events	Proactively diagnose scaling issues
Combine with HPA and VPA	Achieve pod-level scaling and resource optimization

Conclusion

AKS cluster autoscaler is a foundational tool for dynamically managing node resources in response to application demand. By understanding and applying the detailed configuration options and profiles, Kubernetes operators can create resilient, cost-effective clusters that adapt fluidly to workload variations.

Use the practical CLI examples and best practices shared in this article to confidently implement autoscaling strategies tailored to your applications’ needs.

References

Author: Joseph Perez