Kubernetes provides a few mechanisms for scalability of workloads. Three primary mechanisms are Vertical Pod Autoscaler (VPA), Horizontal Pod Autoscaler (HPA), and Cluster Autoscaler (CA).
Cluster Autoscaler automatically adapts the number of Kubernetes nodes in your cluster to your requirements. When the number of pods that are pending or “unschedulable” increases, indicating there are insufficient resources in the cluster, CA adds new nodes to the cluster. It can also scale down nodes if they have been underutilized for a long period of time.
The Cluster Autoscaler is typically installed as a Deployment object in a cluster. It scales one replica at a time, and uses leader election to ensure high availability.
Related content: Read our guide to Horizontal Pod Autoscaler (coming soon)
How Cluster Autoscaler Works
For simplicity, we’ll explain the Cluster Autoscaler process in a scale out scenario. When the number of pending (unschedulable) pods in the cluster increases, indicating a lack of resources, CA automatically starts new nodes.
This occurs in four steps:
CA checks for pending pods, scanning at an interval of 10 seconds (configurable using the --scan-interval flag).
If there are pending pods, CA spins up new nodes to scale out the cluster, within the constraints configured by the administrator. CA integrates with public cloud platforms such as AWS and Azure, using their autoscaling capabilities to add more virtual machines.
Kubernetes registers the new virtual machines as nodes in the control plane, allowing the Kubernetes scheduler to run pods on them.
The Kubernetes scheduler assigns the pending pods to the new nodes.
Detect and fix errors 5x faster
Komodor monitors your entire K8s stack, identifies issues, and uncovers their root cause.
Cluster Autoscaler is a useful mechanism, but it can sometimes work differently than expected. Here are the primary ways to diagnose an issue with CA:
Logs on control plane nodes
Kubernetes control plane nodes create logs of Cluster Autoscaler activity in the following path: /var/log/cluster-autoscaler.logEvents on control plane nodes
The kube-system/cluster-autoscaler-status ConfigMap emits the following events:
ScaledUpGroup—this event means CA increased the size of the node group (provides previous size and current size)
ScaleDownEmpty—this event means CA removed a node that did not have any user pods running on it (only system pods)
ScaleDown—this event means CA removed a node that had user pods running on it. The event will include the names of all pods that are rescheduled as a result.
Events on nodes
ScaleDown—this event means CA is scaling down the node. There can be multiple events, indicating different stages of the scale-down operation.
ScaleDownFailed—this event means CA tried to remove the node but did not succeed. It provides the resulting error message.
Events on pods
TriggeredScaleUp—this event means CA scaled up the cluster to enable this pod to schedule.
NotTriggerScaleUp—this event means CA was not able to scale up a node group to allow this pod to schedule.
ScaleDown—this event means CA tried to evict this pod from a node, in order to drain it and then scale it down.
Cluster Autoscaler: Troubleshooting for Specific Error Scenarios
Here are specific error scenarios that can occur with the Cluster Autoscaler and how to perform initial troubleshooting.
These instructions will allow you to debug simple error scenarios, but for more complex errors involving multiple moving parts in the cluster, you might need automated troubleshooting tools.
Nodes with Low Utilization are Not Scaled Down
Here are reasons why CA might fail to scale down a node, and what you can do about them.
Reason Cluster Doesn’t Scale Down
What You Can Do
Pod specs indicate it should not be evicted from the node.
Identify the missing ConfigMap and create it in the namespace, or mount another, existing ConfigMap.
Node group already has the minimum size.
Reduce minimum size in CA configuration.
The node has “scale-down disabled” annotation.
Remove the annotation from the node.
CA is waiting for the duration specified in one of these flags:
Reduce the time specified in the relevant flag, or wait the specified time after the relevant event.
Failed attempt to remove the node (CA will wait another 5 minutes before trying again).
Wait 5 minutes and see if the issue is resolved.
Pending Nodes Exist But Cluster Does Not Scale Up
Here are reasons why CA might fail to scale up the cluster, and what you can do about them.
Reason Cluster Doesn’t Scale Up
What You Can Do
Existing pods have high resource requests, which won’t be satisfied by new nodes.
Enable CA to add large nodes, or reduce resource requests by pods.
All suitable node groups are at maximum size.
Increase the maximum size of the relevant node group.
Existing pods are not able to schedule on new nodes due to selectors or other settings.
Modify pod manifests to enable some pods to schedule on the new nodes. Learn more in our guide to node affinity.
NoVolumeZoneConflict error—this indicates that a StatefulSet needs to run in the same zone with a PersistentVolume (PV), but that zone has already reached its scaling limit.
From Kubernetes 1.13 onwards, you can run separate node groups per zone and use the --balance-similar-node-groups flag to keep them balanced across zones.
Cluster Autoscaler Stops Working
If CA appears to have stopped working, follow these steps to debug the problem:
Check if CA is running—you can check the latest events emitted by the kube-system/cluster-autoscaler-status ConfigMap. This should be no more than 3 minutes.
Check if cluster and node groups are in healthy state—this should be reported by the ConfigMap.
Check if there are unready nodes (CA version 1.24 and later)—if some nodes appear unready, check the resourceUnready count. If any nodes are marked as resourceUnready, the problem is likely with a device driver failing to install a required hardware resource.
If both cluster and CA are healthy, check:
Nodes with low utilization—if these nodes are not being scheduled, see the Nodes with Low Utilization section above.
Pending pods that do not trigger a scale up—see the Pending Nodes Exist section above.
Control plane CA logs—could indicate what is the problem preventing CA from scaling up or down, why it cannot remove a pod, or what was the scale-up plan.
CA events on the pod object—could provide clues why CA could not reschedule the pod.
Cloud provider resources quota—if there are failed attempts to add nodes, the problem could be resource quota with the public cloud provider.
Networking issues—if the cloud provider is managing to create nodes but they are not connecting to the cluster, this could indicate a networking issue.
Cluster Autoscaler Troubleshooting with Komodor
Kubernetes troubleshooting relies on the ability to quickly contextualize the problem with what’s happening in the rest of the cluster. More often than not, you will be conducting your investigation during fires in production. The major challenge is correlating service-level incidents with other events happening in the underlying infrastructure.
When using Cluster Autoscaler, there can be a variety of issues related to existing nodes or new nodes added to the cluster by CA automation. Komodor can help with our new ‘Node Status’ view, built to pinpoint correlations between service or deployment issues and changes in the underlying node infrastructure. With this view you can rapidly:
See service-to-node associations
Correlate service and node health issues
Gain visibility over node capacity allocations, restrictions, and limitations
Identify “noisy neighbors” that use up cluster resources
Keep track of changes in managed clusters
Get fast access to historical node-level event data
Beyond node error remediations, Komodor can help troubleshoot a variety of Kubernetes errors and issues, acting as a single source of truth (SSOT) for all of your K8s troubleshooting needs. Komodor provides:
Change intelligence: Every issue is a result of a change. Within seconds we can help you understand exactly who did what and when.
In-depth visibility: A complete activity timeline, showing all code and config changes, deployments, alerts, code diffs, pod logs and etc. All within one pane of glass with easy drill-down options.
Insights into service dependencies: An easy way to understand cross-service changes and visualize their ripple effects across your entire system.
Seamless notifications: Direct integration with your existing communication channels (e.g., Slack) so you’ll have all the information you need, when you need it.