Auto-scaling
This guide covers auto-scaling for CAPL clusters. The recommended tool for auto-scaling on Cluster API is Cluster Autoscaler.
Flavor
The auto-scaling feature is provided by an add-on as part of the Cluster Autoscaler flavor.
Configuration
By default, the Cluster Autoscaler add-on runs in the management cluster, managing an external workload cluster.
+------------+ +----------+
| mgmt | | workload |
| ---------- | kubeconfig | |
| autoscaler +------------>| |
+------------+ +----------+
A separate Cluster Autoscaler is deployed for each workload cluster, configured to only monitor node groups for the specific namespace and cluster name combination.
Role-based Access Control (RBAC)
Management Cluster
Due to constraints with the Kubernetes RBAC system (i.e. roles cannot be subdivided beyond namespace-granularity), the Cluster Autoscaler add-on is deployed on the management cluster to prevent leaking Cluster API data between workload clusters.
Workload Cluster
Currently, the Cluster Autoscaler reuses the ${CLUSTER_NAME}-kubeconfig
Secret generated by the bootstrap provider to
interact with the workload cluster. The kubeconfig contents must be stored in a key named value
. Due to this, all
Cluster Autoscaler actions in the workload cluster are performed as the cluster-admin
role.
Scale Down
Cluster Autoscaler decreases the size of the cluster when some nodes are consistently unneeded for a significant amount of time. A node is unneeded when it has low utilization and all of its important pods can be moved elsewhere.
By default, Cluster Autoscaler scales down a node after it is marked as unneeded for 10 minutes. This can be adjusted
with the --scale-down-unneeded-time
setting.
Kubernetes Cloud Controller Manager for Linode (CCM)
The Kubernetes Cloud Controller Manager for Linode is deployed on workload clusters and reconciles Kubernetes Node objects with their backing Linode infrastructure. When scaling down a node group, the Cluster Autoscaler also deletes the Kubernetes Node object on the workload cluster. This step preempts the Node-deletion in Kubernetes triggered by the CCM.