Cluster API Provider Linode


PLEASE NOTE: This project is considered ALPHA quality and should NOT be used for production, as it is currently in active development. Use at your own risk. APIs, configuration file formats, and functionality are all subject to change frequently. That said, please try it out in your development and test environments and let us know how it works for you. Contributions welcome! Thanks!


What is Cluster API Provider Linode (CAPL)

This is a Cluster API implementation for Linode to create, configure, and manage Kubernetes clusters.


Compatibility

Cluster API Versions

CAPL is compatible only with the v1beta1 version of CAPI (v1.x).

Kubernetes Versions

CAPL is able to install and manage the versions of Kubernetes supported by the Cluster API (CAPI) project.


Documentation

Please see our Book for in-depth user and developer documentation.

Topics

This section contains information about enabling and configuring various features for Cluster API Provider Linode

Getting started with CAPL

Prerequisites

For more information please see the Linode Guide.

Setting up your cluster environment variables

Once you have provisioned your PAT, save it in an environment variable along with other required settings:

export LINODE_REGION=us-ord
export LINODE_TOKEN=<your linode PAT>
export LINODE_CONTROL_PLANE_MACHINE_TYPE=g6-standard-2
export LINODE_MACHINE_TYPE=g6-standard-2

Info

This project uses linodego for Linode API interaction. Please refer to it for more details on environment variables used for client configuration.

Warning

For Regions and Images that do not yet support Akamai's cloud-init datasource CAPL will automatically use a stackscript shim to provision the node. If you are using a custom image ensure the cloud_init flag is set correctly on it

Warning

By default, clusters are provisioned within VPC with disk encryption enabled. For Regions which do not have VPC support yet, use the VPCLess flavor to have clusters provisioned. For disabling disk encryption, set spec.template.spec.diskEncryption=disabled in your generated LinodeMachineTemplate resources when creating a CAPL cluster.

Install CAPL on your management cluster

Warning

The linode-linode infrastructure provider requires clusterctl version 1.7.2 or higher

Install CAPL and enable the helm addon provider which is used by the majority of the CAPL flavors
clusterctl init --infrastructure linode-linode --addon helm
# Fetching providers
# Installing cert-manager Version="v1.14.5"
# Waiting for cert-manager to be available...
# Installing Provider="cluster-api" Version="v1.7.3" TargetNamespace="capi-system"
# Installing Provider="bootstrap-kubeadm" Version="v1.7.3" TargetNamespace="capi-kubeadm-bootstrap-system"
# Installing Provider="control-plane-kubeadm" Version="v1.7.3" TargetNamespace="capi-kubeadm-control-plane-system"
# Installing Provider="infrastructure-linode-linode" Version="v0.4.0" TargetNamespace="capl-system"
# Installing Provider="addon-helm" Version="v0.2.4" TargetNamespace="caaph-system"

Deploying your first cluster

Please refer to the default flavor section for creating your first Kubernetes cluster on Linode using Cluster API.

Troubleshooting Guide

This guide covers common issues users might run into when using Cluster API Provider Linode. This list is work-in-progress, please feel free to open a PR to add this guide if you find that useful information is missing.

Examples of common issues

No Linode resources are getting created

This could be due to the LINODE_TOKEN either not being set in your environment or expired. If expired, provision a new token and optionally set the "Expiry" to "Never" (default expiry is 6 months).

One or more control plane replicas are missing

Take a look at the KubeadmControlPlane controller logs and look for any potential errors:

kubectl logs deploy/capi-kubeadm-control-plane-controller-manager -n capi-kubeadm-control-plane-system manager

In addition, make sure all pods on the workload cluster are healthy, including pods in the kube-system namespace.

Otherwise, ensure that the linode-ccm is installed on your workload cluster via CAAPH.

Nodes are in NotReady state

Make sure a CNI is installed on the workload cluster and that all the pods on the workload cluster are in running state.

If the Cluster is labeled with cni: <cluster-name>-cilium, check that the <cluster-name>-cilium HelmChartProxy is installed in the management cluster and that the HelmChartProxy is in a Ready state:

kubectl get cluster $CLUSTER_NAME --show-labels
kubectl get helmchartproxies

Checking CAPI and CAPL resources

To check the progression of all CAPI and CAPL resources on the management cluster you can run:

kubectl get cluster-api

Looking at the CAPL controller logs

To check the CAPL controller logs on the management cluster, run:

kubectl logs deploy/capl-controller-manager -n capl-system manager

Checking cloud-init logs (Debian / Ubuntu)

Cloud-init logs can provide more information on any issues that happened when running the bootstrap script.

Warning

Not all Debian and Ubuntu images available from Linode support cloud-init! Please see the Availability section of the Linode Metadata Service Guide.

You can also see which images have cloud-init support via the linode-cli:

linode-cli images list | grep cloud-init

Please refer to the Troubleshoot Metadata and Cloud-Init section of the Linode Metadata Service Guide.

Overview

This section documents addons for self-managed clusters.

Note

Currently, all addons are installed via Cluster API Addon Provider Helm (CAAPH).

CAAPH is installed by default in the KIND cluster created by make tilt-cluster.

For more information, please refer to the CAAPH Quick Start.

Note

The Linode Cloud Controller Manager and Linode CSI Driver addons require the ClusterResourceSet feature flag to be set on the management cluster.

This feature flag is enabled by default in the KIND cluster created by make tilt-cluster.

For more information, please refer to the ClusterResourceSet page in The Cluster API Book.

Contents

CNI

In order for pod networking to work properly, a Container Network Interface (CNI) must be installed.

Cilium

Installed by default

To install Cilium on a self-managed cluster, simply apply the cni: <cluster-name>-cilium label on the Cluster resource if not already present.

kubectl label cluster $CLUSTER_NAME cni=$CLUSTER_NAME-cilium --overwrite

Cilium will then be automatically installed via CAAPH into the labeled cluster.

Enabled Features

By default, Cilium's BGP Control Plane is enabled when using Cilium as the CNI.

CCM

In order for the InternalIP and ExternalIP of the provisioned Nodes to be set correctly, a Cloud Controller Manager (CCM) must be installed.

Linode Cloud Controller Manager

Installed by default

To install the linode-cloud-controller-manager (linode-ccm) on a self-managed cluster, simply apply the ccm: <cluster-name>-linode label on the Cluster resource if not already present.

kubectl label cluster $CLUSTER_NAME ccm=$CLUSTER_NAME-linode --overwrite

The linode-ccm will then be automatically installed via CAAPH into the labeled cluster.

Container Storage

In order for stateful workloads to create PersistentVolumes (PVs), a storage driver must be installed.

Linode CSI Driver

Installed by default

To install the csi-driver-linode on a self-managed cluster, simply apply the csi: <cluster-name>-linode label on the Cluster resource if not already present.

kubectl label cluster $CLUSTER_NAME csi=$CLUSTER_NAME-linode --overwrite

The csi-driver-linode will then be automatically installed via CAAPH into the labeled cluster.

Flavors

This section contains information about supported flavors in Cluster API Provider Linode

In clusterctl the infrastructure provider authors can provide different types of cluster templates referred to as "flavors". You can use the --flavor flag to specify which flavor to use for a cluster, e.g:

clusterctl generate cluster test-cluster --flavor clusterclass-kubeadm

To use the default flavor, omit the --flavor flag.

See the clusterctl flavors docs for more information.



Supported CAPL flavors

Control PlaneFlavorNotes
kubeadmdefaultInstalls Linode infra resources, kubeadm resources,
CNI, CSI driver, CCM and clusterresourceset
kubeadm-cluster-autoscalarInstalls default along with the cluster autoscalar
add-on
kubeadm-etcd-diskInstalls default along with the disk configuration
for etcd disk
kubeadm-etcd-backup-restoreInstalls default along with etcd-backup-restore addon
kubeadm-vpclessInstalls default without a VPC
kubeadm-dualstackInstalls vpcless and enables IPv6 along with IPv4
kubeadm-self-healingInstalls default along with the machine-health-check
add-on
kubeadm-konnectivityInstalls and configures konnectivity within cluster
kubeadm-fullInstalls all non-vpcless based flavors combinations
kubeadm-fullvpclessInstalls all vpcless based flavors combinations
k3sk3sInstalls Linode infra resources, k3s resources and
cilium network policies
k3s-cluster-autoscalarInstalls default along with the cluster autoscalar
add-on
k3s-etcd-backup-restoreInstalls default along with etcd-backup-restore addon
k3s-vpclessInstalls default without a VPC
k3s-dualstackInstalls vpcless and enables IPv6 along with IPv4
k3s-self-healingInstalls default along with the machine-health-check
add-on
k3s-fullInstalls all non-vpcless based flavors combinations
k3s-fullvpclessInstalls all vpcless based flavors combinations
rke2rke2Installs Linode infra resources, rke2 resources,
cilium and cilium network policies
rke2-cluster-autoscalarInstalls default along with the cluster autoscalar
add-on
rke2-etcd-diskInstalls default along with the disk configuration
for etcd disk
rke2-etcd-backup-restoreInstalls default along with etcd-backup-restore addon
rke2-vpclessInstalls default without a VPC
rke2-self-healingInstalls default along with the machine-health-check
add-on
rke2-fullInstalls all non-vpcless based flavors combinations
rke2-fullvpclessInstalls all vpcless based flavors combinations

Default

Specification

Control PlaneCNIDefault OSInstalls ClusterClassIPv4IPv6
KubeadmCiliumUbuntu 22.04NoYesNo

Prerequisites

Quickstart completed

Usage

  1. Generate cluster yaml
    clusterctl generate cluster test-cluster \
        --kubernetes-version v1.29.1 \
        --infrastructure linode-linode > test-cluster.yaml
    
  2. Apply cluster yaml
    kubectl apply -f test-cluster.yaml
    

Dual-Stack

Specification

Supported Control PlaneCNIDefault OSInstalls ClusterClassIPv4IPv6
kubeadm, k3sCiliumUbuntu 22.04NoYesYes

Prerequisites

Quickstart completed

Usage

  1. Generate cluster yaml
    clusterctl generate cluster test-cluster \
        --kubernetes-version v1.29.1 \
        --infrastructure linode-linode \
        --flavor <controlplane>-dual-stack > test-cluster.yaml
    
  2. Apply cluster yaml
    kubectl apply -f test-cluster.yaml
    

Etcd-disk

This flavor configures etcd to be on a separate disk from the OS disk. By default it configures the size of the disk to be 10 GiB and sets the quota-backend-bytes to 8589934592 (8 GiB) per recommendation from the etcd documentation.

Specification

Supported Control PlaneCNIDefault OSInstalls ClusterClassIPv4IPv6
kubeadm, rke2CiliumUbuntu 22.04NoYesYes

Prerequisites

Quickstart completed

Usage

  1. Generate cluster yaml
    clusterctl generate cluster test-cluster \
        --kubernetes-version v1.29.1 \
        --infrastructure linode-linode \
        --flavor <controlplane>-etcd-disk > test-cluster.yaml
    
  2. Apply cluster yaml
    kubectl apply -f test-cluster.yaml
    

etcd-backup-restore

Specification

Supported Control PlaneCNIDefault OSInstalls ClusterClassIPv4IPv6
kubeadm, k3s, rke2CiliumUbuntu 22.04NoYesYes

Prerequisites

Quickstart completed

Usage

  1. Generate cluster yaml
    clusterctl generate cluster test-cluster \
        --kubernetes-version v1.29.1 \
        --infrastructure linode-linode \
        --flavor <controlplane>-etcd-backup-restore > test-cluster.yaml
    
  2. Apply cluster yaml
    kubectl apply -f test-cluster.yaml
    

Notes

This flavor is identical to the default flavor with the addon etcd-backup-restore enabled

Usage

Refer backups.md

Kubeadm ClusterClass

Specification

Control PlaneCNIDefault OSInstalls ClusterClassIPv4IPv6
KubeadmCiliumUbuntu 22.04YesYesNo

Prerequisites

Quickstart completed

Usage

Create clusterClass and first cluster

  1. Generate the ClusterClass and cluster manifests
    clusterctl generate cluster test-cluster \
        --kubernetes-version v1.29.1 \
        --infrastructure linode-linode \
        --flavor clusterclass-kubeadm > test-cluster.yaml
    
  2. Apply cluster manifests
    kubectl apply -f test-cluster.yaml
    

(Optional) Create a second cluster using the existing ClusterClass

  1. Generate cluster manifests
    clusterctl generate cluster test-cluster-2 \
        --kubernetes-version v1.29.1 \
        --flavor clusterclass-kubeadm > test-cluster-2.yaml
    
    apiVersion: cluster.x-k8s.io/v1beta1
    kind: Cluster
    metadata:
      labels:
        ccm: test-cluster-2-linode
        cni: test-cluster-2-cilium
        crs: test-cluster-2-crs
      name: test-cluster-2
      namespace: default
    spec:
      clusterNetwork:
        pods:
          cidrBlocks:
          - 10.192.0.0/10
      topology:
        class: kubeadm
        controlPlane:
          replicas: 1
        variables:
        - name: region
          value: us-ord
        - name: controlPlaneMachineType
          value: g6-standard-2
        - name: workerMachineType
          value: g6-standard-2
        version: v1.29.1
        workers:
          machineDeployments:
          - class: default-worker
            name: md-0
            replicas: 1
    
  2. Apply cluster manifests
    kubectl apply -f test-cluster-2.yaml
    

Cluster Autoscaler

This flavor adds auto-scaling via Cluster Autoscaler.

Specification

Supported Control PlaneCNIDefault OSInstalls ClusterClassIPv4IPv6
kubeadm, k3s, rke2CiliumUbuntu 22.04NoYesYes

Prerequisites

Quickstart completed

Usage

  1. Set up autoscaling environment variables

    We recommend using Cluster Autoscaler with the Kubernetes control plane ... version for which it was meant.

    -- Releases · kubernetes/autoscaler

    export CLUSTER_AUTOSCALER_VERSION=v1.29.0
    # Optional: If specified, these values must be explicitly quoted!
    export WORKER_MACHINE_MIN='"1"'
    export WORKER_MACHINE_MAX='"10"'
    
  2. Generate cluster yaml

    clusterctl generate cluster test-cluster \
        --kubernetes-version v1.29.1 \
        --infrastructure linode-linode \
        --flavor <controlplane>-cluster-autoscaler > test-cluster.yaml
    
  3. Apply cluster yaml

    kubectl apply -f test-cluster.yaml
    

Cilium BGP Load-Balancing

This flavor creates special labeled worker nodes for ingress which leverage Cilium's BGP Control Plane and LB IPAM support.

With this flavor, Services exposed via type: LoadBalancer automatically get assigned an ExternalIP provisioned as a shared IP through the linode-CCM, which is deployed with the necessary settings to perform shared IP load-balancing.

Warning

There are a couple important caveats to load balancing support based on current Linode networking and API limitations:

  1. Ingress traffic will not be split between BGP peer nodes

    Equal-Cost Multi-Path (ECMP) is not supported on the BGP routers so ingress traffic will not be split between each BGP Node in the cluster. One Node will be actively receiving traffic and the other(s) will act as standby(s).

  2. Customer support is required to use this feature at this time

    Since this uses additional IPv4 addresses on the nodes participating in Cilium's BGPPeeringPolicy, you need to contact our Support team to be permitted to add extra IPs.

Note

Dual-stack support is enabled for clusters using this flavor since IPv6 is used for router and neighbor solicitation.

Without enabling dual-stack support, the IPv6 traffic is blocked if the Cilium host firewall is enabled (which it is by default in CAPL), even if there are no configured CiliumClusterWideNetworkPolicies or the policy is set to audit (default) instead of enforce (see https://github.com/cilium/cilium/issues/27484). More information about firewalling can be found on the Firewalling page.

Specification

Control PlaneCNIDefault OSInstalls ClusterClassIPv4IPv6
KubeadmCiliumUbuntu 22.04NoYesYes

Prerequisites

  1. Quickstart completed

Usage

  1. (Optional) Set up environment variable

    # Optional
    export BGP_PEER_MACHINE_COUNT=2
    
  2. Generate cluster yaml

    clusterctl generate cluster test-cluster \
        --kubernetes-version v1.29.1 \
        --infrastructure linode-linode \
        --flavor kubeadm-cilium-bgp-lb > test-cluster.yaml
    
  3. Apply cluster yaml

    kubectl apply -f test-cluster.yaml
    

After the cluster exists, you can create a Service exposed with type: LoadBalancer and it will automatically get assigned an ExternalIP. It's recommended to set up an ingress controller (e.g. https://docs.cilium.io/en/stable/network/servicemesh/ingress/) to avoid needing to expose multiple LoadBalancer Services within the cluster.

K3s

Specification

Control PlaneCNIDefault OSInstalls ClusterClassIPv4IPv6
k3sCiliumUbuntu 22.04NoYesNo

Prerequisites

  • Quickstart completed
  • Select a k3s kubernetes version to set for the kubernetes version
  • Installed k3s bootstrap provider into your management cluster
    • Add the following to ~/.cluster-api/clusterctl.yaml for the k3s bootstrap/control plane providers
      providers:
        - name: "k3s"
          url: https://github.com/k3s-io/cluster-api-k3s/releases/latest/bootstrap-components.yaml
          type: "BootstrapProvider"
        - name: "k3s"
          url: https://github.com/k3s-io/cluster-api-k3s/releases/latest/control-plane-components.yaml
          type: "ControlPlaneProvider"
          
      
    • Install the k3s provider into your management cluster
      clusterctl init --bootstrap k3s --control-plane k3s
      

Usage

  1. Generate cluster yaml
    clusterctl generate cluster test-cluster \
        --kubernetes-version v1.29.1+k3s2 \
        --infrastructure linode-linode \
        --flavor k3s > test-k3s-cluster.yaml
    
  2. Apply cluster yaml
    kubectl apply -f test-k3s-cluster.yaml
    

RKE2

This flavor uses RKE2 for the kubernetes distribution. By default it configures the cluster with the CIS profile:

Using the generic cis profile will ensure that the cluster passes the CIS benchmark (rke2-cis-1.XX-profile-hardened) associated with the Kubernetes version that RKE2 is running. For example, RKE2 v1.28.XX with the profile: cis will pass the rke2-cis-1.7-profile-hardened in Rancher.

Warning

Until this upstream PR is merged, CIS profile enabling will not work for RKE2 versions >= v1.29.

Specification

Control PlaneCNIDefault OSInstalls ClusterClassIPv4IPv6
rke2CiliumUbuntu 22.04NoYesNo

Prerequisites

Usage

  1. Generate cluster yaml
    clusterctl generate cluster test-cluster \
        --kubernetes-version v1.29.1+rke2r1 \
        --infrastructure linode-linode \
        --flavor rke2 > test-rke2-cluster.yaml
    
  2. Apply cluster yaml
    kubectl apply -f test-rke2-cluster.yaml
    

VPCLess

This flavor supports provisioning k8s clusters outside of VPC. It uses kubeadm for setting up control plane and uses cilium with VXLAN for pod networking.

Specification

Supported Control PlaneCNIDefault OSInstalls ClusterClassIPv4IPv6
kubeadm, k3s, rke2CiliumUbuntu 22.04NoYesNo

Prerequisites

Quickstart completed

Notes

This flavor is identical to the default flavor with the exception that it provisions k8s clusters without VPC. Since it runs outside of VPC, native routing is not supported in this flavor and it uses VXLAN for pod to pod communication.

Usage

  1. Generate cluster yaml
    clusterctl generate cluster test-cluster \
        --infrastructure linode-linode \
        --flavor <controlplane>-vpcless > test-cluster.yaml
    
  2. Apply cluster yaml
    kubectl apply -f test-cluster.yaml
    

Konnectivity

This flavor supports provisioning k8s clusters with konnectivity configured.It uses kubeadm for setting up control plane and uses cilium with native routing for pod networking.

Specification

Supported Control PlaneCNIDefault OSInstalls ClusterClassIPv4IPv6
kubeadmCiliumUbuntu 22.04NoYesNo

Prerequisites

Quickstart completed

Notes

This flavor configures apiserver with konnectivity. Traffic from apiserver to cluster flows over the tunnels created between konnectivity-server and konnectivity-agent.

Usage

  1. Generate cluster yaml
    clusterctl generate cluster test-cluster \
        --infrastructure linode-linode \
        --flavor <controlplane>-konnectivity > test-cluster.yaml
    
  2. Apply cluster yaml
    kubectl apply -f test-cluster.yaml
    

DNS based apiserver Load Balancing

This flavor configures DNS records that resolve to the public (ipv4 and/or IPv6) IPs of the control plane nodes where the apiserver pods are running. No NodeBalancer will be created. The following need to be set in the LinodeCluster spec under network

kind: LinodeCluster
metadata:
    name: test-cluster
spec:
    network:
        loadBalancerType: dns
        dnsRootDomain: test.net
        dnsUniqueIdentifier: abc123

We support DNS management with both, Linode Cloud Manager as well as Akamai Edge DNS. We default to the linode provider but to use akamai, you'll need

kind: LinodeCluster
metadata:
    name: test-cluster
spec:
    network:
        loadBalancerType: dns
        dnsRootDomain: test.net
        dnsUniqueIdentifier: abc123
        dnsProvider: akamai

Along with this, the test.net domain needs to be registered and also be pre-configured as a domain on Linode or zone on Akamai. With these changes, the controlPlaneEndpoint is set to test-cluster-abc123.test.net. This will be set as the server in the KUBECONFIG as well. If users wish to override the subdomain format with something custom, they can pass in the override using the env var DNS_SUBDOMAIN_OVERRIDE.

kind: LinodeCluster
metadata:
    name: test-cluster
spec:
    network:
        loadBalancerType: dns
        dnsRootDomain: test.net
        dnsProvider: akamai
        dnsSubDomainOverride: my-special-overide

This will replace the subdomain creation from test-cluster-abc123.test.net to make the url my-special-overide.test.net.

The controller will create A/AAAA and TXT records under the Domains tab in the Linode Cloud Manager. or Akamai Edge DNS depending on the provider.

Linode Domains:

Using the LINODE_DNS_TOKEN env var, you can pass the API token of a different account if the Domain has been created in another acount under Linode CM:

export LINODE_DNS_TOKEN=<your Linode PAT>

Optionally, provide an alternative Linode API URL and root CA certificate.

export LINODE_DNS_URL=custom.api.linode.com
export LINODE_DNS_CA=/path/to/cacert.pem

Akamai Domains:

For the controller to authenticate with the Edge DNS API, you'll need to set the following env vars when creating the mgmt cluster.

AKAMAI_ACCESS_TOKEN=""
AKAMAI_CLIENT_SECRET=""
AKAMAI_CLIENT_TOKEN=""
AKAMAI_HOST=""

You can read about how you can create these here.

Specification

Supported Control PlaneCNIDefault OSInstalls ClusterClassIPv4IPv6
kubeadmCiliumUbuntu 22.04NoYesYes

Prerequisites

Quickstart completed

Usage

  1. Generate cluster yaml
    clusterctl generate cluster test-cluster \
        --kubernetes-version v1.29.1 \
        --infrastructure linode-linode \
        --control-plane-machine-count 3 --worker-machine-count 3 \
        --flavor <controlplane>-dns-loadbalancing > test-cluster.yaml
    
  2. Apply cluster yaml
    kubectl apply -f test-cluster.yaml
    

Check

You should in a few moments see the records created and running a nslookup against the server endpoint should return a multianswer dns record

Flatcar

This flavor supports provisioning k8s clusters outside of VPC using Flatcar as a base OS. It uses kubeadm for setting up control plane and uses cilium with VXLAN for pod networking.

Specification

Supported Control PlaneCNIDefault OSInstalls ClusterClassIPv4IPv6
kubeadmCiliumFlatcarNoYesNo

Notes

This flavor is identical to the default flavor with the exception that it provisions k8s clusters without VPC using Flatcar as a base OS. Since it runs outside of VPC, native routing is not supported in this flavor and it uses VXLAN for pod to pod communication.

Usage

Initialization

Before generating the cluster configuration, it is required to initialize the management cluster with Ignition support to provision Flatcar nodes:

export EXP_KUBEADM_BOOTSTRAP_FORMAT_IGNITION=true
clusterctl init --infrastructure linode-linode --addon helm

Import the Flatcar image

Flatcar is not officially provided by Akamai/Linode so it is required to import a Flatcar image. Akamai support is available on Flatcar since the release 4012.0.0: all releases equal or greater than this major release will fit.

To import the image, it is recommended to follow this documentation: https://www.flatcar.org/docs/latest/installing/community-platforms/akamai/#importing-an-image

By following this import step, you will get the Flatcar image ID stored into IMAGE_ID.

Configure and deploy the workload cluster

  1. Set the Flatcar image name from the previous step:

    export FLATCAR_IMAGE_NAME="${IMAGE_ID}"
    
  2. Generate cluster yaml

    clusterctl generate cluster test-cluster \
        --kubernetes-version v1.29.1 \
        --infrastructure linode-linode \
        --flavor kubeadm-flatcar > test-cluster.yaml
    
  3. Apply cluster yaml

    kubectl apply -f test-cluster.yaml
    

Etcd

This guide covers etcd configuration for the control plane of provisioned CAPL clusters.

Default configuration

The quota-backend-bytes for etcd is set to 8589934592 (8 GiB) per recommendation from the etcd documentation.

By default, etcd is configured to be on the same disk as the root filesystem on control plane nodes. If users prefer etcd to be on a separate disk, see the etcd-disk flavor.

ETCD Backups

By default, etcd is not backed-up. To enable backups, users need to choose the etcd-backup-restore flavor.

To begin with, this will deploy a Linode OBJ bucket. This serves as the S3-compatible target to store backups.

Next up, on provisioning the cluster, etcd-backup-restore is deployed as a statefulset. The pod will need the bucket details like the name, region, endpoints and access credentials which are passed using the bucket-details secret that is created when the OBJ bucket gets created.

Enabling SSE

Users can also enable SSE (Server-side encryption) by passing a SSE AES-256 Key as an env var. All env vars here on the pod can be controlled during the provisioning process.

Warning

This is currently under development and will be available for use once the upstream PR is merged and an official image is made available

For eg:

export CLUSTER_NAME=test
export OBJ_BUCKET_REGION=us-ord
export ETCDBR_IMAGE=docker.io/username/your-custom-image:version
export SSE_KEY=cdQdZ3PrKgm5vmqxeqwQCuAWJ7pPVyHg
clusterctl generate cluster $CLUSTER_NAME \
  --kubernetes-version v1.29.1 \
  --infrastructure linode-linode \
  --flavor etcd-backup-restore \
  | kubectl apply -f -

Backups

CAPL supports performing etcd backups by provisioning an Object Storage bucket and access keys. This feature is not enabled by default and can be configured as an addon.

Warning

Enabling this addon requires enabling Object Storage in the account where the resources will be provisioned. Please refer to the Pricing information in Linode's Object Storage documentation.

Enabling Backups

To enable backups, use the addon flag during provisioning to select the etcd-backup-restore addon

clusterctl generate cluster $CLUSTER_NAME \
  --kubernetes-version v1.29.1 \
  --infrastructure linode-linode \
  --flavor etcd-backup-restore \
  | kubectl apply -f -

For more fine-grain control and to know more about etcd backups, refer to the backups section of the etcd page

Object Storage

Additionally, CAPL can be used to provision Object Storage buckets and access keys for general purposes by configuring LinodeObjectStorageBucket and LinodeObjectStorageKey resources.

Warning

Using this feature requires enabling Object Storage in the account where the resources will be provisioned. Please refer to the Pricing information in Linode's Object Storage documentation.

Bucket Creation

The following is the minimal required configuration needed to provision an Object Storage bucket.

apiVersion: infrastructure.cluster.x-k8s.io/v1alpha2
kind: LinodeObjectStorageBucket
metadata:
  name: <unique-bucket-label>
  namespace: <namespace>
spec:
  region: <object-storage-region>

Upon creation of the resource, CAPL will provision a bucket in the region specified using the .metadata.name as the bucket's label.

Warning

The bucket label must be unique within the region across all accounts. Otherwise, CAPL will populate the resource status fields with errors to show that the operation failed.

Bucket Status

Upon successful provisioning of a bucket, the LinodeObjectStorageBucket resource's status will resemble the following:

apiVersion: infrastructure.cluster.x-k8s.io/v1alpha2
kind: LinodeObjectStorageBucket
metadata:
  name: <unique-bucket-label>
  namespace: <namespace>
spec:
  region: <object-storage-region>
status:
  ready: true
  conditions:
    - type: Ready
      status: "True"
      lastTransitionTime: <timestamp>
  hostname: <hostname-for-bucket>
  creationTime: <bucket-creation-timestamp>

Access Key Creation

The following is the minimal required configuration needed to provision an Object Storage key.

apiVersion: infrastructure.cluster.x-k8s.io/v1alpha2
kind: LinodeObjectStorageKey
metadata:
  name: <unique-key-label>
  namespace: <namespace>
spec:
  bucketAccess:
    - bucketName: <unique-bucket-label>
      permissions: read_only
      region: <object-storage-region>
  generatedSecret:
    type: Opaque

Upon creation of the resource, CAPL will provision an access key in the region specified using the .metadata.name as the key's label.

The credentials for the provisioned access key will be stored in a Secret. By default, the Secret is generated in the same namespace as the LinodeObjectStorageKey:

apiVersion: v1
kind: Secret
metadata:
  name: <unique-bucket-label>-obj-key
  namespace: <same-namespace-as-object-storage-bucket>
  ownerReferences:
    - apiVersion: infrastructure.cluster.x-k8s.io/v1alpha2
      kind: LinodeObjectStorageBucket
      name: <unique-bucket-label>
      controller: true
      uid: <unique-uid>
data:
  access_key: <base64-encoded-access-key>
  secret_key: <base64-encoded-secret-key>

The secret is owned and managed by CAPL during the life of the LinodeObjectStorageBucket.

Access Key Status

Upon successful provisioning of a key, the LinodeObjectStorageKey resource's status will resemble the following:

apiVersion: infrastructure.cluster.x-k8s.io/v1alpha2
kind: LinodeObjectStorageKey
metadata:
  name: <unique-key-label>
  namespace: <namespace>
spec:
  bucketAccess:
    - bucketName: <unique-bucket-label>
      permissions: read_only
      region: <object-storage-region>
  generatedSecret:
    type: Opaque
status:
  ready: true
  conditions:
    - type: Ready
      status: "True"
      lastTransitionTime: <timestamp>
  accessKeyRef: <object-storage-key-id>
  creationTime: <key-creation-timestamp>
  lastKeyGeneration: 0

Access Key Rotation

The following configuration with keyGeneration set to a new value (different from .status.lastKeyGeneration) will instruct CAPL to rotate the access key.

apiVersion: infrastructure.cluster.x-k8s.io/v1alpha2
kind: LinodeObjectStorageKey
metadata:
  name: <unique-key-label>
  namespace: <namespace>
spec:
  bucketAccess:
    - bucketName: <unique-bucket-label>
      permissions: read_only
      region: <object-storage-region>
  generatedSecret:
    type: Opaque
  keyGeneration: 1
# status:
#   lastKeyGeneration: 0

Resource Deletion

When deleting a LinodeObjectStorageKey resource, CAPL will deprovision the access key and delete the managed secret. However, when deleting a LinodeObjectStorageBucket resource, CAPL will retain the underlying bucket to avoid unintended data loss.

Multi-Tenancy

CAPL can manage multi-tenant workload clusters across Linode accounts. Custom resources may reference an optional Secret containing their Linode credentials (i.e. API token) to be used for the deployment of Linode resources (e.g. Linodes, VPCs, NodeBalancers, etc.) associated with the cluster.

The following example shows a basic credentials Secret:

apiVersion: v1
kind: Secret
metadata:
  name: linode-credentials
stringData:
  apiToken: <LINODE_TOKEN>

Warning

The Linode API token data must be put in a key named apiToken!

Which may be optionally consumed by one or more custom resource objects:

# Example: LinodeCluster
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha1
kind: LinodeCluster
metadata:
  name: test-cluster
spec:
  credentialsRef:
    name: linode-credentials
  ...
---
# Example: LinodeVPC
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha1
kind: LinodeVPC
metadata:
  name: test-vpc
spec:
  credentialsRef:
    name: linode-credentials
  ...
---
# Example: LinodeMachine
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha1
kind: LinodeMachine
metadata:
  name: test-machine
spec:
  credentialsRef:
    name: linode-credentials
  ...
---
# Example: LinodeObjectStorageBucket
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha2
kind: LinodeObjectStorageBucket
metadata:
  name: test-bucket
spec:
  credentialsRef:
    name: linode-credentials
  ...
---
# Example: LinodeObjectStorageKey
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha2
kind: LinodeObjectStorageKey
metadata:
  name: test-key
spec:
  credentialsRef:
    name: linode-credentials
  ...

Secrets from other namespaces by additionally specifying an optional .spec.credentialsRef.namespace value.

Warning

If .spec.credentialsRef is set for a LinodeCluster, it should also be set for adjacent resources (e.g. LinodeVPC).

LinodeMachine

For LinodeMachines, credentials set on the LinodeMachine object will override any credentials supplied by the owner LinodeCluster. This can allow cross-account deployment of the Linodes for a cluster.

Disks

This section contains information about OS and data disk configuration in Cluster API Provider Linode

OS Disk

This section describes how to configure the root disk for provisioned linode. By default, the OS disk will be dynamically sized to use any size available in the linode plan that is not taken up by data disks.

Setting OS Disk Size

Use the osDisk section to specify the exact size the OS disk should be. The default behaviour if this is not set is the OS disk will dynamically be sized to the maximum allowed by the linode plan with any data disk sizes taken into account.

---
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha1
kind: LinodeMachineTemplate
metadata:
  name: ${CLUSTER}-control-plane
spec:
  template:
    spec:
      region: us-ord
      type: g6-standard-4
      osDisk:
        size: 100Gi



Setting OS Disk Label

The default label on the root OS disk can be overridden by specifying a label in the osDisk field. The label can only be set if an explicit size is being set as size is a required field

---
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha1
kind: LinodeMachineTemplate
metadata:
  name: ${CLUSTER}-control-plane
  namespace: default
spec:
  template:
    spec:
      image: ""
      region: us-ord
      type: g6-standard-4
      osDisk:
        label: root-disk
        size: 10Gi

Data Disks

This section describes how to specify additional data disks for a linode instance. These disks can use devices sdb through sdh for a total of 7 disks.

Warning

There are a couple caveats with specifying disks for a linode instance:

  1. The total size of these disks + the OS Disk cannot exceed the linode instance plan size.
  2. Instance disk configuration is currently immutable via CAPL after the instance is booted.

Warning

Currently SDB is being used by a swap disk, replacing this disk with a data disk will slow down linode creation by up to 90 seconds. This will be resolved when the disk creation refactor is finished in PR #216

Specify a data disk

A LinodeMachine can be configured with additional data disks with the key being the device to be mounted as and including an optional label and size.

  • size Required field. resource.Quantity for the size if a disk. The sum of all data disks must not be more than allowed by the linode plan.
  • label Optional field. The label for the disk, defaults to the device name
  • diskID Optional field used by the controller to track disk IDs, this should not be set unless a disk is created outside CAPL
  • filesystem Optional field used to specify the type filesystem of disk to provision, the default is ext4 and valid options are any supported linode filesystem
---
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha1
kind: LinodeMachineTemplate
metadata:
  name: ${CLUSTER}-control-plane
spec:
  template:
    spec:
      region: us-ord
      type: g6-standard-4
      dataDisks:
        sdc:
          label: etcd_disk
          size: 16Gi
        sdd:
          label: data_disk
          size: 10Gi

Use a data disk for an explicit etcd data disk

The following configuration can be used to configure a separate disk for etcd data on control plane nodes.

---
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha1
kind: LinodeMachineTemplate
metadata:
  name: ${CLUSTER}-control-plane
spec:
  template:
    spec:
      region: us-ord
      type: g6-standard-4
      dataDisks:
        sdc:
          label: etcd_disk
          size: 16Gi

---
kind: KubeadmControlPlane
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
metadata:
  name: "${CLUSTER_NAME}-control-plane"
spec:
    diskSetup:
      filesystems:
        - label: etcd_data
          filesystem: ext4
          device: /dev/sdc
    mounts:
      - - LABEL=etcd_data
        - /var/lib/etcd_data

Machine Health Checks

CAPL supports auto-remediation of workload cluster Nodes considered to be unhealthy via MachineHealthChecks.

Enabling Machine Health Checks

While it is possible to manually create and apply a MachineHealthCheck resource into the management cluster, using the self-healing flavor is the quickest way to get started:

clusterctl generate cluster $CLUSTER_NAME \
  --kubernetes-version v1.29.1 \
  --infrastructure linode-linode \
  --flavor self-healing \
  | kubectl apply -f -

This flavor deploys a MachineHealthCheck for the workers and another MachineHealthCheck for the control plane of the cluster. It also configures the remediation strategy of the kubeadm control plane to prevent unnecessary load on the infrastructure provider.

Configuring Machine Health Checks

Refer to the Cluster API documentation for further information on configuring and using MachineHealthChecks.

Auto-scaling

This guide covers auto-scaling for CAPL clusters. The recommended tool for auto-scaling on Cluster API is Cluster Autoscaler.

Flavor

The auto-scaling feature is provided by an add-on as part of the Cluster Autoscaler flavor.

Configuration

By default, the Cluster Autoscaler add-on runs in the management cluster, managing an external workload cluster.

+------------+             +----------+
|    mgmt    |             | workload |
| ---------- | kubeconfig  |          |
| autoscaler +------------>|          |
+------------+             +----------+

A separate Cluster Autoscaler is deployed for each workload cluster, configured to only monitor node groups for the specific namespace and cluster name combination.

Role-based Access Control (RBAC)

Management Cluster

Due to constraints with the Kubernetes RBAC system (i.e. roles cannot be subdivided beyond namespace-granularity), the Cluster Autoscaler add-on is deployed on the management cluster to prevent leaking Cluster API data between workload clusters.

Workload Cluster

Currently, the Cluster Autoscaler reuses the ${CLUSTER_NAME}-kubeconfig Secret generated by the bootstrap provider to interact with the workload cluster. The kubeconfig contents must be stored in a key named value. Due to this, all Cluster Autoscaler actions in the workload cluster are performed as the cluster-admin role.

Scale Down

Cluster Autoscaler decreases the size of the cluster when some nodes are consistently unneeded for a significant amount of time. A node is unneeded when it has low utilization and all of its important pods can be moved elsewhere.

By default, Cluster Autoscaler scales down a node after it is marked as unneeded for 10 minutes. This can be adjusted with the --scale-down-unneeded-time setting.

Kubernetes Cloud Controller Manager for Linode (CCM)

The Kubernetes Cloud Controller Manager for Linode is deployed on workload clusters and reconciles Kubernetes Node objects with their backing Linode infrastructure. When scaling down a node group, the Cluster Autoscaler also deletes the Kubernetes Node object on the workload cluster. This step preempts the Node-deletion in Kubernetes triggered by the CCM.

Additional Resources

VPC

This guide covers how VPC is used with CAPL clusters. By default, CAPL clusters are provisioned within VPC.

Default configuration

Each linode within a cluster gets provisioned with two interfaces:

  1. eth0 (connected to VPC, for pod-to-pod traffic and public traffic)
  2. eth1 (for nodebalancer traffic)

Key facts about VPC network configuration:

  1. VPCs are provisioned with a private subnet 10.0.0.0/8.
  2. All pod-to-pod communication happens over the VPC interface (eth0).
  3. We assign a pod CIDR of range 10.192.0.0/10 for pod-to-pod communication.
  4. By default, cilium is configured with native routing
  5. Kubernetes host-scope IPAM mode is used to assign pod CIDRs to nodes. We run linode CCM with route-controller enabled which automatically adds/updates routes within VPC when pod cidrs are added/updated by k8s. This enables pod-to-pod traffic to be routable within the VPC.
  6. kube-proxy is disabled by default.

Configuring the VPC interface

In order to configure the VPC interface beyond the default above, an explicit interface can be configured in the LinodeMachineTemplate. When the LinodeMachine controller find an interface with purpose: vpc it will automatically inject the SubnetID from the VPCRef.

Example template where the VPC interface is not the primary interface

---
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha2
kind: LinodeMachineTemplate
metadata:
  name: test-cluster-md-0
  namespace: default
spec:
  template:
    spec:
      region: "us-mia"
      type: "g6-standard-4"
      image: linode/ubuntu22.04
      interfaces:
      - purpose: vpc
        primary: false
      - purpose: public
        primary: true

How VPC is provisioned

A VPC is tied to a region. CAPL generates LinodeVPC manifest which contains the VPC name, region and subnet information. By defult, VPC name is set to cluster name but can be overwritten by specifying relevant environment variable.

---
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha1
kind: LinodeVPC
metadata:
  name: ${VPC_NAME:=${CLUSTER_NAME}}
  labels:
    cluster.x-k8s.io/cluster-name: ${CLUSTER_NAME}
spec:
  region: ${LINODE_REGION}
  subnets:
    - ipv4: 10.0.0.0/8
      label: default

Reference to LinodeVPC object is added to LinodeCluster object which then uses the specified VPC to provision resources.

Troubleshooting

If pod-to-pod connectivity is failing

If a pod can't ping pod ips on different node, check and make sure pod CIDRs are added to ip_ranges of VPC interface.

curl --header 'Authorization: Bearer $LINODE_API_TOKEN' -X GET https://api.linode.com/v4/linode/instances/${LINODEID}/configs | jq .data[0].interfaces[].ip_ranges

Note

CIDR returned in the output of above command should match with the pod CIDR present in node's spec k get node <nodename> -o yaml | yq .spec.podCIDRs

Running cilium connectivity tests

One can also run cilium connectivity tests to make sure networking works fine within VPC. Follow the steps defined in cilium e2e tests guide to install cilium binary, set the KUBECONFIG variable and then run cilium connectivity tests.

Firewalling

This guide covers how Cilium and Cloud Firewalls can be used for firewalling CAPL clusters.

Cilium Firewalls

Cilium provides cluster-wide firewalling via Host Policies which enforce access control over connectivity to and from cluster nodes. Cilium's host firewall is responsible for enforcing the security policies.

Default Cilium Host Firewall Configuration

By default, the following Host Policies are set to audit mode (without any enforcement) on CAPL clusters:

  • Kubeadm cluster allow rules

    PortsUse-caseAllowed clients
    ${APISERVER_PORT:=6443}API Server TrafficWorld
    *In Cluster CommunicationIntra Cluster Traffic

Note

For kubeadm clusters running outside of VPC, ports 2379 and 2380 are also allowed for etcd-traffic.

  • k3s cluster allow rules

    PortsUse-caseAllowed clients
    6443API Server TrafficWorld
    *In Cluster CommunicationIntra Cluster and VPC Traffic
  • RKE2 cluster allow rules

    PortsUse-caseAllowed clients
    6443API Server TrafficWorld
    *In Cluster CommunicationIntra Cluster and VPC Traffic

Enabling Cilium Host Policy Enforcement

In order to turn the Cilium Host Policies from audit to enforce mode, use the environment variable FW_AUDIT_ONLY=false when generating the cluster. This will set the policy-audit-mode on the Cilium deployment.

Adding Additional Cilium Host Policies

Additional rules can be added to the default-policy:

apiVersion: "cilium.io/v2"
kind: CiliumClusterwideNetworkPolicy
metadata:
  name: "default-external-policy"
spec:
  description: "allow cluster intra cluster traffic along api server traffic"
  nodeSelector: {}
  ingress:
    - fromEntities:
        - cluster
    - fromCIDR:
        - 10.0.0.0/8
    - fromEntities:
        - world
      toPorts:
        - ports:
            - port: "22" # added for SSH Access to the nodes
            - port: "${APISERVER_PORT:=6443}"

Alternatively, additional rules can be added by creating a new policy:

apiVersion: "cilium.io/v2"
kind: CiliumClusterwideNetworkPolicy
metadata:
  name: "ssh-access-policy"
spec:
  description: "allows ssh access to nodes"
  nodeSelector: {}
  ingress:
    - fromEntities:
        - world
      toPorts:
        - ports:
            - port: "22"

Cloud Firewalls

Cloud firewalls are provisioned with all flavors that use VPCs. They are provisioned in disabled mode but can be enabled with the environment variable LINODE_FIREWALL_ENABLED=true. The default rules allow for all intra-cluster VPC traffic along with any traffic going to the API server.

Creating Cloud Firewalls

For controlling firewalls via Linode resources, a Cloud Firewall can be defined and provisioned via the LinodeFirewall resource in CAPL. Any updates to the cloud firewall CAPL resource will be updated in the cloud firewall and overwrite any changes made outside the CAPL resource.

Example LinodeFirewall:

apiVersion: infrastructure.cluster.x-k8s.io/v1alpha2
kind: LinodeFirewall
metadata:
  name: sample-fw
spec:
  enabled: true
  inboundPolicy: DROP
  inboundRules:
    - action: ACCEPT
      label: intra-cluster
      ports: "1-65535"
      protocol: "TCP"
      addresses:
        ipv4:
          - "10.0.0.0/8"
    - action: ACCEPT
      addresses:
        ipv4:
          - 0.0.0.0/0
        ipv6:
          - ::/0
      ports: "6443"
      protocol: TCP
      label: inbound-api-server

Cloud Firewall Machine Integration

The created Cloud Firewall can be used on a LinodeMachine or a LinodeMachineTemplate by setting the firewallRef field. Alternatively, the provisioned Cloud Firewall's ID can be used in the firewallID field.

Note

The firewallRef and firewallID fields are currently immutable for LinodeMachines and LinodeMachineTemplates. This will be addressed in a later release.

Example LinodeMachineTemplate:

apiVersion: infrastructure.cluster.x-k8s.io/v1alpha2
kind: LinodeMachineTemplate
metadata:
  name: test-cluster-control-plane
  namespace: default
spec:
  template:
    spec:
      firewallRef:
        apiVersion: infrastructure.cluster.x-k8s.io/v1alpha2
        kind: LinodeFirewall
        name: sample-fw
      image: linode/ubuntu22.04
      interfaces:
        - purpose: public
      region: us-ord
      type: g6-standard-4

Placement Groups

This guide covers how configure placement groups within a CAPL cluster. Placement groups are currently provisioned with any of the *-full flavors in the LinodeMachineTemplate for the control plane machines only.

Note

Currently only 5 nodes are allowed in a single placement group

Placement Group Creation

For controlling placement groups via Linode resources, a placement groups can be defined and provisioned via the PlacementGroup resource in CAPL.

Example PlacementGroup:

apiVersion: infrastructure.cluster.x-k8s.io/v1alpha2
kind: LinodePlacementGroup
metadata:
  name: test-cluster
spec:
  region: us-ord

PlacementGroup Machine Integration

In order to use a placement group with a machine, a PlacementGroupRef can be used in the LinodeMachineTemplate spec to assign any nodes used in that template to the placement group. Due to the limited size of the placement group our templates currently only integrate with this for control plane nodes

Example LinodeMachineTemplate:

apiVersion: infrastructure.cluster.x-k8s.io/v1alpha2
kind: LinodeMachineTemplate
metadata:
  name: test-cluster-control-plane
  namespace: default
spec:
  template:
    spec:
      image: linode/ubuntu22.04
      interfaces:
        - purpose: public
      placementGroupRef:
        apiVersion: infrastructure.cluster.x-k8s.io/v1alpha2
        kind: LinodePlacementGroup
        name: test-cluster
      region: us-ord
      type: g6-standard-4

Developing Cluster API Provider Linode

Contents

Setting up

Base requirements

Warning

Ensure you have your LINODE_TOKEN set as outlined in the getting started prerequisites section.

There are no requirements since development dependencies are fetched as needed via the make targets, but a recommendation is to install Devbox

Optional Environment Variables

export LINODE_URL= # Default unset. Set this to talk to a specific linode api endpoint
export LINODE_CA= # Default unset. Set this to use a specific CA when talking to the linode API
export CAPL_DEBUG=false # Default false. Set this to true to enable delve integration
export INSTALL_K3S_PROVIDER=false # Default false. Set this to true to enable k3s capi provider installation
export INSTALL_RKE2_PROVIDER=false # Default false. Set this to true to enable the RKE2 capi provider installation
export INSTALL_HELM_PROVIDER=true # Default true. Set this to true to enable CAAPH provider installation
export INSTALL_KUBEADM_PROVIDER=true # Default true. Set this to true to enable kubeadm CAPI provider installation
export SKIP_DOCKER_BUILD=false # Default false. Set this to true to skip local docker builds of CAPL images
export CAPL_MONITORING=false # Default false. Set this to true to install the kube-prometheus-stack and capl serviceMonitor

Clone the source code

git clone https://github.com/linode/cluster-api-provider-linode
cd cluster-api-provider-linode

Enable git hooks

To enable automatic code validation on code push, execute the following commands:

PATH="$PWD/bin:$PATH" make husky && husky install

If you would like to temporarily disable git hook, set SKIP_GIT_PUSH_HOOK value:

SKIP_GIT_PUSH_HOOK=1 git push
  1. Install dependent packages in your project

    devbox install
    

    This will take a while, go and grab a drink of water.

  2. Use devbox environment

    devbox shell
    

From this point you can use the devbox shell like a regular shell. The rest of the guide assumes a devbox shell is used, but the make target dependencies will install any missing dependencies if needed when running outside a devbox shell.

Get familiar with basic concepts

This provider is based on the Cluster API project. It's recommended to familiarize yourself with Cluster API resources, concepts, and conventions outlined in the Cluster API Book.

Developing

This repository uses Go Modules to track and vendor dependencies.

To pin a new dependency, run:

go get <repository>@<version>

Code Overview

The code in this repo is organized across the following packages:

  • /api contains the custom resource types managed by CAPL.
  • /cmd contains the main entrypoint for registering controllers and running the controller manager.
  • /controller contains the various controllers that run in CAPL for reconciling the custom resource types.
  • /cloud/scope contains all Kubernetes client interactions scoped to each resource reconciliation loop. Each "scope" object is expected to store both a Kubernetes client and a Linode client.
  • /cloud/services contains all Linode client interactions. Functions defined in this package all expect a "scope" object which contains a Linode client to use.
  • /mock contains gomock clients generated from /cloud/scope/client.go.
  • /util/ contains general-use helper functions used in other packages.
  • /util/reconciler contains helper functions and constants used within the /controller package.

When adding a new controller, it is preferable that controller code only use the Kubernetes and Linode clients via functions defined in /cloud/scope and /cloud/services. This ensures each separate package can be tested in isolation using mock clients.

Using tilt

Note

If you want to create RKE2 and/or K3s clusters, make sure to set the following env vars first:

export INSTALL_RKE2_PROVIDER=true
export INSTALL_K3S_PROVIDER=true

Additionally, if you want to skip the docker build step for CAPL to instead use the latest image on main from Dockerhub, set the following:

export SKIP_DOCKER_BUILD=true

To build a kind cluster and start Tilt, simply run:

make local-deploy

Once your kind management cluster is up and running, you can deploy a workload cluster.

To tear down the tilt-cluster, run

kind delete cluster --name tilt

Deploying a workload cluster

After your kind management cluster is up and running with Tilt, you should be ready to deploy your first cluster.

Generating local cluster templates

For local development, templates should be generated via:

make local-release

This creates infrastructure-local-linode/v0.0.0/ with all the cluster templates:

infrastructure-local-linode/v0.0.0
├── cluster-template-clusterclass-kubeadm.yaml
├── cluster-template-etcd-backup-restore.yaml
├── cluster-template-k3s.yaml
├── cluster-template-rke2.yaml
├── cluster-template.yaml
├── clusterclass-kubeadm.yaml
├── infrastructure-components.yaml
└── metadata.yaml

This can then be used with clusterctl by adding the following to ~/.cluster-api/clusterctl.yaml:

providers:
  - name: local-linode
    url: ${HOME}/cluster-api-provider-linode/infrastructure-local-linode/v0.0.0/infrastructure-components.yaml
    type: InfrastructureProvider

Customizing the cluster deployment

Here is a list of required configuration parameters:

## Cluster settings
export CLUSTER_NAME=capl-cluster

## Linode settings
export LINODE_REGION=us-ord
# Multi-tenancy: This may be changed for each cluster to deploy to different Linode accounts.
export LINODE_TOKEN=<your linode PAT>
export LINODE_CONTROL_PLANE_MACHINE_TYPE=g6-standard-2
export LINODE_MACHINE_TYPE=g6-standard-2

Tip

You can also use clusterctl generate to see which variables need to be set:

clusterctl generate cluster $CLUSTER_NAME --infrastructure local-linode:v0.0.0 [--flavor <flavor>] --list-variables

Creating the workload cluster

Using the default flavor

Once you have all the necessary environment variables set, you can deploy a workload cluster with the default flavor:

clusterctl generate cluster $CLUSTER_NAME \
  --kubernetes-version v1.29.1 \
  --infrastructure local-linode:v0.0.0 \
  | kubectl apply -f -

This will provision the cluster within VPC with the CNI defaulted to cilium and the linode-ccm installed.

Using ClusterClass (alpha)

ClusterClass experimental feature is enabled by default in the KIND management cluster created via make tilt-cluster

You can use the clusterclass flavor to create a workload cluster as well, assuming the management cluster has the ClusterTopology feature gate set:

clusterctl generate cluster $CLUSTER_NAME \
  --kubernetes-version v1.29.1 \
  --infrastructure local-linode:v0.0.0 \
  --flavor clusterclass-kubeadm \
  | kubectl apply -f -

For any issues, please refer to the troubleshooting guide.

Cleaning up the workload cluster

To delete the cluster, simply run:

kubectl delete cluster $CLUSTER_NAME

Warning

VPCs are not deleted when a cluster is deleted using kubectl. One can run kubectl delete linodevpc <vpcname> to cleanup VPC once cluster is deleted.

For any issues, please refer to the troubleshooting guide.

Debugging CAPL Controllers

CAPL supports using Delve to attach a debugger to CAPL. This will start Delve in the CAPL container on port 40000 and use Tilt live_reload to rebuild the CAPL Controller on your host and insert it into the container without needing to rebuild the container.

CAPL_DEBUG=true make tilt-cluster

Automated Testing

E2E Testing

To run E2E locally run:

# Required env vars to run e2e tests
export INSTALL_K3S_PROVIDER=true
export INSTALL_RKE2_PROVIDER=true
export LINODE_REGION=us-sea
export LINODE_CONTROL_PLANE_MACHINE_TYPE=g6-standard-2
export LINODE_MACHINE_TYPE=g6-standard-2

make e2etest

This command creates a KIND cluster, and executes all the defined tests.

For more details on E2E tests, please refer to E2E Testing

Warning

CAPL Releases

Release Cadence

CAPL currently has no set release cadence.

Bug Fixes

Any significant user-facing bug fix that lands in the main branch should be backported to the current and previous release lines.

Versioning Scheme

CAPL follows the semantic versioning specification.

Example versions:

  • Pre-release: v0.1.1-alpha.1
  • Minor release: v0.1.0
  • Patch release: v0.1.1
  • Major release: v1.0.0

Release Process

Update metadata.yaml (skip for patch releases)

  • Make sure metadata.yaml is up-to-date and contains the new release with the correct Cluster API contract version.
    • If not, open a PR to add it.

Release in GitHub

  • Create a new release.
    • Enter tag and select create tag on publish
    • Make sure to click "Generate Release Notes"
    • Review the generated Release Notes and make any necessary changes.
    • If the tag is a pre-release, make sure to check the "Set as a pre-release box"

Expected artifacts

  • A infrastructure-components.yaml file containing the resources needed to deploy to Kubernetes
  • A cluster-templates-*.yaml file for each supported flavor
  • A metadata.yaml file which maps release series to the Cluster API contract version

Communication

  1. Announce the release in the Kubernetes Slack on the #linode channel

CAPL Testing

Unit Tests

Executing Tests

In order to run the unit tests run the following command

make test

Creating Tests

General unit tests of functions follow the same conventions for testing using Go's testing standard library, along with the testify toolkit for making assertions.

Unit tests that require API clients use mock clients generated using gomock. To simplify the usage of mock clients, this repo also uses an internal library defined in mock/mocktest.

mocktest is usually imported as a dot import along with the mock package:

import (
  "github.com/linode/cluster-api-provider-linode/mock"

  . "github.com/linode/cluster-api-provider-linode/mock/mocktest"
)

Using mocktest involves creating a test suite that specifies the mock clients to be used within each test scope and running the test suite using a DSL for defnining test nodes belong to one or more test paths.

Example

The following is a contrived example using the mock Linode machine client.

Let's say we've written an idempotent function EnsureInstanceRuns that 1) gets an instance or creates it if it doesn't exist, 2) boots the instance if it's offline. Testing this function would mean we'd need to write test cases for all permutations, i.e.

  • instance exists and is not offline
  • instance exists but is offline, and is able to boot
  • instance exists but is offline, and is not able to boot
  • instance does not exist, and is not able to be created
  • instance does not exist, and is able to be created, and is able to boot
  • instance does not exist, and is able to be created, and is not able to boot

While writing test cases for each scenario, we'd likely find a lot of overlap between each. mocktest provides a DSL for defining each unique test case without needing to spell out all required mock client calls for each case. Here's how we could test EnsureInstanceRuns using mocktest:

func TestEnsureInstanceNotOffline(t *testing.T) {
  suite := NewSuite(t, mock.MockLinodeMachineClient{})
  
  suite.Run(
    OneOf(
      Path(
        Call("instance exists and is not offline", func(ctx context.Context, mck Mock) {
          mck.MachineClient.EXPECT().GetInstance(ctx, /* ... */).Return(&linodego.Instance{Status: linodego.InstanceRunning}, nil)
        }),
        Result("success", func(ctx context.Context, mck Mock) {
          inst, err := EnsureInstanceNotOffline(ctx, /* ... */)
          require.NoError(t, err)
          assert.Equal(t, inst.Status, linodego.InstanceRunning)
        })
      ),
      Path(
        Call("instance does not exist", func(ctx context.Context, mck Mock) {
          mck.MachineClient.EXPECT().GetInstance(ctx, /* ... */).Return(nil, linodego.Error{Code: 404})
        }),
        OneOf(
          Path(Call("able to be created", func(ctx context.Context, mck Mock) {
            mck.MachineClient.EXPECT().CreateInstance(ctx, /* ... */).Return(&linodego.Instance{Status: linodego.InstanceOffline}, nil)
          })),
          Path(
            Call("not able to be created", func(ctx context.Context, mck Mock) {/* ... */})
            Result("error", func(ctx context.Context, mck Mock) {
              inst, err := EnsureInstanceNotOffline(ctx, /* ... */)
              require.ErrorContains(t, err, "instance was not booted: failed to create instance: reasons...")
              assert.Empty(inst)
            }),
          )
        ),
      ),
      Path(Call("instance exists but is offline", func(ctx context.Context, mck Mock) {
        mck.MachineClient.EXPECT().GetInstance(ctx, /* ... */).Return(&linodego.Instance{Status: linodego.InstanceOffline}, nil)
      })),
    ),
    OneOf(
      Path(
        Call("able to boot", func(ctx context.Context, mck Mock) {/*  */})
        Result("success", func(ctx context.Context, mck Mock) {
          inst, err := EnsureInstanceNotOffline(ctx, /* ... */)
          require.NoError(t, err)
          assert.Equal(t, inst.Status, linodego.InstanceBooting)
        })
      ),
      Path(
        Call("not able to boot", func(ctx context.Context, mck Mock) {/* returns API error */})
        Result("error", func(ctx context.Context, mck Mock) {
          inst, err := EnsureInstanceNotOffline(/* options */)
          require.ErrorContains(t, err, "instance was not booted: boot failed: reasons...")
          assert.Empty(inst)
        })
      )
    ),
  )
}

In this example, the nodes passed into Run are used to describe each permutation of the function being called with different results from the mock Linode machine client.

Nodes

  • Call describes the behavior of method calls by mock clients. A Call node can belong to one or more paths.
  • Result invokes the function with mock clients and tests the output. A Result node terminates each path it belongs to.
  • OneOf is a collection of diverging paths that will be evaluated in separate test cases.
  • Path is a collection of nodes that all belong to the same test path. Each child node of a Path is evaluated in order. Note that Path is only needed for logically grouping and isolating nodes within different test cases in a OneOf node.

Setup, tear down, and event triggers

Setup and tear down nodes can be scheduled before and after each run. suite.BeforeEach receives a func(context.Context, Mock) function that will run before each path is evaluated. Likewise, suite.AfterEach will run after each path is evaluated.

In addition to the path nodes listed in the section above, a special node type Once may be specified to inject a function that will only be evaluated one time across all paths. It can be used to trigger side effects outside of mock client behavior that can impact the output of the function being tested.

Control flow

When Run is called on a test suite, paths are evaluated in parallel using t.Parallel(). Each path will be run with a separate t.Run call, and each test run will be named according to the descriptions specified in each node.

To help with visualizing the paths that will be rendered from nodes, a DescribePaths helper function can be called which returns a slice of strings describing each path. For instance, the following shows the output of DescribePaths on the paths described in the example above:

DescribePaths(/* nodes... */) /* [
  "instance exists and is not offline > success",
  "instance does not exist > not able to be created > error",
  "instance does not exist > able to be created > able to boot > success",
  "instance does not exist > able to be created > not able to boot > error",
  "instance exists but is offline > able to boot > success",
  "instance exists but is offline > not able to boot > error"
] */

Testing controllers

CAPL uses controller-runtime's envtest package which runs an instance of etcd and the Kubernetes API server for testing controllers. The test setup uses ginkgo as its test runner as well as gomega for assertions.

mocktest is also recommended when writing tests for controllers. The following is another contrived example of how to use its controller suite:

var _ = Describe("linode creation", func() {
  // Create a mocktest controller suite.
  suite := NewControllerSuite(GinkgoT(), mock.MockLinodeMachineClient{})

  obj := infrav1alpha2.LinodeMachine{
    ObjectMeta: metav1.ObjectMeta{/* ... */}
    Spec: infrav1alpha2.LinodeMachineSpec{/* ... */}
  }

  suite.Run(
    Once("create resource", func(ctx context.Context, _ Mock) {
      // Use the EnvTest k8sClient to create the resource in the test server
      Expect(k8sClient.Create(ctx, &obj).To(Succeed()))
    }),
    Call("create a linode", func(ctx context.Context, mck Mock) {
      mck.MachineClient.CreateInstance(ctx, gomock.Any(), gomock.Any()).Return(&linodego.Instance{/* ... */}, nil)
    }),
    Result("update the resource status after linode creation", func(ctx context.Context, mck Mock) {
      reconciler := LinodeMachineReconciler{
        // Configure the reconciler to use the mock client for this test path
        LinodeClient: mck.MachineClient,
        // Use a managed recorder for capturing events published during this test
        Recorder: mck.Recorder(),
        // Use a managed logger for capturing logs written during the test
        // Note: This isn't a real struct field in LinodeMachineReconciler. A logger is configured elsewhere.
        Logger: mck.Logger(),
      }

      _, err := reconciler.Reconcile(ctx, reconcile.Request{/* ... */})
      Expect(err).NotTo(HaveOccurred())
      
      // Fetch the updated object in the test server and confirm it was updated
      Expect(k8sClient.Get(ctx, client.ObjectKeyFromObject(obj))).To(Succeed())
      Expect(obj.Status.Ready).To(BeTrue())

      // Check for expected events and logs
      Expect(mck.Events()).To(ContainSubstring("Linode created!"))
      Expect(mck.Logs()).To(ContainSubstring("Linode created!"))
    }),
  )
})

E2E Tests

For e2e tests CAPL uses the Chainsaw project which leverages kind and tilt to spin up a cluster with the CAPL controllers installed and then uses chainsaw-test.yaml files to drive e2e testing.

All test live in the e2e folder with a directory structure of e2e/${COMPONENT}/${TEST_NAME}

Environment Setup

The e2e tests use the local-linode infrastructure provider, this is registered by adding the following to ~/.cluster-api/clusterctl.yaml:

providers:
  - name: local-linode
    url: ${HOME}/cluster-api-provider-linode/infrastructure-local-linode/v0.0.0/infrastructure-components.yaml
    type: InfrastructureProvider

Running Tests

In order to run e2e tests run the following commands:

# Required env vars to run e2e tests
export INSTALL_K3S_PROVIDER=true
export INSTALL_RKE2_PROVIDER=true
export LINODE_REGION=us-sea
export LINODE_CONTROL_PLANE_MACHINE_TYPE=g6-standard-2
export LINODE_MACHINE_TYPE=g6-standard-2

make e2etest

Note: By default make e2etest runs all the e2e tests defined under /e2e dir

In order to run specific test, you need to pass flags to chainsaw by setting env var E2E_SELECTOR

Additional settings can be passed to chainsaw by setting env var E2E_FLAGS

Example: Only running e2e tests for flavors (default, k3s, rke2)

make e2etest E2E_SELECTOR='flavors' E2E_FLAGS='--assert-timeout 10m0s'

Note: We need to bump up the assert timeout to 10 mins to allow the cluster to complete building and become available

There are other selectors you can use to invoke specfic tests. Please look at the table below for all the selectors available:

TestsSelector
All Testsall
All Controllersquick
All Flavors (default, k3s, rke2)flavors
K3S Clusterk3s
RKE2 Clusterrke2
Default (kubeadm) Clusterkubeadm
Linode Cluster Controllerlinodecluster
Linode Machine Controllerlinodemachine
Linode Obj Controllerlinodeobj
Linode Obj Key Controllerlinodeobjkey
Linode VPC Controllerlinodevpc

Note: For any flavor e2e tests, please set the required env variables

Adding Tests

  1. Create a new directory under the controller you are testing with the naming scheme of e2e/${COMPONENT}/${TEST_NAME}
  2. Create a minimal chainsaw-test.yaml file in the new test dir
    # yaml-language-server: $schema=https://raw.githubusercontent.com/kyverno/chainsaw/main/.schemas/json/test-chainsaw-v1alpha1.json
    apiVersion: chainsaw.kyverno.io/v1alpha1
    kind: Test
    metadata:
      name: $TEST_NAME
    spec:
      template: true # set to true if you are going to use any chainsaw templating
      steps:
      - name: step-01
        try:
        - apply:
            file: ${resource_name}.yaml
        - assert:
            file: 01-assert.yaml
    
  3. Add any resources to create or assert on in the same directory

Reference

For reference documentation for CAPL API types, please refer to the godocs