Adopt Karpenter Consolidation without Disrupting Critical Workloads
Table of Contents
Introduction
Autoscaling in Kubernetes, particularly in cloud-hosted Kubernetes like Amazon EKS, comes in two flavors:
- Autoscale the app/workload within the cluster, using Horizontal / Vertical Pod Autoscalers (HPA / VPA)
- Autoscale the cluster itself, by adding / removing worker nodes automatically as needed
Kubernetes cluster autoscaler is the go-to solution for the second kind of autoscaling. Karpenter is a better, more capable alternative to cluster autoscaler. Both solutions watch for pods pending due to lack of resources & provision nodes to meet pod requirements.
Karpenter can also ensure your cluster runs at max efficiency & min cost at all times by:
- Auto-deleting empty nodes
- Deleting nodes whose workloads can run on other nodes
- Replace nodes with lower priced variants when possible
This is called “consolidation”.
Although great in theory, Karpenter consolidation should not be enabled unless your workloads can tolerate adhoc disruptions. Karpenter can terminate any pod anytime to consolidate the cluster. Karpenter does however, provide several mechanisms to control the disruption behavior.
Time-Bound Consolidation
Karpenter disruption budgets can be used to rate limit Karpenter’s disruption of nodes in 3 ways:
- Define a max percentage of nodes that can be disrupted at a time
- Define a max count of nodes that can be disrupted at a time
- Define a max % or count of nodes that can be disrupted in a time window
Disruption budgets are defined in the Karpenter NodePool manifest. For example, this budget pauses all disruption from 3 to 6 AM UTC on Saturdays:
apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
name: default
spec:
disruption:
consolidationPolicy: WhenUnderutilized
expireAfter: Never
budgets:
- nodes: "0"
schedule: 0 3 * * sat
duration: 3h
schedule
is a cron expression that defines the starting point of the time window. duration
defines how long after the schedule start, the budget is active.
When multiple budgets are defined, the most restrictive takes effect. Budgets can be used to define when the cluster should not consolidate. So if you prefer to consolidate your cluster only once a week between 3 to 6 AM UTC on Saturdays, your budget should define a time window covering all hours of the week except 3 to 6 AM on Saturdays:
budgets: # of the 168 hours in a week
- nodes: "0" # pause consolidation
duration: 165h # for 165 hours
schedule: 0 6 * * sat # starting at 6 AM UTC every Saturday
Protect Critical Workloads
Depending on your workload characteristics & availability requirements, you can choose to let the cluster consolidate nightly, weekly, etc. For pods that must not be disrupted, even when consolidation is allowed, like nightly batch jobs etc, annotate them with:
karpenter.sh/do-not-disrupt: "true"
The same annotation can be applied to a node that should not be disrupted.
Conclusion
This article was an introduction to Karpenter & its consolidation feature. Since not all workloads are “built for the cloud”, disrupting them adhoc may cause availability issues. Here we learned how to reap the cost & efficiency benefits of consolidation, while still avoiding end-user interruptions during business hours.
About the Author ✍🏻
Harish KM is a Principal DevOps Engineer at QloudX & a top-ranked AWS Ambassador since 2020. 👨🏻💻
With over a decade of industry experience as everything from a full-stack engineer to a cloud architect, Harish has built many world-class solutions for clients around the world! 👷🏻♂️
With over 20 certifications in cloud (AWS, Azure, GCP), containers (Kubernetes, Docker) & DevOps (Terraform, Ansible, Jenkins), Harish is an expert in a multitude of technologies. 📚
These days, his focus is on the fascinating world of DevOps & how it can transform the way we do things! 🚀