nOps recently launched the Episode 3 of nCast in collaboration with AWS Partner Solution Architects – Young Seok Jeong and Andrew Park from the Container Services Team. The episode focussed at providing and exploring valuable insights on Karpenter, Amazon’s latest EKS node management solution. James Wilson, VP of Engineering & Product Development Leader at nOps shared deep insights on best practices for a successful migration to Karpenter, including the use of Blueprints and EKS add-ons. You can listen to the audio version of the podcast here.
This blog will walk you through the concepts of Karpenter, and the step-by-step process of migrating to Karpenter. Read through!
Cluster Autoscaler vs Karpenter: What Are The Differences?
AWS Karpenter is an open-source autoscaling solution that brings significant advancements in node management to the Kubernetes community. When migrating from the cluster autoscaler, you are most likely to rely on node groups as your current method of scaling your cluster. Node groups utilize autoscaling groups to handle node scaling, and the cluster autoscaler controller operating within your cluster and monitoring your workloads.
This controller takes charge of managing the autoscaling groups and increases the desired size of the autoscaling groups based on the pending workloads in the cluster. The use of auto scaling groups results in resource wasting in situations where the pending workload does not require the full resources of the instance type and size defined in the autoscaling group.
Karpenter, in contrast, does not utilize the concept of node groups. Instead, Karpenter introduces Provisioners as an alternative. Provisioners offer enhanced flexibility compared to autoscaling groups, allowing you to define how you want your workloads to be scaled. Since Karpenter does not use auto scaling groups, it can be more flexible at the time of scaling, in terms of the instance size and type that will be launched to fulfill the needs of the pending workload.
For example, if you have pending pods that only require 2 CPU and 2 GB of memory to be scheduled, Karpenter will look for the instance type that best matches those needs, instead of simply launching a larger instance which will be able to schedule the pending workload but at an expense of resource wasting.
With Provisioners in Karpenter, you can:
- Define taints
- Labeling nodes
- Annotating nodes
- Customizing the kubelet args
Therefore, when using Karpenter, you can consider Provisioners as the equivalent of node groups, at least at a very high level, providing you with similar capabilities while offering additional flexibility for scaling your workloads.
Defining your provisioners
Unlike node groups which are a resource of your cloud provider, Provisioners are managed using a Kubernetes CRD, similar to how you manage other Kubernetes resources like Deployments or DaemonSets. During the installation of Karpenter, a CRD specifically for Provisioners is created in the cluster. Additionally, the Karpenter Controller, which is also installed via the helm chart, waits for events associated with Provisioner resources and takes appropriate actions based on those events. Here’s an example of what a Provisioner CRD looks like, for the purpose of this blog post only the minimum number of fields required to run have been included, for more advanced configuration you can refer to the Karpenter documentation.
apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
name: default
spec:
taints:
- key: example.com/special-taint
effect: NoSchedule
labels:
billing-team: my-team
annotations:
example.com/owner: "my-team"
requirements:
- key: "karpenter.k8s.aws/instance-category"
operator: In
values: ["c", "m", "r"]
- key: "karpenter.k8s.aws/instance-cpu"
operator: In
values: ["4", "8", "16", "32"]
- key: "karpenter.k8s.aws/instance-hypervisor"
operator: In
values: ["nitro"]
- key: karpenter.k8s.aws/instance-generation
operator: Gt
values: ["2"]
- key: "topology.kubernetes.io/zone"
operator: In
values: ["us-west-2a", "us-west-2b"]
- key: "kubernetes.io/arch"
operator: In
values: ["arm64", "amd64"]
- key: "karpenter.sh/capacity-type" # If not included, the webhook for the AWS cloud provider will default to on-demand
operator: In
values: ["spot", "on-demand"]
ttlSecondsUntilExpired: 2592000 # 30 Days = 60 * 60 * 24 * 30 Seconds;
# If omitted, the feature is disabled, nodes will never scale down due to low utilization
ttlSecondsAfterEmpty: 30
weight: 10
If you have been using node groups you are probably familiar with most of the configuration above. The requirements field is where you specify the types of instances.
To list all provisioners using kubectl, use `kubectl get provisioners`. As with other Kubernetes resources, you can describe them or output them to YAML to view more details.
How Many Provisioners Should You Create?
If you are currently running a single node group in your EKS cluster migration, it is more straightforward as that means you only need to create a single provisioner in Karpenter with a similar configuration as your node group but with a much larger instance type pool. Since Karpenter is smarter when selecting nodes to use for scaling, it doesn’t suffer from the limitations of the cluster autoscaler, so the larger the instance type pool the better.
If you are currently using multiple managed node groups to scale your cluster, you will want to keep the Karpenter implementation as close as possible to your existing one, at least for some time. In this case you can start by creating a provisioner per each node group, taking any particular considerations that you need for the instance types.
You also need to consider if your applications have any particular needs, such as high IO workloads that require NVMe storage instead of EBS, in cases such as these you will want to have multiple provisioners with the proper instance types so that you don’t impact the performance of your workloads.
Deploying Karpenter
To start using Karpenter you will need to have nodes to run the Karpenter controller on, you can either:
– Creating a managed node group and taint them so that only Karpenter runs on it
– Use fargate to run Karpenter and coredns so that you don’t have to create any node groups
We are going to cover the second approach using Terraform and use the Terraform EKS Karpenter blueprint from the AWS blueprints repository. The following code assumes you have created the cluster using the official EKS module.
Deploying Karpenter To Run On Fargate
1. Create the Fargate profiles required for Karpenter
We will need a Fargate profile setup for the Karpenter namespace, add the fargate profile to your EKS module declaration in terraform as below.
module "eks" {
source = "terraform-aws-modules/eks/aws"
version = "~> 19.13"
cluster_name = local.name
cluster_version = "1.27"
cluster_endpoint_public_access = true
vpc_id = module.vpc.vpc_id
subnet_ids = module.vpc.private_subnets
# Fargate profiles use the cluster primary security group so these are not utilized
create_cluster_security_group = false
create_node_security_group = false
manage_aws_auth_configmap = true
aws_auth_roles = [
# We need to add in the Karpenter node IAM role for nodes launched by Karpenter
{
rolearn = module.eks_blueprints_addons.karpenter.node_iam_role_arn
username = "system:node:{{EC2PrivateDNSName}}"
groups = [
"system:bootstrappers",
"system:nodes",
]
},
]
fargate_profiles = {
karpenter = {
selectors = [
{ namespace = "karpenter" }
]
}
kube_system = {
name = "kube-system"
selectors = [
{ namespace = "kube-system" }
]
}
}
tags = merge(local.tags, {
# NOTE - if creating multiple security groups with this module, only tag the
# security group that Karpenter should utilize with the following tag
# (i.e. - at most, only one security group should have this tag in your account)
"karpenter.sh/discovery" = local.name
})
}
2. Deploy the Karpenter helm chart and other AWS resources
First we need to deploy the Karpenter helm chart, this will install the Karpenter CRD’s, such as the Provisioner, and the Karpenter controller which takes care of executing the scaling actions.
module "eks_blueprints_kubernetes_addons" {
source = "git@github.com:aws-ia/terraform-aws-eks-blueprints//modules/kubernetes-addons?ref=v4.31.0"
eks_cluster_id = module.eks.cluster_name
eks_cluster_endpoint = module.eks.cluster_endpoint
eks_oidc_provider = module.eks.oidc_provider
eks_cluster_version = module.eks.cluster_version
# Wait on the `kube-system` profile before provisioning addons
data_plane_wait_arn = join(",", [for prof in module.eks.fargate_profiles : prof.fargate_profile_arn])
enable_karpenter = true
karpenter_helm_config = {
repository_username = data.aws_ecrpublic_authorization_token.token.user_name
repository_password = data.aws_ecrpublic_authorization_token.token.password
}
karpenter_node_iam_instance_profile = module.karpenter.instance_profile_name
karpenter_enable_spot_termination_handling = true
}
The other module that we will be using is the Karpenter module from the terraform-aws-modules repo. In order to run Karpenter you need a set of infrastructure to be created in AWS, this module will take care of creating the IAM roles and the node instance profile that Karpenter nodes will use. It will also create the interruption SQS queue that Karpenter will use to capture node interruption events from AWS. For more information on the interruption queue and how it works check Karpenter’s Interruption documentation.
Unset
module "karpenter" {
source = "terraform-aws-modules/eks/aws//modules/karpenter"
version = "~> 19.12"
cluster_name = module.eks.cluster_name
irsa_oidc_provider_arn = module.eks.oidc_provider_arn
create_irsa = false # IRSA will be created by the kubernetes-addons module
}
3. Subnets and Security group tagging
Lastly, you will need to tag the security groups and subnets that you want Karpenter to use for the nodes. Karpenter will autodiscover them, if you are using the AWS VPC module you can do it in terraform. These is the tag that you will need on both subnets and security groups:
Unset
karpenter.sh/discovery:
4. Creating the provisioners and the node template
Now that we have all the rest in place the last step is to create the node template and the provisioners.
Unset
apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
name: default
spec:
labels:
karpenter-migration: true
taints:
- key: nops.io/testing
effect: NoSchedule
requirements:
- key: karpenter.sh/capacity-type
operator: In
values: ["spot"]
limits:
resources:
cpu: 1000
providerRef:
name: default
consolidation:
enabled: true
---
apiVersion: karpenter.k8s.aws/v1alpha1
kind: AWSNodeTemplate
metadata:
name: default
spec:
subnetSelector:
karpenter.sh/discovery: ${CLUSTER_NAME}
securityGroupSelector:
karpenter.sh/discovery: ${CLUSTER_NAME}
5. Testing that it worked
Create a test deployment by applying the following YAML:
Unset
apiVersion: apps/v1
kind: Deployment
metadata:
name: inflate
spec:
replicas: 0
selector:
matchLabels:
app: inflate
template:
metadata:
labels:
app: inflate
spec:
nodeSelector:
karpenter-migration: true
tolerations:
- key: "nops.io/testing"
operator: "Exists"
effect: "NoSchedule"
terminationGracePeriodSeconds: 0
containers:
- name: inflate
image: public.ecr.aws/eks-distro/kubernetes/pause:3.7
resources:
requests:
cpu: 1
The deployment will have 0 replicas initially, to trigger the scaling of the Karpenter nodes, we need to scale it up:
Unset
kubectl scale deployment inflate --replicas 5
kubectl logs -f -n karpenter -l app.kubernetes.io/name=karpenter -c controller
Once the pods get out of the “pending” state you should see some new nodes created by Karpenter in the cluster.
Next Steps
As you may have noticed, we used taints and a node selector on the deployment, this was with the goal of not impacting your existing workload, by using the taints and the selector we make sure that your existing workload will keep using your node groups for now.
The next step is to start incrementally moving your workload to Karpenter, you can do this either by completely removing the taints from the provisioner, or by adding the node selector that matches the provisioner as well as the taint to your deployments.
In theory, eliminating the taints from the provisioner should be good enough, but for production you may want to take a less risky approach such as moving your deployments incrementally.
Final Thoughts
That is all about the migration to Karpenter. While Karpenter offers several benefits, it also has some limitations like failure to reconsider spot prices, short notice for spot terminations and more. Thus, to address Karpenter’s shortcomings, nOps Karpenter Solution (nKS) was launched.
Here’s how nKS is an easier and more effective approach:
nKS takes into account your entire AWS ecosystem, ensuring that node scheduling is optimized while managing your reserved instance and savings plan commitments. It uses machine learning algorithms to predict node termination up to 60 minutes in advance, allowing sufficient time to address potential issues and minimize any service disruptions. It offers a user-friendly interface for easy configuration and management of Karpenter, reducing the complexity associated with Kubernetes autoscaling.
Upgrade to nOps Karpenter Solution (nKS) and start automatically optimizing your environment for spot, RI, and savings plans today. Reduce your EKS infrastructure costs by 50% or more with nKS.
Explore more about nOps Karpenter Solution (nKS) here!