Komodor is a Kubernetes management platform that empowers everyone from Platform engineers to Developers to stop firefighting, simplify operations and proactively improve the health of their workloads and infrastructure.
Proactively detect & remediate issues in your clusters & workloads.
Easily operate & manage K8s clusters at scale.
Reduce costs without compromising on performance.
Empower developers with self-service K8s troubleshooting.
Simplify and accelerate K8s migration for everyone.
Fix things fast with AI-powered root cause analysis.
Automate and optimize AI/ML workloads on K8s
Easily manage Kubernetes Edge clusters
Explore our K8s guides, e-books and webinars.
Learn about K8s trends & best practices from our experts.
Listen to K8s adoption stories from seasoned industry veterans.
The missing UI for Helm – a simplified way of working with Helm.
Visualize Crossplane resources and speed up troubleshooting.
Validate, clean & secure your K8s YAMLs.
Navigate the community-driven K8s ecosystem map.
Your single source of truth for everything regarding Komodor’s Platform.
Keep up with all the latest feature releases and product updates.
Leverage Komodor’s public APIs in your internal development workflows.
Get answers to any Komodor-related questions, report bugs, and submit feature requests.
Kubernetes 101: A comprehensive guide
Expert tips for debugging Kubernetes
Tools and best practices
Kubernetes monitoring best practices
Understand Kubernetes & Container exit codes in simple terms
Exploring the building blocks of Kubernetes
Cost factors, challenges and solutions
Kubectl commands at your fingertips
Understanding K8s versions & getting the latest version
Rancher overview, tutorial and alternatives
Kubernetes management tools: Lens vs alternatives
Troubleshooting and fixing 5xx server errors
Solving common Git errors and issues
Who we are, and our promise for the future of K8s.
Have a question for us? Write us.
Come aboard the K8s ship – we’re hiring!
Hear’s what they’re saying about Komodor in the news.
Karpenter is an open-source autoscaler for Kubernetes nodes, which can improve the efficiency and cost-effectiveness of running workloads on Kubernetes clusters. It was originally developed by Amazon Web Services (AWS), is licensed under the permissive Apache License 2.0, and has over 300 GitHub contributors.
Unlike traditional Kubernetes autoscalers that manage cluster node operations based on pre-defined metrics, Karpenter proactively adjusts compute resources to ensure applications have as many Kubernetes nodes as they need. Karpenter simplifies cluster management challenges such as over-provisioning and underutilization by dynamically provisioning the right size and type of resources based on application needs.
The original Karpenter project works in the AWS environment. However, forks of the project are available for Azure and other cloud environments.
Get Karpenter at the official GitHub repo
Get Karpenter for Azure here
Source: Karpenter
This is part of a series of articles about Kubernetes management
Karpenter offers the following capabilities.
Karpenter can quickly scale nodes in a Kubernetes cluster by making fast decisions to launch additional nodes when needed. It uses real-time metrics to forecast and provision resources, reducing latency in resource allocation. This capacity to respond rapidly to workload demands helps maintain application performance and ensures service reliability.
Karpenter focuses on deploying the most appropriate compute resources based on workload characteristics. By analyzing the actual usage and needs of running applications, it matches resources, which minimizes waste and reduces costs. The technology continuously refines these decisions as it learns from the environment’s patterns and behaviors.
Karpenter offers flexible resource provisioning, allowing users to specify requirements such as instance types, zones, and procurement options. The flexibility extends to adjusting these specifications dynamically, accommodating the changing needs of applications. This allows organizations to optimize their infrastructure in real time, aligning application requirements and cost-efficiency.
Node lifecycle management with Karpenter is automated to handle various tasks, including node creation, updating, and eventual decommissioning. This process minimizes manual interventions and lowers the risk of human error. Karpenter’s intelligent lifecycle management also helps in maintaining cluster health and efficiency, ensuring nodes are only operational when needed and gracefully decommissioning them after their useful life.
Karpenter operates by integrating tightly with the Kubernetes API, continuously monitoring cluster state and workload demands. The process can be simplified as follows:
Itiel Shwartz
Co-Founder & CTO
In my experience, here are tips that can help you better utilize Karpenter:
Integrate priority classes with Karpenter to ensure that critical workloads receive resources before less important ones. This can help in maintaining the performance of essential services during high demand.
Configure Karpenter to use a mix of instance types and sizes to optimize costs and availability. This strategy reduces dependency on specific instance types that might be in short supply.
Configure Karpenter to fallback to on-demand instances when spot instances are unavailable. This ensures that your applications remain resilient and available even during spot market fluctuations.
Implement predictive scaling models to anticipate future workloads and scale nodes proactively. This can reduce latency in resource allocation and improve overall application performance.
Configure Karpenter to optimize pod distribution across nodes to minimize latency and maximize resource utilization. This can be achieved by fine-tuning node selectors and affinity rules.
While both Karpenter and the traditional Kubernetes Cluster Autoscaler aim to manage node scaling, they differ in their approaches and capabilities:
Cluster autoscaler relies on predefined thresholds and scales based on pod scheduling events. It typically adds nodes when there are pending pods that cannot be scheduled due to resource constraints.
Karpenter uses real-time metrics and predictive analysis to proactively adjust resources. It can scale nodes up or down based on current and forecasted workload demands, resulting in faster and more accurate scaling decisions.
Cluster autoscaler is limited to adding and removing nodes based on predefined policies and thresholds.
Karpenter offers greater flexibility with dynamic provisioning options, allowing users to specify instance types, zones, and procurement strategies. It adapts to changing application needs in real time.
Cluster autoscaler focuses on ensuring pods are scheduled but might not always optimize resource utilization efficiently, potentially leading to over-provisioning.
Karpenter continuously optimizes resources by analyzing actual usage patterns, minimizing waste and reducing costs. It learns from the environment to make better provisioning decisions over time.
Cluster autoscaler handles basic node management tasks but may require manual interventions for complex scenarios.
Karpenter automates the entire node lifecycle, from creation to decommissioning, reducing the need for manual management and minimizing the risk of human error.
When evaluating Karpenter, it’s important to be aware of the following limitations.
As Karpenter is a relatively new solution and still in its beta phase, users might encounter bugs and feature gaps. The community and development team are actively working on improvements, but early adopters should be prepared for potential stability issues and may need to contribute to testing and feedback processes to help mature the project.
Karpenter’s efficiency is heavily reliant on its configuration process. While it automates many aspects of cluster management, the initial setup and ongoing adjustments require deep knowledge and understanding of both Karpenter and the underlying Kubernetes framework. New users might find the learning curve steep, and misconfigurations can lead to inefficiencies.
While Karpenter is intended to optimize costs by matching resource use to demand accurately, it requires careful configuration and understanding of workload patterns to achieve the expected financial benefits. Users must precisely define parameters to avoid over-provisioning, which can otherwise negate cost savings.
In the event of misconfiguration, costs might escalate quickly due to unnecessary scaling actions. Thus, continuous monitoring and adjustment of configurations are critical when using Karpenter.
Karpenter’s ability to integrate real-time spot pricing information allows for cost-effective provisioning decisions. This awareness helps in selecting the most cost-efficient compute resources available, reducing expenses when market conditions are favorable.
However, reliance on spot instances can introduce risks, especially in volatile markets where availability may suddenly change. Implementing fallback strategies and understanding cloud provider pricing models is essential.
Here’s an overview of how to get started with Karpenter. Code and instructions are adapted from the Karpenter documentation.
Utilities that need to be installed include the AWS CLI, kubectl, eksctl (v0.180.0 or later), and Helm:
curl "https://awscli.amazonaws.com/AWSCLIV2.pkg" -o "AWSCLIV2.pkg"sudo installer -pkg AWSCLIV2.pkg -target /
kubectl
curl -LO "https://storage.googleapis.com/kubernetes-release/release/$(curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt)/bin/linux/amd64/kubectl"chmod +x ./kubectlsudo mv ./kubectl /usr/local/bin/kubectl
curl --silent --location "https://github.com/weaveworks/eksctl/releases/download/latest_release/eksctl_$(uname -s)_amd64.tar.gz" | tar xz -C /tmpsudo mv /tmp/eksctl /usr/local/bin
curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
aws sts get-caller-identity
After installing the tools, proceed with these steps:
export KARPENTER_NAMESPACE="kube-system"export KARPENTER_VERSION="0.37.0"export K8S_VERSION="1.30"
export AWS_PARTITION="aws"export MY_CLUSTER="${USER}-karpenter-demo"export AWS_DEFAULT_REGION="us-west-2"export MY_AWS_ACCOUNT="$(aws sts get-caller-identity --query Account --output text)"export TEMPOUT="$(mktemp)"export ARM_AMI_ID="$(aws ssm get-parameter --name /aws/service/eks/optimized-ami/${K8S_VERSION}/amazon-linux-2-arm64/recommended/image_id --query Parameter.Value --output text)"export AMD_AMI_ID="$(aws ssm get-parameter --name /aws/service/eks/optimized-ami/${K8S_VERSION}/amazon-linux-2/recommended/image_id --query Parameter.Value --output text)"export GPU_AMI_ID="$(aws ssm get-parameter --name /aws/service/eks/optimized-ami/${K8S_VERSION}/amazon-linux-2-gpu/recommended/image_id --query Parameter.Value --output text)"
Create a basic cluster with eksctl:
eksctl
curl -fsSL https://raw.githubusercontent.com/aws/karpenter-provider-aws/v"${KARPENTER_VERSION}"/website/content/en/preview/getting-started/getting-started-with-karpenter/cloudformation.yaml > "${TEMPOUT}" \&& aws cloudformation deploy \ --stack-name "Karpenter-${MY_CLUSTER}" \ --template-file "${TEMPOUT}" \ --capabilities CAPABILITY_NAMED_IAM \ --parameter-overrides "ClusterName=${MY_CLUSTER}"eksctl create cluster -f - <<EOF---apiVersion: eksctl.io/v1alpha5kind: ClusterConfigmetadata: name: ${MY_CLUSTER} region: ${AWS_DEFAULT_REGION} version: "${KUBERNETES_VERSION}" tags: karpenter.sh/discovery: ${MY_CLUSTER}iam: withOIDC: true podIdentityAssociations: - namespace: "${KARPENTER_NAMESPACE}" serviceAccountName: karpenter roleName: ${MY_CLUSTER}-karpenter permissionPolicyARNs: - arn:${AWS_PARTITION}:iam::${MY_AWS_ACCOUNT}:policy/KarpenterControllerPolicy-${MY_CLUSTER}iamIdentityMappings:- arn: "arn:${AWS_PARTITION}:iam::${MY_AWS_ACCOUNT}:role/KarpenterNodeRole-${MY_CLUSTER}" username: system:node:{{NameOfEC2PrivateDNS}} groups: - system:bootstrappers - system:nodesmanagedNodeGroups:- instanceType: m5.large amiFamily: AmazonLinux2 name: ${MY_CLUSTER}-ng desiredCapacity: 2 minSize: 1 maxSize: 10addons:- name: eks-pod-identity-agentEOFexport CLUSTER_ENDPOINT="$(aws eks describe-cluster --name "${MY_CLUSTER}" --query "cluster.endpoint" --output text)"export KARPENTER_IAM_ROLE_ARN="arn:${AWS_PARTITION}:iam::${MY_AWS_ACCOUNT}:role/${MY_CLUSTER}-karpenter"echo "${CLUSTER_ENDPOINT} ${KARPENTER_IAM_ROLE_ARN}"
The output should look similar to this:
aws iam create-service-linked-role --aws-service-name spot.amazonaws.com || true
You can install Karpenter using Helm:
helm registry logout public.ecr.awshelm upgrade --install karpenter oci://public.ecr.aws/karpenter/karpenter --version "${KARPENTER_VERSION}" --namespace "${KARPENTER_NAMESPACE}" --create-namespace \ --set "settings.clusterName=${MY_CLUSTER}" \ --set "settings.interruptionQueue=${MY_CLUSTER}" \ --set controller.resources.requests.cpu=1 \ --set controller.resources.requests.memory=1Gi \ --set controller.resources.limits.cpu=1 \ --set controller.resources.limits.memory=1Gi \ --wait
To create a default NodePool that can handle different pod shapes, use the following script. This script creates a NodePool and an EC2NodeClass for Karpenter to manage.
The NodePool specifies requirements such as architecture, operating system, and instance types. It sets a limit on the CPU resources and defines policies for node consolidation to optimize resource usage. The EC2NodeClass includes configurations for Amazon Machine Images (AMIs), roles, and security groups, which ensure the new nodes meet the cluster’s security and operational standards.
cat <<EOF | envsubst | kubectl apply -f -apiVersion: karpenter.sh/v1beta1kind: NodePoolmetadata: name: defaultspec: template: spec: requirements: - key: kubernetes.io/arch operator: In values: ["amd64"] - key: kubernetes.io/os operator: In values: ["linux"] - key: karpenter.sh/capacity-type operator: In values: ["spot"] - key: karpenter.k8s.aws/instance-category operator: In values: ["c", "m", "r"] - key: karpenter.k8s.aws/instance-generation operator: Gt values: ["2"] nodeClassRef: apiVersion: karpenter.k8s.aws/v1beta1 kind: EC2NodeClass name: default limits: cpu: 1000 disruption: consolidationPolicy: WhenUnderutilized expireAfter: 720h # 30 * 24h = 720h---apiVersion: karpenter.k8s.aws/v1beta1kind: EC2NodeClassmetadata: name: defaultspec: amiFamily: AL2 # Amazon Linux 2 role: "KarpenterNodeRole-${MY_CLUSTER}" subnetSelectorTerms: - tags: karpenter.sh/discovery: "${MY_CLUSTER}" securityGroupSelectorTerms: - tags: karpenter.sh/discovery: "${MY_CLUSTER}" amiSelectorTerms: - id: "${ARM_AMI_ID}" - id: "${AMD_AMI_ID}"EOF
Use the following script to deploy and scale up the application. The Deployment resource defines an application called “scaleup” with a placeholder container image.
Initially, the number of replicas is set to zero. The kubectl scale command then increases the number of replicas to three, prompting Karpenter to provision additional nodes if needed. The final command retrieves logs from the Karpenter controller to monitor the scaling activities and ensure that the deployment and node provisioning are functioning as expected.
cat <<EOF | kubectl apply -f -apiVersion: apps/v1kind: Deploymentmetadata: name: scaleupspec: replicas: 0 selector: matchLabels: app: scaleup template: metadata: labels: app: scaleup spec: terminationGracePeriodSeconds: 0 containers: - name: scaleup image: public.ecr.aws/eks-distro/kubernetes/pause:3.7 resources: requests: cpu: 1EOFkubectl scale deployment scaleup --replicas 3kubectl logs -f -n "${KARPENTER_NAMESPACE}" -l app.kubernetes.io/name=karpenter -c controller
The output should look something like:
Komodor is the Continuous Kubernetes Reliability Platform, designed to democratize K8s expertise across the organization and enable engineering teams to leverage its full value.
Komodor’s platform empowers developers to confidently monitor and troubleshoot their workloads while allowing cluster operators to enforce standardization and optimize performance. Specifically when working in a hybrid environment, Komodor reduces the complexity by providing a unified view of all your services and clusters.
By leveraging Komodor, companies of all sizes significantly improve reliability, productivity, and velocity. Or, to put it simply – Komodor helps you spend less time and resources on managing Kubernetes, and more time on innovating at scale.
If you are interested in checking out Komodor, use this link to sign up for a Free Trial.
Share:
and start using Komodor in seconds!