More Efficient Scaling and Safer Savings Using Karpenter on EKS

Reliable cost reduction on EKS by making Karpenter’s high-speed autoscaling continuously production-safe.

In an AWS-managed Kubernetes environment like EKS, “cost optimization” is not a side quest for FinOps. It’s an SRE problem because the same changes that reduce spend can also change scheduling latency, rollout safety, and failure modes under load.

At Komodor, our goal for cost optimization is straightforward: reduce waste without turning your cluster into a fragile system that only works on calm days. That means aligning supply with real demand, and applying changes with the same discipline you’d apply to any reliability-impacting rollout.

The path to lower spend is well understood. The challenge is making it work safely in production.

Why EKS Costs Balloon (and Why “More Nodes” is a Symptom)

For most EKS clusters, the bill grows because small “safe decisions” accumulate:

  • Outdated resource guesses — Requests and limits set months ago no longer reflect reality. These are often copied from old templates, sized for peak events like launch days, seasonal surges, or Black Friday traffic, and never revisited.
  • Overprovisioning for safety — Headroom within the node, and other extra nodes are  provisioned over time,  then never removed or checked.
  • Fragmentation — Scale-down becomes hard because workloads can’t move cleanly across nodes during consolidation. What’s more, teams often hesitate to remove capacity. No one wants to be the person who changed something that “works” and accidentally messed it up, especially when the downside risk could mean sacrificing performance with slower scheduling, noisier rollouts, and potential SLO hits.

AWS explicitly calls out that failing to continuously adjust resource allocations leads to higher costs, but also to worse performance and reliability over time. 

The AWS Priority Stack

AWS frames EKS cost optimization around a simple practical sequence:

  1. Right-size workloads
  2. Reduce unused capacity
  3. Optimize capacity types (Spot, GPUs, instance families)

Everything else builds on that foundation and the order is intentional. If your requests are wrong, every “node optimization” after it becomes less effective, because the scheduler is making decisions based on inaccurate inputs.

Foundation: Right-Sizing That Protects Performance

The goal of right-sizing is not aggressive cost-cutting. It’s about removing waste while making sure any requests reflect the behavior and performance boundaries set by your organization. 

AWS emphasizes that inaccurate requests and limits are one of the biggest drivers of unused capacity in Kubernetes clusters. When requests are inflated, the scheduler has no choice but to reserve space that never gets used, pushing clusters to scale out unnecessarily. 

These requests should track actual utilization, not guesses. And every container in the pod, including sidecars, matters. In short, any scaling logic should be tied to real data about your system’s health and saturation.

To address this safely, AWS recommends what’s often called the autoscaling trio:

  • HPA to scale replicas based on demand.
  • VPA to adjust requests and limits per replica (starting in audit or recommendation mode).
  • A node autoscaler like Cluster Autoscaler or Karpenter to adjust total node count based on scheduling. 

Used together, these components continuously balance your workload demand with cluster capacity.

Where Komodor Offers Reliability Assurance 

This is where theory often breaks down in production. Changing requests, limits, or scaling behavior can have a real blast radius.

Komodor acts as the “make this safe in production” layer, by providing:

  • Automated workload right-sizing
  • Intelligent bin packing that addresses unevictable pods
  • Smart headroom management intelligently reserves capacity for burst and deployments
  • Autoscaler enrichment for faster, smarter, and more reliable scaling
  • GPU optimization that detects health risks and provides RCA for failures
  • Guardrails and auditability around what changes are allowed
  • Review and approval modes before changes go live
  • Clear visibility into how scaling decisions affect performance and reliability

Right-sizing becomes something teams can operationalize and trust.

Reduce Unused Capacity: Where Karpenter Shines on EKS

Once workloads are right-sized, the next step lies in dynamically reducing provisioned compute. It’s about capacity. Ideally, you want to add and remove nodes dynamically, without leaving idle or fragmented resources. 

AWS supports both Cluster Autoscaler and Karpenter on EKS, but explicitly notes that Karpenter’s model, which provisions nodes directly based on pod scheduling needs, can reduce costs and optimize cluster-wide usage more effectively in many scenarios. 

At a high level, Karpenter watches unschedulable pods, provisions right-sized nodes on demand, and removes nodes when they’re no longer needed.  When it works well, Karpenter enables faster scaling and fewer nodes.

The Hidden Blockers: Why Autoscalers Don’t Always Scale Down

This is the part that surprises teams: enabling autoscaling doesn’t guarantee savings.

Scale-down fails when the cluster cannot safely drain nodes. Typical causes include:

  • Restrictive PDBs (Pod Disruption Budgets) that effectively prevent evictions (or are set inconsistently with replica counts).
  • Strong affinity / anti-affinity / topology constraints that make rescheduling impossible.
  • Workloads that are “unevictable” in practice (stateful services, long-running jobs without safe disruption handling, or policies that block movement).

A good rule of thumb for SREs is that autoscaling is only as good as the cluster’s ability to move workloads. If the workload can’t move, nodes can’t disappear, and the costs stay high.

Where Komodor Adds More Value Than Just Savings

EKS + Karpenter gets you the baseline mechanics. Komodor doesn’t replace EKS or Karpenter. It amplifies them.

Komodor’s added value lies in operationalizing cost optimization with reliability assurance.

Intelligent bin-packing that unlocks consolidation

Komodor can help reduce fragmentation by driving placements that maximize later consolidation, identifying unevictable pods, and isolating them so they don’t block scale-down. The objective is simple: make it easier for node autoscalers to remove nodes safely, instead of leaving savings trapped in “almost-empty” capacity.

Performance-first cost optimization with Smart Headroom

Aggressive consolidation can increase scheduling latency during spikes and rollouts. Komodor’s Smart Headroom concept is a pragmatic counterbalance: maintain a controlled, dynamic buffer so pods can schedule immediately during bursts, without defaulting to permanent overprovisioning.

Safe automation with guardrails

Cost optimization is a sequence of changes. SRE teams need control over how changes are applied. Komodor’s modes include both autonomous and co-pilot to enable real change management. You can keep the automation bounded, reviewed, and aligned to your risk posture.

Savings that hold up over time

The difference between a short-term win and a durable one is whether the savings create reliability debt. In Kubernetes, cost and reliability are tied together more tightly than most teams want to admit. The moment performance gets shaky, the “fix” is usually more capacity: bump requests, add nodes, widen headroom, relax consolidation, or freeze changes. It’s a rational response under pressure. It’s also how savings disappear—and how overprovisioning becomes the default..

Komodor is built to stop that loop. It ties optimizations to operational behavior by tracking what blocks workload movement, what causes regressions, and how autoscaling and consolidation behave under real conditions (spikes, rollouts, noisy neighbors). Because recommendations are connected to change history, autoscaling behavior, and incident context, teams can answer the questions that actually determine whether an optimization is safe: What changed? What’s the blast radius? Did scheduling latency increase? Did rollouts get noisier? Did error rates move?

The result is fewer incidents caused by “unsafe savings,” faster troubleshooting, fewer tickets, and lower MTTR. But just as important: teams gain confidence to keep optimizing instead of reverting to permanent “just-in-case” capacity. These gains don’t show up as a neat line item on the AWS bill—but over time, they’re what make cost optimization sustainable instead of cyclical.

A Practical EKS Playbook SREs Can Actually Run

  1. Baseline
    Profile workload resource patterns and define a cost visibility baseline.
  2. Right-size
    Bring requests/limits back to reality (including sidecars). Introduce HPA/VPA as a coordinated strategy, using conservative rollout practices for VPA.
  3. Adopt Karpenter by workload class
    Start with one NodePool for a well-understood slice of workloads. Expand gradually as you validate scheduling and consolidation behavior.
  4. Remove scale-down blockers
    Audit PDBs and eviction behavior. Fix “can’t move” constraints and identify the workloads that require special handling.
  5. Operationalize savings with Komodor
    Improve consolidation outcomes, preserve burst performance with Smart Headroom, and apply guardrails so optimization becomes continuous and not a one-time campaign.

Conclusion: The Best Way to Maximize EKS

The most effective EKS cost optimization strategy isn’t chasing fewer nodes. It’s building a system where right-sizing, autoscaling, and reliability reinforce each other.

AWS provides the primitives. Karpenter provides the scaling engine. Komodor makes it all work safely in production.

Coming next in this blog series: How to amplify your cost optimization savings for AKS and GKE. 

Stay tuned!→ If you’re building a durable cost optimization program , our ebook on Optimizing the Budget: Cost Management for Kubernetes Applications offers a a step-by-step guide to right-sizing, reducing unused capacity, and keeping performance and reliability intact.