• Home
  • Komodor Blog
  • The Two-Sided Scheduling Problem: Reaching the Next Layer of Cloud Savings

The Two-Sided Scheduling Problem: Reaching the Next Layer of Cloud Savings

You’ve deployed Karpenter or Cluster Autoscaler and tightened your resource requests, but while you saw an initial dip in your cloud bill, your savings have flatlined.

Organizations that thought they had the fundamentals of cloud cost under control are now seeing stagnation. The problem isn’t that they need another FinOps tool or better visibility. The problem is that the current state of enterprise cloud cost optimization strategy is fundamentally reactive.

The remaining 20% to 30% of your cluster waste isn’t hiding inside your workloads; it’s structurally locked into the cluster by Kubernetes itself. To understand the causes of this idle capacity, it’s important to first understand a fundamental tension in Kubernetes architecture. We call it the two-sided problem.  

The Two-Sided Problem: Why Autoscalers Fail to Consolidate

Put simply, your Kubernetes scheduler and autoscaler operate in two entirely different realities. 

This is not a bug; both systems were designed to behave like this, but their reactive nature has the unintended consequence of creating cloud waste.

On one side, you have the scheduler that optimizes for what’s happening now. When a pod needs to be placed, it finds a node that fits right now, and puts it there. It has no awareness of future cluster state, no knowledge of which nodes the autoscaler is planning to drain. Pods land wherever they fit, including on nodes that should be emptied in the near future.

On the other side, you have the autoscaler. It’s trying to consolidate, but it inherits the placement decisions the scheduler already made. Its consolidation attempts run into obstacles the scheduler quietly created. And critically, the autoscaler is reactive by design. It can only act on the world as it is. It cannot fix what the scheduler got wrong upstream.

The expensive outcome of this tension is that capacity cannot be fully utilized. For example, short-lived and long-lived workloads get placed on the same node, preventing consolidation from being streamlined, or memory-heavy workloads are stacked together on general purpose instances, leaving the node’s CPU capacity wasted. The autoscaler’s effective optimization surface approaches zero.

The fix isn’t a better autoscaler but to proactively manage what happens upstream, at scheduling time.

Predictive Placement: Proactive Consolidation

This is where Predictive Placement comes in. It does not replace your existing scheduler or your autoscaler. It complements your existing setup by sitting in front of the scheduler at admission time, acting proactively. There are no CI/CD changes and no deployment workflow changes required.

Predictive Placement works through a continuous optimization loop, utilizing four core mechanisms:

1. Continuous Simulation

Accurate and up-to-date data is gathered from continuous simulations. Every 60 seconds, Komodor simulates a full cluster drain. It uses this to set the ideal state of the cluster, categorizing nodes into actionable drain horizons. Nodes are classified into one of three states: 

  • Removable (the autoscaler should drain this) 
  • Prioritized for Removal (ranked 1 to 100 based on readiness for removal)
  • Locked (running long-lived or unevictable pods).

2. Intelligent Pod Placement (Steering)

Using the simulation output, an admission webhook steers new pods away from drain candidates at scheduling time, before bad placement happens, and explicitly guides pods toward nodes that are not slated for removal.

3. Unevictable Pod Consolidation

To solve the issue of scattered blockers, Komodor detects unevictable pods at admission. It automatically groups these unevictable pods onto designated “keeper nodes”. By clustering the blockers together, the remaining nodes are kept clean. The autoscaler can now drain them freely, maximizing the number of nodes that become viable candidates for removal.

4. AI Pattern Recognition

Simulations map the current state, but AI maps the future. Using a proprietary pattern recognition model, Komodor learns workload and scheduling patterns over time. It groups complementary resources together. For example, by placing CPU-heavy pods on the same nodes as memory-hungry pods to ensure maximum utilization. It also learns the expected time span of pods, grouping long-lived workloads together so that short-lived pods don’t anchor nodes indefinitely. This turns reactive simulation into proactive guidance.

By continuously steering pods towards the optimal nodes for consolidation, Predictive Placement proactively fixes the Kubernetes scheduling gap for incoming workloads.

Capacity Intelligence: Clearing the Board of Blockers 

Optimizing the placement of new pods is only half the battle, though. To achieve maximum consolidation, you must also address the existing optimization blockers already present in your environment.

In large-scale clusters, these blockers are incredibly difficult to isolate manually. Workload specs and configurations that made sense on Day 1 can have compounding, unintended consequences at scale. A single parameter can effectively paralyze your autoscaler, stranding capacity and costing thousands of dollars a month. 

To clear the board of these hidden traps, Komodor has introduced Capacity Intelligence. This AI-powered intelligence has a rich contextual understanding of every workload running in the cluster, and how they interact with each other. It continuously scans the cluster for optimization blockers and misconfigurations, with reliability always paramount.  It chains the evidence from the workload spec or the autoscaler configuration directly to the real-world outcomes, the idle nodes with wasted capacity.

When a misconfiguration is detected, Capacity Intelligence attaches a quantified, human-readable dollar-per-month impact and provides the full root-cause context.

Every potential remediation path is tested against a set of simulations prior to recommendation to ensure there’s no risk of availability or performance degradation. Human-in-the-loop approval is the default, while also allowing the configuration of guardrails and policies for autonomous remediation.

The findings are translated into one-click, reliability-aware fixes suggested and applied through the platform. The fix itself is generated by Klaudia, Komodor’s Agentic AI, providing the exact instructions needed. Crucially, before any recommendation is surfaced, it is validated against reliability checks to guarantee zero reliability degradation.

Komodor | The Two-Sided Scheduling Problem: Reaching the Next Layer of Cloud Savings

A Continuous Optimization Loop

Capacity Intelligence and Predictive Placement are designed to operate together as a continuous optimization cycle. Detection without prevention means waste reappears the next time a pod is scheduled. Prevention without detection means existing waste sits in your cluster forever, hidden beneath different configurations and setups.

By injecting deep contextual awareness and proactive intelligence into the gap between the scheduler and the autoscaler, Komodor is able to move Kubernetes cost control away from typically reactive FinOps approaches. It delivers a reliability-first optimization suite that allows platform teams to safely automate cost actions. With support for unattended optimization modes, teams can achieve up to 80% cloud infrastructure cost savings without compromising performance.

Ready to start reclaiming the cluster waste that other solutions leave behind? Discover how Komodor drives deeper savings in the most complex cloud environments while actively safeguarding performance SLAs. Find out more.