Komodor Blog

All articles
Page 1
Welcome to Komodor's blog, your go-to resource for insights on all things Kubernetes. Stay tuned for expert advice, in-depth tutorials, and the latest industry trends to help you throughout your K8s journey.
Market Guide for AI Site | Komodor

Komodor Triples Revenue as AI-Driven Site Reliability Engineering (SRE) Reshapes Cloud-Native Operations

2 min read

Company doubled its share of Fortune 500 customers with surging demand for AI-powered reliability and cost control.

Illustration of a cloud dashboard cockpit steering toward a Kubernetes icon, representing Kubernetes Cost Optimization Done Right.

Kubernetes Cost Optimization Done Right

8 min read

Overprovisioning is draining your cloud budget. Kubernetes cost optimization done right means fixing root causes, not just reading dashboards.

Abstract illustration for Resource Allocation in Kubernetes showing the Kubernetes logo at the center of layered rings

Rightsizing & Handling Resource Allocation in Kubernetes

8 min read

Pods crashing? Resources wasted? Master resource allocation in Kubernetes with proven rightsizing strategies that work in production.

When AI Writes the Code, Who Keeps Production Running?

6 min read

The acceleration of AI-assisted development has created an asymmetric problem. Developers got their force multiplier. SREs are still using the same playbook they had five years ago, except now they're responsible for exponentially more code, written by tools that prioritize speed over operational clarity.

AI SRE in Practice: Accelerating Engineer Onboarding with Contextual Expertise

6 min read

Part 7 of our AI SRE in Practice Series. This scenario walks through how AI-augmented knowledge transfer changes the onboarding experience, using a real example from a containers team implementing changes to HiveMQ infrastructure.

AI SRE in Practice: Diagnosing AWS CNI IP Exhaustion Before Widespread Outage

6 min read

Part 6 of our AI SRE in Practice Series. In this scenario we walk through an AWS CNI IP exhaustion incident where 15 services experienced outages before platform teams identified the root cause.

klaudia-blueprints-knowledge-base

Contextualizing AI SRE: How Klaudia Leverages Organizational Knowledge

5 min read

For an AI SRE to be safe and effective, it cannot rely on generic training data alone. It needs context. Klaudia solves this through a dual-layer approach to context engineering: the Organization Blueprint and the Knowledge Base Integration.

AI SRE in Practice: Tracing Policy Changes to Widespread Pod Failures

6 min read

Part 5 of our AI SRE in Practice Series. This scenario walks through a policy enforcement incident where a seemingly minor configuration change caused widespread pod failures that required deep investigation across the cluster to understand the scope and root cause.

mcp-komodor-klaudia-ai-sre

From Blueprint to Production: Building a Kubernetes MCP Server

3 min read

This post details how to build an MCP server that connects AI agents (like Claude Desktop or Cursor) to a Kubernetes cluster, enabling natural language control over kubectl operations.