Komodor is a Kubernetes management platform that empowers everyone from Platform engineers to Developers to stop firefighting, simplify operations and proactively improve the health of their workloads and infrastructure.
Proactively detect & remediate issues in your clusters & workloads.
Automatically analyze and reconcile drift across your flee
Easily operate & manage K8s clusters at scale.
Reduce costs without compromising on performance.
Meet Klaudia, Your AI-powered SRE Agent
Empower developers with self-service K8s troubleshooting.
Simplify and accelerate K8s migration for everyone.
Fix things fast with AI-powered root cause analysis.
Automate and optimize AI/ML workloads on K8s
Easily manage Kubernetes Edge clusters
Smooth Operations of Large Scale K8s Fleets
Explore our K8s guides, e-books and webinars.
Learn about K8s trends & best practices from our experts.
Listen to K8s adoption stories from seasoned industry veterans.
The missing UI for Helm – a simplified way of working with Helm.
Visualize Crossplane resources and speed up troubleshooting.
Validate, clean & secure your K8s YAMLs.
Navigate the community-driven K8s ecosystem map.
Your single source of truth for everything regarding Komodor’s Platform.
Keep up with all the latest feature releases and product updates.
Leverage Komodor’s public APIs in your internal development workflows.
Get answers to any Komodor-related questions, report bugs, and submit feature requests.
Kubernetes 101: A comprehensive guide
Expert tips for debugging Kubernetes
Tools and best practices
Kubernetes monitoring best practices
Understand Kubernetes & Container exit codes in simple terms
Exploring the building blocks of Kubernetes
Cost factors, challenges and solutions
Kubectl commands at your fingertips
Understanding K8s versions & getting the latest version
Rancher overview, tutorial and alternatives
Kubernetes management tools: Lens vs alternatives
Troubleshooting and fixing 5xx server errors
Solving common Git errors and issues
Who we are, and our promise for the future of K8s.
Have a question for us? Write us.
Come aboard the K8s ship – we’re hiring!
Hear’s what they’re saying about Komodor in the news.
Webinars
Speakers Deck available for download
Udi: Hi everyone, and welcome to Kubernetes Health Management with Komodor. Today, we’re going to break down the concept of Kubernetes health—what it actually means to manage it—and then we’ll show you how we do it at Komodor. And there’s no better person to walk us through this topic than Danielle Inbar. So, welcome, Danielle!
Danielle: Thanks, happy to be here!
Udi: Danielle is the Director of Product at Komodor. She’s worked extensively in cloud-native environments, previously at Snyk, where she focused on container security and open-source products. She also worked at Spot.io, which was later acquired by NetApp, as a product manager specializing in cloud cost optimization, specifically for Kubernetes.
Danielle: That’s right. And way back, I was a software engineer at Motorola and started my career in QA at a company called Vingh. Outside of work, I’m a mom to two wonderful kids—Arel and Yuval—and our office dog, Charlie, who’s basically a celebrity at Komodor.
Udi: That’s awesome. Charlie is definitely the office star! So, let’s jump right into it. Kubernetes observability is a hot topic right now. Where do things stand in the industry, and why do people say that traditional monitoring solutions are broken when it comes to Kubernetes?
Danielle: Yeah, great question. So, we all know Kubernetes is complex. And it’s not just us saying this—Tim Hockin, co-founder of Kubernetes, has also said that its complexity is only increasing.
The problem is that organizations are struggling to move forward. They want to release features faster, but they’re spending more and more time managing Kubernetes itself. The complexity becomes a tax on innovation, velocity, and scale. If engineers are constantly firefighting Kubernetes issues, they’re not building new features or improving their products.
Udi: Right, it’s a double-edged sword. Kubernetes is powerful and flexible, but that power comes with a cost. You get all these capabilities, but you also have to pay the price of maintaining and troubleshooting it. Some companies even have entire teams dedicated just to managing Kubernetes.
Danielle: Exactly. And traditionally, organizations thought about infrastructure monitoring in two layers:
But Kubernetes changes everything because it doesn’t fit neatly into either layer. It sits in between. It has one foot in the application layer (with workloads like pods, deployments, and jobs) and another in the infrastructure layer. And it introduces new layers—like configuration, networking, storage, add-ons, operators, and CRDs—that make troubleshooting much harder.
Udi: And this problem only gets worse as organizations scale. If you’re running just one cluster, maybe you can manage it. But when you scale to dozens or even hundreds of clusters—especially in a hybrid or multi-cloud environment—troubleshooting becomes exponentially harder.
Danielle: Absolutely. That’s why so many companies struggle to maintain Kubernetes health. Once you scale up, small misconfigurations or issues that seem minor at first can cascade into major incidents. You start seeing pods restarting unexpectedly, jobs failing, nodes under high pressure—all because something upstream wasn’t configured properly.
And this is why the traditional approach to monitoring Kubernetes doesn’t work. Engineers spend hours correlating logs, metrics, and configurations across multiple layers, trying to piece together what went wrong. It’s overwhelming.
Udi: Right. And this brings us to the big question: What does Kubernetes health really mean?
Danielle: Kubernetes health isn’t just about whether a pod is running or a node is online. It’s about understanding the bigger picture—how all the components interact and how small misconfigurations can lead to cascading failures.
Think of Kubernetes as the operating system of the cloud. It’s a platform that runs other platforms. But most engineers only focus on what’s visible—the tip of the iceberg. Underneath, there’s a complex ecosystem of configurations, dependencies, and resources that all need to be in sync.
Udi: And if something is off, the whole system can become unstable.
Danielle: Exactly. Some of the biggest challenges in Kubernetes health include:
Udi: Let’s walk through an example. Say a developer sees a web service is down. What’s the traditional troubleshooting process?
Danielle: First, the developer inspects the Kubernetes deployment and sees that all the pods are failing. They check the logs—something’s preventing the service from connecting to the database. The database itself looks fine, but the connections dropped to zero.
After an hour of frustration, they escalate the issue to DevOps. The DevOps engineer retraces all the steps, then checks the network policies—everything seems fine. Finally, they inspect the certificates and realize that a TLS certificate expired, causing authentication failures.
This entire process can take two hours or more. And all of it could have been avoided if they had better visibility into certificate health.
Udi: So, how does Komodor approach this differently?
Danielle: With Komodor, this investigation would take five minutes instead of two hours. Our Kubernetes Health Management platform automatically detects issues like failed certificate renewals and alerts you before they expire.
Instead of waiting for a service outage, teams get an early warning and can fix issues before they impact users.
Udi: And that’s the key difference. Most monitoring tools react after something breaks. Komodor helps teams be proactive, reducing downtime and improving reliability.
Udi: So, to wrap things up, we covered:
Danielle: Exactly. And Komodor isn’t just about reliability—it also includes cost optimization, user management, and role-based access control to streamline Kubernetes operations.
Udi: Awesome. Thanks, Danielle, for the deep dive! And thanks to everyone for joining. We’ll now open it up for Q&A.
and start using Komodor in seconds!