Kubecon 2022 just concluded with plenty of exciting announcements, including our new open-source project, Helm-Dashboard, but we’ve got some more news to share on top of that.
Closing the Troubleshooting Loop Using Komodor Actions
Komodor’s mission is to take the complexity out of Kubernetes troubleshooting. Setting out, this meant building a platform that would streamline the process of root cause analysis – a platform that quickly and intuitively identifies when anything goes wrong, answers the “what was changed, by whom, and when?” questions by taking inventory of all changes, in their relevant context, and pinpointing the cause of fires in production.
However, once the root cause was detected, users still had to make a few hops and run manual commands in order to remediate. Although our users saw over a 70% decrease in MTTR, it was still a bit far from our vision of seamless troubleshooting and smooth K8s operations.
Today, we’ve added the missing piece to close the troubleshooting loop of detection, investigation, and remediation, all within the Komodor platform. This new feature includes actions such as Restart, Scale, Delete, Edit Resource, Rollback, Drain/Cordon Node, and more.
Our goal with “Actions”? Simple. We want to limit the context switching between tools and remove the current K8s knowledge barrier for non-K8s experts by making everything available in one place, all within the Komodor platform, and only a click away.
Read our documentation for a complete guide to using Komodor Actions.
Whenever one of our Monitors detects an issue, be it on the infra or app level, Komodor Playbooks automatically runs a series of checks and provides actionable insights for remediation. Now, with Komodor Actions, those insights come ready with a suggested action that can be executed directly with a single click.
For instance, following an alert on Slack, you can jump directly to the relevant service on Komodor’s platform, see an availability issue on the timeline, and click on it to discover that only 2/4 replicas are available and that Pods are crashing due to an OOMKilled error. Then, Komodor not only suggests increasing memory resources but already provides the option to take immediate action by simply clicking on the ‘Configure Resource’ button.
RBAC & Komodor Actions
Allowing non-experts to easily manipulate and change K8s resources empowers them to troubleshoot independently and reduces toil on DevOps and SRE teams. But, as the saying goes; “with great power comes great responsibility”, and you can’t grant full access without some restrictions.
We’re aware of the fine line between giving too many permissions to giving too little. DevOps and SREs don’t want to do everything alone and become a bottleneck but are also reluctant to give non-K8s experts too much autonomy and direct shell access, which could potentially result in an ‘elephant in a china shop’ scenario.
Read our documentation for an overview of RBAC in Komodor.
We also know that, in most organizations, too many changes to K8s resources can go un-audited. DevOps teams lack visibility into developer access and cannot fully track actions done within the cluster itself.
Hence, we’ve added RBAC support. It allows DevOps, SRE, and Platform teams to customize policies and access control for different teams, as they see fit, without losing track of changes across their Kubernetes infrastructure. This solves the pain of not knowing if and when someone made a change to the cluster, who did it, and what were the ramifications.
Thanks to Komodor Actions and RBAC, the troubleshooting loop can finally be closed with both DevOps and non-K8s experts having the ability to detect, investigate and remediate any issue independently directly through Komodor’s platform with minimal friction.