- Customer Stories
- How a Fortune 500 Company Used Komodor to Migrate to K8s and Scale-up Operations
How a Fortune 500 Company Used Komodor to Migrate to K8s and Scale-up Operations
501 - 1,000 employees
13 clusters (6 prod)
About the company
The Fortune 500 Company (who shall be referred to as F500), is the leader in cloud-delivered smart manufacturing solutions, empowering the world’s manufacturers to make awesome products. Their platform gives manufacturers the ability to connect, automate, track, and analyze every aspect of their business to drive transformation.
The Smart Manufacturing Platform includes solutions for manufacturing execution (MES), ERP, quality, supply chain planning and management, Industrial IoT, and analytics to connect people, systems, machines, and supply chains, enabling them to lead with precision, efficiency, and agility.
The Platform Team at F500 went cloud-native to enjoy the speed, scale, and resiliency of containerized environments, and started gradually migrating workloads to K8s over several years. They’ve created a sprawling infrastructure using Azure DevOps and Flux for Ci/CD, internal CLI tools and Rancher for cluster management, Prometheus and Grafana for monitoring, GitHub for version control, OpsGenie for alerting, LaunchDarkly for feature flags, and a massive SQL server to store business data.
Despite being early adopters of cloud-native, they maintained a traditional infrastructure and were reluctant to migrate mission-critical workloads due to the scarcity of internal K8s expertise, the steep learning curve for devs, and the intricate nature of their existing infra.
Thousands of factories in 37 different countries rely on F500’s Smart Manufacturing Platform to keep their production lines running reliably and securely. Going fully cloud-native without proper guardrails for developers posed a risk for the company’s SLAs, as the delayed incident response could have cascading effects of global proportions.
As F500 was in the process of getting acquired by a huge conglomerate , the Platform Team was looking for ways to modernize their legacy infrastructure and migrate more and more workloads to K8s in order to support the inevitable post-acquisition scale up.
Part of the modernization and scale-up also meant shifting-left some ops responsibilities to F500’s 150+ developers. And that required the platform engineers to abstract away some of the complexity baked into K8s by providing devs with tools that make it easy to deploy a cloud-native app without knowing the ins and outs of K8s.
On top of that, the team wanted to gain broader visibility that would allow them to fully understand the impact of every event in the system and trace back how they got there. They needed a comprehensive unified platform for K8s maintenance, monitoring, and troubleshooting.
Komodor provided F500 with a single unified platform for K8s operations, with multi-cluster 360 observability. This enabled F500 to:
- Gain a system-wide view of all events including health changes, alerts, code deployments, infra modifications, config updates, and 3rd party services to fully understand changes that impact the application’s state or behavior, in the right context. Komodor automatically adjusts timestamps to different timezones and helps global teams quickly correlate between distributed events.
- Quickly spot unhealthy services. Komodor offers a filterable, multi-cluster view of all deployments to boost confidence that a workload is ready to receive traffic, or that it’s able to pinpoint the root cause of an unhealthy state.
- Visualize when Secrets and Configmaps are updated in the event/timeline interface. (Including Consul Template rendered changes that are stored as a ConfigMap) Every Secret and Configmap change is tracked and audited |by Komodor during the deployment that updates the ConfigMap.
- Empower developers to independently troubleshoot K8s and own their code end-to-end. Komodor presents alerts on each service’s timeline based on internal events (K8s events like deploy or health) and based on external services such as Datadog, Grafana, NewRelic, OpsGenie, etc. Consolidating all the relevant data in one place, with an extra layer of contextual insights and remediation instructions.