13 Clusters, ~150 Nodes, 2861 services
The company, part of a Fortune 500 conglomerate, has been proudly serving restaurants for over 25 years—resulting in a network of over 55,000 restaurants, bars, and wineries around the world. For restaurants, it’s an all-in-one solution that helps streamline FOH and BOH operations, puts them in front of new guests, and backs it all up with 24/7 customer support in 240 languages. For diners, it’s a way to discover new restaurants with personalized recommendations, data-backed lists, and curated guides, and manage all reservations through an intuitive app.
F500, a company known for its robust online reservation services, faced significant challenges in visualizing and troubleshooting its Kubernetes-based applications. After transitioning from Singularity to Kubernetes, product engineers at F500 struggled with the lack of a user-friendly interface to view and manage deployments. The native Kubernetes Dashboard, although functional, proved to be verbose and cumbersome, impeding efficient application management.
The primary challenge for F500 was to provide their engineers with a tool that could simplify the visualization of deployments, making it easier to understand and manage applications in the Kubernetes environment. This was critical for efficient deployment, scaling, balancing, and restarting of applications. Furthermore, the company sought to enhance its troubleshooting capabilities, particularly during incidents, by reducing triage time and offering deeper insights into deployment failures.
Komodor emerged as the ideal solution to F500’s challenges. It significantly improved the engineers’ ability to visualize deployments and troubleshoot effectively. The most notable benefits included:
- Reduced Triage Time: Using Komodor led to a substantial decrease in triage time during incidents. Its ability to provide quick access to logs and insights into failed deployments allowed engineers to resolve issues more rapidly.
- Insightful Troubleshooting: Komodor offered advanced features like root cause analysis within log streams, pinpointing the exact line potentially responsible for deployment failures.
- Enhanced Infra Visibility: The platform provided detailed insights into node activities, helping engineers understand and correlate events such as disk and memory pressures with pod evictions.
- Event Timeline: The event timeline feature was particularly valuable, overlaying node events with deployment events to aid in swift troubleshooting. Efficient Change Management: Komodor’s ability to highlight differences between deployments, including changes in application code or base images, proved crucial in identifying breaking changes quickly.
- Log Retention: Unlike the native Kubernetes dashboard, Komodor retained logs for a period, even for deleted pods, aiding postmortem analysis.
“My advice would be don’t overthink it. It was so easy to onboard. It’s definitely worth it in the long run. I was heavily involved at the beginning, getting us onboarded with Komodor, but the great thing about it, and this speaks volumes about the organization and the product they have made, is that it does a great job running itself, and the users are very happy with it.”