Komodor is a Kubernetes management platform that empowers everyone from Platform engineers to Developers to stop firefighting, simplify operations and proactively improve the health of their workloads and infrastructure.
Proactively detect & remediate issues in your clusters & workloads.
Easily operate & manage K8s clusters at scale.
Reduce costs without compromising on performance.
Empower developers with self-service K8s troubleshooting.
Simplify and accelerate K8s migration for everyone.
Fix things fast with AI-powered root cause analysis.
Explore our K8s guides, e-books and webinars.
Learn about K8s trends & best practices from our experts.
Listen to K8s adoption stories from seasoned industry veterans.
The missing UI for Helm – a simplified way of working with Helm.
Visualize Crossplane resources and speed up troubleshooting.
Validate, clean & secure your K8s YAMLs.
Navigate the community-driven K8s ecosystem map.
Kubernetes 101: A comprehensive guide
Expert tips for debugging Kubernetes
Tools and best practices
Kubernetes monitoring best practices
Understand Kubernetes & Container exit codes in simple terms
Exploring the building blocks of Kubernetes
Cost factors, challenges and solutions
Kubectl commands at your fingertips
Understanding K8s versions & getting the latest version
Rancher overview, tutorial and alternatives
Kubernetes management tools: Lens vs alternatives
Troubleshooting and fixing 5xx server errors
Solving common Git errors and issues
Who we are, and our promise for the future of K8s.
Have a question for us? Write us.
Come aboard the K8s ship – we’re hiring!
Hear’s what they’re saying about Komodor in the news.
AKS monitoring is the process of overseeing and managing the performance and availability of Azure Kubernetes Service (AKS) clusters. It involves the collection and analysis of metrics, logs, and traces from AKS to gain insights into the system’s health and performance.
AKS monitoring allows administrators to keep track of the status of their applications, detect anomalies, troubleshoot issues, and plan for capacity. It can provide visibility into various aspects of AKS, including node performance, pod status, network traffic, and more.
Azure provides a range of tools you can use to monitor AKS, including Azure Monitor, Container Insights, and Azure Log Analytics. These tools provide visibility into your AKS environment, allowing you to diagnose and resolve issues in your AKS clusters.
This is part of a series of articles about Kubernetes monitoring
Monitoring your AKS clusters is vital for several reasons:
Itiel Shwartz
Co-Founder & CTO
In my experience, here are tips that can help you better monitor AKS clusters:
Configure Azure Monitor to collect and analyze metrics and logs for real-time operational insights.
Enable Prometheus metrics scraping for detailed performance data and alerting capabilities.
Collect and analyze control plane logs for deeper insights into cluster operations and troubleshooting.
Use Azure Monitor for Networks to track and analyze network traffic and identify potential bottlenecks.
Implement traffic analytics to monitor network flows and optimize performance and security.
Azure provides a range of tools for AKS monitoring. Let’s explore a few of them.
Azure Monitor is a comprehensive service that collects, analyzes, and visualizes metrics and logs from your Azure resources, including AKS. It provides real-time operational insights, allowing you to diagnose issues and understand trends.
Azure Monitor integrates with AKS, enabling you to collect metrics and logs from your AKS clusters. It also supports querying and alerting, allowing you to set up alerts based on specific conditions and send notifications when these conditions are met.
Prometheus is an open source observability tool built for containerized and Kubernetes environments. Managed Prometheus with Azure Monitor is a fully managed service that provides Prometheus-as-a-Service for AKS. It enables you to collect Prometheus metrics from your AKS clusters and analyze them using Azure Monitor.
Managed Prometheus integrates seamlessly with AKS, allowing you to monitor your clusters using the same Prometheus queries and dashboards you are familiar with. It also supports alerting, allowing you to set up Prometheus alert rules and receive notifications when these rules are triggered.
Microsoft Defender for Cloud is a security management tool that integrates with AKS to provide threat protection. It monitors your AKS clusters for potential security threats and provides recommendations for improving your security posture.
Defender for Cloud collects security-related logs and metrics from your AKS clusters and analyzes them using advanced analytics and threat intelligence. It also supports automated responses, allowing you to take quick action when a threat is detected.
Related content: Read our guide to Kubernetes monitoring tools
Here are the main steps that will allow you to monitor your Azure Kubernetes clusters.
Container Insights is a feature of Azure Monitor that provides deep insights into the performance and health of your AKS clusters. It collects metrics, logs, and events from your AKS clusters and visualizes them in Azure Monitor.
To enable Container Insights for your AKS cluster, you need to install the Azure Monitor agent on your cluster nodes. This agent collects the necessary data and sends it to Azure Monitor for analysis. Once enabled, you can view the collected data in the Azure Monitor dashboard, where you can analyze it and set up alerts.
Monitoring the performance of your AKS cluster involves tracking key metrics such as CPU usage, memory usage, network traffic, and more. These metrics can provide insights into the health and performance of your cluster.
You can monitor these metrics using Azure Monitor, which collects performance metrics from your AKS clusters and visualizes them in a dashboard. You can also set up alerts based on these metrics, enabling you to receive notifications when certain conditions are met.
Container Insights supports alerting, allowing you to create alert rules based on specific conditions. These alert rules can help you detect and respond to issues in your AKS cluster.
To create an alert rule, you need to specify the condition that triggers the alert, the action to take when the alert is triggered, and the recipients of the alert notification. For example, you could create an alert rule that triggers when the CPU usage of a node exceeds a certain threshold and sends an email notification to your operations team.
It’s crucial that you enable scraping of Prometheus metrics for your cluster, as this will provide you with a wealth of information about your system’s performance. Prometheus metrics can provide:
There are two templates available in the Managed Prometheus service, which can help you automatically set up alerts for your cluster:
Control plane logs provide detailed records of the operations performed by the Kubernetes control plane, which can help you understand how your cluster is functioning and troubleshoot cluster-level issues.
To collect control plane logs for AKS clusters, you need to create diagnostic settings. These settings allow you to specify which logs you want to collect and where you want to store them. You can choose to send the logs to a Log Analytics workspace, a storage account, or an event hub.
Creating diagnostic settings for control plane logs also enables you to set up alerts based on specific log events. For example, you can create an alert that triggers whenever there’s a failed API request, indicating a potential issue with your cluster. Control plane logs also let you track changes made to your cluster, such as the creation or deletion of pods. This can help you understand the impact of these changes on your cluster’s performance and stability.
Network observability is a crucial aspect of AKS monitoring. It gives you visibility into your cluster’s network traffic, allowing you to understand how your applications are communicating with each other and with external services.
To enable network observability, you can use tools like Azure Monitor for Networks, which provides real-time network flow data for your AKS clusters. This data can help you identify network bottlenecks, troubleshoot connectivity issues, and ensure that your applications are communicating efficiently.
In addition, network observability can help you detect potential security threats. For example, if you see an unusually high amount of traffic coming from a specific IP address, this could indicate a potential DDoS attack. By monitoring your network traffic, you can detect such threats early and take action to protect your cluster.
Lastly, an effective AKS monitoring strategy should include the use of traffic analytics. Traffic analytics is an Azure solution that provides visibility into user and application activity in your Azure networks. It analyzes Azure Network Watcher flow logs and provides insights into traffic flows in your Azure subscriptions.
In the context of AKS, traffic analytics provide detailed insights into the network traffic to and from your cluster, helping you understand your application’s network behavior and optimize its performance. With traffic analytics, you can monitor the volume of network traffic, the source and destination of the traffic, and the protocols and ports being used. This information can help you identify potential bottlenecks or inefficiencies in your network.
Komodor’s platform streamlines the day-to-day operations and troubleshooting process of your Kubernetes apps. Regardless of which Kubernetes Managed Service provider you may be using (and you may be using multiple!), Komodor acts as your single pane of glass for monitoring your Kubernetes workloads, providing enhanced visibility into your clusters and integrating with popular monitoring tools like Datadog, Prometheus or Grafana for clear metric and event visualization. Additionally, it features static monitors that enforce best practices and prevent misconfigurations, and historical data retention that lets you see a complete timeline of events leading up to the current state.
Moreover, Komodor’s Workspace view feature reduces the cognitive load on K8s non-experts by filtering out irrelevant data, ensuring that they stay informed about their app’s performance data and can take swift action when issues arise. By mitigating the overwhelming flow of data that emerges from various dashboards and APMs, Komodor helps end-users own their apps e2e and operate them independently.
To learn more about how Komodor can make it easier to empower you and your teams to troubleshoot K8s, sign up for our free trial.
If you are interested in checking out Komodor, use this link to sign up for a Free Trial.
Share:
and start using Komodor in seconds!