Komodor is a Kubernetes management platform that empowers everyone from Platform engineers to Developers to stop firefighting, simplify operations and proactively improve the health of their workloads and infrastructure.
Proactively detect & remediate issues in your clusters & workloads.
Easily operate & manage K8s clusters at scale.
Reduce costs without compromising on performance.
Empower developers with self-service K8s troubleshooting.
Simplify and accelerate K8s migration for everyone.
Fix things fast with AI-powered root cause analysis.
Explore our K8s guides, e-books and webinars.
Learn about K8s trends & best practices from our experts.
Listen to K8s adoption stories from seasoned industry veterans.
The missing UI for Helm – a simplified way of working with Helm.
Visualize Crossplane resources and speed up troubleshooting.
Validate, clean & secure your K8s YAMLs.
Navigate the community-driven K8s ecosystem map.
Your single source of truth for everything regarding Komodor’s Platform.
Keep up with all the latest feature releases and product updates.
Leverage Komodor’s public APIs in your internal development workflows.
Get answers to any Komodor-related questions, report bugs, and submit feature requests.
Kubernetes 101: A comprehensive guide
Expert tips for debugging Kubernetes
Tools and best practices
Kubernetes monitoring best practices
Understand Kubernetes & Container exit codes in simple terms
Exploring the building blocks of Kubernetes
Cost factors, challenges and solutions
Kubectl commands at your fingertips
Understanding K8s versions & getting the latest version
Rancher overview, tutorial and alternatives
Kubernetes management tools: Lens vs alternatives
Troubleshooting and fixing 5xx server errors
Solving common Git errors and issues
Who we are, and our promise for the future of K8s.
Have a question for us? Write us.
Come aboard the K8s ship – we’re hiring!
Hear’s what they’re saying about Komodor in the news.
Kubernetes CPU limits define the maximum CPU resources a pod is allowed to use on the host machine. When you create a template for a pod, you can optionally specify how many resources each container is allowed to use on a Kubernetes node. The most common resources are CPU and memory (RAM), but you can also specify others.
You can specify a resource request which indicates the minimal resources needed for containers in a pod—the kube-scheduler uses this information to decide which node to schedule the pod on and reserves at least the requested amount of the resource specifically for that container to use. When you specify a resource limit for a container, the kubelet enforces this limit, making sure that the running container does not use more than the resources specified.
This is part of a series of articles about Kubernetes Management.
Each node in a Kubernetes cluster is allocated memory (RAM) and compute power (CPU) that can be used to run containers. A Kubernetes cluster defines a logical grouping of one or more containers into pods. You can then deploy and manage pods on top of your nodes.
When you create a pod, you typically specify the storage and networking that containers share within that pod. The Kubernetes scheduler finds a node that has the required resources to run the pod.
You can provide more information for the scheduler using two parameters that specify RAM and CPU utilization:
Itiel Shwartz
Co-Founder & CTO
In my experience, here are tips that can help you better manage Kubernetes CPU limits and throttling:
Use monitoring tools to keep track of CPU usage and identify throttling instances.
Configure CPU limits based on the actual needs of your applications to avoid unnecessary throttling.
Define resource requests to ensure your pods get the necessary CPU resources.
Regularly review performance metrics to adjust CPU limits and requests appropriately.
Ensure your application code is efficient and not excessively consuming CPU.
If you do not specify a CPU limit, the container can use all the CPU resources available on the node. This can cause containers with high CPU utilization to slow down other containers on the same node and use all available CPU, and may even cause Kubernetes components such as the kubelet to become unresponsive. The node then enters a NotReady state, causing its pods to be rescheduled on another node.
By setting limits on all containers, you can avoid most of the following problems:
Applying limits on CPU also has several potential drawbacks. In fact, some have suggested that CPU limits are an antipattern. Here are a few reasons:
Implementing CPU limits within a Kubernetes environment can often lead to resource underutilization. This occurs when containers are restricted by CPU limits that are set lower than the potential peak usage they might achieve under optimal conditions. As a result, even if additional CPU cycles are available on the node, they remain unused..
Setting and managing CPU limits introduces additional complexity into the resource management strategy of a Kubernetes cluster. Administrators must use meticulous planning to define appropriate CPU limits that reflect the needs of each container while avoiding resource contention. This balancing act requires continuous monitoring and adjustment of CPU limits.
CPU starvation occurs when the limits set are too restrictive, causing processes to receive insufficient CPU time for their execution needs. This is particularly problematic for compute-intensive applications like AI, big data processing, or real-time applications.
Strict CPU limits can also lead to potential disruptions in service delivery. When containers reach their CPU capacity, they are throttled, meaning they are temporarily restricted from using CPU resources beyond their set limit. This throttling can increase the time it takes for the container to complete its tasks, leading to bottlenecks and decreasing application performance.
CPU limits are sometimes useful, but it’s important to understand the context and determine if they are the best solution for your needs. Key considerations include:
Benchmarking involves executing the application under various operational scenarios to measure the actual CPU usage across different states of application load. This data provides a baseline that helps in setting CPU limits that are neither excessively high, which would lead to wasted resources, nor too low, which might trigger CPU throttling.
In environments where Kubernetes hosts multiple tenants — different teams or applications sharing the same cluster resources — CPU limits prevent any single tenant from consuming disproportionate CPU resources. Such limits ensure that all tenants have equitable access to CPU resources, preventing one application’s excessive consumption from degrading the others.
CPU limits enhance the predictability of application performance by ensuring a stable allocation of CPU resources. This stability is crucial for applications that need to guarantee a specific level of performance or for those operating under stringent service level agreements (SLAs). By defining clear CPU boundaries, administrators can better manage the behavior of applications.
If you use CPU limits, it’s important to identify containers without any limits set. Here is how to do it.
Use this query to discover containers without CPU limits in a specific namespace.
sum by (namespace)(count by (namespace,pod,container)(kube_pod_container_info{container!=""})unless sum by (namespace,pod,container)(kube_pod_container_resource_limits{resource="cpu"}))
This technique aims to avoid CPU throttling by identifying containers that have CPU limits close to their actual utilization.
Use this query to find containers with CPU utilization close to the limit:
(sum by(namespace,pod,container)(rate(container_cpu_usage_seconds_total{container!=""}[5m])) / sum by(namespace,pod,container)(kube_pod_container_resource_limits{resource="cpu"})) > 0.8
Kubernetes makes sure that pods are only scheduled on a node if that node has enough resources for the aggregate requests of all the container’s pods. This also means that the node commits to each container the CPU and memory resources specified in its resource request.
Consider a Kubernetes cluster where the sum of all resource requests is greater than the resources available in the cluster. This is known as “overcommitting”. When the cluster is overcommitted, pods might work well in normal circumstances, but when there are high loads, containers can start using resources up to the limit. This will cause certain pods to evict, and in extreme cases, nodes can die due to resource starvation in the cluster.
To check for CPU overcommits in the cluster, use the following query:
100 * sum(kube_pod_container_resource_limits{container!="",resource="cpu"} ) /sum(kube_node_status_capacity_cpu_cores)
This is based on an example from the official Kubernetes documentation.
Step 1: Create a separate namespaceFirst, we’ll create a separate Namespace so that resources created in the tutorial are isolated from the rest of your cluster.
kubectl create namespace cpu-example
Step 2: Create a pod with one container and a resource requestHere is a pod template with one container. The container has a resources:requests field that specifies a request of 0.5 CPU and a resources:limits field that specifies a limit of 1 CPU.
resources:requests
resources:limits
Note that the pod template can also specify how many CPUs the container should be allowed to use. The args section in the template below indicates that the container should attempt to use 2 CPUs.
apiVersion: v1kind: Podmetadata: name: cpu-demo namespace: cpu-examplespec: containers: —name: cpu-demo-ctr image: vish/stress resources: limits: cpu: "1" requests: cpu: "0.5" args: —-cpus —"2"
Step 3: Create the podCreate the pod in your namespace using this command:
kubectl apply -f https://k8s.io/examples/pods/resource/cpu-request-limit.yaml --namespace=cpu-example
Step 4: View pod requests and limitsRun this command:
kubectl get pod cpu-demo --output=yaml --namespace=cpu-example
The output shows that the pod running in the cluster has a request of 0.5 CPU and a limit of 1 CPU.
resources: limits: cpu: "1" requests: cpu: 500m
Run this command to get actual runtime metrics for the pod:
kubectl top pod cpu-demo --namespace=cpu-example
The output will look something like this. The example below shows that the pod is actually using 0.974 of the CPU, which is slightly less than the limit. In this example, the application on the container is throttled by Kubernetes because we configured it to use 2 CPUs, but its limit allows it to use only one.
NAME CPU(cores) MEMORY(bytes)cpu-demo 974m [something]
Troubleshooting Kubernetes CPU issues requires visibility into Kubernetes cluster node, and the ability to correlate node status with what’s happening in the rest of the cluster. More often than not, you will be conducting your investigation during fires in production.
Komodor can help with its ‘Node Status’ view, built to pinpoint correlations between service or deployment issues and changes in the underlying node infrastructure. With this view you can rapidly:
Beyond node error remediations, Komodor can help troubleshoot a variety of Kubernetes errors and issues. As the leading Continuous Kubernetes Reliability Platform, Komodor is designed to democratize K8s expertise across the organization and enable engineering teams to leverage its full value.
Komodor’s platform empowers developers to confidently monitor and troubleshoot their workloads while allowing cluster operators to enforce standardization and optimize performance. Specifically when working in a hybrid environment, Komodor reduces the complexity by providing a unified view of all your services and clusters.
By leveraging Komodor, companies of all sizes significantly improve reliability, productivity, and velocity. Or, to put it simply – Komodor helps you spend less time and resources on managing Kubernetes, and more time on innovating at scale.
Related content: Read our guide to Kubernetes RBAC
If you are interested in checking out Komodor, use this link to sign up for a Free Trial.
Share:
and start using Komodor in seconds!