Kubernetes CPU Limits and Throttling

What Is the Kubernetes CPU Limit Used For

When you create a template for a pod, you can optionally specify how many resources each container is allowed to use on a Kubernetes node. The most common resources are CPU and memory (RAM), but you can also specify others.

You can specify a resource request which indicates the minimal resources needed for containers in a pod—the kube-scheduler uses this information to decide which node to schedule the pod on and reserves at least the requested amount of the resource specifically for that container to use. When you specify a resource limit for a container, the kubelet enforces this limit, making sure that the running container does not use more than the resources specified.

How Requests and Limits Work

Each node in a Kubernetes cluster is allocated memory (RAM) and compute power (CPU) that can be used to run containers. A Kubernetes cluster defines a logical grouping of one or more containers into pods. You can then deploy and manage pods on top of your nodes.

When you create a pod, you typically specify the storage and networking that containers share within that pod. The Kubernetes scheduler finds a node that has the required resources to run the pod.

You can provide more information for the scheduler using two parameters that specify RAM and CPU utilization:

  • Request—sets the minimum amount of RAM or CPU required for the container. Kubernetes aggregates all container requests into a single pod request. The scheduler uses this pod request to ensure that pods are deployed to nodes with sufficient resources.
  • Limit—you can set a maximum amount of allows RAM or CPU utilization by specifying a limit on the container. Kubernetes translates and enforces restrictions by interacting with container engines, such as Docker or containerd. When a container exceeds its memory limit, the kubelet typically kills and restarts it. CPU limits are more lenient and can be exceeded for long periods of time.

Which Issues Can Occur if You Don’t Specify the CPU Limit in Kubernetes?

If you do not specify a CPU limit, the container can use all the CPU resources available on the node. This can cause containers with high CPU utilization to slow down other containers on the same node and use all available CPU, and may even cause Kubernetes components such as the kubelet to become unresponsive. The node then enters a NotReady state, causing its pods to be rescheduled on another node.

By setting limits on all containers, you can avoid most of the following problems:

  • Out of Memory (OOM) issues—can cause a node to go down, affecting the stability of the cluster. For example, applications with memory leaks can cause OOM problems. However, memory limits on containers can prevent memory leaks within a container from affecting the node.
  • CPU starvation—applications that are too CPU-intensive can affect all applications on the same node. Other applications can slow down or become unresponsive.
  • Pod eviction—when a node runs out of resources, the node initiates an eviction process that terminates pods. The first pods evicted are those that have no resource requests.
  • Financial waste—if there is no need for resource requests or limits, and there are no errors, this probably means you have over-provisioned the cluster and are overpaying for hardware resources.

What is CPU Throttling

CPU throttling means that applications are granted more constrained resources when they are near to the container’s CPU limit. In some cases, container throttling occurs even when CPU utilization is not close to the limit due to bugs in the Linux kernel.

Consider a single-threaded application running on a limited CPU with a processing time of 200ms per operation. The following diagram shows an application that completes the request:

Constrained CPU Limits

Now consider an application with a CPU limit of 0.4 CPUs. The application will only receive about 40ms of runtime for each 100ms. This means that instead of completing the request in 200ms, it will take a total of 440ms. This means the application is experiencing CPU throttling.

Unconstrained CPU Limits

Preventing Errors by Detecting Containers Without CPU Limits

The first step in setting appropriate Kubernetes resource limits is to discover containers without limits.

Finding containers without CPU limits by namespace

Use this query to discover containers without CPU limits in a specific namespace.

sum by (namespace)
(count by (namespace,pod,container)(kube_pod_container_info{container!=""})
unless sum by (namespace,pod,container)(kube_pod_container_resource_limits{resource="cpu"}))

Finding containers with tight CPU limits

This technique aims to avoid CPU throttling by identifying containers that have CPU limits close to their actual utilization.

Use this query to find containers with CPU utilization close to the limit:

(sum by
(namespace,pod,container)(rate(container_cpu_usage_seconds_total{container!=""}[5m])) / 
sum by(namespace,pod,container)(kube_pod_container_resource_limits{resource="cpu"})) > 0.8

Checking if the cluster has enough capacity

Kubernetes makes sure that pods are only scheduled on a node if that node has enough resources for the aggregate requests of all the container’s pods. This also means that the node commits to each container the CPU and memory resources specified in its resource request.

Consider a Kubernetes cluster where the sum of all resource requests is greater than the resources available in the cluster. This is known as “overcommitting”. When the cluster is overcommitted, pods might work well in normal circumstances, but when there are high loads, containers can start using resources up to the limit. This will cause certain pods to evict, and in extreme cases, nodes can die due to resource starvation in the cluster.

To check for CPU overcommits in the cluster, use the following query:

100 * sum(kube_pod_container_resource_limits{container!="",resource="cpu"} ) /

Quick Tutorial: How to Assign CPU Resources to Containers and Pods

This is based on an example from the official Kubernetes documentation.

Step 1: Create a separate namespace
First, we’ll create a separate Namespace so that resources created in the tutorial are isolated from the rest of your cluster.

kubectl create namespace cpu-example

Step 2: Create a pod with one container and a resource request
Here is a pod template with one container. The container has a resources:requests field that specifies a request of 0.5 CPU and a resources:limits field that specifies a limit of 1 CPU.

Note that the pod template can also specify how many CPUs the container should be allowed to use. The args section in the template below indicates that the container should attempt to use 2 CPUs.

apiVersion: v1
kind: Pod
  name: cpu-demo
  namespace: cpu-example
 —name: cpu-demo-ctr
    image: vish/stress
        cpu: "1"
        cpu: "0.5"

Step 3: Create the pod
Create the pod in your namespace using this command:

kubectl apply -f https://k8s.io/examples/pods/resource/cpu-request-limit.yaml --namespace=cpu-example

Step 4: View pod requests and limits
Run this command:

kubectl get pod cpu-demo --output=yaml --namespace=cpu-example

The output shows that the pod running in the cluster has a request of 0.5 CPU and a limit of 1 CPU.

    cpu: "1"
    cpu: 500m

Run this command to get actual runtime metrics for the pod:

kubectl top pod cpu-demo --namespace=cpu-example

The output will look something like this. The example below shows that the pod is actually using 0.974 of the CPU, which is slightly less than the limit. In this example, the application on the container is throttled by Kubernetes because we configured it to use 2 CPUs, but its limit allows it to use only one.

NAME                        CPU(cores)   MEMORY(bytes)
cpu-demo                    974m         [something]

Solving Kubernetes Node Errors with Komodor

Troubleshooting Kubernetes CPU issues requires visibility into Kubernetes cluster node, and the ability to correlate node status with what’s happening in the rest of the cluster. More often than not, you will be conducting your investigation during fires in production.

Komodor can help with our new ‘Node Status’ view, built to pinpoint correlations between service or deployment issues and changes in the underlying node infrastructure. With this view you can rapidly:

  • See service-to-node associations
  • Correlate service and node health issues
  • Gain visibility over node capacity allocations, restrictions, and limitations
  • Identify “noisy neighbors” that use up cluster resources
  • Keep track of changes in managed clusters
  • Get fast access to historical node-level event data


Beyond node error remediations, Komodor can help troubleshoot a variety of Kubernetes errors and issues, acting as a single source of truth (SSOT) for all of your K8s troubleshooting needs. Komodor provides:

  • Change intelligence: Every issue is a result of a change. Within seconds we can help you understand exactly who did what and when.
  • In-depth visibility: A complete activity timeline, showing all code and config changes, deployments, alerts, code diffs, pod logs and etc. All within one pane of glass with easy drill-down options.
  • Insights into service dependencies: An easy way to understand cross-service changes and visualize their ripple effects across your entire system.
  • Seamless notifications: Direct integration with your existing communication channels (e.g., Slack) so you’ll have all the information you need, when you need it.

Related content: Read our guide to Kubernetes RBAC

If you are interested in checking out Komodor, use this link to sign up for a Free Trial.

How useful was this post?

Click on a star to rate it!

Average rating 3 / 5. Vote count: 6

No votes so far! Be the first to rate this post.