Pod in Pending State? Top 6 Causes and How to Resolve

Your deployment is done. The YAML is clean. But the pod isn’t running. It’s just sitting there, pending. No error, no container, no progress.

A pod stuck in pending state means Kubernetes wants to run your workload but can’t find a suitable place for it. The reasons range from exhausted node resources to a single misconfigured label. Here’s how to diagnose and fix every one of them.

What Is a Kubernetes Pod Pending State? 

In Kubernetes, a pod is the basic unit of deployment. A pod can have several states—the “Pending” state indicates that the pod is not yet running on a node, meaning it awaits assignment for execution. This situation typically arises when the scheduler has yet to allocate the necessary resources or lacks available nodes meeting the pod’s requirements. 

During this phase, the pod’s containers have not started, and any issues preventing scheduling need resolution. A pod often enters the pending state due to constraints such as insufficient compute resources, node selection criteria, or unmet storage demands.

This is part of a series of articles about Kubernetes troubleshooting.

Teams dealing with recurring Pending pods across multiple clusters may also want to explore how an AI SRE Platform helps platform teams troubleshoot Kubernetes environments at scale.

Common Causes of Pods Remaining in Pending State 

1. Insufficient Node Resources

A common cause for pods remaining in the pending state is insufficient node resources. Kubernetes requires adequate CPU and memory on nodes to launch new pods. When these resources are depleted, the scheduler cannot place pods, leaving them in  pending status. It’s critical to monitor the resource utilization across the cluster to ensure that nodes can accommodate new workloads as needed.

Resource over-commitment is another aspect leading to this state. If too many pods request more resources than nodes can supply, competition for these resources intensifies. Operators need to manage the distribution of workloads carefully, possibly reconfiguring resources or scaling up cluster capacity to handle the increased demand.

If Pending pods are piling up because requests and actual usage have drifted apart, see our Kubernetes rightsizing at scale and the broader Kubernetes cost optimization guides.

2. Node Not Ready or Unschedulable

Pods may remain  pending if nodes become unschedulable due to health issues or if administrators intentionally cordon nodes for maintenance. A node marked as “Not Ready” indicates it cannot host additional workloads, likely due to failed liveness or readiness probes. Identifying and resolving the underlying issues is crucial to restoring node functionality and freeing up resources for pod scheduling.

Network disruptions or component failures on nodes can also contribute to this scenario, rendering nodes temporarily unschedulable. Regular node health checks and automated recovery processes help mitigate these risks.

When node health issues repeatedly turn into manual escalations, platform teams can reduce handoff friction in TicketOps.

3. Node Selectors and Affinity Constraints

Pods can specify node selectors or affinity rules to control on which nodes they run. If these criteria are too restrictive, the scheduler may struggle to find an eligible node, extending the  pending state duration. Node selectors use labels to designate nodes while affinity constraints dictate more complex relationships like co-locating or separating pods.

Misconfigured labels or overly stringent rules may limit the scheduler’s choices, exacerbating resource scarcity and prolonging delays. Evaluating and adjusting these constraints allows for a more flexible deployment strategy.

4. Taints and Tolerations

Another reason pods might remain pending is the use of taints and tolerations. Taints are applied to nodes to repel unsuitable pods, while compatible pods bear corresponding tolerations to override these effects. Mismatched or missing tolerations cause pods to remain pending, unable to land on any nodes. Correctly configuring these attributes is crucial for harmonious scheduling.

If taints and tolerations are overused or not properly aligned with the intended workflow, they can inadvertently restrict node availability. Establishing a simplified approach to applying these configurations balances node preferences with workload readiness.

5. PersistentVolumeClaim Issues

PersistentVolumeClaims (PVCs) are utilized by pods to request persistent storage. When the requested volumes are unattainable or improperly configured, pods tend to remain in the  pending state. This could arise due to storage class misconfigurations, unbound PersistentVolumes (PVs), or temporarily unavailable storage backends, which directly impact the pod’s ability to schedule.

Addressing PVC issues requires checking the existence and status of associated PVs, ensuring they are bound to the PVCs correctly. Verifying storage class parameters can also identify discrepancies that prevent successful bindings.

6. Image Pull Errors

Image pull errors can impede a pod from transitioning out of the  pending state as they prevent container images from loading successfully. Such errors often stem from incorrect image names, tag specifications, or authentication issues with private registries. Additionally, network issues can disrupt access to external container registries.

Diagnosing these errors starts with verifying image details in the pod specification and credentials set up for accessing the registry. Network connectivity checks can identify infrastructure-induced delays in image downloads.

Related content: Read our guide to pod status

expert-icon-header

Tips from the expert

Komodor | Pod in Pending State? Top 6 Causes and How to Resolve

Itiel Shwartz

Co-Founder & CTO

Itiel is the CTO and co-founder of Komodor. He’s a big believer in dev empowerment and moving fast, has worked at eBay, Forter and Rookout (as the founding engineer). Itiel is a backend and infra developer turned “DevOps”, an avid public speaker that loves talking about things such as cloud infrastructure, Kubernetes, Python, observability, and R&D culture.

In my experience, here are tips that can help you troubleshoot and prevent Kubernetes pods from remaining in the Pending state:

Enable resource reservations for system components:

Reserve CPU and memory resources for critical system components like the kubelet and the API server using --system-reserved or --kube-reserved flags. This prevents resource starvation that might delay pod scheduling.

Use LimitRange to set resource defaults:

Define default resource requests and limits using LimitRange in namespaces to ensure that pods always specify appropriate resource allocations. This helps prevent overcommitment or under-requesting of resources.

Implement overprovisioning nodes:

Deploy a small number of low-priority pods simulating load on nodes to ensure there is always some buffer capacity for higher-priority pods when needed. This approach works well in clusters with variable workloads.

Audit PVC binding modes:

Use the correct volumeBindingMode in the storage class (Immediate or WaitForFirstConsumer) to optimize how PersistentVolumeClaims are bound to PersistentVolumes, avoiding unnecessary delays in pod scheduling.

Optimize cluster autoscaler configurations:

Tune the cluster autoscaler to handle specific workload patterns by configuring features like scale-down thresholds and custom resource limits, ensuring quick reactions to pending pods caused by resource shortages.

Troubleshooting Pods Stuck in Pending State 

When pods remain in the pending state, identifying and addressing the root cause is crucial. Kubernetes offers various tools and methods to diagnose and resolve issues. Below are common troubleshooting steps with explanations and code examples.

1. Describe the Pod

The kubectl describe pod command provides detailed information about the Pod, including its status, conditions, and recent events.

Command:

kubectl describe pod <pod-name>

Example output (Events section):

Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 30s (x2 over 1m) default-scheduler 0/5 nodes are available: 2 Insufficient CPU, 3 Insufficient memory.
Komodor | Pod in Pending State? Top 6 Causes and How to Resolve

The “Events” section in the output often indicates why the pod is not being scheduled. For instance, “FailedScheduling” could highlight resource constraints or node-related issues. Use this information to adjust resource requests or evaluate node availability.

Common FailedScheduling messages and what to check next

In many cases, the fastest way to understand why a pod is still Pending is to read the FailedScheduling message in the Events section and map it to the underlying constraint.

Event message exampleLikely causeWhat to check next
0/n nodes are available: Insufficient cpu or Insufficient memoryThe pod’s resource requests do not fit on any currently schedulable nodeCompare the pod’s resources.requests with node Allocatable capacity and current usage. Check kubectl describe pod <pod-name>, kubectl describe node <node-name>, and kubectl top nodes.
node(s) had untolerated taintA node taint is blocking scheduling because the pod has no matching tolerationCheck node taints with kubectl get nodes -o custom-columns=NAME:.metadata.name,TAINTS:.spec.taints and confirm the pod spec includes the needed tolerations.
node(s) didn't match Pod's node affinity/selectorThe pod’s nodeSelector or required node affinity does not match available node labelsReview the pod’s nodeSelector and .spec.affinity.nodeAffinity, then compare them with real node labels using kubectl get nodes --show-labels.
pod has unbound immediate PersistentVolumeClaimsThe pod depends on storage that is not yet bound or availableCheck the PVC and PV status with kubectl get pvc, kubectl describe pvc <pvc-name>, and confirm the StorageClass and binding behavior are correct.
node(s) didn't match pod anti-affinity rulesRequired anti-affinity rules are too restrictive for the current cluster layoutInspect .spec.affinity.podAntiAffinity and verify whether your required rules leave any eligible node at all.
node(s) didn't have free ports for the requested pod portsA requested hostPort is already in use on candidate nodesReview the pod spec for hostPort usage and check whether another pod or process is already using that port on the target nodes.
How to Read FailedScheduling Messages

Start with the exact event text first, then move to cluster-wide events and scheduler logs only if the pod-level message is still inconclusive.

2. Check Events Related to the Pod

Developers can list all events in the cluster and filter them to find those associated with the pod. This can provide additional context, especially if certain events are not captured in kubectl describe pod.

Command:

kubectl get events | grep <pod-name>
Komodor | Pod in Pending State? Top 6 Causes and How to Resolve

This approach supplements the describe output, helping to uncover environment-wide factors affecting pod scheduling, such as delays or conflicts in resource allocation.

3. Analyze Scheduler Logs

The Kubernetes scheduler logs provide a detailed view of scheduling operations, offering insights into why pods remain pending. This is particularly helpful for debugging complex scheduling scenarios.

Command:

kubectl -n kube-system logs $(kubectl -n kube-system get pods | grep scheduler | awk '{print $1}')

By reviewing scheduler logs, teams can pinpoint scheduling errors or constraints, such as affinity conflicts or node taints. This is an advanced troubleshooting step when basic methods fail to identify the issue.

4. Use Pod Priority or Daemonsets

Pod priority and preemption

Kubernetes makes it possible to assign higher priorities to critical pods to preempt less important workloads. This can prevent pending pod issues.

Example:

apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
name: high-priority
value: 1000
globalDefault: false
description: "This priority class is used for critical workloads."

Pods with this priority class can displace lower-priority pods  if resources are scarce.

DaemonSets

For system-critical pods, DaemonSets ensure deployment to specified nodes regardless of other scheduling constraints.

Example:

apiVersion: apps/v1
kind: DaemonSet
metadata:
name: critical-service
spec:
selector:
matchLabels:
app: critical-service
template:
metadata:
labels:
app: critical-service
spec:
containers:
- name: critical-service-container
image: my-critical-service:latest
Komodor | Pod in Pending State? Top 6 Causes and How to Resolve

5. Check Node Status and Capacity

Inspecting the status and capacity of nodes is essential for identifying resource constraints that could prevent pods from being scheduled. Use the following commands to analyze node conditions, resource availability, and allocations.

Command: Check Node Status

kubectl get nodes

Example output:

NAME          STATUS   ROLES    AGE     VERSION
node-1 Ready worker 15d v1.26.0
node-2 Ready worker 10d v1.26.0
node-3 NotReady worker 12d v1.26.0

The STATUS column shows the readiness of nodes. Nodes marked as NotReady are unavailable for scheduling. Investigate these nodes by examining their conditions.

Command: Describe a Node

kubectl describe node <node-name>

Example output (truncated):

Conditions:
Type Status LastHeartbeatTime Reason
---- ------ ----------------- ------
MemoryPressure False 2024-12-30T10:32:45Z KubeletHasSufficientMemory
DiskPressure False 2024-12-30T10:32:45Z KubeletHasNoDiskPressure
PIDPressure False 2024-12-30T10:32:45Z KubeletHasSufficientPID
Ready True 2024-12-30T10:32:45Z KubeletReady

Allocatable:
cpu: 4
memory: 16Gi
pods: 110
Komodor | Pod in Pending State? Top 6 Causes and How to Resolve

The Conditions section provides insights into issues like memory, disk, or CPU pressure. Allocatable shows the resources available for scheduling.

Command: View Node Resource Usage

To evaluate resource usage and detect overcommitment, check node metrics (requires Metrics Server):

kubectl top nodes

Example output:

NAME          CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%
node-1 1200m 30% 8Gi 50%
node-2 900m 22% 6Gi 37%
node-3 - - - -

The percentage of resource usage indicates how heavily loaded a node is. Nodes nearing 100% utilization might not accommodate new pods.

Command: List Node Taints

kubectl get nodes -o custom-columns=NAME:.metadata.name,TAINTS:.spec.taints

Example output:

NAME          TAINTS
node-1 <none>
node-2 key=value:NoSchedule
node-3 key=value:NoExecute
Komodor | Pod in Pending State? Top 6 Causes and How to Resolve

Taints applied to nodes can restrict scheduling. Pods require matching tolerations to be scheduled on such nodes.

Addressing node issues

  • Resolve NotReady status: Investigate the node’s health by checking system logs, verifying connectivity, or restarting the kubelet.
  • Scale resources: Add nodes to the cluster or resize existing nodes to provide additional capacity.
  • Modify taints and tolerations: Adjust taints or configure pod tolerations to align with workload requirements.

Kubernetes Troubleshooting with Komodor

Komodor is the AI SRE Platform designed to maximize Kubernetes reliability. It helps democratize K8s expertise across the organization and enable engineering teams to leverage the full value of Kubernetes.

Komodor’s platform empowers developers to confidently monitor and troubleshoot their workloads while allowing cluster operators to enforce standardization and optimize performance. 

By leveraging Komodor, companies of all sizes significantly improve reliability, productivity, and velocity. Or, to put it simply – Komodor helps you spend less time and resources on managing Kubernetes, and more time on innovating at scale.

FAQs About Pod in Pending State

A Kubernetes pod in Pending state has not yet been assigned to a node and its containers haven’t started. The scheduler is either waiting for sufficient CPU/memory resources, or the pod’s requirements, such as node selectors, affinity rules, taints/tolerations, or PersistentVolumeClaims, haven’t been met yet. Resolving the underlying constraint moves the pod to a Running state.

The most common causes are: insufficient CPU or memory on available nodes, nodes marked as NotReady or cordoned, misconfigured node selectors or affinity rules, mismatched taints and tolerations, unbound PersistentVolumeClaims, and image pull errors. Running kubectl describe pod and checking the Events section is the fastest way to identify the root cause.

Start by running kubectl describe pod and reviewing the Events section. Common fixes include: adding cluster capacity or scaling nodes, adjusting resource requests, fixing node selectors or affinity rules, aligning taints with pod tolerations, resolving PVC binding issues, and correcting image names or registry credentials. Scheduler logs can help debug complex scenarios.

Run kubectl get nodes to check node readiness, kubectl describe node to review allocatable resources and pressure conditions, and kubectl top nodes (requires Metrics Server) to see live CPU and memory usage. Nodes approaching 100% utilization won’t accept new pods, requiring you to either scale the cluster or rebalance existing workloads.

Node selectors use labels to restrict which nodes a pod can schedule on. If no node matches the label, the pod stays Pending. Taints actively repel pods unless the pod has a matching toleration. Both can cause Pending states, but taints affect pods cluster-wide unless explicitly overridden, making misconfigured tolerations particularly disruptive.