SIGKILL: Fast Termination Of Linux Containers | Signal 9

It starts with a scale-down event or a memory spike. Kubernetes sends SIGTERM. The grace period ticks down. The container doesn’t exit. Thirty seconds later, SIGKILL fires and whatever state that container was holding is gone.

This scenario plays out across production clusters every day. Understanding the full signal lifecycle is how you engineer around it.

What is SIGKILL (signal 9)

SIGKILL is a type of communication, known as a signal, used in Unix or Unix-like operating systems like Linux to immediately terminate a process. It is used by Linux operators, and also by container orchestrators like Kubernetes, when they need to shut down a container or pod on a Unix-based operating system.

Linux commands 44.8% of the server operating system market, making the signals it uses to manage processes anything but an edge case.

A signal is a standardized message sent to a running program that triggers a specific action, such as terminating a process or handling an error. This form of Inter Process Communication (IPC) is how operating systems maintain control: when a signal is sent to a target process, the OS waits for atomic instructions to complete, then interrupts execution and handles the signal accordingly.

SIGKILL instructs the process to terminate immediately. It cannot be ignored or blocked. The process is killed, and if it is running threads, those are killed as well. If the SIGKILL signal fails to terminate a process and its threads, this indicates an operating system malfunction.

This is the strongest way to kill a process and can have unintended consequences because it is unknown if the process has completed its cleanup operations. Because it can result in data loss or corruption, it should only be used if there is no other option. In Kubernetes, a SIGTERM command is always sent before SIGKILL, to give containers a chance to shut down gracefully.

For a deeper dive on SIGTERM click hereThis is part of a series of articles about Exit Codes.

What are SIGTERM (Signal 15) and SIGKILL (Signal 9)? Options for Killing a Process in Linux

In Linux and other Unix-like operating systems, there are several operating system signals that can be used to kill a process.

The most common types are:

  • SIGKILL (also known as Unix signal 9)—kills the process abruptly, producing a fatal error. It is always effective at terminating the process, but can have unintended consequences.
  • SIGTERM (also known as Unix signal 15)—tries to kill the process, but can be blocked or handled in various ways. It is a more gentle way to kill a process.
TermWhat it meansCan the app clean up?Usual exit codeWhat to check first
SIGTERMGraceful stop request sent to the process during normal shutdownYes. The app can catch it and shut down cleanly143 in many container/Kubernetes casesWas the pod deleted, scaled down, or replaced during a rollout? Did the app handle shutdown correctly?
SIGKILLImmediate forced terminationNo. The app cannot catch, block, or finish cleanup after it arrives137Did the container fail to exit before terminationGracePeriodSeconds ended? Was it manually killed?
OOMKilledContainer was killed after hitting memory pressure or its memory limitNo meaningful cleanup at kill timeOften 137Check pod status for OOMKilled, container memory limits/requests, recent spikes, and node memory pressure
Exit Code 137A process exit code commonly associated with signal 9 because shells use 128 + signal numberNot by itself. It is a result, not the cause137First determine why signal 9 happened: grace period expired, manual kill, or OOMKilled
SIGTERM vs SIGKILL vs OOMKilled vs Exit Code 137
expert-icon-header

Tips from the expert

Itiel Shwartz

Co-Founder & CTO

Itiel is the CTO and co-founder of Komodor. He’s a big believer in dev empowerment and moving fast, has worked at eBay, Forter and Rookout (as the founding engineer). Itiel is a backend and infra developer turned “DevOps”, an avid public speaker that loves talking about things such as cloud infrastructure, Kubernetes, Python, observability, and R&D culture.

In my experience, here are tips that can help you better manage SIGKILL (Signal 9) in Linux containers:

Grace period adjustment

Customize the SIGTERM grace period in Kubernetes to suit your application’s shutdown requirements, ensuring clean termination before SIGKILL is sent.

Signal trapping

Implement signal trapping in your containerized applications to handle SIGTERM and perform necessary cleanup.

Health checks

Configure robust health checks and readiness probes to prevent sending SIGKILL to healthy containers.

Resource limits

Set appropriate resource limits to avoid OOMKilled errors, which result in SIGKILL.

Set appropriate resource limits and rightsize workloads to reduce OOMKilled errors, which often surface as SIGKILL.

For teams trying to balance reliability with spend, Kubernetes cost optimization is closely tied to setting requests and limits based on actual usage, not guesswork.

Log analysis

Regularly analyze logs for SIGTERM and SIGKILL signals to identify patterns and improve system reliability.

At scale, this kind of repetitive investigation is one of the clearest use cases for AI SRE and for an AI SRE agent that reduces MTTR and operational toil.

Using the Kill -9 Command

If you are a Unix/Linux user, here is how to kill a process directly:

  1. List currently running processes. The command ps -aux shows a detailed list of all running processes belonging to all users and system daemons.
  2. Identify the process ID of the process you need to kill.
  3. Do one of the following:
    Use the kill [ID] command to try killing the process using the SIGTERM signal
    Use the kill -9 [ID] command to kill the process immediately using the SIGKILL signal

When Should you Use SIGKILL as a Unix/Linux User?

SIGKILL kills a running process instantly. For simple programs, this can be safe, but most programs are complex and made up of multiple procedures. Even seemingly insignificant programs perform transactional operations that must be cleaned up before completion.

If a program hasn’t completed its cleanup at the time it receives the SIGKILL signal data may be lost or corrupted. You should use SIGKILL only in the following cases:

  • A process has a bug or issue during its cleanup process
  • You don’t want the process to clean itself up, to retain data for troubleshooting or forensic investigation
  • The process is suspicious or known to be malicious
 

How Can You Send SIGKILL to a Container in Kubernetes?

If you are a Kubernetes user, you can send a SIGKILL to a container by terminating a pod using the kubectl delete command.

Kubernetes will first send the containers in the pod a SIGTERM signal. By default, Kubernetes gives containers a 30 second grace period, and afterwards sends a SIGKILL to terminate them immediately.

If your team is tracing shutdowns like this across many clusters and workloads, Komodor’s AI SRE Platform is built for visualizing, troubleshooting, and optimizing Kubernetes environments at scale.

The Kubernetes Pod Termination Process and SIGKILL

When Kubernetes performs a scale-down event or updates pods as part of a Deployment, it terminates containers in three stages:

  • The kubelet asks the container runtime to send the container’s stop signal, which is usually SIGTERM, to let the application shut down gracefully.
  • By default, Kubernetes gives containers a grace period of 30 seconds to shut down. This value is customizable.
  • If the container does not exit and the grace period ends, the kubelet sends a SIGKILL signal, which causes the container to shut down immediately.

It is important to realize that while you can capture a SIGTERM in the container’s logs, by definition, you cannot capture a SIGKILL command, because it immediately terminates the container.

Which Kubernetes Errors are Related to SIGKILL?

Not every pod shutdown follows the same path in Kubernetes. In the normal termination flow, Kubernetes starts with a graceful shutdown and only escalates to SIGKILL if the container is still running after the termination grace period expires. By default, that grace period is 30 seconds, but it can be changed with terminationGracePeriodSeconds.

1) Graceful pod termination

In a standard pod deletion, rollout, or scale-down event, the kubelet begins shutdown by running any preStop hook first, if one is defined, and then asking the container runtime to send the stop signal to PID 1 in each container. In most cases that is SIGTERM, which gives the application a chance to finish in-flight work and clean up before exiting.

2) Forced termination after the grace period

If the container is still running when the grace period ends, Kubernetes escalates to forcible shutdown. At that point, the container runtime sends SIGKILL to any remaining processes in the pod. This is the path most teams mean when they say a container was “killed by Kubernetes” after failing to stop cleanly.

3) OOMKilled is a separate path

OOMKilled should be treated as a separate failure mode, not as a normal graceful-then-forced shutdown sequence. In practice, teams often see it as exit code 137, which is commonly associated with SIGKILL, but the troubleshooting flow is different: instead of asking “why didn’t the app exit during termination?”, the better question is “why did the container exceed available memory or hit its memory limit?”

To troubleshoot the difference:

  • look for a pod in Terminating during normal shutdown flow
  • check whether a preStop hook or a short grace period delayed clean termination
  • verify whether the container exited gracefully with 143 or was force-killed with 137
  • if the pod status shows OOMKilled, investigate memory pressure and limits first rather than treating it like a standard rollout or delete event.

When SIGKILL is tied to memory pressure or badly sized requests and limits, the fix is usually not just incident response but better Kubernetes rightsizing and broader Kubernetes cost optimization practices.

If you are tuning managed clusters, see our guides to EKS cost optimization and GKE cost optimization for cloud-specific ways to reduce waste without trading away reliability.

How Does SIGKILL Impact NGINX Ingress Controllers?

Just like Kubernetes can send a SIGTERM or SIGKILL signal to shut down a regular container, it can send these signals to an NGINX Ingress Controller pod. However, NGINX handles signals in an unusual way:

  • When receiving SIGTERM, SIGINT – NGINX performs fast shutdown. The master process instructs the worker process to exit, waits only 1 second, and then sends it a SIGKILL signal.
  • When receiving QUIT – NGINX performs graceful shutdown. It closes the listening port to avoid receiving more requests, closes idle connection, and only exits after all working processes exit.

And so, in a sense, NGINX treats SIGTERM and SIGINT like SIGKILL. If the controller is processing requests when the signal is received, it will drop the connections, resulting in HTTP server errors. To prevent this, you should always shut down the NGINX Ingress Controller using a QUIT command.

Shutting down the NGINX Ingress Controller with QUIT instead of SIGTERM

In the standard nginx-ingress-controller image (version 0.24.1), there is a command that can send NGINX the appropriate termination signal. Run this script to shut down NGINX gracefully by sending it a QUIT signal:

/usr/local/openresty/nginx/sbin/nginx -c /etc/nginx/nginx.conf -s quit while pgrep -x nginx; do    sleep 1 done 

Under the Hood: How the SIGKILL Signal Works

SIGKILL is handled entirely by the operating system (the kernel). When a SIGKILL is sent for a particular process, the kernel scheduler immediately stops giving the process CPU time to execute user space code. When the scheduler makes this decision, if the process has threads executing code on different CPUs or cores, those threads are also stopped.

What happens when a process is killed while executing kernel code?

When a SIGKILL signal is delivered, if a process or thread is executing system calls or I/O operations, the kernel switches the process to “dying” state. The kernel schedules CPU time to allow the dying process to resolve its remaining concerns.

Non-interruptible operations will run until they are complete (but check for “dying” status before running more user-space code). Interruptible operations, when they identify the process is “dying”, terminate prematurely. When all operations are complete, the process is given “dead” status.

What happens when a process is marked “dead”?

When kernel operations complete, the kernel starts cleaning up the process, just like when a program exits normally. A result code higher than 128 is given to the process, indicating that it was killed by a signal. A process killed by SIGKILL has no chance to process the received SIGKILL message.

At this stage the process transitions to “zombie” status, and the parent process is notified using the SIGCHLD signal. Zombie status means that the process has been killed,
but the parent process can read the dead process’s exit code using the wait(2) system call. The only resource consumed by a zombie process is a slot in the process table, which stores the process ID, exit, and other “critical statistics” that enable troubleshooting.

If a zombie process remains alive for a few minutes, this probably indicates an issue with the workflow of its parent process.

Troubleshooting Kubernetes Pod Termination with Komodor

SIGKILL issues are rarely just about one signal. In real environments, they sit at the intersection of application shutdown behavior, resource tuning, and platform-team response time. That is where AI SRE and a Kubernetes platform become useful, because they connect troubleshooting, optimization, and operational scale instead of treating them as separate problems.

This is the reason why we created Komodor, a tool that helps dev and ops teams stop wasting their precious time looking for needles in (hay)stacks every time things go wrong.

Acting as a single source of truth (SSOT) for all of your k8s troubleshooting needs, Komodor offers:

  • Change intelligence: Every issue is a result of a change. Within seconds we can help you understand exactly who did what and when.
  • In-depth visibility: A complete activity timeline, showing all code and config changes, deployments, alerts, code diffs, pod logs and etc. All within one pane of glass with easy drill-down options.
  • Insights into service dependencies: An easy way to understand cross-service changes and visualize their ripple effects across your entire system.
  • Seamless notifications: Direct integration with your existing communication channels (e.g., Slack) so you’ll have all the information you need, when you need it.

FAQs About SIGKILL (Signal 9) in Linux and Kubernetes

SIGKILL is a Unix/Linux signal that immediately and forcibly terminates a process. Unlike other signals, it cannot be ignored, blocked, or handled by the process itself. The operating system kernel executes it directly. It kills the process and all its threads instantly. Because it bypasses graceful shutdown, it can cause data loss or corruption and should only be used as a last resort.

SIGTERM (Signal 15) is a gentle termination request that a process can catch, handle, or delay, allowing it to clean up before exiting. SIGKILL (Signal 9) is an immediate, unconditional kill that cannot be intercepted or ignored. Kubernetes always sends SIGTERM first, giving containers a 30-second grace period, and only sends SIGKILL if the container hasn’t exited by then.

When Kubernetes terminates a pod, it sends SIGTERM to all containers in the pod first. Containers have a default 30-second grace period to shut down cleanly. If a container is still running after that window expires, the kubelet sends a SIGKILL signal, which forces immediate termination. This grace period is configurable via terminationGracePeriodSeconds in the pod spec.

A container forcefully terminated by SIGKILL exits with exit code 137 (128 + signal number 9). If a container shut down gracefully in response to SIGTERM, it exits with exit code 143 (128 + 15). Checking the exit code via kubectl describe pod is the fastest way to determine whether a container was killed forcefully or terminated on its own terms.

Use SIGKILL only when: a process is stuck in its cleanup routine and won’t respond to SIGTERM; you intentionally want to preserve the process state for forensic investigation; or the process is suspected to be malicious. For normal shutdowns, always attempt SIGTERM first. Skipping SIGTERM and jumping straight to SIGKILL risks data corruption or incomplete transactional operations.

No. Because SIGKILL immediately terminates the container at the kernel level, there is no opportunity for the container to log the event. You can capture SIGTERM in logs and should handle it in your application code, but SIGKILL leaves no application-level trace. To investigate SIGKILL events, check pod-level events using kubectl describe pod or host-level process monitoring.