Hold on!

Before you go, why not take Komodor for a spin? Simplify Kubernetes troubleshooting in 5 minutes.

Try Komodor for Free *No credit card required.
Komodor-platform
This website uses cookies. By continuing to browse, you agree to our Privacy Policy.

Troubleshooting Unhealthy DaemonSets

618 Views

What Is a Kubernetes DaemonSet

A Kubernetes DaemonSet is a type of Kubernetes object that ensures all nodes in a cluster, or a specific subset of nodes, runs exactly one copy of a pod. When new eligible nodes are added to the cluster, the DaemonSet automatically runs the pod on them.

Typically, Kubernetes users don’t care where their pods run. But in some cases, it is important to have a pod running on every node. For example, this makes it possible to run a logging component on all nodes of a cluster. A DaemonSet makes this easy—you define a pod with the logging component and create the DaemonSet in the cluster, and the DaemonSet controller ensures the pod is running on every node.

This is part of our series of articles about Kubernetes troubleshooting.

How do DaemonSets Work?

A DaemonSet is an active Kubernetes object managed by a controller. You can declare any state you want, indicating that a particular Pod should exist on all nodes. The tuning control loop compares the desired state to the currently observed state. If the monitored node does not have a matching pod, the DaemonSet controller will create one for you.

This automated process includes existing nodes and all newly created nodes. Pods created by the DaemonSet controller are ignored by the Kubernetes scheduler as long as they exist as nodes themselves.

DaemonSet creates pods on every node by default. If desired, you can use the node selector to limit the number of nodes it can accept. The DaemonSet controller only creates pods on nodes that match the predefined nodeSelector field in the YAML file.

Troubleshoot Your Kubernetes Cluster

Auto-identify Kubernetes anomalies, uncover their root causes and resolve issues efficiently.

DaemonSet vs. StatefulSet vs. Deployment

DaemonSets, StatefulSets and Deployments are three ways to deploy workloads in Kubernetes. All three of these are defined via YAML configuration, are created as an object in the cluster, and are then managed on an ongoing basis by a Kubernetes controller. There is a separate controller responsible for each of these objects.

The key differences between these three objects can be described as follows:

  • DaemonSets allow you to run one or more pods across the entire cluster or a certain subset of nodes. The pods do not have a persistent ID and do not necessary have persistent storage.
  • StatefulSets allow you to run one or more pods with a persistent ID and persistent storage, suitable for stateful applications.
  • Deployments allow you to run one or more pods in a flexible configuration, defining how many replicas of the pods need to run, on which types of nodes they should schedule (for example via taints and tolerations), and the deployment pattern (for example, Recreate or Rolling deployment).

How to Create a DaemonSet

To create a DaemonSet, you need to define a YAML manifest file and run it in the cluster using kubectl apply.

The DaemonSet YAML file specifies the pod template that should be used to run pods on each node. It can also specify conditions or tolerations that determine when DaemonSet pods can schedule on nodes.

Here is an example of a DaemonSet manifest file. The example was shared in the Kubernetes documentation.

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluentd-elasticsearch
  namespace: kube-system
  labels:
    k8s-app: fluentd-logging
spec:
  selector:
    matchLabels:
      name: fluentd-elasticsearch
  template:
    metadata:
      labels:
        name: fluentd-elasticsearch
    spec:
      tolerations:
     —key: node-role.kubernetes.io/master
        operator: Exists
        effect: NoSchedule
      containers:
     —name: fluentd-elasticsearch
        image: quay.io/fluentd_elasticsearch/fluentd:v2.5.2
        resources:
          limits:
            memory: 200Mi
          requests:
            cpu: 100m
            memory: 200Mi
        volumeMounts:
       —name: varlog
          mountPath: /var/log
       —name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
      terminationGracePeriodSeconds: 30
      volumes:
     —name: varlog
        hostPath:
          path: /var/log
     —name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers

A few important points about this code:

  • The DaemonSet has the name fluentd-elasticsearch and is running in the kube-system namespace.
  • The DaemonSet pods are defined in the spec.template field. Every pod will run the image quay.io/fluentd_elasticsearch/fluentd:v2.5.2.
  • The pod template must have a label, in this case fluentd-elasticsearch.
  • DaemonSet pods must have RestartPolicy set to Always or unspecified (in this case it is not specified).
  • This DaemonSet has a toleration, defined in spec.tolerations, which specifies that the pod is allowed to run on master nodes.

Diagnosing Unhealthy DaemonSets

A DaemonSet is unhealthy if it doesn’t have one pod running per eligible node. Use the following steps to diagnose and resolve the most common DaemonSet issues.

However, note that DaemonSet troubleshooting can get complex and issues can involve multiple parts of your Kubernetes environment. For complex troubleshooting scenarios, you will need to use specialized tools to diagnose and resolve the problem.

1. List Pods in the DaemonSet

Run this command to see all the pods in the DaemonSet:

kubectl get pod -l app=[label]

Identify which of the pods has a status of crashloopbackoff, pending, or evicted.

For any pods that seem to be having issues, run this command to get more information about the pod:

kubectl describe pod [pod-name]

Or use this command to get logs for the pod:

kubectl logs [pod-name]

2. Check if Pods are Running Out of Resources

A common cause of CrashLoopBackOff or scheduling issues on the nodes is the lack of resources available to run the pod.
To identify which node the pod is running on, run this command:

kubectl get pod [pod-name] -o wide

To view currently available resources on the node, get the node name from the previous command and run:

kubectl top node [node-name]

Use the following strategies to resolve the issue:

  • Reduce the requested CPU and memory of the DaemonSet.
  • Move some pods off the relevant nodes to free up resources.
  • Scale nodes vertically, for example by upgrading them to a bigger compute instance.
  • Use taints and tolerations in the DaemonSet manifest to prevent the DaemonSet from running on certain nodes which do not have sufficient resources to run the pod.
  • If it is not essential to run exactly one pod per node, consider using a Deployment object instead. This will give you more control over the number and locations of pods running.

3. Debug Container Issues

If pods are running properly, there may be an issue with an individual container inside the pod. The first step is to check which image is specified in the DaemonSet manifest and make sure it is the right image.

If it is, bash into the container by gaining shell access to the node and using this command (for a Docker container):

docker run -ti --rm ${image} /bin/bash

Try to identify if there are application errors or configuration issues preventing the container from running properly.

Kubernetes troubleshooting relies on the ability to quickly contextualize the problem with what’s happening in the rest of the cluster. More often than not, you will be conducting your investigation during fires in production. DaemonSet issues can involve issues related to pods, nodes, storage volumes, the underlying infrastructure, or a combination of these.

This is the reason why we created Komodor, a tool that helps dev and ops teams stop wasting their precious time looking for needles in (hay)stacks every time things go wrong.

Acting as a single source of truth (SSOT) for all of your k8s troubleshooting needs, Komodor offers:

  • Change intelligence: Every issue is a result of a change. Within seconds we can help you understand exactly who did what and when.
  • In-depth visibility: A complete activity timeline, showing all code and config changes, deployments, alerts, code diffs, pod logs and etc. All within one pane of glass with easy drill-down options.
  • Insights into service dependencies: An easy way to understand cross-service changes and visualize their ripple effects across your entire system.
  • Seamless notifications: Direct integration with your existing communication channels (e.g., Slack) so you’ll have all the information you need when you need it.

If you are interested in checking out Komodor, use this link to sign up for a Free Trial.

Related Articles

Latest Blogs

The 4 Golden Signals for Monitoring Kubernetes: Everything You Need to Know

The 4 Golden Signals for Monitoring Kubernetes: Everything You Need to Know

This post will focus on the four golden signals you need to consider when troubleshooting in k8s: latency, traffic, errors, and saturation...

CI/CD Pipelines for Kubernetes: Best Practices and Tools

CI/CD Pipelines for Kubernetes: Best Practices and Tools

In this blog post, we discuss the challenges as well as best practices for CI/CD pipelines for Kubernetes. ...

Troubleshooting in Kubernetes: The Shift-Left Approach

Troubleshooting in Kubernetes: The Shift-Left Approach

In this blog post, we will discuss a new paradigm for making Kubernetes easier to troubleshoot: the shift-left approach....