How to Scale Kubernetes Pods with Kubectl Scale Deployment

What is Kubectl Scale Deployment? 

If you are working with Kubernetes, you will need to scale resources up or down to meet the changing demands of your workloads. One way to do this is the kubectl scale deployment command. It is a powerful tool that enables administrators and operators to manage the number of replicas for a specific deployment in a Kubernetes cluster.

kubectl is a command-line interface (CLI) that allows users to interact with their Kubernetes clusters. The kubectl scale deployment command is a function that lets you adjust the capacity of your application based on demand. You can increase or decrease the number of replicas for a deployment to ensure optimal resource utilization and high availability.

This is part of a series of articles about Kubectl Cheat Sheet

Kubectl Scale Command: Basic Usage 

The most basic usage of kubectl scale command is as follows:

$ kubectl scale deployment --replicas=5 my-app

This command increases the number of replicas of a deployment named my-app to five. You can target other Kubernetes resources by replacing deployment with the name of the resource:

$ kubectl scale replicaset --replicas=5 dep-nginx-6ffcc9c74d

$ kubectl scale --replicas=5  replicationcontroller/my-replication-controller 

$ kubectl scale --replicas=5 statefulset.apps/my-statefulset

If you don’t have a replication controller running in your cluster, you can use the following YAML to create one:

apiVersion: v1
kind: ReplicationController
metadata:
  name: my-replication-controller
spec:
  replicas: 3
  selector:
    app: my-app
  template:
    metadata:
      labels:
        app: my-app
    spec:
      containers:
      - name: my-container
        image: nginx:latest
        ports:
        - containerPort: 80

Save the above YAML code in a file called replication_controller.yaml and then issue following command to create it:

kubectl apply -f replication_controller.yaml
expert-icon-header

Tips from the expert

Itiel Shwartz

Co-Founder & CTO

Itiel is the CTO and co-founder of Komodor. He’s a big believer in dev empowerment and moving fast, has worked at eBay, Forter and Rookout (as the founding engineer). Itiel is a backend and infra developer turned “DevOps”, an avid public speaker that loves talking about things such as cloud infrastructure, Kubernetes, Python, observability, and R&D culture.

In my experience, here are tips that can help you better scale Kubernetes pods with kubectl scale deployment:

Understand Your Workload Requirements

Before scaling, analyze your application’s workload patterns and resource requirements. This ensures efficient scaling and optimal resource utilization.

Use Autoscaling for Dynamic Workloads

Implement Horizontal Pod Autoscaler (HPA) for dynamic scaling based on CPU/memory usage or custom metrics. This automates scaling and maintains performance during varying loads.

Set Resource Requests and Limits

Define appropriate resource requests and limits for your pods. This helps the scheduler make informed decisions and prevents resource contention or over-provisioning.

Monitor Resource Usage

Continuously monitor resource usage using tools like Prometheus and Grafana. Set up alerts for high or low resource utilization to proactively manage scaling.

Perform Load Testing

Conduct load testing to understand how your application behaves under different loads. Use tools like Locust or JMeter to simulate traffic and determine optimal scaling thresholds.

Kubectl Scale Deployment Use Cases 

Scaling deployments is a crucial aspect of managing containerized applications in production environments. Here are some use cases where kubectl scale deployment can be helpful:

Handling Traffic Spikes

During peak demand, such as sales events or holiday seasons, your application may experience sudden traffic spikes. With kubectl scale deployment, you can quickly increase the number of replicas for your application’s pods to handle the additional load without any downtime.

Maintaining High Availability

By scaling up the number of replicas with kubectl scale deployment, you can minimize the impact of node failures or other issues affecting individual pod instances. This helps maintain uninterrupted service availability and enhances reliability.

Adjusting Horizontal Auto-Scaling

Kubernetes supports horizontal auto-scaling through its Horizontal Pod Autoscaler (HPA). HPA automatically adjusts the number of pod replicas based on predefined metrics like CPU usage or custom metrics defined by developers/operators themselves. However, if you need to make a quick adjustment in response to an unexpected event or test the impact of scaling on your application’s performance, using kubectl scale deployment can be helpful.

Optimizing Resource Utilization

By monitoring the resource utilization of individual pods and adjusting their replica count with kubectl scale deployment, you can better balance resources within your cluster and optimize overall efficiency.

Performing Load Testing

Increasing the number of replicas for your application’s pods with kubectl scale deployment can simulate a higher load on your cluster and test its performance under stress. This can help you identify potential bottlenecks and optimize your cluster’s configuration for better scalability.

Learn more in our detailed guide to Kubectl restart pod

Quick Tutorial: Working With the kubectl Scale Deployment Command

Here is a step-by-step tutorial that shows how to work with the kubectl scale deployment.

Step 1: Creating a Deployment YAML file

The first part of the process involves creating a Kubernetes Deployment. The Deployment configuration is usually defined in a YAML file. Here is a sample configuration:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-deployment
spec:
  replicas: 1
  selector:
    matchLabels:
      app: my-app
  template:
    metadata:
      labels:
        app: my-app
    spec:
      containers:
        - name: nginx
          image: nginx:latest

In this YAML file, we define a Deployment called my-deployment that runs an nginx:latest Docker image in a single pod (as specified by replicas: 1). The pod runs a single container with the same Docker image.

Step 2: Apply the Deployment

To apply the Deployment, you will use the kubectl apply command. This creates the Deployment as per the configuration defined in the YAML file. The -f flag is used to specify the file that contains the Deployment configuration.

$ kubectl apply -f my-deployment.yaml

Related content: Read our guide to kubectl apply

Step 3: Check the Deployment

The kubectl get pods command can be used to check the status of your pods and see if the Deployment has been successfully created.

$ kubectl get pods

Step 4: Scale Up the Deployment

To increase the number of pods in your Deployment, use the kubectl scale command, followed by --replicas, and specify the desired number of replicas. The project/my-deployment at the end is to specify the name of the Deployment that you want to scale.

$ kubectl scale --replicas=6 deployment.apps/my-deployment

After running this command, Kubernetes will adjust the Deployment to have six replicas.

Step 5: Check the Deployment Again

You can check the status of your pods again to confirm that the Deployment has been successfully scaled up.

$ kubectl get pods

Step 6: Scale Down the Deployment

Scaling down the Deployment works the same way as scaling up. To reduce the number of pods in your Deployment, use the kubectl scale command, followed by --replicas, and specify the new desired number of replicas.

$ kubectl scale --replicas=4 deployment.apps/my-deployment

After running this command, Kubernetes will adjust the Deployment to have four replicas.
And that’s it! This is how you can work with the kubectl scale deployment command to adjust the number of pods in a Deployment.

Scale Deployment with Komodor

Komodor allows developers to get full control over your application, services and deployments by allowing them to scale their services, follow autoscaling events and analyze the cause for them. Controlling services has never been easier:

  • Scale a deployment – Users can scale their deployment easily and effortlessly by clicking on the scale operation at the top of the services view.

Fig 1. 

Fig 2.

Fig 3

  • Follow your scaling using Komodor – After scaling your deployments, Komodor tracks the scale event created and indicates that all the pod spawn successfully and marked as ready by Kubernetes.

Fig 4

Fig 5

Fig 6

  • Track auto scaling events – developers can easily see autoscaling events on the timeline. Each HPA change is trailed and accessed to get the full history scaling changes with the relevant metrics during the time of the scaling to analyze it.

Fig7