maxSurge and maxUnavailable in Kubernetes

maxSurge determines how many additional pods can be created beyond the desired number of replicas during a rolling update. It allows Kubernetes to temporarily exceed the target number of replicas to ensure availability while deploying new versions. The primary purpose of maxSurge is to minimize downtime by ensuring that new pods are created and ready before old ones are terminated. When maxSurge is set, Kubernetes creates the additional pods before terminating the old ones. This ensures that the application remains available during the update process.
maxSurge can be specified as:

  • An absolute number (e.g., 2 means up to 2 extra pods).
  • A percentage of the desired replicas (e.g., 50% for half of the desired replicas).

maxUnavailable defines the maximum number of pods that can be unavailable (not ready) during a rolling update. It determines how many pods can be taken offline while Kubernetes replaces them with new ones. The purpose of maxUnavailable is to maintain application availability by limiting the number of pods that are unavailable at any given time. When maxUnavailable is set, Kubernetes ensures that no more than the specified number of pods are unavailable during the update process. This ensures a balance between availability and update progress.
maxUnavailable can be specified as:

  • An absolute number (e.g., 1 means only one pod can be unavailable).
  • A percentage of the desired replicas (e.g., 25% for one-fourth of the replicas).

Let’s explore how these parameters work together using an example.

We have a Deployment with the following configuration. It has 4 Desired replicas with maxSurge as 2 and maxUnavailable as 1.
Here’s the YAML file for the Deployment:

Rolling Update Process
  1. Initial State: 4 pods are running with the image nginx:1.20.
  2. First Step: Kubernetes creates 2 new pods with the updated image (because maxSurge: 2 allows it). The total number of pods becomes 6 (4 old pods + 2 new pods).
  3. Second Step: Once the 2 new pods pass their readiness checks, Kubernetes terminates up to 1 old pod (because maxUnavailable: 1 limits how many pods can be unavailable). The total number of pods becomes 5 (3 old pods + 2 new pods).
  4. Subsequent Steps: This process repeats: Kubernetes creates up to 2 new pods while ensuring no more than 1 pod is unavailable. Gradually, all old pods are replaced with the new ones.
  5. Final State: Once all old pods are terminated, the Deployment stabilizes with 4 new pods running the updated image.
When to Use Custom maxSurge and maxUnavailable Values
Use Case 1: High Availability Applications

For applications that must remain fully available (e.g., e-commerce websites or payment gateways), set:

  • High maxSurge (e.g., 50%) to ensure extra capacity during updates.
  • Low maxUnavailable (e.g., 0 or 1) to minimize downtime.
Use Case 2: Resource-Constrained Environments

In environments with limited resources (e.g., small clusters or test environments), set:

  • Low maxSurge (e.g., 1) to avoid creating too many additional pods.
  • Higher maxUnavailable (e.g., 50%) to allow faster updates without consuming extra resources.
Conclusion

maxSurge and maxUnavailable are critical parameters in Kubernetes for managing rolling updates efficiently. By controlling the number of additional pods created (maxSurge) and the number of pods that can go offline (maxUnavailable), you can tailor the update process to your application’s requirements. Whether you’re working with a high-traffic production app or a resource-constrained test environment, understanding these parameters enables you to achieve the perfect balance between availability and resource efficiency.

That’s all for now.
Thank you for reading!!

Stay tuned for more articles on Cloud and DevOps. Don’t forget to follow me for regular updates and insights.

Let’s Connect: LinkedIn

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top