Guides
5
min read

Kubernetes Resource Exhaustion: How to Secure Your Clusters

Matthias Luft
February 27, 2025

Kubernetes is Everywhere, But Are You Protecting Your Resources?

Kubernetes has become the de facto standard for container orchestration, powering mission-critical workloads across industries. However, Kubernetes resource exhaustion is a growing challenge, as managing resource consumption effectively is critical to ensuring stability and security.

Kubernetes shares critical resources—such as CPU, memory, and storage—between the host and its pods. Unfortunately, Kubernetes does not enforce resource limits by default, meaning any pod can potentially consume all available resources—bringing down critical workloads, causing service disruptions, or leaving the door open for attackers to exploit these gaps.

The Risk: Pods Can Exhaust Critical Resources

Without explicit resource limits, Kubernetes environments are vulnerable to various resource exhaustion attacks:

  • Disk Space Exhaustion: A pod can fill up the entire disk, preventing other processes from writing to it.
  • Memory Exhaustion: Unrestricted memory usage can lead to OOM (Out of Memory) conditions, killing critical processes.
  • Process ID (PID) exhaustion: A pod can spawn processes indefinitely (fork bombing), hitting the system’s PID limits, preventing new services from being started/restarted.
  • Inode exhaustion: A lesser-known but serious issue where excessive file creation fills up inode tables, rendering the node unresponsive and preventing new file creation. Kubernetes has two relevant inode limits: one for the host filesystem and another for the overlay filesystem. If a pod has a host volume mounted, it can create files in the host filesystem, enabling potential exhaustion of host inodes. The native pod filesystem is a type of overlay filesystem, where the inode count is shared among all pods on the host, making exhaustion here almost as critical.

These attacks can render nodes useless, causing pods to be rescheduled, potentially cascading failures throughout the cluster. Unrestricted resource access exposes Kubernetes clusters to cryptojacking, allowing attackers to exploit your infrastructure for cryptocurrency mining.

Demonstrating Kubernetes DoS Attacks

The open-source project dostainer demonstrates denial-of-service (DoS) techniques in Kubernetes. For instance, it highlights how unbounded resource consumption can take down nodes via CPU, memory, fork bombing, and inode exhaustion attacks. Notably, inode exhaustion is a form of Kubernetes resource exhaustion that is particularly hard to mitigate, as there is no built-in mechanism to restrict inode usage per pod.

Preventing Kubernetes Resource Exhaustion: Hardening Strategies

By default, Kubernetes lacks enforced resource limits, leaving clusters vulnerable to resource exhaustion attacks. However, you can—and should—explicitly define them in your deployment configurations. Many modern linters and misconfiguration checkers will flag missing resource quotas as a security risk, reinforcing the importance of setting them proactively.

To secure your cluster, apply the following hardening measures, as outlined in dosploy.yaml.

1. Memory Restrictions to Prevent Kubernetes Resource Exhaustion

Kubernetes allows you to control how much memory a container can request and consume, preventing pods from monopolizing resources. Memory requests define the minimum guaranteed allocation for a container, while memory limits cap its maximum usage. If a container exceeds its memory limit, it is terminated and may be restarted, ensuring that runaway processes do not degrade cluster stability.

Below is an example configuration applying explicit memory requests and limits for each pod to prevent excessive consumption:

apiVersion: v1
kind: Pod
metadata:
 name: dostainer-demo
 namespace: dostainer-testing
spec:
 containers:
   - name: dostainer-demo-ctr
     image: uchimata/dostainer
     resources:
       requests:
         memory: "100Mi"
         ephemeral-storage: "2Gi"
       limits:
         memory: "200Mi"
         ephemeral-storage: "4Gi"
     command: ["/app/fill-memory.sh"]
     args: ["5"]

2. Ephemeral Storage Limits in Kubernetes Resource Management

Kubernetes provides mechanisms to control how much local ephemeral storage a pod can consume. Without these limits, runaway processes can exhaust node storage, leading to system instability. To effectively restrict storage usage, you need to manage:

  • Pod filesystem (“disk”) – Temporary storage used by the container.
  • Log storage – Space consumed by container logs.
  • Mounted volumes – Ephemeral storage volumes (emptyDir) used within the pod.

All of those can be managed by configuration, through resource requests and limits:

apiVersion: v1
kind: Pod
metadata:
 name: dostainer-demo
 namespace: dostainer-testing
spec:
 containers:
   - name: dostainer-demo-ctr
     image: uchimata/dostainer
     resources:
       requests:
         memory: "100Mi"
         ephemeral-storage: "2Gi"
       limits:
         memory: "200Mi"
         ephemeral-storage: "4Gi"
     command: ["/app/fill-memory.sh"]
     args: ["5"]
     volumeMounts:
       - name: ephemeral
         mountPath: "/tmp"
 volumes:
   - name: ephemeral
     emptyDir:
       sizeLimit: 500Mi

Best Practice: If you have a more complex disk or volume setup, review the local ephemeral storage documentation to ensure proper enforcement.

3. Pod PID Limits to Avoid Kubernetes Resource Exhaustion

Kubernetes provides process ID (PID) limits to prevent a single pod from exhausting system PIDs. There are two key controls:

  • Node PID Limit – Reserve PIDs for the host system, ensuring that one pod cannot starve other workloads of system resources.
  • Per Pod PID Limit – Restrict the number of PIDs a single pod can spawn, preventing excessive process creation (e.g., fork bombs).

Challenges with Per-Pod Limits

Kubernetes enforces per-pod PID limits at the kubelet level (unlike CPU or memory limits), applying the same PID cap (PodPidsLimit) to every pod on a node. This one-size-fits-all approach can be challenging because different applications have varying concurrency models and, therefore, different PID requirements. Fine-tuning PID limits to match your application’s needs goes beyond resource management. It could also be a security measure, potentially restricting an attacker’s ability to spawn additional processes – so we will keep our eyes open for future per-pod configuration options.

The limit is configured via the kubelet parameter podsPidsLimit.

4. Mitigating Inode Exhaustion in Kubernetes Resource Allocation

Currently, Kubernetes lacks a built-in method to restrict inode usage per pod. However, there are a few workarounds:

  • Using loopback mounts per container – This can effectively limit file creation, but can add complexity and operational overhead.
  • Implement strict logging and monitoring – Detecting unusual file creation patterns can help identify inode exhaustion attempts early.

Implementing these solutions requires significant operational effort and may not be feasible in all environments. However, you can set a threshold of inode usage per node to protect the functionality of the overall node – see next section for details.

5. Pod Eviction to Manage Kubernetes Resource Exhaustion

The limits above can cause processes within a pod to hit restrictions, leading to application failure. Once you optimize the hardening options above for your workloads, consider testing Kubernetes’ pod eviction mechanism (currently in beta). When configured resource limits per node are exceeded, this feature moves pods to a failed state. This can serve as a second layer of protection against resource exhaustion, should the per-pod limits not be sufficient.

Understand Your Isolation and Tenancy Requirements

Kubernetes is designed for multi-tenancy, but proper isolation requires a strong understanding of its underlying principles and you need to evaluate the protection needs of your different workloads to design a reasonable tenancy level.

How Can Averlon Help?

Managing multiple Kubernetes clusters in a complex cloud environment is an immense challenge for human reviewers. Each cluster can contain hundreds or thousands of workloads, each with unique configurations, resource limits, and dependencies. As the number of clusters scales, maintaining visibility into resource allocations and ensuring strong tenancy boundaries becomes increasingly difficult.

Averlon provides automated visualization and analysis, offering much-needed clarity by:

  • Detecting missing resource limits across all clusters to prevent overconsumption.
  • Mapping workload relationships to identify dependencies and enforce tenancy boundaries.
  • Providing real-time insights instead of relying on static documentation or fragmented data.

With interactive dashboards and automated insights, teams can proactively manage resource exhaustion risks, optimize infrastructure, and prevent costly misconfigurations—without the manual overhead. Averlon ensures that Kubernetes environments remain resilient, cost-efficient, and secure at scale.

Learn more about the Averlon Platform and see how it simplifies cloud security at scale.

Ready to Reduce Cloud Security Noise and Act Faster?

Discover the power of Averlon’s AI-driven insights. Identify and prioritize real threats faster and drive a swift, targeted response to regain control of your cloud. Shrink the time to resolution for critical risk by up to 90%.

CTA image