📝 Kubernetes

Resource Limits and Requests in Kubernetes

P
Author
Pyland
📅
Published
30.06.2026
⏱️
Reading time
3 min
👁️
Views
78
🌳
Level
Advanced

Without resource constraints a single “greedy” container can consume all the memory on a node and kill the other pods. Requests and Limits are the mechanism for fair CPU and memory distribution across the cluster.

requests: Guaranteed Resources

A request is the minimum amount of resources that K8s guarantees to a container. The Scheduler uses requests to decide which node to place a Pod on.

If a Pod requests 500m CPU and 256Mi RAM →
Scheduler looks for a node with at least 500m CPU and 256Mi RAM free

A request is a reservation. Even if the container uses less, those resources are “reserved” for it.

limits: Maximum Resources

A limit is the consumption ceiling. A container cannot exceed the limit.

  • CPU Limit: if the process wants more, it gets throttled (slowed down). The Pod is not killed.
  • Memory Limit: if the process exceeds the memory limit, it receives OOMKilled (Out Of Memory, the Pod restarts).

CPU and Memory: Units of Measurement

CPU

CPU is measured in millicores (m):

Notation Value
1 or 1000m 1 CPU core
500m 0.5 core
100m 0.1 core
250m 25% of one core
resources:
  requests:
    cpu: "100m"   # 0.1 core guaranteed
  limits:
    cpu: "500m"   # maximum 0.5 core

Memory

Notation Value
128Mi 128 mebibytes
1Gi 1 gibibyte (≈ 1.07 GB)
512M 512 megabytes (prefer Mi over M)
resources:
  requests:
    memory: "128Mi"
  limits:
    memory: "256Mi"

Full YAML with Resources

# deployment-with-resources.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-server
spec:
  replicas: 3
  selector:
    matchLabels:
      app: api-server
  template:
    metadata:
      labels:
        app: api-server
    spec:
      containers:
        - name: api
          image: my-api:1.0
          ports:
            - containerPort: 8000
          resources:
            requests:
              cpu: "200m"
              memory: "256Mi"
            limits:
              cpu: "1000m"
              memory: "512Mi"

        - name: sidecar-proxy
          image: envoy:v1.28
          resources:
            requests:
              cpu: "50m"
              memory: "64Mi"
            limits:
              cpu: "200m"
              memory: "128Mi"

QoS Classes: Guaranteed, Burstable, BestEffort

K8s automatically assigns a QoS (Quality of Service) class to each Pod. This determines the eviction order when a node runs out of resources.

Guaranteed

Requests == Limits for all containers, for both CPU and Memory. Highest priority — evicted last.

resources:
  requests:
    cpu: "500m"
    memory: "256Mi"
  limits:
    cpu: "500m"      # equals requests
    memory: "256Mi"  # equals requests

Burstable

At least one limit or request is set, but they are not equal. Medium priority.

resources:
  requests:
    cpu: "100m"
    memory: "128Mi"
  limits:
    cpu: "500m"
    memory: "512Mi"

BestEffort

Neither requests nor limits are specified. Lowest priority — evicted first when resources are scarce.

# No resources section at all
containers:
  - name: app
    image: my-app:1.0

Never use BestEffort in production.

Checking Resource Usage

# Pod resource consumption
kubectl top pods
# NAME                    CPU(cores)   MEMORY(bytes)
# api-server-abc123       185m         243Mi
# api-server-def456       210m         198Mi

# Node resource consumption
kubectl top nodes
# NAME       CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%
# node-1     1250m        31%    4192Mi          52%
# node-2     890m         22%    3100Mi          38%

# Detailed resource allocation on a node
kubectl describe node node-1
# Allocated resources:
#   CPU Requests  CPU Limits   Memory Requests  Memory Limits
#   1250m (31%)  3500m (87%)  4096Mi (51%)     8192Mi (102%)

LimitRange: Default Limits for a Namespace

# limitrange.yaml
apiVersion: v1
kind: LimitRange
metadata:
  name: default-limits
  namespace: production
spec:
  limits:
    - type: Container
      default:          # default limits
        cpu: "500m"
        memory: "256Mi"
      defaultRequest:   # default requests
        cpu: "100m"
        memory: "128Mi"
      max:              # maximum allowed limits
        cpu: "2"
        memory: "2Gi"
      min:              # minimum allowed requests
        cpu: "50m"
        memory: "32Mi"

ResourceQuota: Namespace-Level Limits

# resourcequota.yaml
apiVersion: v1
kind: ResourceQuota
metadata:
  name: namespace-quota
  namespace: production
spec:
  hard:
    requests.cpu: "10"        # total CPU requests must not exceed 10 cores
    requests.memory: "20Gi"   # total memory requests must not exceed 20 Gi
    limits.cpu: "40"
    limits.memory: "80Gi"
    pods: "50"                # no more than 50 pods in the namespace
kubectl apply -f resourcequota.yaml

kubectl describe resourcequota namespace-quota -n production
# Name:            namespace-quota
# Resource         Used    Hard
# --------         ----    ----
# limits.cpu       3500m   40
# limits.memory    6Gi     80Gi
# pods             12      50
# requests.cpu     1250m   10
# requests.memory  3Gi     20Gi

Your reaction to the article

💬 Comments (0)

🔐 Sign in to leave a comment
🚪 Login
💭

No comments yet

Be the first to share your opinion about this article!

🔗 Similar

Similar articles

Continue learning with these materials

📝

Health Checks: Liveness and Readiness in Kubernet…

A container is running — that does not mean the application is working. There might...

📅 30.06.2026 👁️ 93
📝

What Is Kubernetes and Why You Need It

Docker packages an application into a container. But what do you do when you have...

📅 30.06.2026 👁️ 85
📝

ConfigMap and Secret: Configuration in Kubernetes

Configuration must not be baked into a Docker image. Different environments (dev, staging, production) require...

📅 30.06.2026 👁️ 93

Did you like the article?

Subscribe to our updates and receive new articles first. Grow with PyLand!