Kubernetes QoS with Resource Requests and Limits

Raju Dawadi
4 min readJul 25, 2020

Quality of Service(QoS) by its name suggests the way service is provided based on need and also in adverse situation. The quality is in terms of the resource(cpu, memory etc.). In this post, we will talk about the QoS of pod in Kubernetes cluster.

While configuring or scheduling pod, we pass 2 parameters which determines the usage of node resources: Requests and Limits. These two parameters are provided for each containers in pod and are mainly dealt with CPU and Memory(RAM). “Request” specifies the minimum resource allocated to container which is reserved whether the container actually uses or not. And Limits is the cap only upto which the container can utilize resource. Let’s say we have 100m CPU as Requests and 200m CPU as Limits. While starting the container, it is scheduled in the node which has 100m of free CPU and that much of resource is reserved. But when traffic or any action is performed in the container, it can use up to 200m of CPU — no more than that in any condition.

These two parameters are not mandatory but what will happen if we don’t pass any one of them?

  • If Requests is not passed to Kubernetes pod resource, then it is set with Limits value
  • If Limits is not passed to Kubernetes pod resource, then it is not set

Kubernetes categorize resources as Compressible & Incompressible and up to now the support is for CPU and later for Memory. This categorization also acts while providing QoS. A pod is throttled when there is request for compressible resource but it could be killed if the limit is reached for incompressible resource. That’s the reason a pod restarts(process inside is killed) when it requests high memory than allocated but it responds slowly when the limit is reached for CPU.

Kubernetes has assigned 3 QoS classes: Guaranteed, Burstable and Best Effort. These classes can be best utilized while setting the priority of pod when the K8s nodes are under pressure.

Best-Effort

A pod which has no defined Requests and Limits value is likely to get killed when there is race of incompressible resource. Pod with low priority like: development namespace, stress testing for resource consumption study are placed in this class while defining the pod spec.

Burstable

When Kubernetes pod is running within the Limits of resource assigned, the system gives its best to not try to kill but when it has to this comes in second. Best-effort is one which has both Requests and Limits defined but Limits value is more than Requests. For example, let’s see this manifest:

containers:
name: foo
resources:
limits:
cpu: 50m
memory: 2Gi
requests:
cpu: 20m
memory: 1Gi

Burstable pod is killed after Best-Effort class is emptied and still there is race for resource.

Guaranteed

When a pod have same value of Requests and Limits on resources OR only the Limits is specified(Requests==Limits), then it gets the highest priority when node is under pressure or has to be evicted. Means, the pod is less likely to get terminated in adverse condition. Example manifest:

# No Requests value is specified
containers:
name: foo
resources:
limits:
cpu: 100m
memory: 1Gi
---
# Requests is same as Limits
containers:
name: foo
resources:
limits:
cpu: 10m
memory: 1Gi
requests:
cpu: 10m
memory: 1Gi

Which one to choose?

Its a best practice to allocate Requests and Limits on resource for pod. Let’s say we have none of the resources defined and we have a vulnerability or problem in code which keeps on multiplying request of node resource. If auto-scaling of nodes is enabled, this might burst the scaling costing $ badly.

For the pod which is of highest priority like, ingress load balancer pod, api gateway without which none of the service matters can be set as Guaranteed.

For normal workload, Burstable is a good choice along with the min-max with autoscaling capability of pod.

Best-Effort is preferred for pods which has low value. Considering we have dev and production pods running in same Kubernetes cluster, then we can give this class to pods running in dev namespace so that production resources are not throttled. Another context would be while stress testing while we need to research on how the pod will behave or consume resource with certain load.

Do you have any interesting experience using the QoS classes? Drop words in comment. Get connected with me on Twitter and Linkedin where I keep on sharing interesting updates.

--

--