Resource Limits
Indicating Limits
By default, Pods have no resource limits, which means they can use as much CPU and memory as the node has available. To prevent one Pod from monopolizing all available resources, you should set resource limits for Pods. You should set both a memory limit and a CPU limit. If a Pod uses more than its resource limit, it is restarted (new container). If it continues to fail, it goes into a CrashLoopBackoff
just like with liveness probes.
You can specify resource limits in the spec.containers[].resources.limits
section of a Pod configuration. The following example sets a memory limit of 50 MiB for a container:
vmemory-allocator-with-limit.yaml
spec: # The Pod spec in the Deployment
containers:
- image: kiamol/ch12-memory-allocator
resources:
limits: # Resource limits constrain compute power
memory: 50Mi # for the container; this limits RAM to 50 MB.
When you create this Pod, the container is automatically assigned a memory limit of 256 MiB. It will restart when it hits that limit
Namespace Quotas
You can also set resource limits at the namespace level. This is useful if you want to prevent a single user from monopolizing all the resources on a cluster. You can set a quota for the total amount of CPU and memory that can be used by all Pods in a namespace.
namespace-with-quota/02-memory-quota.yaml
apiVersion: v1
kind: ResourceQuota
metadata:
name: memory-quota
namespace: kiamol-ch12-memory
spec:
hard:
limits.memory: 150Mi
Now, if a pod specifies a limit that is greater than the namespace quota, it will not be created. For example:
03-memory-allocator.yaml
apiVersion: apps/v1
kind: Deployment
...
spec:
containers:
- name: api
image: kiamol/ch12-memory-allocator
resources:
limits:
memory: 200Mi
CPU Limits
CPU limits to containers and quotas, but they work in a slightly different way. Containers with a CPU limit run with a fixed amount of processing power, and they can use as much of that CPU as they like—they aren’t replaced if they hit the limit
You can use multiples to give your app container access to many cores or divide a single core into “millicores,” where one millicore is one-thousandth of a core
web-with-cpu-limit.yaml
spec:
containers:
- image: kiamol/ch05-pi
command: ["dotnet", "Pi.Web.dll", "-m", "web"]
resources:
limits:
cpu: 250m # 250 millicores limits the container to 0.25 cores.
Include both CPU and memory limits in your Pod specs.
Resource Constraint Failures
What if you have a namespace quota of 500m, but you have two replicas with a limit of 300m each? It will only deploy one pod. YOu can see it like this:
# after deploying your app with two repolicas
% kubectl get deploy -n kiamol-ch12-cpu
NAME READY UP-TO-DATE AVAILABLE AGE
pi-web 1/2 1 1 44s
# Debug what happened (It will tell you it ran out of quota)
% kubectl describe replicaset -n kiamol-ch12-cpu
# Get resource quota
% kl get resourcequota -n kiamol-ch12-memory
Checking Available Resources
You can check cpu available on a node like so:
kubectl get nodes -o jsonpath='{.items[].status.allocatable.cpu}'
Also, when you scale things up, it is helpful to check the ReplicaSet to see if it is scaling up. If it is not, it is likely because it is hitting a resource limit.
kubectl get rs -l app=pi-web --all-namespaces
Don’t Use CPU Limits?
This blog post is controversial, Michal thinks it’s bad advice. I tend to agree with Michal.
Requests vs. Limits
No CPU limit
If you do not specify a CPU limit for a Container, then one of these situations applies:
The Container has no upper bound on the CPU resources it can use. The Container could use all of the CPU resources available on the Node where it is running.
The Container is running in a namespace that has a default CPU limit, and the Container is automatically assigned the default limit. Cluster administrators can use a LimitRange to specify a default value for the CPU limit.
Limits create requests by default
If you specify a CPU limit for a Container but do not specify a CPU request, Kubernetes automatically assigns a CPU request that matches the limit. Similarly, if a Container specifies its own memory limit but does not specify a memory request, Kubernetes automatically assigns a memory request that matches the limit.