DevOps
DevOps integrates and automates the work of software development (Dev) and IT operations (Ops) as a means for improving and shortening the systems development life cycle.
Rules:
- Posts must be relevant to DevOps
- No NSFW content
- No hate speech, bigotry, etc
- Try to keep discussions on topic
- No spam of tools/companies/advertisements
- It’s OK to post your own stuff part of the time, but the primary use of the community should not be promotional content.
Icon base by Lorc under CC BY 3.0 with modifications to add a gradient
That seems like a problem of the application, no? If the workloads have memory leaks or are too eager to get memory to itself, then no cluster will be able to make it perform better.
Others are correct, the problem is the software. You are right to use memory requests and limits. The limits being the max it will use, but hopefully other pods won't be using all of their limits at once.
So all of the pods' memory requests on a given node will sum to < 100% of the total available memory. So you can of course say your pod requests the highest amount of ram it will ever need, but that does mean it's reserved for that pod and won't be used anywhere else even during downtime
K8s will allow over provisioning of ram for the limits though because it assumes it will not always need that as you are seeing.
What you can do is to set a priority class on the pod so when it spikes and you don't have enough ram, it will kill some other pod instead of yours, but that makes other pods more volatile of course.
There's many options at your disposal, you'll have to decide what works best for your use case.
https://home.robusta.dev/blog/stop-using-cpu-limits
Okay, it's actually more complex than that. Because on self managed nodes, kubernetes is not the only thing that's running, so it can make sense to set limits for other non kubernetes workloads hosted on those nodes. And memory is a bit different from CPU. You will have to do some testing and YMMV but just keep the difference between requests and limits in mind.
But my suggestion would be to try to see if you can get away with only setting requests, or with setting high very high limits. See: https://kubernetes.io/docs/tasks/configure-pod-container/assign-memory-resource/#if-you-do-not-specify-a-memory-limit
In order for them not to be OOM Killed, you have to set the memory requests for them above their highest spike, which means most of the time they’re only using like 25% or so of their memory allocation.
Are you sure? Only limits should limit the total memory usage of a pod? Requests should happily let pods use more memory than the request size.
One thing I am curious about is if your pods actually need that much memory. I have heard (horror) stories, where people had an application in Kubernetes with a memory leak, so what they did instead of fixing the memory leak, was to just regularly kill pods and restart new ones that weren't leaking yet. :/
To answer your actual question about memory optimization, no. Even google still "wastes" memory by having requests and limits higher than what pods usually use. It is very difficult to prune and be ultra efficient. If an outage due to OOM costs more than paying for more resources would, then people just resort to the latter.