I got an home server that is running docker for all my self hosted apps. But sometimes I accidentally trigger Earlyoom by remotely starting expensive docker builds, which kill docker.
I don't have access to my server outside of my home network, so I can't manually restart docker in those situations.
What would be the best way to restart it automatically? I don't mind doing a full system restart if needed
Structuring the available CPU and Memory reservations for containers is LITERALLY the entire reason containers exist. Just because you're only familiar with the "dumb" way of using them doesn't mean you should be dismissive when someone offers you advice when you come here asking for it.
You're also seemingly just a dick for being lazy, because I looked, and wuddyaknow. So now you're just rude, dickish, and lazy.
Take the advice from the original responder, and then go and learn how to use the things you're asking for help with, along with some manners.
Alright, sorry for calling it a "bandaid fix". It wasn't just the right term for what I wanted to say. I was more referring on how it would only fix issues in cases of builds, and not on actual runtime, which can also be an issue if I am not careful.
So yeah, it's the fix for the issue in the post, but this solution made me realise that this isn't the only thing I want.
But the second part is... Just chill. It's a home server. Not a high availability cluster. I can afford stupid things. Heck, I'm only asking this question because I got stupid and haven't limited the job count of a cargo build, downing my server. I don't care that my build crash. I just want to not have to manually restart it, because when I'm not here I can't do it.
As for the link that you sent, it's container limitations, not image building limitations.
And I already have setup some on my most hungry container, stats shown that it blew past it, so idk what's going on there.
Edit: NVM. This is a bandaid fix. What if you forgot to put the flag? Like it's been 5 month since last time and forgot to do the same fix? Or you accidentally removed it while editing the command?
I'm actually looking for a solution that fixed my problem fully, not a partial solution
Then you didn't explain the issue very well, because what you're asking for was given to you exactly. Builds also have flags, and you should know that if you're complaining about advice given to you. I'm not saying that to admonish you, just giving you the info.
The next step down is that you're using Portainer, and having user-error issues somehow. So another solution is renaming these actions something with a very obvious prefix like "BUILD ACTION", but also setting memory limits.
The very last step is making sure your swap is in order. Allocate 2x your system memory to swap, and this will help alleviate OOM issues to a point, but especially during builds.
If you come back and say this is a band-aid solution, get a better machine and stop asking questions to solve the impossible in here. This is your fault this is an issue to begin with, you don't know how to run your machines (regardless of it just being a home server or whatever ), and you're just being rude.
The other person may have responded with a fair amount of hostility, but they're absolutely correct. I run Kubernetes clusters hosting millions of containers across hundreds of thousands of VMs at my job, and OOMKills are just a fact of life. Apps will leak memory, and you're powerless to fix it unless you're willing to debug the app and fix the leak. It's better for the container to run out of memory and trigger a cgroup-scoped OOM kill. A system-wide OOM kill will murder the things you love, shit in your hat, and lick your face like David Tennant licked Krysten Ritter.
Oh that's not a problem to let a container get killed. It's perfectly fine. What I want is just not crippling my whole server because one container did a funny.
If it keeps docker and the portainer VM I'll be 100% ok, because I can just restart it. I don't want to have remote access to my server outside of my home for security reasons, so this is just the bare minimum