Skip Navigation

Docker Backup Stratagy

Hello Self-Hosters,

What is the best practice for backing up data from docker as a self-hoster looking for ease of maintenance and foolproof backups? (pick only one :D )

Assume directories with user data are mapped to a NAS share via NFS and backups are handled separately.

My bigger concern here is how do you handle all the other stuff that is stored locally on the server, like caches, databases, etc. The backup target will eventually be the NAS and then from there it'll be double-backed up to externals.

  1. Is it better to run #cp /var/lib/docker/volumes/* /backupLocation every once in a while, or is it preferable to define mountpoints for everything inside of /home/user/Containers and then use a script to sync it to wherever you keep backups? What pros and cons have you seen or experienced with these approaches?

  2. How do you test your backups? I'm thinking about digging up an old PC to use to test backups. I assume I can just edit the ip addresses in the docker compose, mount my NFS dirs, and failover to see if it runs.

  3. I started documenting my system in my notes and making a checklist for what I need to backup and where it's stored. Currently trying to figure out if I want to move some directories for consistency. Can I just do docker-compose down edit the mountpoints in docker-compose.yml and run docker-compose up to get a working system?

25

You're viewing a single thread.

25 comments
  • You want your docker container’s persistent data mounted to real locations. I use the volumes for non-persistent stuff.

    You want your real locations to have a file system that can snapshot (ZFS, BTRFS).

    Then you can dump the superior Postgres databases and for all other databases and data, you stop the containers, snapshot, start the containers (limits downtime!), and then back up that snapshot. Thanks to snapshot, you don’t need to wait until the backup is done to bring the containers back up for data consistency. For backup I use restic, it does seem to work well, and has self-check functions so that’s nice. I chose restic instead of just sending snapshots because of its coupled encryption and checks, which allow for reliable data integrity on unreliable mediums (anyone, even giant providers, could blackhole bits of your backup!). I copy over the restic binary that made the backup using encrypted rclone, the encryption there prevents anyone (the baddies? Idk who’d target me but it doesn’t matter now!) from mucking with the binary if you did need that exact version to restore from for some reason.

    Note I do not dump SQL or the like, they’re offline and get snapshotted in a stable state. The SQL dump scene was nasty, esp compared to Postgres’ amazingly straightforward way (while running!). I didn’t bother figuring out SQL dump or restore.

    All of your containers should have specific users for them, specify the UID/GID so they’re easily recreatable in a restore scenario. (The database containers get their own users too)

    Addendum for the specific users: Make an LXC container run by a specific user and put the docker container in it if the docker container is coded by F tier security peeps and hard requires root. Or use podman, it is competent and can successfully lie to those containers.

    I don’t test my backups because the time to do so is stupid high thanks to my super low internet speeds. I tested restoring specific files with restic when setting it up and now I rely on the integrity checks (2GB check a day) to spot check everything is reliable. I have a local backup as well as a remote, the local is said snapshot used to make the restic remote backup. The snapshot is directly traversable and I don’t need to scrutinize it hard. If I had faster internet, I’d test restoring remote restic once a year probably. For now I try to restore a random file or small directory once a year.

    Hope the rant helps

25 comments