hi peeps.
Im early on my selfhosting journey and managed to setup some things as a test in docker. the main service im looking to run is ERPNext. after a while I got it up and running so now I want to back this up including all the customization I did in the app.
where to start with this? any intuitive guide?
My docker files, configs, and volumes are all kept in a structure like:
/srv - /docker - - /syncthing - - - /compose.yml - - - /sync-volume - - /traefik - - - /compose.yml [...]I just backup /srv/docker, but I black list some subfolders for things like databases for which regular dumps are created or something. Currently the compressed / deduplicated repos consume ~350GB.
I use borgmatic because you do 1 full backup and thereafter everything is incremental, so minimal bandwidth.
I keep one backup repo on the server itself in /srv/backup - yes this will be prone to failure of that server but it’s super handy to be able to restore from a local repo if you just mess up a configuration or version upgrade or something.
I keep two other backup repos in two other physical locations, and one repo air gapped.
For example I rent a server from OVH in a Sydney data centre, there’s one repo in /srv/backup on that server, one on OVH’s storage service, one kept on my home server, and one on a removable drive I update periodically.
All repo’s are encrypted except for the air gapped one. That one has instructions intended for someone to use if I die or am incapacitated. So it has my master password for my password database, ssh keys, everything. We have a physical safe at home so that’s where that lives.
Do you recommend moving an existing volume to this new structure?
In general you backup everything that cannot be recreated through external services. So that would be the configuration files and all volumes you added. Maybe logfiles as well.
If databases are involved they usually offer some method of dumping all data to some kind of text file. Usually relying on their binary data is not recommended.
Borg is a great tool to manage backups. It only backs up changed data and you can instruct it to only keep weekly, monthly, yearly data, so you can go back later.
Of course, just flat out backing up everything is good to be able to quickly get back to a working system without any thought. And it guarantees that you don’t forget anything.
If databases are involved they usually offer some method of dumping all data to some kind of text file. Usually relying on their binary data is not recommended.
It’s not so much text or binary. It’s because a normal backup program that just treats a live database file as a file to back up is liable to have the DBMS software write to the database while it’s being backed up, resulting in a backed-up file that’s a mix of old and new versions, and may be corrupt.
Either:
- The DBMS needs to have a way to create a dump — possibly triggered by the backup software, if it’s aware of the DBMS — that won’t change during the backup
or:
- One needs to have filesystem-level support to grab an atomic snapshot (e.g. one takes an atomic snapshot using something like btrfs and then backs up the snapshot rather than the live filesystem). This avoids the issue of the database file changing while the backup runs.
In general, if this is a concern, I’d tend to favor #2 as an option, because it’s an all-in-one solution that deals with all of the problems of files changing while being backed up: DBMSes are just a particularly thorny example of that.
Full disclosure: I mostly use ext4 myself, rather than btrfs. But I also don’t run live DBMSes.
EDIT: Plus, #2 also provides consistency across different files on the filesystem, though that’s usually less-critical. Like, you won’t run into a situation where you have software on your computer update File A, then does a
sync(), then updates File B, but your backup program grabs the new version of File B but then the old version of File A. Absent help from the filesystem, your backup program won’t know where write barriers spanning different files are happening.In practice, that’s not usually a huge issue, since fewer software packages are gonna be impacted by this than write ordering internal to a single file, but it is permissible for a program, under Unix filesystem semantics, to expect that the write order persists there and kerplode if it doesn’t…and a traditional backup won’t preserve it the way that a backup with help from the filesystem can.
By everything, does this mean the docker file and its volume?
The whole drive. The docker file and volumes are the bare minimum.
Sweet ill get to it
In addition to daily backups, once a month I image the drive. I wrote a simple script triggered by a cron job to image the drive to a NAS backup. The daily backups go to 3 different offsite storage facilities, and two go to separate NAS drives. All drive images are kept both local and off premise as well. So, for individual files, etc, I can restore them from the daily backups. If the wheels fall off, I can restore the whole drive from an image. Might be a bit over engineered but I’ve been caught a few times and so I just decided that won’t happen again.
Interesting.
Yep, I agree there’s 2 types of backups:
- data
- OS image
Out of curiosity, how are you doing the drive imaging?
My setup is easy and reliable:
Bash script that runs restic to backup to backblaze with a 90 day retention snapshot policy and a systemd service + timer.
It runs everyday, everything is backedup to b2, and I don’t need to bother with it.
Pros:
- easy
- quick
- reliable
- private (restic encrypts before sending)
- don’t need to worry about multiple backups as backblaze does it for me (3-2-1 system)
Cons:
- costs (very little) money (backblaze is basically the cheapest provider)
- long restore time as it would be slow to download
- restore costs (pay per gb downloaded)
Check the app’s own docs first, there is something here about automating backups:
https://docs.frappe.io/erpnext/user/manual/en/download-backup
Oh this is smart! Thanks
I backup the whole / with borg
It has insane deduplication and compression and it creates a structure kind of like git, so you can have lots of incrimental versions without using much space.
Borg is a solid choice, but over the last couple years I’ve transitioned to Restic which prefer slightly. It seems a lot faster, has better usability/ergonomics, and easier to configure a write-only setup (so your server can write snapshots, but is incapable of deleting and such).
I’ve never tried restic.
I’m happy with borg and no real reason to switch.
Just wanted to add that borgmatic is like a configuration manager for borg backup. Still CLI & config file, and just running borg commands on the back end, but adds some nice features like notifications while really simplifying the configuration required.




