this post was submitted on 12 Aug 2023
17 points (94.7% liked)

Selfhosted

39435 readers
11 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

  1. Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.

  2. No spam posting.

  3. Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.

  4. Don't duplicate the full text of your blog or github here. Just post the link for folks to click.

  5. Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).

  6. No trolling.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 1 year ago
MODERATORS
 

Hey guys,

I would like to setup some backups.

I have a raspberry at home and 2 VPS’s. I’m trying to setup borgmatic on my raspberry to back it up and the 2 VPS’s but I’m not sure this can be done.

Right now I’m looking to back up the raspberry and use rclone to mount one of the VPS and back it up. The issue is with the second VPS, it has MariaDB running and I can’t see how to back it up remotely (the port is not exposed publicly). I don’t find anything about tunneling in borgmatic. Am I forced to install borgmatic on the VPS to back it up? If I do this, how can I merge the back up with the other ones?

Actually should I do this or have 3 separate borg repositories?

Lastly, my raspberry uses rclone to push to S3 and I don’t want the keys to be accessible on the VPS’s, that’s why I’m trying to have borgmatic only on my raspberry.

Thanks for your help!

top 13 comments
sorted by: hot top controversial new old
[–] [email protected] 9 points 1 year ago (1 children)

Before using borg I would recommend you to take a look at restic. In my opinion it is better in everything than borg.

As for how to backup the database, my advice is to export the database to a SQL file and backup that file. That will always be easier than having to deal with agents that connect to the database.

As for the number of repositories, if you use restic, a single repository is enough. Besides, as restic does deduplication, if you have the same files between your machines, they will only occupy the space of one. ;-)

I hope I have helped you with some of my ideas.

Best regards.

[–] [email protected] 1 points 1 year ago

Thanks, indeed, restic looks interesting!

For now I don’t have much use for the deduplication as every servers are hosting different data.

One drawback I can see is the memory usage. My raspberry only has 2GB I’m not sure if this is going to be an issue. Currently the data I need to back up are around 100GB.

[–] [email protected] 5 points 1 year ago (1 children)

for the database, consider a script that does a "mysqldump" of the entire database that you schedule to run on the system daily/weekly. Also consider using gpg to encrypt the plain text file and delete the original in the same script. This is so you don't leave a copy of the data unencrypted anywhere outside the database. You can then initiate either a copy of the encrypted file to a local folder that you're backing up, or if you've set this up to back up directly on the remote that's fine too - bringing it local gives you a staged copy outside the archive and not on the original host in case you need an immediately available backup of your database.

With respect to the 3 separate repos, I would say keep them separate unless you have a large amount of duplicated data. Borg does not deduplicate over different repos as far as I'm aware. The downside of using a single repo is that the repo is locked during backups and if you're running different scripts from each host, the lock files borg creates can become stale if the script doesn't complete and one day (probably the day you're trying to restore) you'll find that borg hasn't been backing your stuff up because a lock file is holding the backup archive open due to a failed backup that terminated due to an untimely reboot months ago. I don't recall now why this occurs and doesn't self-correct but do remember concluding that if deduplication isn't a major factor, it's easier and safer to keep the borg repos separate by host. Deduplication is the only reason to combine them as far as I can tell.

When it comes to backup scripts, try to keep everything foolproof and use checks where you can to make sure the script is seeing the expected data, completes successfully and so on. Setting up automatic backups isn't a trivial task, although maybe tools like rclone and borgmatic simplify it - I haven't used those, just borg command line and scp/gpg in shell scripts. Have fun!

[–] [email protected] 1 points 1 year ago (2 children)

Thanks for the clarification about the repositories!

If I understood correctly, I should run a cron on the VPN that dumps the DB and encrypts it. Then borg only has to get the dump and archives it.

Also what is the reason not to use the mariadb tool provided by borg? It looked interesting because of the data stream.

[–] [email protected] 1 points 1 year ago (1 children)

Also what is the reason not to use the mariadb tool provided by borg?

I don’t know what tool you mean and can’t find any references online. I do see that Borgmatic allows hooks to run a program like mysqldump before a backup run, but it’s neither part of Borg itself nor has anything to do with streaming data, so I’m still confused about what tool you’ve found.

The advice you’ve gotten is good and it’s what I do. A cron job runs mysqldump, a different cron job runs borg, and I do error checking on both of those as well as occasional test restores.

[–] [email protected] 1 points 1 year ago (1 children)

I was talking about this: https://torsion.org/borgmatic/docs/how-to/backup-your-databases/

If you're using MariaDB, use the MariaDB database hook instead of mysql_databases: as the MariaDB hook calls native MariaDB commands instead of the deprecated MySQL ones.

But I guess this is basically a mysql dump reading from stdin

[–] [email protected] 1 points 1 year ago

Yeah, the main "innovation" is that it streams directly from MySQL/MariaDB to your encrypted Borg repository without hitting disk.

[–] [email protected] 1 points 1 year ago

Yes you understand the suggested approach. I don't know about the mariadb tool and if it looks good, by all means use it, but I would offer that the fastest, simplest way to restore a reasonably small database that I can think of is with a sql dump. Any additional complexity just seems like it's adding potential failure points. You don't want to be messing around with borg or any other tools to replay transactions when all you want to do is get your database rebuilt. Also, if you have an encrypted local copy of the dump, then restoring from borg is the last resort, because most of the time you'll just need the latest backup. I would bring the data local and back it up there if feasible. Then you only need a remote connection to grab the encrypted file and you'll always have a recent local copy if your server goes kaput. Borg will back it up incrementally.

[–] [email protected] 3 points 1 year ago* (last edited 1 year ago) (1 children)

Personally I would create one borg repo for every server I backup. But I also use borgbase, which encourages that by default.

About the backup of the datase - use mysqldump and store that. You can also do it like this:

mysqldump [...] | borg create [...] -

The - tells borg to use stdin as the content you want to store.

Lastly there is a pull mode in borg, that way you could run one script on one host to backup all your servers. You would need to run mysqldump over ssh then.

Edit:

You can use one repo for all servers, make sure to use prefixes, so that you can use --glob-archives with borg prune, if not you might get old backups cleaned out that you want to keep.

[–] [email protected] 3 points 1 year ago

Thanks pull mode is exactly what I was looking for!

[–] [email protected] 2 points 1 year ago* (last edited 1 year ago) (1 children)

borgmatic dev here. What I do is run borgmatic locally on each server that needs to get backed up. That's a whole lot easier IMO than setting up network filesystems / rclone or tunnels or screwing around with database dumps yourself, and potentially more reliable. So in your case, I'd run borgmatic on the VPS and then have it connect locally to your MariaDB database using borgmatic's native filesystem support. And then if you also backup the local files with that same VPS instance of borgmatic as well, there's nothing to "merge."

I'd generally recommend one Borg repository per source server / instance of borgmatic.

Lastly, my raspberry uses rclone to push to S3 and I don’t want the keys to be accessible on the VPS’s, that’s why I’m trying to have borgmatic only on my raspberry.

You could always have borgmatic backup to a local Borg repository on the VPS, and then run rclone on your trusted server to copy that repository to S3. Personally I'd probably just put the S3 keys on the VPS and lock it down so that I trust its security, but you do you. 😀

[–] [email protected] 1 points 1 year ago (1 children)

I see, thank you.

For now I went with the cron dump and the rclone. The only issue with this setup is that I can’t monitor the database dump easily. Thus, if the dump fails, borg will just backup the failed dump…

As for the VPS, of course, ideally, it’s secured enough. But as it is said, if the server is exposed to the Internet you cannot be sure of anything…

[–] [email protected] 2 points 1 year ago

For the cron dumps, you could plug the cron job into a monitoring service (Healthchecks, etc.) so you'd at least know when it fails.