this post was submitted on 25 Jul 2023

190 points (98.0% liked)

Linux

48655 readers

765 users here now

From Wikipedia, the free encyclopedia

Linux is a family of open source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991 by Linus Torvalds. Linux is typically packaged in a Linux distribution (or distro for short).

Distributions include the Linux kernel and supporting system software and libraries, many of which are provided by the GNU Project. Many Linux distributions use the word "Linux" in their name, but the Free Software Foundation uses the name GNU/Linux to emphasize the importance of GNU software, causing some controversy.

Rules

Posts must be relevant to operating systems running the Linux kernel. GNU/Linux or otherwise.
No misinformation
No NSFW content
No hate speech, bigotry, etc

Related Communities

Community icon by Alpár-Etele Méder, licensed under CC BY 3.0

founded 5 years ago

MODERATORS

[email protected]

190

How do you all go about backing up your data, on Linux? (lemmy.world)

submitted 1 year ago* (last edited 1 year ago) by [email protected] to c/[email protected]

113 comments fedilink hide all child comments

I'm trying to find a good method of making periodic, incremental backups. I assume that the most minimal approach would be to have a Cronjob run rsync periodically, but I'm curious what other solutions may exist.

I'm interested in both command-line, and GUI solutions.

(page 2) 46 comments

sorted by: hot top controversial new old

[–] [email protected] 2 points 1 year ago

Do most of my work on nfs, with zfs backing on raidz2, send snapshots for offline backup.

Don't have a serious offsite setup yet, but it's coming.

[–] [email protected] 2 points 1 year ago

Github for projects, Syncthing to my NAS for some config files and that's pretty much it, don't care for the rest.

[–] [email protected] 2 points 1 year ago* (last edited 1 year ago)

I've got a smb server setup with a 12tb server drive. Anything important gets put on there

Edit: fixed spelling

[–] [email protected] 2 points 1 year ago

I use rsync to an external drive, but before that I toyed a bit with pika backup.

I don't automate my backup because i physically connect my drive to perform the task.

[–] [email protected] 2 points 1 year ago

I use bupstash to backup to a server I built a few years ago

[–] [email protected] 2 points 1 year ago

I use lucky backup to mirror to external drive. And I also use Duplicacy to back up 2 other separate drives at the same time. Have a read on the data hoarder wiki on backups.

[–] [email protected] 2 points 1 year ago* (last edited 1 year ago)

I made my own bash script that uses rsync. I stopped using Github so here's a paste lol.

I define the backups like this, first item is source, other items on that line are it's exclusions.

/home/shared
/home/jamie     tmp/ dj_music/ Car_Music_USB
/home/jamie_work

#!/usr/bin/ssh-agent /bin/bash

# chronicle.sh



# Get absolute directory chronicle.sh is in
REAL_PATH=`(cd $(dirname "$0"); pwd)`

# Defaults
BACKUP_DEF_FILE="${REAL_PATH}/backup.conf"
CONF_FILE="${REAL_PATH}/chronicle.conf"
FAIL_IF_PRE_FAILS='0'
FIXPERMS='true'
FORCE='false'
LOG_DIR='/var/log/chronicle'
LOG_PREFIX='chronicle'
NAME='backup'
PID_FILE='~/chronicle/chronicle.pid'
RSYNC_OPTS="-qRrltH --perms --delete --delete-excluded"
SSH_KEYFILE="${HOME}/.ssh/id_rsa"
TIMESTAMP='date +%Y%m%d-%T'

# Set PID file for root user
[ $EUID = 0 ] && PID_FILE='/var/run/chronicle.pid'


# Print an error message and exit
ERROUT () {
    TS="$(TS)"
    echo "$TS $LOG_PREFIX (error): $1"
    echo "$TS $LOG_PREFIX (error): Backup failed"
    rm -f "$PID_FILE"
    exit 1
}


# Usage output
USAGE () {
cat << EOF
USAGE chronicle.sh [ OPTIONS ]

OPTIONS
    -f path   configuration file (default: chronicle.conf)
    -F        force overwrite incomplete backup
    -h        display this help
EOF
exit 0
}


# Timestamp
TS ()
{
    if
        echo $TIMESTAMP | grep tai64n &>/dev/null
    then
        echo "" | eval $TIMESTAMP
    else
        eval $TIMESTAMP
    fi
}


# Logger function
# First positional parameter is message severity (notice|warn|error)
# The log message can be the second positional parameter, stdin, or a HERE string
LOG () {
    local TS="$(TS)"
    # local input=""

    msg_type="$1"

    # if [[ -p /dev/stdin ]]; then
    #     msg="$(cat -)"
    # else
        shift
        msg="${@}"
    # fi
    echo "$TS chronicle ("$msg_type"): $msg"
}

# Logger function
# First positional parameter is message severity (notice|warn|error)
# The log message canbe stdin or a HERE string
LOGPIPE () {
    local TS="$(TS)"
    msg_type="$1"
    msg="$(cat -)"
    echo "$TS chronicle ("$msg_type"): $msg"
}

# Process Options
while
    getopts ":d:f:Fmh" options; do
        case $options in
            d ) BACKUP_DEF_FILE="$OPTARG" ;;
            f ) CONF_FILE="$OPTARG" ;;
            F ) FORCE='true' ;;
            m ) FIXPERMS='false' ;;
            h ) USAGE; exit 0 ;;
            * ) USAGE; exit 1 ;;
    esac
done


# Ensure a configuration file is found
if
    [ "x${CONF_FILE}" = 'x' ]
then
    ERROUT "Cannot find configuration file $CONF_FILE"
fi

# Read the config file
. "$CONF_FILE"


# Set the owner and mode for backup files
if [ $FIXPERMS = 'true' ]; then
#FIXVAR="--chown=${SSH_USER}:${SSH_USER} --chmod=D770,F660"
FIXVAR="--usermap=*:${SSH_USER} --groupmap=*:${SSH_USER} --chmod=D770,F660"
fi


# Set up logging

if [ "${LOG_DIR}x" = 'x' ]; then
    ERROUT "(error): ${LOG_DIR} not specified"
fi

mkdir -p "$LOG_DIR"
LOGFILE="${LOG_DIR}/chronicle.log"

# Make all output go to the log file
exec >> $LOGFILE 2>&1


# Ensure a backup definitions file is found
if
    [ "x${BACKUP_DEF_FILE}" = 'x' ]
then
    ERROUT "Cannot find backup definitions file $BACKUP_DEF_FILE"
fi


# Check for essential variables
VARS='BACKUP_SERVER SSH_USER BACKUP_DIR BACKUP_QTY NAME TIMESTAMP'
for var in $VARS; do
    if [ ${var}x = x ]; then
        ERROUT "${var} not specified"
    fi
done


LOG notice "Backup started, keeping $BACKUP_QTY snapshots with name \"$NAME\""


# Export variables for use with external scripts
export SSH_USER RSYNC_USER BACKUP_SERVER BACKUP_DIR LOG_DIR NAME REAL_PATH


# Check for PID
if
    [ -e "$PID_FILE" ]
then
    LOG error "$PID_FILE exists"
    LOG error 'Backup failed'
    exit 1
fi

# Write PID
touch "$PID_FILE"

# Add key to SSH agent
ssh-add "$SSH_KEYFILE" 2>&1 | LOGPIPE notice -

# enhance script readability
CONN="${SSH_USER}@${BACKUP_SERVER}"


# Make sure the SSH server is available
if
    ! ssh $CONN echo -n ''
then
    ERROUT "$BACKUP_SERVER is unreachable"
fi


# Fail if ${NAME}.0.tmp is found on the backup server.
if
    ssh ${CONN} [ -e "${BACKUP_DIR}/${NAME}.0.tmp" ] && [ "$FORCE" = 'false' ]
then
    ERROUT "${NAME}.0.tmp exists, ensure backup data is in order on the server"
fi


# Try to create the destination directory if it does not already exist
if
    ssh $CONN [ ! -d $BACKUP_DIR ]
then
    if
        ssh $CONN mkdir -p "$BACKUP_DIR"
        ssh $CONN chown ${SSH_USER}:${SSH_USER} "$BACKUP_DIR"
    then :
    else
        ERROUT "Cannot create $BACKUP_DIR"
    fi
fi

# Create metadata directory
ssh $CONN mkdir -p "$BACKUP_DIR/chronicle_metadata"


#-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
# PRE_COMMAND

if
    [ -n "$PRE_COMMAND" ]
then
    LOG notice "Running ${PRE_COMMAND}"
    if
        $PRE_COMMAND
    then
        LOG notice "${PRE_COMMAND} complete"
    else
        LOG error "Execution of ${PRE_COMMAND} was not successful"
        if [ "$FAIL_IF_PRE_FAILS" -eq 1 ]; then
            ERROUT 'Command specified by PRE_COMMAND failed and FAIL_IF_PRE_FAILS enabled'
        fi
    fi
fi


#-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
# Backup

# Make a hard link copy of backup.0 to rsync with
if [ $FORCE = 'false' ]; then
    ssh $CONN "[ -d ${BACKUP_DIR}/${NAME}.0 ] && cp -al ${BACKUP_DIR}/${NAME}.0 ${BACKUP_DIR}/${NAME}.0.tmp"
fi


while read -u 9 l; do

    # Skip commented lines
    if [[ "$l" =~ ^#.* ]]; then
    continue
    fi

    if [[ $l = '/*'* ]]; then
        LOG warn "$SOURCE is not an absolute path"
        continue
    fi

    # Reduce whitespace to one tab
    line=$(echo $l | tr -s [:space:] '\t')

    # get the source
    SOURCE=$(echo "$line" | cut -f1)

    # get the exclusions
    EXCLUSIONS=$(echo "$line" | cut -f2-)

    # Format exclusions for the rsync command
    unset exclude_line
    if [ ! "$EXCLUSIONS" = '' ]; then
        for each in $EXCLUSIONS; do
            exclude_line="$exclude_line--exclude $each "
        done
    fi


    LOG notice "Using SSH transport for $SOURCE"


    # get directory metadata
    PERMS="$(getfacl -pR "$SOURCE")"


    # Copy metadata
    ssh $CONN mkdir -p ${BACKUP_DIR}/chronicle_metadata/${SOURCE}
    echo "$PERMS" | ssh $CONN -T "cat > ${BACKUP_DIR}/chronicle_metadata/${SOURCE}/metadata"


    LOG debug "rsync $RSYNC_OPTS $exclude_line "$FIXVAR" "$SOURCE" \
    "${SSH_USER}"@"$BACKUP_SERVER":"${BACKUP_DIR}/${NAME}.0.tmp""

    rsync $RSYNC_OPTS $exclude_line $FIXVAR "$SOURCE" \
    "${SSH_USER}"@"$BACKUP_SERVER":"${BACKUP_DIR}/${NAME}.0.tmp"

done 9< "${BACKUP_DEF_FILE}"


#-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
# Try to see if the backup succeeded

if
    ssh $CONN [ ! -d "${BACKUP_DIR}/${NAME}.0.tmp" ]
then
    ERROUT "${BACKUP_DIR}/${NAME}.0.tmp not found, no new backup created"
fi


# Test for empty temp directory
if
    ssh $CONN [ ! -z "$(ls -A ${BACKUP_DIR}/${NAME}.0.tmp 2>/dev/null)" ]
then
    ERROUT "No new backup created"
fi

#-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
# Rotate

# Number of oldest backup
X=`expr $BACKUP_QTY - 1`


LOG notice 'Rotating previous backups'

# keep oldest directory temporarily in case rotation fails
ssh $CONN [ -d "${BACKUP_DIR}/${NAME}.${X}" ] && \
ssh $CONN mv "${BACKUP_DIR}/${NAME}.${X}" "${BACKUP_DIR}/${NAME}.${X}.tmp"


# Rotate previous backups
until [ $X -eq -1 ]; do
    Y=$X
    X=`expr $X - 1`

    ssh $CONN [ -d "${BACKUP_DIR}/${NAME}.${X}" ] && \
    ssh $CONN mv "${BACKUP_DIR}/${NAME}.${X}" "${BACKUP_DIR}/${NAME}.${Y}"
    [ $X -eq 0 ] && break
done

# Create "backup.0" directory
ssh $CONN mkdir -p "${BACKUP_DIR}/${NAME}.0"


# Get individual items in "backup.0.tmp" directory into "backup.0"
# so that items removed from backup definitions rotate out
while read -u 9 l; do

    # Skip commented lines
    if [[ "$l" =~ ^#.* ]]; then
    continue
    fi

    # Skip invalid sources that are not an absolute path"
    if [[ $l = '/*'* ]]; then
        continue
    fi

    # Reduce multiple tabs to one
    line=$(echo $l | tr -s [:space:] '\t')

    source=$(echo "$line" | cut -f1)

    source_basedir="$(dirname $source)"

    ssh $CONN mkdir -p "${BACKUP_DIR}/${NAME}.0/${source_basedir}"

    LOG debug "ssh $CONN cp -al "${BACKUP_DIR}/${NAME}.0.tmp${source}" "${BACKUP_DIR}/${NAME}.0${source_basedir}""

    ssh $CONN cp -al "${BACKUP_DIR}/${NAME}.0.tmp${source}" "${BACKUP_DIR}/${NAME}.0${source_basedir}"

done 9< "${BACKUP_DEF_FILE}"


# Remove oldest backup
X=`expr $BACKUP_QTY - 1` # Number of oldest backup
ssh $CONN rm -Rf "${BACKUP_DIR}/${NAME}.${X}.tmp"

# Set time stamp on backup directory
ssh $CONN touch -m "${BACKUP_DIR}/${NAME}.0"

# Delete temp copy of backup
ssh $CONN rm -Rf "${BACKUP_DIR}/${NAME}.0.tmp"

#-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
# Post Command

if
    [ ! "${POST_COMMAND}x" = 'x' ]
then
    LOG notice "Running ${POST_COMMAND}"
    if
        $POST_COMMAND
    then
        LOG notice "${POST_COMMAND} complete"
    else
        LOG warning "${POST_COMMAND} complete with errors"
    fi
fi

# Delete PID file
rm -f "$PID_FILE"

# Log success message
LOG notice 'Backup completed successfully'

[–] [email protected] 2 points 1 year ago

When I do something really dumb I typically just use dd to create an iso. I should probably find something better.

[–] [email protected] 2 points 1 year ago* (last edited 1 year ago)

I use Restic, called from cron, with a password file containing a long randomly generated key.

I back up with Restic to a repository on a different local hard drive (not part of my main RAID array), with --exclude-caches as well as excluding lots of files that can easily be re-generated / re-installed/ re-downloaded (so my backups are focused on important data). I make sure to include all important data including /etc (and also backup the output of dpkg --get-selections as part of my backup). I auto-prune my repository to apply a policy on how far back I keep (de-duplicated) Restic snapshots.

Once the backup completes, my script runs du -s on the backup and emails me if it is unexpectedly too big (e.g. I forgot to exclude some new massive file), otherwise it uses rclone sync to sync the archive from the local disk to Backblaze B2.

I backup my password for B2 (in an encrypted password database) separately, along with the Restic decryption key. Restore procedure is: if the local hard drive is intact, restore with Restic from the last good snapshot on the local repository. If it is also destroyed, rclone sync the archive from Backblaze B2 to local, and then restore from that with Restic.

Postgres databases I do something different (they aren't included in my Restic backups, except for config files): I back them up with pgbackrest to Backblaze B2, with archive_mode on and an archive_command to archive WALs to Backblaze. This allows me to do PITR recovery (back to a point in accordance with my pgbackrest retention policy).

For Docker containers, I create them with docker-compose, and keep the docker-compose.yml so I can easily re-create them. I avoid keeping state in volumes, and instead use volume mounts to a location on the host, and back up the contents for important state (or use PostgreSQL for state instead where the service supports it).

[–] [email protected] 2 points 1 year ago (1 children)

I use timeshift. It really is the best. For servers I go with restic.

[–] [email protected] 1 points 1 year ago* (last edited 1 year ago)

I use timeshift because it was pre-installed. But I can vouch for it; it works really well, and let's you choose and tweak every single thing in a legible user interface!

[–] [email protected] 2 points 1 year ago (2 children)

I run ZFS on my servers and then replicate to other ZFS servers with Syncoid.

load more comments (2 replies)

[–] [email protected] 1 points 1 year ago

I use Duplicacy to encrypt and backup my data to OneDrive on a schedule. If Proton ever creates a Linux client for Drive, then I'll switch to that, but I'm not holding my breath.

[–] [email protected] 1 points 1 year ago

Most of my data is backed up to (or just stored on) a VPS in the first instance, and then I backup the VPS to a local NAS daily using rsnapshot (the NAS is just a few old hard drives attached to a Raspberry Pi until I can get something more robust). Very occasionally I'll back the NAS up to a separate drive. I also occasionally backup my laptop directly to a separate hard drive.

Not a particularly robust solution but it gives me some piece of mind. I would like to build a better NAS that can support RAID as I was never able to get it working with the Pi.

[–] [email protected] 1 points 1 year ago (1 children)

I use Pika backup, which uses borg backup under the hood. It's pretty good, with amazing documentation. Main issue I have with it is its really finicky and is kind of a pain to setup, even if it "just works" after that.

[–] [email protected] 2 points 1 year ago (1 children)

Can you restore from it? That’s the part I’ve always struggled with?

[–] [email protected] 1 points 1 year ago

The way pika backup handles it, it loads the backup as a folder you can browse. I've used it a few times when hopping distros to copy and paste stuff from my home folder. Not very elegant, but it works and is very intuitive, even if I wish I could just hit a button and reset everything to the snapshot.

[–] [email protected] 1 points 1 year ago

I run Openmediavault and I backup using BorgBackup. Super easy to setup, use, and modify

[–] [email protected] 1 points 1 year ago (2 children)

ZFS send / recieve and snapshots.

[–] [email protected] 2 points 1 year ago (1 children)

Does this method allow to pick what you need to backup or it's the entire filesystem?

[–] [email protected] 1 points 1 year ago

It allows me to copy select datasets inside the pool.

So I can choose rpool/USERDATA/so-n-so_123xu4 for user so-n-so. I can also choose copy copy some or all of the rpool/ROOT/ubuntu_abcdef, and it's nested datasets.

I settle for backing up users and rpool/ROOT/ubuntu_abcdef, ignoring the stuff in var datasets. This gets me my users home, roots home, /opt. Tis all I need. I have snapshots and mirrored m2 ssd's for handling most other problems (which I've not yet had).

The only bugger is /boot (on bpool). Kernel updates grown in there and fill it up, even if you remove them via apt... because snapshots. So I have to be careful to clean it's snapshots.

[–] [email protected] 2 points 1 year ago

me too. ZFS is amazing

[–] [email protected] 1 points 1 year ago

Restic with deja dupe gui

[–] [email protected] 1 points 1 year ago

Restic to Synology nas, Synology software for cloud backup.

[–] [email protected] 1 points 1 year ago

Periodic backup to external drive via Deja Dup. Plus, I keep all important docs in Google Drive. All photos are in Google Photos. So it's only my music really which isn't in the cloud. But I might try upload it to Drive as well one day.

[–] [email protected] 1 points 1 year ago

zfs snap and zfs send to an external or another server.

[–] [email protected] 1 points 1 year ago

I use duplicity to a drive mounted off a Pi for local, tarsnap for remote. Both are command-line tools; tarsnap charges for their servers based on exact usage. (And thanks for the reminder; I'm due for another review of exactly what parts of which drives I'm backing up.)

[–] [email protected] 1 points 1 year ago (1 children)

Anything important I keep in my Dropbox folder, so then I have a copy on my desktop, laptop, and in the cloud.

When I turn off my desktop, I use restic to backup my Dropbox folder to a local external hard drive, and then restic runs again to back up to Wasabi which is a storage service like amazon's S3.

Same exact process for when I turn off my laptop.. except sometimes I don't have my laptop external hd plugged in so that gets skipped.

So that's three local copies, two local backups, and two remote backup storage locations. Not bad.

Changes I might make:

add another remote location
rotate local physical backup device somewhere (that seems like a lot of work)
move to next cloud or seafile instead of Dropbox

I used seafile for a long time but I couldn't keep it up so I switched to Dropbox.

Advice, thoughts welcome.

[–] [email protected] 1 points 1 year ago

I actually move my Documents, Pictures and other important folders inside my Dropbox folder and symlink them back to their original locations

This gives me the same Docs, Pics, etc. folders synced on every computer.

[–] [email protected] 1 points 1 year ago

A separate NAS on an atom cpu with btrfs of raid 10 exposed over NFS.

[–] [email protected] 1 points 1 year ago

Either an external hard drive or a pendrive. Just put one of those in a keychain and voila, a perfect backup solution that does not need of internet access.

...it's not dumb if it (still) works. :^)

[–] [email protected] 1 points 1 year ago

Vorta + borgbase

The yearly subscription is cheap and fits my storage needs by quite some margin. Gives me peace of mind to have an off-site back up.

I also store my documents on Google Drive.

[–] [email protected] 0 points 1 year ago

Good ol' fashioned rsync once a day to a remote server with zfs with daily zfs snapshot (rsync.net). Very fast because it only need to send changed/new files, and saved my hide several times when I need to access deleted files or old version of some files from the zfs snapshots.

[+] [email protected] -49 points 1 year ago (3 children)

Get a Mac, use Time Machine. Go all in on the eco system. phone, watch, iPad, tv. I resisted for years but it's so good man and the apple silicon is just leaps beyond everything else.

[–] [email protected] 2 points 1 year ago* (last edited 1 year ago)

Time Machine is not a backup, it is unreliable. I've had corrupted time machine backups and its backups are non-portable: You can only read the backups using an Apple machine. Apple Silicon is also not leaps beyond everything else, a 7000-series AMD chip will trade blows on performance per watt given the same power target. (source: I measured it, 60 watt power limit on a 7950X will closely match a M1 ultra given the same 60 watts of power)

Sure their laptops are tuned better out of the box and have great battery life, but that's not because of the Apple Silicon. Apple had good battery life before, even when their laptops had the same Intel chip as any other laptop. Why? Because of software.

Like before, their new M-chips are nothing special. Apple Silicon chips are great, but so are other modern chips. Apple Silicon is not "leaps beyond everything else".

If you look past their shiny fanboy-bait chips, you realize you pay **huge ** markups on RAM and storage. Apple's RAM and storage isn't anything special, but they're a lot more expensive than any other high-end RAM and storage modules, and it's not like their RAM or storage is better because, again, an AMD chip can just use regular RAM modules and an NVME SSD and it will match the M-chip performance given the same power target. Except you can replace the RAM modules and the SSD on the AMD chipset for reasonable prices.

In the end, a macbook is a great product and there's no other laptop that really gets close to its performance given its size. But that's it, that's where Apple's advantage ends. Past their ultra-light macbooks, you get overpriced hardware, crazy expensive upgrades, with an OS that isn't better, more reliable or more stable than Windows 11 (source: I use macOS and Windows 11 daily). You can buy a slightly thicker laptop (and it will still be thin and light) with replacable RAM and SSD and it will easily match the performance of the magic M1 chip with only a slight reduction in potential battery life. But guess what: If you actually USE your laptop for anything, the battery life of any laptop will quickly drop to 2-3 hours at best.

And that's just laptops. If you want actual work done, you get a desktop, and for the price of any Apple desktop you can easily get any PC to outperform it. In some cases, you can buy a PC to outperform the Apple desktop AND buy a macbook for on the go, and still have money left over. Except for power consumption ofcourse, but who cares about power consumption on a work machine? Only Apple fanboys care about that, because that's the only thing they got going for them. My time is more expensive than my power bill.

load more comments (2 replies)

load more comments