How do you do your backups? This comes up from time to time. Here's how
I do mine. I've been running this for well over 10 years now.
First level: Replica on secondary machine
=========================================
I usually use my desktop machine, but I also have a laptop. I don't
really *need* a laptop anymore -- it was invaluable when I went to
University, but that's long over. These days, I mainly use it for
watching movies or for sitting on the balcony. Both these machines are
configured very similarly. In fact, the plan is that I could switch to
the laptop as my primary machine at any given time. (More or less.)
So, every time both these machines are running at the same time (happens
every 0-3 days), I run a little script called `SYNC`: First, it does
`git pull` for all repos (even the ones with no commits in the last 10
years), in both directions. The majority of what I do on my computers
happens in Git repos. After that, it runs `unison` for selected
directories like my photos, music, and such.
Once `SYNC` is finished, I can switch over to the other machine just
fine and keep doing whatever I want. Both machines have the same "active
data set".
This means, if one of those two machines dies, I might lose 1 or 2 days,
but that's it. I don't produce *that much* data and I can live with this
risk. *If* I work on something special, I run `SYNC` way more often.
I never had a fire in my apartment, luckily. If there is one, I plan to
take my laptop with me. If the fire happens while I'm away ... see
below.
Second level: Full system backups on my NAS
===========================================
I have a little custom NAS that runs Ubuntu with ZFS. There's enough
room to hold the entire disks of my desktop and laptop, including
ZFS snapshots that go back in time quite a bit.
Maybe once a month, I run `rsync -aXAP` to copy all files from desktop
and laptop to that NAS.
The purpose of these backups is mainly to create yet another copy of my
user data and to create a backup of the *system* data. If the root FS of
my desktop crashes, I can buy a new drive, install it, and `rsync` all
that data back -- this might not give me a recent snapshot of my user
data, but at least the system is running again and then `SYNC` can grab
fresh user data from the laptop. (Even though "I run Arch", my system
changes very, very little.)
Third level: Offsite backups
============================
I have two large USB drives. One of them is large enough to hold the
entirety of my NAS (excluding the ZFS snapshots, though, that would be
too much). Usually after I ran that `rsync` from "level 2", I run
another `rsync` that copies the data from the NAS onto one of those USB
drives.
One drive is in a closet at my place, the other drive is at my parent's
place. Every now and them, I exchange them.
This is for total desaster recovery. If I'm not at home and my apartment
burns down, that second drive probably still holds my data.
Disadvantages
=============
- All of this is a very manual process.
- Offsite backups don't get as much attention as I'd like.
Unfortunately, I don't have a fast internet uplink -- it's simply
not feasible for me to create backups in a "cloud". Moving USB
drives back and forth is pretty much the only thing that I can do.
- If my appartment really does burn down one day and I need to get
*all* the data off of that one USB drive, oof, that's risky. Can it
sustain that level of stress? I'm aware of this problem, though, and
I'll copy important data first.
Advantages
==========
- One of my machines crashing is more likely than my entire apartment
burning down while I'm not at home (I work almost exclusively from
home). In that case, I don't need to "restore" anything, because the
secondary machine is already ready to use.
- I don't rely on a certain backup tool. I just copy files around. (I
do kind of rely on ZFS, but it could be replaced by btrfs.) I've
seen backup tools die in the past and then you're stuck with dead
backups.
- This entire concept nudges me to use Git a lot. This has the
advantage of automatically having a history and of automatically
having a way to check file consistency to detect bit rot (`git
fsck`).
- Since there's no "cloud" involved, the risk of a data breach is
greatly reduced. (Everything uses LUKS anyway.)