Title: Backup software: borg vs restic | |
Author: Solène | |
Date: 21 May 2021 | |
Tags: backup openbsd unix | |
Description: | |
# Introduction | |
Backups are important, lot of our life is now related to digital data | |
and it's important to take care of them because computers are | |
unreliable, can be stolen and mistakes happen. I really like two | |
programs which are restic and borg, they have nearly the same features | |
but it's hard to decide between both, this is an attempt to understand | |
the differences for my use case. | |
# Restic | |
Restic is a backup software written in Go with a "push" workflow, it | |
supports data deduplication within a repository and multiple systems | |
using the same repository and also encryption. | |
Restic can backup to a remote sftp server but also many network | |
services storage like S3/Minio and even more when using with the | |
program rclone (which can turn any supported backend into a compatible | |
restic backend). Restic seems compatible with Windows (I didn't try). | |
restic website | |
# Borg | |
Borg is a backup software written in Python with a "push" workflow, it | |
supports encryption, data deduplication within a repository and | |
compression. You can backup to a remote server using ssh but the | |
remote server requires borg to be installed. | |
It's a very good and reliable backup software. It has a companion app | |
named "borgmatic" to automate the backup process and snapshots | |
managements (daily/hourly/monthly ... and integrity checking). | |
*BSD specific note: borg can honor the "nodump" flag in the filesystem | |
to skip saving those files. | |
borgbackup website | |
borgmatic website | |
# Experiment | |
I've been making a backup of my /home/ partition (minus some | |
directories that has been excluded in both cases) using borg and | |
restic. I always performed the restic backup and then the borg backup, | |
measuring bandwidth for each and execution time for each. | |
There are five steps: init for the first backup of lot of data, little | |
changes twice, which is basically opening firefox, browsing a few | |
pages, closing it, refreshing my emails in claws-mail (this changes a | |
lot of small files) and use the computer for an hour. There is a | |
massive change as fourth step, I found a few game installers that I | |
unzipped, producing lot of small files instead of one big file and | |
finally, 24h of normal use between the fourth and last step which is a | |
good representation of a daily backup. | |
## Data | |
``` | |
restic borg | |
Data transmitted (MB) | |
--------------------- | |
Backup 1 (init) 62860 53730 | |
Backup 2 (little changes) 15 26 | |
Backup 3 (little changes) 168 171 | |
Backup 4 (massive changes) 4820 3910 | |
Backup 5 (typical day of use) 66 44 | |
Local cache size (MB) | |
--------------------- | |
Backup 1 (init) 161 45 | |
Backup 2 (little changes) 163 45 | |
Backup 3 (little changes) 207 46 | |
Backup 4 (massive changes) 211 47 | |
Backup 5 (typical day of use) 216 47 | |
Backup time (seconds) | |
--------------------- | |
Backup 1 (init) 2139 2999 | |
Backup 2 (little changes) 38 131 | |
Backup 3 (little changes) 43 114 | |
Backup 4 (massive changes) 201 355 | |
Backup 5 (typical day of use) 50 110 | |
Repository size (GB) 65 56 | |
``` | |
## Analysis | |
Borg was a lot slower than restic but in my experiment the remote ssh | |
server is a dual core atom system, borg is using a process on the other | |
end to manage the data, so maybe that CPU was slowing the backup | |
process. Nevertheless, in my real use case, borg is effectively slower. | |
Most of the time, borg was more bandwidth effective than restic: it | |
saved 15% of bandwidth for the first backup and 18% after some big | |
changes, but in some cases it used a bit more bandwidth. I have no | |
explanation for this, I guess it depends how file chunks are | |
calculated, if a big database file is changing then one may be able to | |
save only the difference and not the whole file. Borg is also | |
compressing the data (using lz4 by default), this may explain the | |
bandwidth saving that doesn't work for binary data. | |
The local cache (typically in /root/.cache/) was a lot bigger for | |
restic than for borg, and was increasing slightly at each new backup | |
while borg cache never changed much. | |
Finally, the whole repo size holding all the snapshots has a different | |
size for restic and borg, respectively 65 GB and 56 GB, which makes a | |
14% difference between each which may due to the compression done by | |
borg. | |
# Other backup software | |
I tested Restic and Borg because they are both good software using the | |
"push" workflow (local computer sends the data) making full snapshots | |
of every backup, but there are many other backup solution available. | |
- duplicity: fully scriptable, works over many remote protocols but | |
requires a full snapshot and then incremental snapshots to work, when | |
you need to make a new full snapshot it will take a lot of space which | |
is not always convenient. Supports GPG encrypted backup stored over | |
FTP, this is useful for some dedicated server offering 100GB of free | |
FTP. | |
- burp: not very well known, the setup uses TLS certificates for | |
encryption, requires a burp server and a burp client | |
- rsnapshot: based on rsync, automate the rotation of backups, use hard | |
links to avoid data duplication for files that didn't change between | |
two backups, it pulls data from servers from a central backup system. | |
- backuppc: a perl app that will pull data from servers to its | |
repository, not really easy to use | |
- bacula: enterprise grade solution that I never got to work because | |
it's really complicated but can support many things, even saving on | |
tapes | |
# Conclusion | |
In this benchmark, borg is clearly slower but was the most storage and | |
bandwidth efficient. On the other hand, restic is easier to deploy | |
(static binary) and supports a simple sftp server while borg requires | |
borg installed on both sides. | |
A biggest difference between restic and borg, is that restic supports | |
multiples systems backup in the same repository, allowing a massive | |
data deduplication gain across machines, while a borg repository is for | |
single system (it could work with multiples systems but they should not | |
backup at the same time and they would have to rebuild the local cache | |
every time which is slow). | |
I'll stick with borg because the backup time isn't a real issue given | |
it's not dramatically slower than restic and that I really enjoy using | |
borgmatic to automatically manage the backups. | |
For doing backups to a remote server over the Internet, the bandwidth | |
efficiency would be my main concern of all the differences, borg seems | |
a clear winner here. |