| Title: Backup software: borg vs restic | |
| Author: Solène | |
| Date: 21 May 2021 | |
| Tags: backup openbsd unix | |
| Description: | |
| # Introduction | |
| Backups are important, lot of our life is now related to digital data | |
| and it's important to take care of them because computers are | |
| unreliable, can be stolen and mistakes happen. I really like two | |
| programs which are restic and borg, they have nearly the same features | |
| but it's hard to decide between both, this is an attempt to understand | |
| the differences for my use case. | |
| # Restic | |
| Restic is a backup software written in Go with a "push" workflow, it | |
| supports data deduplication within a repository and multiple systems | |
| using the same repository and also encryption. | |
| Restic can backup to a remote sftp server but also many network | |
| services storage like S3/Minio and even more when using with the | |
| program rclone (which can turn any supported backend into a compatible | |
| restic backend). Restic seems compatible with Windows (I didn't try). | |
| restic website | |
| # Borg | |
| Borg is a backup software written in Python with a "push" workflow, it | |
| supports encryption, data deduplication within a repository and | |
| compression. You can backup to a remote server using ssh but the | |
| remote server requires borg to be installed. | |
| It's a very good and reliable backup software. It has a companion app | |
| named "borgmatic" to automate the backup process and snapshots | |
| managements (daily/hourly/monthly ... and integrity checking). | |
| *BSD specific note: borg can honor the "nodump" flag in the filesystem | |
| to skip saving those files. | |
| borgbackup website | |
| borgmatic website | |
| # Experiment | |
| I've been making a backup of my /home/ partition (minus some | |
| directories that has been excluded in both cases) using borg and | |
| restic. I always performed the restic backup and then the borg backup, | |
| measuring bandwidth for each and execution time for each. | |
| There are five steps: init for the first backup of lot of data, little | |
| changes twice, which is basically opening firefox, browsing a few | |
| pages, closing it, refreshing my emails in claws-mail (this changes a | |
| lot of small files) and use the computer for an hour. There is a | |
| massive change as fourth step, I found a few game installers that I | |
| unzipped, producing lot of small files instead of one big file and | |
| finally, 24h of normal use between the fourth and last step which is a | |
| good representation of a daily backup. | |
| ## Data | |
| ``` | |
| restic borg | |
| Data transmitted (MB) | |
| --------------------- | |
| Backup 1 (init) 62860 53730 | |
| Backup 2 (little changes) 15 26 | |
| Backup 3 (little changes) 168 171 | |
| Backup 4 (massive changes) 4820 3910 | |
| Backup 5 (typical day of use) 66 44 | |
| Local cache size (MB) | |
| --------------------- | |
| Backup 1 (init) 161 45 | |
| Backup 2 (little changes) 163 45 | |
| Backup 3 (little changes) 207 46 | |
| Backup 4 (massive changes) 211 47 | |
| Backup 5 (typical day of use) 216 47 | |
| Backup time (seconds) | |
| --------------------- | |
| Backup 1 (init) 2139 2999 | |
| Backup 2 (little changes) 38 131 | |
| Backup 3 (little changes) 43 114 | |
| Backup 4 (massive changes) 201 355 | |
| Backup 5 (typical day of use) 50 110 | |
| Repository size (GB) 65 56 | |
| ``` | |
| ## Analysis | |
| Borg was a lot slower than restic but in my experiment the remote ssh | |
| server is a dual core atom system, borg is using a process on the other | |
| end to manage the data, so maybe that CPU was slowing the backup | |
| process. Nevertheless, in my real use case, borg is effectively slower. | |
| Most of the time, borg was more bandwidth effective than restic: it | |
| saved 15% of bandwidth for the first backup and 18% after some big | |
| changes, but in some cases it used a bit more bandwidth. I have no | |
| explanation for this, I guess it depends how file chunks are | |
| calculated, if a big database file is changing then one may be able to | |
| save only the difference and not the whole file. Borg is also | |
| compressing the data (using lz4 by default), this may explain the | |
| bandwidth saving that doesn't work for binary data. | |
| The local cache (typically in /root/.cache/) was a lot bigger for | |
| restic than for borg, and was increasing slightly at each new backup | |
| while borg cache never changed much. | |
| Finally, the whole repo size holding all the snapshots has a different | |
| size for restic and borg, respectively 65 GB and 56 GB, which makes a | |
| 14% difference between each which may due to the compression done by | |
| borg. | |
| # Other backup software | |
| I tested Restic and Borg because they are both good software using the | |
| "push" workflow (local computer sends the data) making full snapshots | |
| of every backup, but there are many other backup solution available. | |
| - duplicity: fully scriptable, works over many remote protocols but | |
| requires a full snapshot and then incremental snapshots to work, when | |
| you need to make a new full snapshot it will take a lot of space which | |
| is not always convenient. Supports GPG encrypted backup stored over | |
| FTP, this is useful for some dedicated server offering 100GB of free | |
| FTP. | |
| - burp: not very well known, the setup uses TLS certificates for | |
| encryption, requires a burp server and a burp client | |
| - rsnapshot: based on rsync, automate the rotation of backups, use hard | |
| links to avoid data duplication for files that didn't change between | |
| two backups, it pulls data from servers from a central backup system. | |
| - backuppc: a perl app that will pull data from servers to its | |
| repository, not really easy to use | |
| - bacula: enterprise grade solution that I never got to work because | |
| it's really complicated but can support many things, even saving on | |
| tapes | |
| # Conclusion | |
| In this benchmark, borg is clearly slower but was the most storage and | |
| bandwidth efficient. On the other hand, restic is easier to deploy | |
| (static binary) and supports a simple sftp server while borg requires | |
| borg installed on both sides. | |
| A biggest difference between restic and borg, is that restic supports | |
| multiples systems backup in the same repository, allowing a massive | |
| data deduplication gain across machines, while a borg repository is for | |
| single system (it could work with multiples systems but they should not | |
| backup at the same time and they would have to rebuild the local cache | |
| every time which is slow). | |
| I'll stick with borg because the backup time isn't a real issue given | |
| it's not dramatically slower than restic and that I really enjoy using | |
| borgmatic to automatically manage the backups. | |
| For doing backups to a remote server over the Internet, the bandwidth | |
| efficiency would be my main concern of all the differences, borg seems | |
| a clear winner here. |