### Using paper as apocalyptic-resilient backup storage ### | |
Today, out of the blue, I was pondering about paper. You know, this cellulose | |
based thingy that Chinese people came up with back in the good old days of | |
the Han dynasty. Since these how remote times, humanity relied on paper for | |
storing information and, as thousands of years demonstrated, paper proved to | |
be an excellent long-time medium. | |
Nowadays we use diskettes, magnetic drives, flash memories and CD/DVD | |
discs... But neither of these comes anywhere close to paper's longevity and | |
resilience. Any serious magnetic disturbance will flush out data from most of | |
our computers, mild UV exposure will erase the content of our writeable | |
CD/DVDs, and in any case - all of these modern storage contraptions are | |
unlikely to be readable in 50-100 years. | |
So I thought - wouldn't it be nice to use paper as a storage for binary data? | |
After some quick research I found out (unsurprisingly) that I am not the | |
first to have come up with such idea. There are at least three software | |
packages out there that are dedicated to exactly that: | |
* PaperDisk, a commercial offering from Cobblestone Software (Win 3.1, 9x) | |
* PaperBack, an open-source software by Michael Mohr (Windows only, sadly) | |
* Optar, an open-source software by Karel Kulhavy (POSIX) | |
All these solutions share one big limitation, though: paper has a relatively | |
low data density, which translates to low data capacity. Apparently, it is | |
realistic to expect about 300-500K of storage on a single A4 sheet. This is | |
not much, but still - if we own some data that has a high value-per-byte | |
ratio, such backup could be an excellent long-term storage strategy. | |
Before using paper-based backups, one needs to think about a major (and | |
easily overlooked) potential problem - let's assume our paper backup sheet | |
survives 3000 years, and in some dystopian future a sentient race discovers | |
it. How would they know how to decode it? Surely they will not have access | |
to a PC running Windows 3.11 with proper software. Without going to such | |
extreme scenario, this problem might just as well apply to the paper backup | |
author itself, 20 years later: "I encoded important data on this A4 sheet | |
20 years ago using some exotic software - how do I decode it now?. | |
For the above reason, it may be safer (although much less efficient) to opt | |
for a paper-based backup solution that prints data in a way that a human | |
could easily understand and decode. Writing data encoded as a suit of letters | |
comes naturally to mind: after all, it is what we do since 6000 years now, | |
with very good results. Binary data could therefore be encoded in a BASE-32 | |
"alphabet", composed of the most recognizeable latin symbols. Instructions | |
how to decode it would be short enough to be written on the margin of the | |
sheet. On a single A4 sheet it is possible to fit comfortably 10'000 | |
characters. When encoded with a BASE-32 scheme, this translates to a useable | |
storage of 6 KiB. It is much less than with sophisticated methods used by the | |
dedicated softwares listed before, but it has the advantage of being trivial | |
to decode either by hand, or using any kind of OCR software - now or in the | |
future. | |
Attached to this article is a copy of the latest (as of February 2019) | |
PaperBack source code, the shareware PaperDisk 1.0 release (1997), an | |
extremely interesting white paper on the subject from Cobblestone Software, | |
as well as the source code and website copy of Optar (2007). | |
=== Attachments ========================================== | |
optar.pdf | |
optar.tgz | |
paperbak-1.10.src.zip | |
paperdisk v1.0 shareware.zip | |
paperdisk white paper.pdf |