| ### Using paper as apocalyptic-resilient backup storage ### | |
| Today, out of the blue, I was pondering about paper. You know, this cellulose | |
| based thingy that Chinese people came up with back in the good old days of | |
| the Han dynasty. Since these how remote times, humanity relied on paper for | |
| storing information and, as thousands of years demonstrated, paper proved to | |
| be an excellent long-time medium. | |
| Nowadays we use diskettes, magnetic drives, flash memories and CD/DVD | |
| discs... But neither of these comes anywhere close to paper's longevity and | |
| resilience. Any serious magnetic disturbance will flush out data from most of | |
| our computers, mild UV exposure will erase the content of our writeable | |
| CD/DVDs, and in any case - all of these modern storage contraptions are | |
| unlikely to be readable in 50-100 years. | |
| So I thought - wouldn't it be nice to use paper as a storage for binary data? | |
| After some quick research I found out (unsurprisingly) that I am not the | |
| first to have come up with such idea. There are at least three software | |
| packages out there that are dedicated to exactly that: | |
| * PaperDisk, a commercial offering from Cobblestone Software (Win 3.1, 9x) | |
| * PaperBack, an open-source software by Michael Mohr (Windows only, sadly) | |
| * Optar, an open-source software by Karel Kulhavy (POSIX) | |
| All these solutions share one big limitation, though: paper has a relatively | |
| low data density, which translates to low data capacity. Apparently, it is | |
| realistic to expect about 300-500K of storage on a single A4 sheet. This is | |
| not much, but still - if we own some data that has a high value-per-byte | |
| ratio, such backup could be an excellent long-term storage strategy. | |
| Before using paper-based backups, one needs to think about a major (and | |
| easily overlooked) potential problem - let's assume our paper backup sheet | |
| survives 3000 years, and in some dystopian future a sentient race discovers | |
| it. How would they know how to decode it? Surely they will not have access | |
| to a PC running Windows 3.11 with proper software. Without going to such | |
| extreme scenario, this problem might just as well apply to the paper backup | |
| author itself, 20 years later: "I encoded important data on this A4 sheet | |
| 20 years ago using some exotic software - how do I decode it now?. | |
| For the above reason, it may be safer (although much less efficient) to opt | |
| for a paper-based backup solution that prints data in a way that a human | |
| could easily understand and decode. Writing data encoded as a suit of letters | |
| comes naturally to mind: after all, it is what we do since 6000 years now, | |
| with very good results. Binary data could therefore be encoded in a BASE-32 | |
| "alphabet", composed of the most recognizeable latin symbols. Instructions | |
| how to decode it would be short enough to be written on the margin of the | |
| sheet. On a single A4 sheet it is possible to fit comfortably 10'000 | |
| characters. When encoded with a BASE-32 scheme, this translates to a useable | |
| storage of 6 KiB. It is much less than with sophisticated methods used by the | |
| dedicated softwares listed before, but it has the advantage of being trivial | |
| to decode either by hand, or using any kind of OCR software - now or in the | |
| future. | |
| Attached to this article is a copy of the latest (as of February 2019) | |
| PaperBack source code, the shareware PaperDisk 1.0 release (1997), an | |
| extremely interesting white paper on the subject from Cobblestone Software, | |
| as well as the source code and website copy of Optar (2007). | |
| === Attachments ========================================== | |
| optar.pdf | |
| optar.tgz | |
| paperbak-1.10.src.zip | |
| paperdisk v1.0 shareware.zip | |
| paperdisk white paper.pdf |