| ---------------------------------------- | |
| Using ptx to generate one-time pads | |
| March 15th, 2018 | |
| ---------------------------------------- | |
| I have been working my way through coreutils [0] recently when | |
| I came across ptx. | |
| $ apropos ptx | |
| ptx (1) - produce a permuted index of file contents | |
| What the hell does that mean? I know... | |
| $ man ptx | |
| PTX(1) User Commands PTX(1) | |
| NAME | |
| ptx - produce a permuted index of file contents | |
| SYNOPSIS | |
| ptx [OPTION]... [INPUT]... (without -G) | |
| ptx -G [OPTION]... [INPUT [OUTPUT]] | |
| DESCRIPTION | |
| Output a permuted index, including context, of the words | |
| in the input files. | |
| With no FILE, or when FILE is -, read standard input. | |
| Mandatory arguments to long options are mandatory for | |
| short options too. | |
| ... | |
| Oh that totally clears it... nope. Still no clue. | |
| So I asked on Mastodon and a few people had some suggestions in | |
| particular someone was able to shoot me over to a blog post [1] | |
| which tries to clear up what a 'purmuted index' even is. And | |
| that's the key. So check this out: | |
| A while back before we had badass search engines and hyperlinked | |
| doom shenanigans manually finding the reference to a word in | |
| a document SUUUUUUUUCKED. So they made this index in the back that | |
| listed all the key terms alphebetically in the middle column of | |
| a page. To the left of that word it would list whatever sentence | |
| led up to it. To the right they'd list the sentence fragment that | |
| followed the term. Finally, the page number. With that you could | |
| jump to the page and eye-ball search it yourself. | |
| It's been around since systemV and it's pretty much useless, | |
| right? Well, foxy, I think I came up with a fun hobby use-case. | |
| Pick a book with a publically available canonical plain-text | |
| source. Oh, I dunno, head over to Project Gutenburg [2] or | |
| something and wrestle yourself up some Joyce (or ILLEGAL GERMAN | |
| NOVELS!!!!! [3]). We're gonna shove that badboy into ptx like | |
| a champ. Here we go... | |
| $ curl https://www.gutenberg.org/files/4300/4300-0.txt > ulysses.txt | |
| $ ptx ulysses.txt | |
| SCREEN EXPLODES WITH TEXT FOR SEVERAL MINUTES!!!!! | |
| That's not how that works. Back to manpage! | |
| Hmmm... | |
| ...assumes latin-1 charset... | |
| ...ignore case, perhaps... | |
| ...[.?!][]\"')}]*\\($\\|\t\\| \\)[ \t\n]*... | |
| ...Emacs next-error, grumble... | |
| ...-w, width, ahha... | |
| ROFF! NO FUCKING WAY! | |
| One of the output formats for ptx is freaking roff! Syncronicity, | |
| baby! [4] Lets try something a little smaller. | |
| $ curl http://www.gutenberg.org/cache/epub/1065/pg1065.txt > theraven.txt | |
| $ ptx -O -f -w 66 theraven.txt > theraven-index.txt | |
| That sorta works. Ugh, but I'm getitng tired. Here's the plan for | |
| what's next: | |
| - Figure out how to format this stuff so I can awk it | |
| - awk so that the text key and one more word to the right are | |
| the output. Two words with a space between, that's it. | |
| - sort unique that bad-boy by each column in turn so both pairs | |
| of words are unique. | |
| - Use whatever words are in your primary list to write a plain | |
| text message. If your source document is large enough that's | |
| virtually any word you'd like to use. | |
| - Use awk to replace your words with the one to the right via | |
| a lookup file | |
| - Send secret message to a friend. The knowledge of which book | |
| is your cypher is all that's necessary to repeat the process | |
| in reverse. | |
| Huzzah for secret codes. | |
| If I get some time this weekend I'll look at writing a script to | |
| automate this for you. Provide a book and a message and indicate | |
| whether to encode or decode. Oh what fun that would be for some | |
| private crypto. Thinking you could do this in perl? Wanna show me | |
| up? Put your illogical collection of special characters where your | |
| mouth is, buddy! | |
| [0] GNU Core Utilities | |
| [1] Reading a Permuted Index | |
| [2] Project Gutenberg on Gopher | |
| [3] Project Gutenberg Blocks Access to Germany | |
| [4] dbucklin - Formatting for Gopher with GNU troff |