PGP word list

 Words for conveying data bytes in speech

 The PGP Word List ("Pretty Good Privacy word list", also called a
 biometric word list for reasons explained below) is a list of words
 for conveying data bytes in a clear unambiguous way via a voice
 channel. They are analogous in purpose to the NATO phonetic
 alphabet, except that a longer list of words is used, each word
 corresponding to one of the 256 distinct numeric byte values.

History and structure

 The PGP Word List was designed in 1995 by Patrick Juola, a
 computational linguist, and Philip Zimmermann, creator of
 PGP.[1][2] The words were carefully chosen for their phonetic
 distinctiveness, using genetic algorithms to select lists of words
 that had optimum separations in phoneme space. The candidate word
 lists were randomly drawn from Grady Ward's Moby Pronunciator list
 as raw material for the search, successively refined by the genetic
 algorithms. The automated search converged to an optimized solution
 in about 40 hours on a DEC Alpha, a particularly fast machine in
 that era.

 The Zimmermann–Juola list was originally designed to be used in
 PGPfone, a secure VoIP application, to allow the two parties to
 verbally compare a short authentication string to detect a man-in-
 the-middle attack (MiTM). It was called a biometric word list
 because the authentication depended on the two human users
 recognizing each other's distinct voices as they read and compared
 the words over the voice channel, binding the identity of the
 speaker with the words, which helped protect against the MiTM
 attack. The list can be used in many other situations where a
 biometric binding of identity is not needed, so calling it a
 biometric word list may be imprecise. Later, it was used in PGP to
 compare and verify PGP public key fingerprints over a voice
 channel. This is known in PGP applications as the "biometric"
 representation. When it was applied to PGP, the list of words was
 further refined, with contributions by Jon Callas. More recently,
 it has been used in Zfone and the ZRTP protocol, the successor to
 PGPfone.

 The list is actually composed of two lists, each containing 256
 phonetically distinct words, in which each word represents a
 different byte value between 0 and 255. Two lists are used because
 reading aloud long random sequences of human words usually risks
 three kinds of errors: 1) transposition of two consecutive words,
 2) duplicate words, or 3) omitted words. To detect all three kinds
 of errors, the two lists are used alternately for the even-offset
 bytes and the odd-offset bytes in the byte sequence. Each byte
 value is actually represented by two different words, depending on
 whether that byte appears at an even or an odd offset from the
 beginning of the byte sequence. The two lists are readily
 distinguished by the number of syllables; the even list has words
 of two syllables, the odd list has three. The two lists have a
 maximum word length of 9 and 11 letters, respectively. Using a two-
 list scheme was suggested by Zhahai Stewart.

Word lists

 Here are the two lists of words as presented in the PGPfone Owner's
 Manual.[3]

 Hex  Even Word  Odd Word
 ---  ---------  ----------
 00   aardvark   adroitness
 01   absurd     adviser
 02   accrue     aftermath
 03   acme       aggregate
 04   adrift     alkali
 05   adult      almighty
 06   afflict    amulet
 07   ahead      amusement
 08   aimless    antenna
 09   Algol      applicant
 0A   allow      Apollo
 0B   alone      armistice
 0C   ammo       article
 0D   ancient    asteroid
 0E   apple      Atlantic
 0F   artist     atmosphere
 10   assume     autopsy
 11   Athens     Babylon
 12   atlas      backwater
 13   Aztec      barbecue
 14   baboon     belowground
 15   backfield  bifocals
 16   backward   bodyguard
 17   banjo      bookseller
 18   beaming    borderline
 19   bedlamp    bottomless
 1A   beehive    Bradbury
 1B   beeswax    bravado
 1C   befriend   Brazilian
 1D   Belfast    breakaway
 1E   berserk    Burlington
 1F   billiard   businessman
 20   bison      butterfat
 21   blackjack  Camelot
 22   blockade   candidate
 23   blowtorch  cannonball
 24   bluebird   Capricorn
 25   bombast    caravan
 26   bookshelf  caretaker
 27   brackish   celebrate
 28   breadline  cellulose
 29   breakup    certify
 2A   brickyard  chambermaid
 2B   briefcase  Cherokee
 2C   Burbank    Chicago
 2D   button     clergyman
 2E   buzzard    coherence
 2F   cement     combustion
 30   chairlift  commando
 31   chatter    company
 32   checkup    component
 33   chisel     concurrent
 34   choking    confidence
 35   chopper    conformist
 36   Christmas  congregate
 37   clamshell  consensus
 38   classic    consulting
 39   classroom  corporate
 3A   cleanup    corrosion
 3B   clockwork  councilman
 3C   cobra      crossover
 3D   commence   crucifix
 3E   concert    cumbersome
 3F   cowbell    customer
 40   crackdown  Dakota
 41   cranky     decadence
 42   crowfoot   December
 43   crucial    decimal
 44   crumpled   designing
 45   crusade    detector
 46   cubic      detergent
 47   dashboard  determine
 48   deadbolt   dictator
 49   deckhand   dinosaur
 4A   dogsled    direction
 4B   dragnet    disable
 4C   drainage   disbelief
 4D   dreadful   disruptive
 4E   drifter    distortion
 4F   dropper    document
 50   drumbeat   embezzle
 51   drunken    enchanting
 52   Dupont     enrollment
 53   dwelling   enterprise
 54   eating     equation
 55   edict      equipment
 56   egghead    escapade
 57   eightball  Eskimo
 58   endorse    everyday
 59   endow      examine
 5A   enlist     existence
 5B   erase      exodus
 5C   escape     fascinate
 5D   exceed     filament
 5E   eyeglass   finicky
 5F   eyetooth   forever
 60   facial     fortitude
 61   fallout    frequency
 62   flagpole   gadgetry
 63   flatfoot   Galveston
 64   flytrap    getaway
 65   fracture   glossary
 66   framework  gossamer
 67   freedom    graduate
 68   frighten   gravity
 69   gazelle    guitarist
 6A   Geiger     hamburger
 6B   glitter    Hamilton
 6C   glucose    handiwork
 6D   goggles    hazardous
 6E   goldfish   headwaters
 6F   gremlin    hemisphere
 70   guidance   hesitate
 71   hamlet     hideaway
 72   highchair  holiness
 73   hockey     hurricane
 74   indoors    hydraulic
 75   indulge    impartial
 76   inverse    impetus
 77   involve    inception
 78   island     indigo
 79   jawbone    inertia
 7A   keyboard   infancy
 7B   kickoff    inferno
 7C   kiwi       informant
 7D   klaxon     insincere
 7E   locale     insurgent
 7F   lockup     integrate
 80   merit      intention
 81   minnow     inventive
 82   miser      Istanbul
 83   Mohawk     Jamaica
 84   mural      Jupiter
 85   music      leprosy
 86   necklace   letterhead
 87   Neptune    liberty
 88   newborn    maritime
 89   nightbird  matchmaker
 8A   Oakland    maverick
 8B   obtuse     Medusa
 8C   offload    megaton
 8D   optic      microscope
 8E   orca       microwave
 8F   payday     midsummer
 90   peachy     millionaire
 91   pheasant   miracle
 92   physique   misnomer
 93   playhouse  molasses
 94   Pluto      molecule
 95   preclude   Montana
 96   prefer     monument
 97   preshrunk  mosquito
 98   printer    narrative
 99   prowler    nebula
 9A   pupil      newsletter
 9B   puppy      Norwegian
 9C   python     October
 9D   quadrant   Ohio
 9E   quiver     onlooker
 9F   quota      opulent
 A0   ragtime    Orlando
 A1   ratchet    outfielder
 A2   rebirth    Pacific
 A3   reform     pandemic
 A4   regain     Pandora
 A5   reindeer   paperweight
 A6   rematch    paragon
 A7   repay      paragraph
 A8   retouch    paramount
 A9   revenge    passenger
 AA   reward     pedigree
 AB   rhythm     Pegasus
 AC   ribcage    penetrate
 AD   ringbolt   perceptive
 AE   robust     performance
 AF   rocker     pharmacy
 B0   ruffled    phonetic
 B1   sailboat   photograph
 B2   sawdust    pioneer
 B3   scallion   pocketful
 B4   scenic     politeness
 B5   scorecard  positive
 B6   Scotland   potato
 B7   seabird    processor
 B8   select     provincial
 B9   sentence   proximate
 BA   shadow     puberty
 BB   shamrock   publisher
 BC   showgirl   pyramid
 BD   skullcap   quantity
 BE   skydive    racketeer
 BF   slingshot  rebellion
 C0   slowdown   recipe
 C1   snapline   recover
 C2   snapshot   repellent
 C3   snowcap    replica
 C4   snowslide  reproduce
 C5   solo       resistor
 C6   southward  responsive
 C7   soybean    retraction
 C8   spaniel    retrieval
 C9   spearhead  retrospect
 CA   spellbind  revenue
 CB   spheroid   revival
 CC   spigot     revolver
 CD   spindle    sandalwood
 CE   spyglass   sardonic
 CF   stagehand  Saturday
 D0   stagnate   savagery
 D1   stairway   scavenger
 D2   standard   sensation
 D3   stapler    sociable
 D4   steamship  souvenir
 D5   sterling   specialist
 D6   stockman   speculate
 D7   stopwatch  stethoscope
 D8   stormy     stupendous
 D9   sugar      supportive
 DA   surmount   surrender
 DB   suspense   suspicious
 DC   sweatband  sympathy
 DD   swelter    tambourine
 DE   tactics    telephone
 DF   talon      therapist
 E0   tapeworm   tobacco
 E1   tempest    tolerance
 E2   tiger      tomorrow
 E3   tissue     torpedo
 E4   tonic      tradition
 E5   topmost    travesty
 E6   tracker    trombonist
 E7   transit    truncated
 E8   trauma     typewriter
 E9   treadmill  ultimate
 EA   Trojan     undaunted
 EB   trouble    underfoot
 EC   tumor      unicorn
 ED   tunnel     unify
 EE   tycoon     universe
 EF   uncut      unravel
 F0   unearth    upcoming
 F1   unwind     vacancy
 F2   uproot     vagabond
 F3   upset      vertigo
 F4   upshot     Virginia
 F5   vapor      visitor
 F6   village    vocalist
 F7   virus      voyager
 F8   Vulcan     warranty
 F9   waffle     Waterloo
 FA   wallet     whimsical
 FB   watchword  Wichita
 FC   wayside    Wilmington
 FD   willow     Wyoming
 FE   woodlark   yesteryear
 FF   Zulu       Yucatan

Examples

 Each byte in a bytestring is encoded as a single word. A sequence
 of bytes is rendered in network byte order, from left to right. For
 example, the leftmost (i.e. byte 0) is considered "even" and is
 encoded using the PGP Even Word table. The next byte to the right
 (i.e. byte 1) is considered "odd" and is encoded using the PGP Odd
 Word table. This process repeats until all bytes are encoded. Thus,
 "E582" produces "topmost Istanbul", whereas "82E5" produces "miser
 travesty".

 A PGP public key fingerprint that displayed in hexadecimal as

     E582 94F2 E9A2 2748 6E8B
     061B 31CC 528F D7FA 3F19

 would display in PGP Words (the "biometric" fingerprint) as

     topmost Istanbul Pluto vagabond treadmill Pacific brackish
     dictator goldfish Medusa
     afflict bravado chatter revolver Dupont midsummer stopwatch
     whimsical cowbell bottomless

 The order of bytes in a bytestring depends on endianness.

Other word lists for data

 There are several other word lists for conveying data in a clear
 unambiguous way via a voice channel:

   * the NATO phonetic alphabet maps individual letters and digits
     to individual words
   * the S/KEY system maps 64 bit numbers to 6 short words of 1 to 4
     characters each from a publicly accessible 2048-word
     dictionary. The same dictionary is used in RFC 1760 and RFC
     2289.
   * the Diceware system maps five base-6 random digits (almost 13
     bits of entropy) to a word from a dictionary of 7,776 distinct
     words.
       * the Electronic Frontier Foundation has published a set of
         improved word lists based on the same concept[4]
   * FIPS 181: Automated Password Generator converts random numbers
     into somewhat pronounceable "words".
   * mnemonic encoding converts 32 bits of data into 3 words from a
     vocabulary of 1626 words.[5]
   * what3words encodes geographic coordinates in 3 dictionary words.
   * the BIP39 standard permits encoding a cryptographic key of
     fixed size (128 or 256 bits, usually the unencrypted master key
     of a Cryptocurrency wallet) into a short sequence of readable
     words known as the seed phrase, for the purpose of storing the
     key offline. This is used in cryptocurrencies such as Bitcoin
     or Monero.
   * Like the PGP word list, the Bytewords standard maps each
     possible byte to a word. There is only one list, rather than
     two. The words are uniformly four letters long and can be
     uniquely identified by their first and last letters

References

     This article incorporates material that is copyrighted by PGP
     Corporation and has been licensed under the GNU Free
     Documentation License. (per Jon Callas, CTO, CSO PGP
     Corporation, 4-Jan-2007)

    1. ↑ Juola, Patrick; Zimmermann, Philip (1996). "Whole-word
       phonetic distances and the PGPfone alphabet (Archived)"
       (PDF). Proceeding of Fourth International Conference on
       Spoken Language Processing. ICSLP '96. Vol. 1. pp. 98–101.
       doi:10.1109/ICSLP.1996.607046. ISBN 0-7803-3555-4.
       S2CID 10385500. Archived from the original (PDF) on 7
       September 2006.
    2. ↑ Juola, Patrick (1996). "Isolated Word Confusion Metrics and
       the PGPfone Alphabet". Proceedings of New Methods in Language
       Processing 2. Ankara, Turkey: Oxford University, Dept. of
       Experimental Psychology. arXiv:cmp-lg/9608021.
       Bibcode:1996cmp.lg....8021J.
    3. ↑ "Archived copy". web.mit.edu. Archived from the original on
       26 March 2010. Retrieved 12 January 2022.{{cite web}}: CS1
       maint: archived copy as title (link)
    4. ↑ "EFF's New Wordlists for Random Passphrases". 19 July 2016.
    5. ↑ mnemonic encoding Archived 2008-03-02 at the Wayback
       Machine and updated code


References

1. https://en.wikipedia.org/wiki/Pretty_Good_Privacy (link)
2. https://en.wikipedia.org/wiki/Word (link)
3. https://en.wikipedia.org/wiki/Bytes (link)
4. https://en.wikipedia.org/wiki/NATO_phonetic_alphabet (link)
5. https://en.wikipedia.org/wiki/Patrick_Juola (link)
6. https://en.wikipedia.org/wiki/Computational_linguistics (link)
7. https://en.wikipedia.org/wiki/Philip_Zimmermann (link)
8. https://en.wikipedia.org/wiki/Pretty_Good_Privacy (link)
9. https://en.wikipedia.org/wiki/PGP_word_list#cite_note-Juola1996a-1
(link)
10. https://en.wikipedia.org/wiki/PGP_word_list#cite_note-Juola1996b-2
(link)
11. https://en.wikipedia.org/wiki/Phonetic (link)
12. https://en.wikipedia.org/wiki/Genetic_algorithms (link)
13. https://en.wikipedia.org/wiki/Phoneme (link)
14. https://en.wikipedia.org/wiki/Grady_Ward (link)
15. https://en.wikipedia.org/wiki/Moby_Project (link)
16. https://en.wikipedia.org/wiki/DEC_Alpha (link)
17. https://en.wikipedia.org/wiki/PGPfone (link)
18. https://en.wikipedia.org/wiki/Man-in-the-middle_attack (link)
19. https://en.wikipedia.org/wiki/Biometric (link)
20. https://en.wikipedia.org/wiki/Pretty_Good_Privacy (link)
21. https://en.wikipedia.org/wiki/Public_key (link)
22. https://en.wikipedia.org/wiki/Message_digest (link)
23. https://en.wikipedia.org/wiki/Jon_Callas (link)
24. https://en.wikipedia.org/wiki/Zfone (link)
25. https://en.wikipedia.org/wiki/ZRTP (link)
26. https://en.wikipedia.org/wiki/Phonetics (link)
27. https://en.wikipedia.org/wiki/Syllables (link)
28. https://en.wikipedia.org/wiki/PGP_word_list#cite_note-3 (link)
29. https://en.wikipedia.org/wiki/Network_byte_order (link)
30. https://en.wikipedia.org/wiki/Hexadecimal (link)
31. https://en.wikipedia.org/wiki/Endianness (link)
32. https://en.wikipedia.org/wiki/NATO_phonetic_alphabet (link)
33. https://en.wikipedia.org/wiki/S/KEY (link)
34. https://en.wikipedia.org/wiki/Diceware (link)
35. https://en.wikipedia.org/wiki/Electronic_Frontier_Foundation (link)
36. https://en.wikipedia.org/wiki/PGP_word_list#cite_note-4 (link)
37. https://en.wikipedia.org/wiki/Automated_Password_Generator (link)
38. https://en.wikipedia.org/wiki/PGP_word_list#cite_note-5 (link)
39. https://en.wikipedia.org/wiki/What3words (link)
40. https://en.wikipedia.org/wiki/Cryptocurrency_wallet (link)
41. https://en.wikipedia.org/wiki/Seed_phrase (link)
42. https://en.wikipedia.org/wiki/Bitcoin (link)
43. https://en.wikipedia.org/wiki/Monero (link)
44. https://developer.blockchaincommons.com/bytewords/ (link)
45.
https://en.wikipedia.org/wiki/PGP_word_list#cite_ref-Juola1996a_1-0
(link)
46.
https://web.archive.org/web/20060907131751/https://www.mathcs.duq.edu/~j
uola/papers.d/icslp96.pdf (link)
47. https://en.wikipedia.org/wiki/Doi_(identifier) (link)
48. https://doi.org/10.1109%2FICSLP.1996.607046 (link)
49. https://en.wikipedia.org/wiki/ISBN_(identifier) (link)
50. https://en.wikipedia.org/wiki/Special:BookSources/0-7803-3555-4
(link)
51. https://en.wikipedia.org/wiki/S2CID_(identifier) (link)
52. https://api.semanticscholar.org/CorpusID:10385500 (link)
53. https://www.mathcs.duq.edu/~juola/papers.d/icslp96.pdf (link)
54.
https://en.wikipedia.org/wiki/PGP_word_list#cite_ref-Juola1996b_2-0
(link)
55. http://www.mathcs.duq.edu/~juola/papers.d/pgpfonenemlap.ps (link)
56. https://en.wikipedia.org/wiki/ArXiv_(identifier) (link)
57. https://arxiv.org/abs/cmp-lg/9608021 (link)
58. https://en.wikipedia.org/wiki/Bibcode_(identifier) (link)
59. https://ui.adsabs.harvard.edu/abs/1996cmp.lg....8021J (link)
60. https://en.wikipedia.org/wiki/PGP_word_list#cite_ref-3 (link)
61.
https://web.archive.org/web/20100326141145/http://web.mit.edu/network/pg
pfone/manual/index.html#PGP000062 (link)
62. http://web.mit.edu/network/pgpfone/manual/index.html#PGP000062
(link)
63. https://en.wikipedia.org/wiki/Template:Cite_web (link)
64.
https://en.wikipedia.org/wiki/Category:CS1_maint:_archived_copy_as_title
(link)
65. https://en.wikipedia.org/wiki/PGP_word_list#cite_ref-4 (link)
66.
https://www.eff.org/deeplinks/2016/07/new-wordlists-random-passphrases
(link)
67. https://en.wikipedia.org/wiki/PGP_word_list#cite_ref-5 (link)
68. http://www.tothink.com/mnemonic/ (link)
69.
https://web.archive.org/web/20080302025836/http://www.tothink.com/mnemon
ic/ (link)
70. https://en.wikipedia.org/wiki/Wayback_Machine (link)
71. https://github.com/singpolyma/mnemonicode (link)

From: <https://en.wikipedia.org/wiki/PGP_word_list>