GEOMYS SPECIFICATION
========================================================================
Modified: 8 Feb 2018
Author: Auzymoto
Version: 1.0
========================================================================

SECTION 1: INTRODUCTION

   Geomys  is a gopher subscription tool, it allows you to subscribe to
   gopher URLs then automatically check when they  have  been  updated.
   This document's intent is to specify how geomys will be used and how
   it will work internally.

   There have been other utilities that implement gopher  subscription,
   but  it  is  the  goal  of  geomys  to  implement more sophisticated
   features such as: being able to recusivley subscribe to a  menu  and
   all  sub-menus,  tolerance  for  the reorganization of gopher menus,
   tolerance for the deletion of content, output  modes  that  make  it
   easy  to  build a content aggregator, and a simple interface to make
   it easy to insert the funcionality of geomys into gopher clients.

   The name 'geomys' is derived from the latin genus of gopher species.

SECTION 2: COMMAND LINE SYNTAX

   The functionality of geomys is broken up into multiple modules.  The
   general format for geomys commands is:

     geomys <module> [{-d | --database}="PATH"] <module arguments>

   Geomys uses a database file.  By default the database is  stored  in
   the user's home directory at "~/geomys.db".  If the -d or --database
   flag is given, the database file at PATH is used instead.  This flag
   can  be  used  with any module.  This is very usefull if you want to
   use multiple subscription lists.  The database stores  the  list  of
   subscriptions  and  corresponding  metadata,  in addition to the new
   content  information  that  geomys  generates  by  running   "geomys
   update".

   The modules are listed thusly:
     subscribe
     unsubscibe
     update
     look
     list
     edit

SECTION 2.1: SUBSCRIBE MODULE

   The  subscribe  module  is used to create a new gopher subscription.
   If you try to create a  subscription  that  already  exists  in  the
   database, you will get an error asking you to edit said subscription
   instead.  If the URL cannot  be  resolved  then  an  error  will  be
   printed.  The format for the suscribe module is as shown below, with
   different options displayed here  on  multiple  lines  for  clarity.
   Arguments may be given in any order, but the URL is expected last.

     geomys subscribe [-s | --single]
                      [-f | --file]
                      [{-n | --name}="SUBSCRIPTION NAME"]
                      [-m | --menus]
                      [-a | --all]
                      <URL>


   <URL>

       This  is  the  base  URL  that  geomys will look at to check for
       updates.  The prefix "gopher://" is optional.   The  URL  should
       point to a gopher menu, unless the -f flag is given.

   -s and --single

       By  default  geomys searches recursivly through gopher menus and
       submenus.  This flag restricts geomys to checking only a  single
       gopher menu.

   -f and --file

       This flag denotes that the url is pointing to a file rather than
       a gopher menu.  When this flag is given, geomys  will  calculate
       and  store a checksum for the file to use for comparision.  When
       -f is given, geomys will ignore the -s and -m flags.

   -n and --name

       This argument gives a name to the subscription.  The  name  will
       be  used for the new content listings that geomys generates.  If
       no name is given, the URL is used as a name instead.

   -m and --menus

       When looking at gopher menus, geomys  only  considers  the  URLs
       that  point  to  files.  This allows someone to reorganize their
       gopherhole or change a heading etc. and  for  geomys  to  ignore
       such changes.  This option forces geomys to also use gopher menu
       URLs that point to other  gopher  menus  in  it's  calculations.
       This  is  usefull  when  people publish content as gopher menu's
       instead of text files.

   -a and --all

       This option tells geomys to look at all gopher menu's and files.
       Geomys  will  store  the  checksums  of  all the content.  It is
       recommended to use this option sparingly, as  ANY  change  to  a
       gopherhole will be registered.

SECTION 2.2: UNSUBSCRIBE MODULE

   The  unsubscribe  module  is  used to remove a subscription from the
   database.  The subscription ID must be given, which can be found  by
   running  "geomys list".  An error will be printed if no ID is given,
   or if the ID does not exist in the database.  The  format  for  this
   command is as follows:

     geomys unsubscribe <SUBSCRIPTION ID>

SECTION 2.3: UPDATE MODULE

   The  update  module fetches content from gopherspace and compares it
   with the information in the database in  order  to  generate  a  new
   content  list.   This  module does not display the new content list.
   If a gopher subscription in  innaccesible, then it is ignored.   The
   update module does not have any module specific flags.

SECTION 2.4: LOOK MODULE

   The look module is used to display the new content list.  The format
   for the command is as follows:

     geomys look [-g | --gopher] [-o | --original]

   -g and --gopher

       By default geomys displays the  new  content  list  in  a  human
       readable format.  This flag tells geomys to output a gopher menu
       format, suitable to be fed into gopher client or for use  in  an
       aggregator.

   -o and --original

       By  default geomys will display links to each new file directly.
       For example, if someone put up 3 new posts on thier  gopherhole,
       then  there  would be 3 links in the new content list that would
       point directly at the posts.  This flag tells geomys to only use
       the original subscription URL for it's new content listing.

SECTION 2.5: LIST MODULE

   The  list  module is used to find information about subscriptions in
   the database.  If no ID is given, a list of all  subscriptions  with
   their  corresponding subscription IDs will be printed.  If the ID is
   given, the metadata  corresponding  to  that  subscription  will  be
   printed.  The format is as follows:

     geomys list [<SUBSCRIPTION ID>]

SECTION 2.6: EDIT MODULE

   The  edit module is used to edit subscription metadata.  After edits
   have been made, the new  metadata  will  be  printed.   This  module
   requires  a  subscription  ID, which can be found by running "geomys
   list".  The format for the command is as follows, with  flags  being
   shown here on multiple lines for clarity:

     geomys edit <SUBSCRIPTION ID> [-s | --single]
                                   [-f | --file]
                                   [{-n | --name}="NAME"]
                                   [-m | --menu]
                                   [-a | --all]
                                   [{-u | --url}="URL"]

   The flags -s, -f, -m, and -a will toggle on and off the flags in the
   database.  For example if the -a flag is active for the subscription
   #11,  then  running "geomys edit 11 -a" will turn of the -a flag for
   that subscription.

   The  -n  and  -u  flags  will  update  the  name  and  URL  of   the
   subscription.

SECTION 3: EXAMPLE AGGREGATOR

   With  the above mentioned functionality, it would be quite simple to
   setup a gopher aggregator with a few scripts.

   Let's say you wanted to build a very simple aggregator so  that  the
   20  most  recent posts would be displayed.  For this example we will
   assume   that   your   gopher   server   is   serving    the    file
   "/gopher/aggregator/main.gopher"   when   some   one  accesses  your
   aggregator.  The file "/gopher/aggregator/recent.gopher"  will  hold
   the     list     of     recent     links,     while     the     file
   "/gopher/aggregator/heading.gopher" will hold a heading that will be
   diplayed at the beggining of the gopher menu.

   We  will  check gopherspace for updates by running this script every
   hour:

       cd /gopher/aggregator/

       geomys update -d="./geomys.db"
       geomys look -g -d="./geomys.db" > temp

       # Append temp to the beginning of recent.gopher
       # and discard any lines after line 20.
       cat temp ./recent.gopher | head -n 20 > recent.gopher

       # assemble the main aggregator listing
       cat ./heading.gopher ./recent.gopher > ./main.gopher

   With more sophisticated scripts you  could  even  set  up  automatic
   archiving,  and  have  multiple  subscription lists.  There are many
   sophisticated setups you could build using geomys.

SECTION 4: DATABASE FORMAT

   Geomys stores everything in a single text file.  The  exact  details
   and  syntax  will not be laid down here; Instead the general idea of
   what geomys is storing will be disscussed.  Also,  a  very  informal
   BNF notation will be used for clarification.

   The  database  is  comprised  of  two  main  sections, a new content
   section and a subscription section.   The  new  content  section  is
   generated  by  running  "geomys  update",  and  stores a list of new
   content URLs.  The subscription section stores flags and information
   about  gopher  content  which  is  used  to generate the new content
   section.

     <database> ::= <new content section> <subscription section>

   The  new  content  section  is  pretty  simple,  It  is  a  list  of
   subscription IDs each with a list of URLs that point to new content.
   When "geomys look" is ran, geomys simply spits  out  all  the  URLs.
   When  the  -o flag is given, it spits out the subscriptions instead.
   Here the acromyn "NC" is used in place of "New Content".

     <new content section> ::= <List of NCitems>

     <NCitem> ::= <subscriptionID> <list of URLs>

   The subscription section is a little bit  more  complicated.   Every
   subscription  has  stored with it a name, base URL, subscription ID,
   and flag values.  These are the values shown when "geomys list <ID>"
   is  ran.   geomys also optionaly stores a list of URLs and a list of
   hashes.  Whether or not the URL list or hash list is stored  depends
   on the flags.  These two lists are used for comparisons when "geomys
   update" is ran.

     <subscription section> ::= <list of subscriptions>

     <subscription> ::= <SUBitems> |
                        <SUBitems> <URL list> |
                        <SUBitems> <hash list> |
                        <SUBitems> <URL & hash list>

     <SUBitems> ::= <subscription ID>
                    <base URL>
                    <name>
                    <flags>

SECTION 5: ALGORITHM

   This section will cover how geomys interacts with  gopherspace,  and
   will  be  disscussing  how  the  update  module generaly works.  The
   operations of the other modules will not be  covered,  as  they  are
   mostly just operating on the database.

   As a quick note, the subscribe module uses the same functionality as
   the update module to initialize a new subscription,  the  difference
   being  that  the  subscribe  module  does not generate a new content
   list.

   The proccess defined below is  how  geomys  operates  for  a  single
   subscription.  Geomys repeats these steps for every subscription.

   Geomys  will  start  at  the  base  URL  that  was  defined with the
   subscription.  From here it will fetch the gopher menu  at  the  URL
   and  pull  out  all  the  links  found  within  it.   Geomys ignores
   infomation that isn't a link.  It will add these links to a list and
   follow the links under the following criteria:

   1) The  destination  URL shares the same  root path as the base URL.
      For example if the base URL is "sdf.org/1/users/solderpunk", then
      a  link  of  "sdf.org/1/users/solderpunk/phlog"  is  valid  while
      "sdf.org/1/users/tomasino/" is not.

   2) The destination URL should not have been encountered already.

   These rules are to ensure that geomys does not caught in an infinite
   loop or start to explore the wilds of gopherspace.

   This  procces  is repeated recursively for all the links that geomys
   follows.  At the end of this proccess, geomys is  left  with  a  big
   list of URLs.

   Now  geomys  will filter out any URLs that point to menus unless the
   -m flag is active.  This allows people to  reorganize  their  gopher
   menu's without triggering geomys.

   The next step is to compare the list of URLs with the URLs stored in
   the the database.  If there is any links that are in the  list  that
   are  not in database, geomys will add those links to the new content
   list.

   If the -a flag is active, then geomys will also calculate  checksums
   for  each menu and file it visits.  It will compare the checksums to
   those in the database, and if any checksum has changed then the  new
   checksum  will be written to the database and the link will be added
   to the new content list.

   If the -f flag is active, geomys will only calculate and compare the
   checksum for the bas URL.

   If  the  -s  flag is active, geomys won't follow any of the links it
   finds in gopher menus.