GOPHERNICUS GOTCHAS

Before I forget them, I thought I should collect some of the
suprises that I had developing GophHub, and previous things, to run
in the CGI environment of Gophernicus (version 3.1.1).

* There are two ways of running a CGI executable that are described
in the Gophernicus documentation (perhaps forgetting the "=mapfile"
command which can be used in static gophermaps): Save (or symlink
to) it as a "gophermap" file, or save (or symlink to) it in the
cgi-bin directory. The notable thing is that, regardless of the
selector item type that you use to access it via the client, the
"gophermap" script's output will always have gophermap rendering
automatically applied (which will mess up plain-text output), and
the cgi-bin script won't. So if you need a script to do plain-text
output as well as present a menu (eg. GophHub with file viewing via
Gopher enabled), you need to use a symlink so that the script
appears in both places, and have it write links to plain-text
content that point to the cgi-bin location.

* Gophernicus interprets the first character on a line in
gophermaps for lots of special functions, so if you're including
content from an external source, you can't let that content start
at the first character of any lines in the gophermap. For the
README included in the GophHub root Git directory view, I indented
the content by one space using "pr" from GNU Coreutils:
pr -o 1 -T file.txt

* Gophernicus allows setting CGI variables in the URL with the same
"[protocol]://example.com/[path]?var=val&var2=val..." URL parameter
syntax as HTTP GET requests. This is used extensively in GophHub
and works well with the original UMN Gopher client. However some
multi-protocol clients struggle with it when used in search fields
(item type 7) because they internally treat the search argument as
a "?" parameter of the URL, and don't like there already being a
"?" on the end. For example I noticed that early versions of
Firefox, where Gopher support was still present, don't work with
the settings menu on the main GophHub gophermap, because it strips
off the existing URL parameters when you use the search link. In
the end I decided to offer this feature anyway after I noticed that
most other clients seemed to cope with it OK. Web proxies might
also have trouble.

* Search selectors (item type 7) are expected to return gophermaps
by clients, like item type 1. So you shouldn't try to return
arbitrary plain-text content directly from a search query or the
client will interpret what it thinks are gophermap elements within
the text, potentially messing up the display and file saving.

* By default, Gophernicus converts line endings in text files to
DOS-style (CRLF), which can break running UNIX shell scripts
downloaded over Gopher. The UMN Gopher client converts back to
UNIX-stle newlines when you save a file, but other clients,
including Curl, don't. So to offer UNIX-compatible script
downloads, I use Sed to add a null byte on the second line of the
file, and avoid using a ".sh" file extension, which tricks
Gophernicus into thinking it's a binary file so that it doesn't
convert the newlines. That's how the "Download" links work in the
scripts section here. This is the Sed command:
sed '2i # \x00' script.sh > script

If you run the Gopher server yourself, you can also change the CRLF
conversion behaviour with command-line options to Gophernicus. In
2020 I posted more information on this to the Tildiverse NNTP
network in the tilde.projects newsgroup (Message-ID:
<[email protected]>), which you can also read here:
gopher://aussies.space/0/~freet/cupboard/Gophernicus_newline_conversion.txt

DYNAMIC GOPHER TIPS

Some geneneral tips for dynamic Gopher content with Gophernicus.

* CGI scripting with Gopher, as with HTTP, allows you to break the
rigid directory structure of URLs. Usually this structure is so
reliable in Gopher that some clients even generate a directory tree
view of the directories in the URL and previously-viewed
Gophermaps. Ideally I like to preserve this rigid directory
structure as much as possible. With GophHub it wasn't practical
because different Git repos can have any directory structure
imaginable. But with Gopher Banker I made the script that generates
the rates table gophermap also create directories for each currency
conversion option, so that links to convert to specific currencies
look like this:
gopher://tilde.club/7/~freet/currconv/AUD
Symlinks were used so that the executable "gophermap" script inside
each currency ID directory points back to the one
"gopherconvert.sh" executable.

* Here are some useful CGI environment variables that are available
to CGI scripts running with Gophernicus (originally found by
looking at the Gophernicus source code). Variable names are
prepended with "$", as in shell scripting:

Search request (if none, equals URL parameters): $SEARCHREQUEST
URL parameters (if none, equals search request): $QUERY_STRING
Executable's file name (with full path): $SCRIPT_FILENAME
Gopher URL pointing to this executable: $HTTP_REFERER
Selector string from request (with URL parameters): $SELECTOR
Path section of the request (no URL parameters): $REQUEST
IP address of client: $REMOTE_ADDR
Server host name: $SERVER_NAME
Root directory for public Gopher content: $DOCUMENT_ROOT
Executable's working directory: $PWD
Server software (also OS): $SERVER_SOFTWARE
Character set in use: $GOPHER_CHARSET

* In a Bash script, you can convert URL parameters into positional
parameters for processing like command-line arguments:

------------------------------------------------------------------
# Turn query string into positional parameters, as if options were
# command-line arguments

oldIFS="$IFS"
IFS='&'
set -- $QUERY_STRING
IFS="$oldIFS"
------------------------------------------------------------------

The parameters are split by "&" characters as if those were spaces.
So in the script they can now be accessed with the "$@"  variable,
individually as "$1", "$2"..., or any other way that normal
arguments can be read and manipulated. For example if the script
is run from "gopher://example.com/1/test?var=val&var2=val2", it
should show "$1" as "var=val", and "$2" as "var2=val2". Then you
can do this to set the values of recognised local variables from
the URL parameters:

------------------------------------------------------------------
# Process query string

var=
var2=

while [ "$1" ]
do
 case "$1" in
   var=*) var="${1#*=}" ;;
   var2=*) var2="${1#*=}" ;;
 esac
 shift
done
------------------------------------------------------------------

"$var" now equals "val", and "$var2" is "val2", as set from the
URL. Be aware that these variables should always be quoted when
used, to protect against injection attacks. I prefer to switch into
the restricted mode in Bash ("set -r") as an extra level of
protection before using any variables that have been set to
user-provided values.

- The Free Thinker