Survey of CGI handling by Gopher servers
                 by Christopher Williams
                        2025-02-28

In writing my own Gopher server, I wanted to support CGIs
in a standard way, in compliance with the RFC 3875, the
Common Gateway Interface[1] and with other Gopher servers
(as there seemed to be differences in CGI support in various
Gopher servers). I actually started with the idea of just
having simple support for running scripts called "moles",
as Bucktooth calls them, but soon I realized that CGIs are
the way to go and would actually be fairly straightforward
(Bucktooth’s moles are basically CGIs but without all of the
required environment variables set).

So I wanted to see how existing Gopher servers handled CGIs,
especially when it comes to environment variables (as those
seemed to vary the most between servers). Here’s a list of
servers that I know about and could get my hands on:

* Bucktooth[2]

* Gophernicus[3]

* port70[4]

* geomyidae[5]

* Motsognir[6]

* PyGopherd[7]

Sean Conner also wrote a piece[8] similar to this post.

I plan to update this as I find more differences and other
gotchas between servers.

------------------------------------------------------------
                   Overview of servers
------------------------------------------------------------


# Bucktooth

Let’s start with one of the oldest Gopher servers still in
common use.

To be clear, Bucktooth technically supports "moles" and
not CGIs. What this means in practice is that a Bucktooth
mole cannot depend on the presence of most CGI environment
variables. In particular, Bucktooth sets only the following
environment variables:

* `SERVER_PORT`

* `SERVER_HOST`

* `REMOTE_ADDR`

* `REMOTE_HOST`

* `REMOTE_PORT`

* `SELECTOR`

* `REQUEST`

It also passes the search string (if any) as command-line
arguments and appends the search string to the end of
`SELECTOR` separated by a `?` (basically like a CGI query
string).


# Gophernicus

This one provides the most environment variables of all
servers I looked at. I won’t list all variables, but it sets
all variables that CGI requires, with one exception, plus a
handful of de facto standard variables.


# port70

This sets all of the required CGI environment variables.


# geomyidae

This sets all required variables.


# Motsognir

This is missing a few required variables.


# PyGopherd

This is missing a few required and important variables.

------------------------------------------------------------
                    Major differences
------------------------------------------------------------


# Query strings

Perhaps the biggest difference I found is with how each
server handles query strings and search strings. A query
string is text that follows a `?` in a URL; in Gopher it
would be text that follows a `?` in the selector.

CGI has no concept of Gopher’s search string, though some
servers conflate it with a query string.

Most servers simply set the `QUERY_STRING` variable to the
text that follows `?`.

With Motsognir, if a request contains a search string,
`QUERY_STRING` will contain the search string instead.
Motsognir also makes the query string available in the
`QUERY_STRING_URL` variable and the search string in the
`QUERY_STRING_SEARCH` variable.

geomyidae, Gophernicus, and PyGopherd all set the
non-standard `SEARCHREQUEST` variable to the search string.
geomyidae also sets the `X_GOPHER_SEARCH` variable to the
search string.

port70 sets `QUERY_STRING` to the search string.

port70, PyGopherd, and Bucktooth don’t really support query
strings per RFC 3875. A script running under port70 can
find the query string inside the `GOPHER_DOCUMENT_SELECTOR`
variable (after removing everything up to and including
the first `?`). PyGopherd or Bucktooth provide the same
information inside the `SELECTOR` variable instead. (I
haven’t actually tested any of these three servers; some of
them might instead return an error if a selector contains a
query string. Bucktooth seems to recognize either a `?` or a
tab to separate the request from the query/search string.)


# Extra path information

CGI specifies that extra path information is the path that
remains after the path to the script. For example, in a
request for `/cgi/script/foo/bar`, where the script is at
`/cgi/script`, the extra path information is `/foo/bar`. The
translated path is the physical location of the extra path
component. This could be at, say, `/srv/foo/bar`, if files
are served from `/srv`. Only geomyidae and port70 set the
`PATH_INFO` and `PATH_TRANSLATED` variables.

However—and this is a big one—geomyidae sets these variables
_incorrectly_: it sets these to the path (virtual and
physical) to the script itself rather than to the extra
path information. geomyidae also sets the non-standard
`TRAVERSAL` variable to the extra path information, and it
does not set any variable to the translated path (which is
optional per RFC 3875 anyway).

The other servers that I surveyed appear not to handle extra
path information in the first place, but even in that case
RFC 3875 still requires a server to set `PATH_INFO` to NULL
(i.e., an empty string).


# Server protocol

Half of the servers set `SERVER_PROTOCOL` to something, but
there is no agreement on the value:

* Gophernicus: `RFC1436`

* port70: `GOPHER`

* geomyidae: `gopher/1.0`

Arguably, geomyidae’s is not correct since the Gopher
protocol does not have a defined version number. And
"RFC1436" is not the name of a protocol which rules out
Gophernicus.

That leaves port70 as the only possibly "correct" one of
the bunch. RFC 3875 says "It is not case sensitive and is
usually presented in upper case", so port70 even follows the
spirit of the law here by setting it to "GOPHER" rather than
to "Gopher" or "gopher".

------------------------------------------------------------
                  Miscellaneous gotchas
------------------------------------------------------------

I won’t dive too much into this, but there are some
"gotchas" that I came across:

* Bucktooth silently converts any `+` character to space
  and decodes percent-encoded values in a search string
  (but only if a search string contains at least one
  percent-encoded byte). This is not correct behavior for
  Gopher (it is for HTTP, but Gopher isn’t HTTP last I
  checked). A server should leave the search string intact,
  and if a Gopher client is converting a user’s spaces to
  `+` characters or percent-encoding some bytes, shame on
  it!

* Bucktooth also decodes anything that looks like a
  percent-encoded byte in the selector before a `?`. That’s
  bad news for anyone who has a file on their server
  named something like `12%34` (unless directory listings
  and gophermaps always percent-encode file names, which
  doesn’t seem to be the case).

------------------------------------------------------------
                        References
------------------------------------------------------------

[1] gopher://asciz.com/0/rfc/rfc3875.txt
[2] gopher://gopher.floodgap.com:70/1/buck
[3] http://github.gophernicus.org
[4] gopher://gopher.conman.org:70/1Gopher%3ASrc%3A
[5] gopher://gopher.r-36.net:70/1/scm/geomyidae/log.gph
[6] https://motsognir.sourceforge.net/
[7] gopher://gopher.quux.org:70/1/devel/gopher/pygopherd
[8] gopher://gopher.conman.org/0Phlog:2020/01/06.1