The technical differences between HTTP and gopher

* * * * *

The technical differences between HTTP and gopher

> … The point is to attempt as full a sketch as possible of the actual
> differences and similarities between the HTTP (HyperText Transfer Protocol)
> and GOPHER protocols.
>
> …
>
> From what I gather, these are the similaries:
>
> 1. Both gopher and http start with a TCP (Transmission Control Protocol)
> connection on an IANA (Internet Assigned Numbers Authority) registerd
> port number.
> 2. Both servers wait for text (the request) terminating in a CRLF
> 3. Both servers expect the request (if there is one) to be formatted in a
> particular way.
> 4. Both servers return plain text in response, and close the TCP
> connection.
>
> And these are the differences that I understand:
>
> 1. Gopher will accept and respond to a blank request, with a default set
> of information, http will not.
> 2. Gophper [sic] sends a single "." on a line by itself to tell the
> client it is done, http does nothing similar prior to closing the
> connection.
> 3. Http has things like frames, multiplexing, compression, and security;
> gopher does not.
> 4. Http has rich, well-developed semantics, gopher has basic, minimalist
> semantics
> 5. Http requests are more resource intensive than gopher requests.
> 6. Http is highly commercialized, gopher is barely commercialized.
> 7. Http is heavily used and highly targeted by malicious users, gopher is
> neither.
> 8. Http is largely public, gopher is largely private (de facto privacy
> through obscurity.)
> 9. Http is used by everyone, their children, their pets, their
> appliances, their phones, and their wristwatches; gopher is used
> primarily by technical folk and other patient people.
> 10. Http all but guarantees a loss of privacy; gopher doesn't
>
> Yeah, I know, it's not much, but that's all that is coming to mind
> presently. What are your thoughts?
>

“Tech nology/Gopher [1]” (I'm quoting for the benefit of those that cannot
view gopher based sites).

I don't want to say that tfurrows is wrong, but there is quite a bit that
needs some clarification, and as someone who has worked with HTTP for over
twenty years, and has recently dived back into gopher (I used it for several
years in the early 90s—in fact, I recall Time Magazine [2] having a gopher
server back then) I think I can answer this.

First, the protocol. The gopher protcol is simple—you make a TCP connection
to the given port (defaults to 70). Upon connection, the client then sends
the request which can be one of three formats:

-----[ data ]-----
CRLF
-----[ END OF LINE ]-----

The simplest request—just a carriage return and line feed character. This
will return the main page for the gopher server.

-----[ data ]-----
selector-to-viewCRLF
-----[ END OF LINE ]-----

This will return the requested data from the gopher server. The specification
[3] calls this a “selector.” And yes, it can contain any non-control
character, including space. It's terminated by a carriage return and line
feed characters.

-----[ data ]-----
selector-for-searchHTsearch terms to useCRLF
-----[ END OF LINE ]-----

The last one—this sends a search query to a gopher server. It's the
“selector” that initiates a search, followed by a horizontal tab character,
then the text making up the query, followed by a carriage return and line
feed.

In all three cases, the gopher server will immedately start serving up the
data. Text files and gopher indexes will usually end with a period on its own
line; other file transfers will end with the server closing the connection.

That's pretty much the gopher protocol.

The HTTP protocol that works the closest to gopher is the so called HTTP/0.9
version, and it was pretty much the the same. So the same three requests
above as HTTP requests.

-----[ data ]-----
GET /CRLF
-----[ END OF LINE ]-----

The minimum request for HTTP. As you can see, it's only an extra four
characters, but the initial text, GET in this case, was useful later when the
types of requests increased (but I'm getting ahead of myself here). This will
return the main page for the HTTP server.

-----[ data ]-----
GET /resource_to_viewCRLF
-----[ END OF LINE ]-----

The usual request, but instead of a “selector” you request a “resource”
(different name, same concept) but it cannot contain bare spaces—they have to
be encoded as %20 (and a bare “%” sign is encoded as %25). Like gopher, the
contents are immediately sent, but there is no special “end-of-file” marker—
the server will just close the connection.

-----[ data ]-----
GET /resource_for_seach?search%20terms%20to%20useCRLF
-----[ END OF LINE ]-----

And a search query, where you can see the spaces being replaced with %20.
Also note that the search query is separated by the “resource” with a “?”.

So not much difference between gopher and HTTP/0.9. In fact, during the early
to mid-90s, you could get gopher servers that responded to HTTP/0.9 style
requests as the difference between the two was easy to distinguish.

The next version of HTTP, HTTP/1.0, expanded the protocol. Now, the client
was expected to send a bit more infomration in the form of headers after the
request line. And in order to help distinguish between HTTP/0.9 and HTTP/1.0,
the request line was slightly expanded. So now the request would look like:

-----[ data ]-----
GET /resource_to_view HTTP/1.0CRLF
User-Agent: Foobar/1.0 (could be a web browser, could be a web crawler)CRLF
Accept: text/*, image/*CRLF
Accept-Language: en-US;q=1.0, en;q=0.7; de;q=0.2, se;q=0.1CRLF
Referer: http://www.example.net/search?for%20blahCRLF
CRLF
-----[ END OF LINE ]-----

(Yes, “Referer” is the proper name of that header, and yes, it's mispelled)

I won't go too much into the protocol here, but note that the client can now
send a bunch more information about the request. The Accept header now allows
for so-called “content negotiation” where the client informs the server about
what type of data it can deal with; the Accept Language header tells the
server the preferred languages (the example above says I can deal with
German, but only if English isn't available, but if English is availble,
American is preferred). There are other headers; check the specification [4]
for details).

The server now returns more information as well:

-----[ data ]-----
HTTP/1.0 200 OkayCRLF
Date: Sun, 12 Jan 2019 13:39:07 GMTCRLF
Server: Barfoo/1.0 (on some operating system, on some computer, somewhere)CRLF
Last-Modified: Tue, 05 Sep 2017 02:59:41 GMTCRLF
Content-Type: text/html; charset=UTF-8CRLF
Content-Length: 3351CRLF
CRLF
content for another 3,351 bytes
-----[ END OF LINE ]-----

The first line is the status, and it informs the client if the “resource”
exists (in this case, a 200 indicates that it does), or if it can't be found
(the dreaded 404) or if it has explicitely been remove (410) or it's been
censored due to laws (451), or even moved elsewhere.

Also added were a few more commands in addition to GET, like POST (which is
used to send data from the client to the server) and HEAD (which is like GET
but doesn't return any content—this can be used to see if a resource has
changed).

HTTP/1.1 [5] is just more of the same, only now you can make multiple
requests per connection, a few more commands were added, and the ability to
request portions of a file (say, to resume a download that was cut off for
some reason).

HTTP/2.0 [6] changes the protocol from text-based to binary (and attempts to
do TCP- over-TCP but that's a rant for another time) but again, it's not much
different, conceptually, than HTTP/1.1.

Security, as in https: type of security, isn't inherently part of HTTP. TLS
(Transport Layer Security) is basically inserted between the TCP and HTTP
layers. So the same could be done for gopher—just insert TLS between TCP and
gopher and there you go—gophers:. Of course, that now means dealing with CA
(Certificate Authority)s and certificates and revocation lists and all that
crap, but it's largely orthogonal to the protocols themselves.

HTTP/1.0 allows compression but that falls out of the content negotiation.
The bit about frames and multiplexing is more an HTTP/2.0 issue which is a
lot of crap that the server has to handle instead of the operating system
(must not rant …).

Are HTTP requests more resource intensive? They can be, but they don't have
to be. But that leads right into the commericalization of HTTP. Or rather,
the web. HTTP is the conduit. And conduits can carry both water and waste.
HTTP became commercialized because it became popular. Why did HTTP become
popular and gopher whithered? Personally, I think it has to do with HTML
(HyperText Markup Language). Once you could inline images inside an HTML
document, it was all over for gopher. The ability to include cat pictures
killed gopher.

But in an alternative universe, where HTML had no image support, I think you
would have seen gopher expand much like HTTP has. Work was started in 1993 to
to expand the gopher protocol [7] (alternative link [8]) where the protocol
gets a bit more complex and HTTP- like. As mentioned, a secure gophers: is
“easy” to add [DELETED-in that it doesn't change the core protocol-DELETED]
(update—it's not as easy as I thought [9]). And as such, I could see it
getting more commercialized. Advertising can be inserted

TYPEWRITERS

For SALE, HIRE, or EXCHANGE,

at HALF the USUAL PRICES.

* * * * *

MS. (Manuscripts) Typewritten from
10d. per 1,000 words. 100 Circulars for 4s

* * * * *

TAYLOR'S,
74, Chancery Lane, London.
(Est. 1884.)
Telegrams: "Glossator," London.
Telephone No. 690, Holborn.

even in a text file. Yes, it might look a bit strange, but it can be done.
The only reason it hasn't is that gopher lost out to HTTP.

So those are the differences between HTTP and gopher. HTTP is more flexible
but more complex to implement. Had history played out differently, perhaps
gopher would have become more flexible and complex.

Who knows?

[1] gopher://sdf.org:70/0/users/tfurrows/phlog/2018/aco_gopherVsHttp.txt
[2] http://time.com/
[3] https://www.ietf.org/rfc/rfc1436.txt
[4] https://www.ietf.org/rfc/rfc1945.txt
[5] https://www.ietf.org/rfc/rfc2616.txt
[6] https://www.ietf.org/rfc/rfc7540.txt
[7] gopher://gopher.floodgap.com:70/0/gopher/tech/gopherplus.txt
[8] https://gopher.floodgap.com/gopher/gw?gopher://gopher.floodgap.com:70/0
[9] gopher://gopher.conman.org/0Phlog:2019/03/31.1

Email author at [email protected]