Two proposed status schemes
---------------------------

This post is not up to my usual standards due to time pressure and is
more more rambling/thinking outloud than concrete, but here it is.

Here's a proposal for a two-digit status code scheme for Gemini,
inspired by the idea I had at the end of a previous post[1].  Two
digit schemes are necessarily more complicated and scary looking than
single digit schemes.  This one is *very* carefully designed so that
it is possible for either client or server authors to get away with
ignoring the second digit:

----------
Gemini uses two-digit numeric status codes.  Related status codes have
the same first digit.  Importantly, the first digit of Gemini status
codes do not group codes into vague categories like "client error" and
"server error" as per HTTP.  Instead, the first digit alone provides
enough information for a client to determine how to handle the
response.  By design, it is possible to write a simple but feature
complete client which only looks at the first digit.  The second digit
provides more fine-grained information, for unambiguous server logging
and to enable writing smarter bots, or comfier interactive clients
which provide a slightly more streamlined user interface.

From the perspective of a simple client looking only at the first
digit, there are 6 status codes in Gemini.  They are:

1       The requested resource accepts user input.  The header text is
       prompt which should be displayed to the user.  The same
       resource should then be requested again with the user's input
       included as a query.

2       The request was handled successfully and a response body will
       follow the response header.  The header text is a MIME type
       which applies to the response body.  cf HTTP status 200.

3       The server is redirecting the client to a new location for the
       requested resource.  There is no response body.  The header
       text is a URL for the requested resource.  The URL may be
       absolute or relative.  The redirect should be considered
       temporary, i.e. clients should continue to request the
       resource at the original address and should not performance
       convenience actions like automatically updating bookmarks.
       cf HTTP status 307

4       The request has failed.  There is no response body.  The
       nature of the failure is temporary, i.e. an identical request
       MAY succeed in the future.  The header text may provide
       additional information on the failure, and should be displayed
       to human users.

5       The request has failed.  There is no response body.  The
       nature of the failure is permanent, i.e. identical future
       requests will also fail and should not be attempted.
       The header text may provide additional information on the
       failure, and should be displayed to human users.

6       The requested resource requires client-certificate
       authentication to access.  If the request was made without a
       certificate, it should be repeated with one.  If the request
       was made with a certificate, the server did not accept it and
       the request should be repeated with a different certificate.
       The header text may provide additional information on
       certificate requirements or the reason a certificate was
       rejected.

Note that for basic interactive clients for human use, errors 4 and 5
may be handled identically.  Basic clients may also choose not to
support client-certificate authentication, in which case only four
distinct status handlers are required (for 1, 2, 3 and a combined 4-5).

The full two-digit system is:

10      Equivalent to the single digit status 1.

20      Equivalent to the single digit status 2.

30      Temporary redirect, i.e. equivalent to the single digit
       status 3.  Could be used for things like load balancing, or
       redirecting to a region-specific page based on IP geolocation.

31      Permanent redirect.  The requested resource should be
       consistently requested from the new URL provided in future.
       Tools like search engine indexers or content aggregators
       should update their configurations, and end-user clients may
       update bookmarks etc.  Note that single digit clients will
       still end up at the right place if they read this as "3", they
       just won't be able to make use of the knowledge that this
       redirect is permanent, so they'll pay a very small performance
       penalty by having to follow the redirect each time.

40      A temporary error has occurred and no more specific
       information is available.

41      Server is overloaded

42      CGI process died or timed out.

43      Rate limiting is in effect, status message indicates number of
       seconds to wait before another request.

50      A permanent error has occurred and no more specific
       information is available.

51      Not found, cf HTTP 404

53      Gone, cf HTTP 410.  This resource isn't coming back at this
       address and it should be removed from indexes.

59      Bad request, cf HTTP 400

60      A client certificate is required to proceed

61      The server is requesting the initiation of a transient client
       certificate session.  The client should ask the user if they
       want to accept this and, if so, generate a disposable key/cert
       pair and re-request the resource using it.

62      This resource is protected and a client certificate which the
       server accepts as valid must be used - a disposable key/cert
       is not appropriate here.

63      The supplied client certificate is not valid for the requested
       resource.

64      The supplied client certificate was not accepted because its
       validity start date is in the future.

65      The supplied client certificate was not accepted because its
       expiry date has passed.

Note that these codes have been constructed so that simple servers can
just send 40, 50 or 60 when a more carefully written server might send
a more specific code.  In short, all of the detail and power of the
full two-digit system is built into the protocol, but both client
authors and server authors need to opt in to that more complex
system.  It is possible for client authors to opt out by only looking
at the first digit and for server authors to opt out by just putting a
0 on the end of the first digit and putting any other information into
the header message.
----------

I like this proposed system, and the one other person who has seen it
so far (Conman Sean) likes it too.  But at the same time, I can
definitely hear a voice in the back of my head screaming "this is
hugely over-engineered, we don't need it and you only like it because
you're pleased with yourself about how nicely the two-digit codes
degrade into one-digit codes".

The voice is right that I'm pleased as punch with the whole idea of
having a functioning one-digit status code system embedded inside a
two-digit status code system.  I think this is a very cool idea and
I'd like to see it used more widely, in non-Gemini contexts.  But just
because it is a very cool idea doesn't necessarily mean it's the right
idea for Gemini.  I worry that, at least in this case, the extra power
of the two-digit system is enough to justify its weight.  Consider:

The single digit codes 1 and 2 are not expanded upon at all in the
two-digit system.  The single digit code 3 is expanded into only two
two-digit codes, 30 and 31.  The arguments for including a temporary
redirect are pretty flimsy - in fact, the only reason explicit
temporary and permanent redirects are in there is because it was the
first example I thought of where a two-digit scheme could degrade to
a one-digit scheme in a totally compatible way.  We could probably do
without this and then fully half of the one-digit codes are not
expanded upon at all, leading us seriously into "why bother?"
territory.

The one really compelling reason I can come up with for all the 4x
and 5x two-digit codes is that if we tried to go without them and just
served up 4 and 5 with the particular error explained in the header,
then clients are only going to receive an explanation of what actually
happened in whatever human language was spoken by the person who wrote
the server.  Numeric codes, in contrast, allow clients to present
translations of the particular error into whatever language the user
would prefer.  That's not at all an inconsequential thing for a system
that one wants to see widely used, and argues strongly for having
distinct status codes for at least the most meaningfully distinct
conditions.

Sloum is a fan of the single character status code idea, and raised to
me the interesting possibility of using a single hexadecimal digit
(i.e. 0-F) as the entire space of status characters.  That's a cute
idea which gives us 16 codes.  If we trim a little bit of the fat from
the two-digit system above, can we fit everything into 16 codes,
allowing translation of client interfaces?

0       Bad request
1       Input prompt (see last post[2])
2       Success
3       Redirect
4       Not found
5       Temporary server error (overload, CGI failure)
6       Gone
7       Rate limiting in effect
8       Unused
9       Unused
A       Transient client cert session requested (see [2])
B       Client cert required for protected resource
C       Client cert invalid for this resource
D       Cilent cert outside of validity window
E       Unused
F       Unused

Yeah, seems like we can do it with room to spare.  I've structured the
above so that everything to do with client certs gets an alphabetical
code (the two unused codes E and F don't trouble me too much because
the client certificate stuff is the most complicated part of Gemini
and it's conceivable that the need for extra codes will arise), which
makes it easy for simple clients which don't support client certs to
detect any code related to that (e.g. in Python a simple
code.isalpha() will return True for any cert related and False for
anything else).  The numeric codes are structured for maximum
similarity with HTTP equivalents as a memory aid.  The unused 8 and 9
*do* make me nervous...

Despite all the effort I put into coming up with the two-digit
scheme, I have to admit that the above just feels, intuitively,
much more "right" for Gemini.  It's small and friendly and
approachable.  It's specific enough that logging a status code alone
is adequately informative.

I'm tempted to just say "screw it, we're using this!" (meaning the
hexadecimal scheme), but it's late and I'm sleepy and I know that kind
of snap decision is a bad idea...

[1] gopher://zaibatsu.circumlunar.space:70/0/~solderpunk/gemini/status-codes.txtgopher://zaibatsu.circumlunar.space:70/0/~solderpunk/gemini/status-codes.txt
[2] gopher://zaibatsu.circumlunar.space:70/0/~solderpunk/gemini/inputs-and-client-certs.txt