Gemini maps
-----------

I have actually mostly been thinking about details *other* than what
I'll call "Gemini maps" until a better name comes along.  But now that
multiple third parties are starting to implement "proto-Gemini", and
given that I'll be travelling for a bit in 4 days time and will have
less time to dedicate to this, I feel a kind of time pressure to come
up with a very, very preliminary complete spec which gives enough
detail for the community to experiment with at least the core
functionality in my absence.

At this point, the request and response formats seem pretty well
settled.  There's still a question mark floating over how to include
search queries in requests (though I have pretty well convinced myself
that this isn't necessary, more on that one day), but that's, IMHO,
not "core functionality".  As long as people can put up simple static
"Geminiholes" (stupid name, we need something better) and link 'em
together, that will do for now.  Currently the most under-specced part
of this whole thing is the text/gemini default item type.

I previously proposed something kind of like gophermaps, and on the
whole I still think there is an awful lot to recommend this idea.  The
server at gemini.conman.org used a form of that proposal, so I wrote
AV-98 to parse that, so now it's *vaguely* settled upon, but there's
some wiggle room, and anyway there's overdue discussion to be had
about this, so here we go.

First of all - someone phlogged somewhere (I'm truly sorry, I honestly
tried to find and cite this, but this conversation has become very
widely spread and I couldn't find it.  As always, email me if it
was you, or you remember who it was) that it could be problematic to
erase gopher's text/menu distinction in favour of something like a
gophermap because very long text-only files (like the text of Lord of
the Rings) would need to be parsed line-by-line looking for links, for
no gain.  The solution to this is simple!  That sort of content should
be served as text/plain, not text/gemini.  That way the client knows
it has no links and can just be displayed.  The power of content type
declaration!  To make this simple, we could come up with a file
extension for text/gemini content (.gem?  I know Ruby Gems exist but
I've never used Ruby so I don't know if they are litteraly .gem files),
and servers could dish .txt fils up as text/plain and .gem files up as
text/gemini and that'd be that.

Right, the actual format!  What is being used in the existing code
is as follows.  Any line that looks like this is a link:

<TAB><USER FRIENDLY NAME><TAB><LINK><CR><LF>

and anything else is plain text (like a gopher menu 'i' line).

That's it!

What can <LINK> be?  Definitely, URLs are absolutely allowed.  This is
a deliberate improvement upon gopher, which was designed only to link
to other gopher stuff, and can only link to non-gopher content via an
ugly hack which I'm very happy that Gemini is free of.  Anything other
than URLs?  Well, the gemini.conman.org server also serves relative
links, HTML style, and AV-98 handles them.  This wasn't something that
was ever discussed or decided upon, it just kind of happened.  I don't
*think* I have a problem with it.  Commentary is very welcome!  This
*does* complicate client design slightly, in that clients need to
remember the URL where they got the map from in order to translate the
relative links to absolute ones.  Gopher clients don't need to do this,
because each item in a gopher menu specifies a host and a port.  This
is basically a trade-off between network taffic (relative URLs are
shorter than absolute ones) and client complexity.  Since we have TLS
overhead, small network efficiencies are not necessarily worth chasing
(although I do hope that TLS session resumption will become more
widely supported in future, so we can really cut that overhead down).
This *probably* argues for absolute URLs only.  Relative links are
more user friendly for authors, but of course Gemini servers could
convert them for you, which is how most gopher servers work anyway.
This pushes work out of the client and into the server, which I think
is how things should be (and is an explicit part of the philosophy of
gopher in RFC1436).  For now, let's maybe follow Postel's law: if
anybody wants to write a new server, please just send absolute URLs.
If anybody wants to write a client, please be prepared to accept
relative URLs if they appear.  We can defer the final decision on
this, based on what we learn in early testing.

What can <USER FRIENDLY NAME> be?  Anything that doesn't have a <TAB>
in it, because that would confuse the parsing.  Anything else goes, I
guess.

So, the link indicator and delimiter is fixed as <TAB>?  Well, that is
what's being used in practice for now.  Sloum pointed out[1] that tabs
are perhaps problematic, because some people have configured their
editors to produce a sequence of spaces instead of a tab, because, for
reasons I never quite understood, most programming language
communities seem to have decided that tabs are somehow bad (I love
that Lua does not seem to have this culture!).  This sounds like a
valid argument on the face of it, but then, most gopher servers use
tabs for this and it doesn't seem to stop people.  All that really
matters is that we pick something which is very quick and easy for
somebody to produce in any editor, and which people are unlikely to
reasonably want to use in <USER FRIENDLY NAME>.  This last
consideration makes me not very in favour of sloum's suggestion of "@"
to separate <USER FRIENDLY NAME> from <LINK>, because I can absolutely
see people e.g. linking to their Mastodon profile and using their
username, with an @ at the front, as the <USER FRIENDLY NAME>.  I kind
of like his "~!" link indicator idea.  Lines beginning with tabs could
occur easily in non-link contexts (paragraph indents, snippets of
source code).  Identifying link lines correctly requires counting the
total number of tabs in the whole line, which is slightly more
computational effort than just checking the first two characters.  As
ever, feedback welcome, but it's tabs for now because, well, there's
running code using tabs.

In earlier writing I originally proposed a more extensive link format:

<TAB><USER FRIENDLY NAME><TAB><LINK><TAB><MIMETYPE><CR><LF>

Mostly this was just blindly copying gopher.  The MIME type is
slightly redundant, in that the client will learn what it is when
actually fetching it (this is how it works on the web).  It doesn't
seem popular to have this here, gemini.conman.org doesn't provide it
and sloum said he didn't think it was necessary.  The only reason I am
still very slightly attracted to the idea is the following thought:
graphical Gemini clients could, as an option the user could turn
on/off as they pleased, when seeing image/* links, fetch and display
them in-line.  I'm imagining this in a very harmless, "images as
figures" way, as espoused by @gcupc in their nice post about the
"Lynx web"[2]: the server has no way of controlling the image's
size, or position.  It just goes exactly where the link is, centred
in the line, at a sensible size that is under the user's direct
control.  Like a figure in a textbook or something.  For certain
kinds of documents this is a totally sensible and reasonable thing to
want to do, and I really like that doing it this way requires *no*
image-specific syntax and it degrades totally gracefully into just a
link in clients which don't want to or can't support images.  That's
nice!  The least-offensive way possible to bring images into this!

But I'm also worried that people would start serving weird
not-really-MIME values in that position, and using it to trigger
weird and wonderful behaviour in experimental non-standard clients.  I
will write an entire post on this some time, but I am terrified of
putting extensibility into Gemini, either designed extensibility or
accidental scope for sneaky extensibility.  Extensibility is not a
fundamentally bad thing, from an engineering perspective when you just
want to solve problems it can be very powerful.  But Gemini is an
ideological protocol - simple is best, privacy matters!  If you let
people who don't believe these things add extra features, it presents
a possible slippery slope away from those values.  This is kind of
what happened to the web, with cookies.  I would like Gemini to be
"closed by design", so I'm trying to avoid places where people could
easily slip things in, and that <MIMETYPE> field, that's just a free
place to stick arbitrary text, confident that clients which aren't in
on the extension will just ignore it as "some MIME type I don't have a
special way to handle", and continue without breakage.  Way too
tempting.  Yeah, let's leave that out.  It's a shame that we can't
have nice things!

Regarding the treatment of plain text content, for now let's just say
it should be presented as-is, but there is some discussion around this
to have.  Many people have pointed out that the convention of not
reflowing text in gopher makes gopher content difficult to consume on
devices like phones, where 70 or 80 char lines are too wide.  Maybe we
should explicitly declare Gemini map text to be reflowable?  I would
also not be opposed to a *very light dash* of *strictly optional*
formatting possibilities.  For example, in Markdown, you can have:

# Sections
## Subsections
### Sub-subsections

That's an extremely easy thing for even a crappy hand-written parser
to recognise.  We could say that Gemini clients *may* render lines
beginning with #s in larger fonts, but that it's 100% okay not to.
This lets very simple clients ignore this issue entirely, and the
plain text version remains totally readable, but graphical clients
*could* give a very nice and clean representation of structured text,
which is not a bad thing at all.  We could just not specify any of
this at all and leave it entirely up to the discretion of individual
clients to recognise and render some things, but with that approach
different clients will recgonise different things, and so authors
will just ignore all of them and there will be no point.  Having
*one* standardised way to do this kind of thing lets authors who want
to partake of it do so in a way they know will maximise client
compatibility, but doesn't force anybody who is disinterested to use
it.  So, it makes sense to me to specify *something*.  We should
*only* consider things which degrade cleanly when viewed as raw text.
If text which "should be" bold turns up with *s around it instead,
that's fine, the point is still clear.  But Markdown supports, e.g.
strike-through text with ~~this syntax~~ and that's *no good* because
if you view it unrendered it is not at all obvious that it's supposed
to be crossed out and the meaning is confused.  So, none of that!
Also, absolutely nothing which is remotely difficult to unambiguously
parse.  If even a shred of cleverness is required to not make a mess
of it, I'm not interested.  I suspect something usable meets those
criteria, but if not oh, well.

A final consideration: the reflow thing and the light markup thing
would both make it quite difficult to include ASCII art in Gemini
maps.  It of course would still be possible to serve it in text/plain
documents, but not in maps, which would kill the gopher tradition of
including ASCII art headers in the root menu.  This is kind of a
shame, but then, maybe it's also nice if there is a clear aesthetic
difference between gopher and Gemini.  I dunno.  This would complicate
low-effort bihosting, too.  Hmm...

That's it!  I will condense this into the spec-spec.txt shortly.

If you want to influence my thinking on any of the open questions
raised here, write me something convincing.  Remember, always: simple
is best, privacy matters, beware of sneaky extenders!

[1] gopher://colorfield.space:70/0/~sloum/phlog/190619.txt
[2] https://jfm.carcosa.net/blog/computing/gopher-and-the-lynx-web/