GopherProxy

GOPHER 2.0 - MARKUP

This is post 3 of 4 (?) in which I talk about one of my favorite
subjects: linked documents and lightweight markup languages.

Ratfactor's Apologia
=================================================================

I knew that the title "Gopher 2.0" would be a little contentious.

It's certainly more attention-grabbing than "Ideas for a New Con-
tent Delivery Protocol Heavily Inspired by Gopher". (Though now
that I see it, the last "PHIG" part does make me smile.)

Part of me wishes I could have thought of a better name for these
posts... *but* part of me doesn't.

I don't mind stirring the pot to see where everybody stands on
the upgrade vs. clean break issue. I was already leaning hard
towards the "clean break" camp because I love retrocomputing and
I want old machines and old software to keep working as-is for as
long as possible.

Now I'm completely convinced. :-)

Also, this reminds me somewhat tangentially of Cunningham's Law
[0]

"The best way to get the right answer on the Internet
is not to ask a question, it's to post the wrong an-
swer."

Where in this case, Gopher 2.0 is the "wrong" title for the
"right" (for me) content.

Why markup - encoding
=================================================================

Far more than the protocol, this is where I start to really get
excited.

I loooooove "plain text" documents.

But.

There's no such thing.

If you said "plain text" far enough back in time, I wouldn't know
if you were talking about something encoded in EBCDIC or ASCII.

If you say it now, I don't know if you're talking about 7-bit
ASCII or 8-bit ISO 8859-1 (Latin-1) or a multi-byte Unicode en-
coding or something else entirely!

And let's not even speak of line endings ("\r\n" vs "\n").

Please. Let's not speak of it. The wounds are still too fresh.

So do I *even need* to mention that UTF-8 would be required for
any next-gen document format?

Okay: UTF-8 is required. That's a position I'm happy to defend.

Why markup - hypertext
=================================================================

I am deeply invested in the concept of hypertext. [1]

I've experimented with Wikis and HTML content generators to a de-
gree that may not even be healthy. :-)

One of my favorite tools is the lightweight VimWiki plugin for
Vim, which allows me to quickly create, edit, arrange, and navi-
gate text documents within my editor. (And yes, I'm aware of and
jealous of Emacs and Org Mode.)

For VimWiki (or any hypertext document system) to work, it needs
to have a way to link directly to other documents.

HTML does this with anchors:

<a href="wigglers">Wigglers</a>

VimWiki does this with links:

[[wigglers|Wigglers]]

Gopher does this with "Directory Entities" (but only in directory
listings):

0/docs/wigglers[TAB]Wigglers[TAB]example.com[TAB]70

And informally, many folks have adopted this presumably Markdown-
inspired "reference-style" link pattern for Gopher content:

Wigglers [1]
...
[1] gopher://docs/wigglers

What I like about the Gopher directory entity style is that it
enforces (or at least strongly suggests) a one-line-each linear
list of links. What I don't like about them is typing and read-
ing them.

I like the "reference-style" links for the same reasons. I also
like that the path is completely visible and not replaced with
alternate text. What I don't like about them is that they are
not actually part of Gopher.

Why markup - text wrapping
=================================================================

One of the biggest problems I have with viewing Gopher content is
that it doesn't display well on different sized screens.

Isn't it painfully ironic that something as *simple* as *text* is
so hard to format for a cell phone screen vs an old 80-column
terminal vs a widescreen desktop monitor?

This is one thing that HTML gets 100% correct: by default, all
text reflows to fit the container.

The problem is that we can't just remove all of the line ending
characters from our documents and hope for the best: we'd lose
source code formatting, ASCII art, and all the other little de-
tails that make "plain" text so wonderful to view!

So somehow you have to specify, "here is paragraph text - please
make this look right for my readers," but also, "here is a cool
Figlet logo or a diagram made out of | + - characters, don't
touch this!"

Markup perspectives
=================================================================

Like HTTP comes with HTML (or vice versa), I believe a next-gen
rodent-based protocol for content specifies the format of that
content to the degree that we can link to other documents and
identify, minimally, how to display that content.

But, again, this format needs to balance the concerns of the
three perspectives I used to look at the protocol: developers,
content creators, and end-users.

Let's look at each of those now:

1. Developers

In my mind, a good format specification is unambiguous, simple,
and flexible. In the spirit of Gopher, I suggest a format that
is as *easy to parse as possible*.

Therefore, I currently favor an extremely limited line-based syn-
tax. I'll show examples later.

2. Content creators

As a content creator, I want the syntax to get out of my way and
let me type as rapidly as I can compose my thoughts.

I want the flexibility to be able to accomplish any (reasonable)
thing I can think of doing with plain text, but not have to memo-
rize a huge set of rules.

I feel like developer and content creator perspectives don't have
to be at odds so long as both agree on *utter simplicity* as a
core tenant.

3. End-users

As an end-user, I want to be able to view content so that it is
formatted as nicely for my screen as possible; I want to be able
to view a document on my phone, printed out on paper, or jacked
into my cyberdeck on the neon rooftop of a megacorp in the pour-
ing rain.

About the markup example
=================================================================

I'm already dogfooding [2] a prototype of this syntax and have
been using it since mentioning a tool I created called Text Ju-
nior (tjr) in a post back in April. [3]

(By the way, piping through groff hasn't been quite the panacea
I'd hoped it would be, but I'm otherwise pretty happy with the
little tool and the syntax. I've been noodling with a replacement
written in AWK/gawk.)

I've borrowed things I like from existing syntaxes such as Mark-
down, AsciiDoc, and various wikis. I honestly can't keep them
all straight anymore.

The common feature here is that all formatting is "line-based":
paragraphs are separated by blank lines. Headings are on lines
that start with one or more "#" characters. Other blocks start
and end with lines containing nothing but symmetrical triplets of
characters that are easy to type on the keyboard and are hopeful-
ly easy to remember (because of certain existing conventions).

Unambiguity, ease of typing, and ease of parsing are the primary
goals (in that order).

Markup example
=================================================================

Enough talk, let's see an example:

# Example Document

Hello.
Here is a paragraph of text.
It reflows as needed to fit the desired output width.

I like the idea of enforcing that links be on a line of their own.
I'm not super sure about the exact syntax.
For reasons I'll get into in Part 4 (the client), I want to support relative document links.
So here's something to look at:

link:/docs/wigglers
link://example.com/danglers
http://example.com/
telnet://example.com:23

The first two are for *this* new imaginary protocol. The last two are for *other* protocols in URL form.
Also note that there is no "display" text for the links.
I like the idea that the end-user is completely aware of where they're going when they follow a link.

Now a "preformatted" or "code" block:

```
example(){ +------------------------+
print("Hello world!"); | Code or art goes here |
} +------------------------+
```

I consider these to be nice-to-have formatting items:

"""
A block quote will stand out from the paragraph text.
It will also flow and wrap like paragraph.
"""

It's also hard to make a good document without this ability:

1. Ordered and unordered lists are always nice to have
2. I'm not certain how necessary it is to support nested lists. I guess it would be nice.

The end.

Example rendering
=================================================================

Here's an example rendering as it might appear if your screen
just happened to match the width of this document. :-) Of course,
you have to use your imagination to visualize how links might be
highlighted and such:

EXAMPLE DOCUMENT

Hello. Here is a paragraph of text. It reflows as needed to fit
the desired output width.

I like the idea of enforcing that links be on a line of their
own. I'm not super sure about the exact syntax. For reasons
I'll get into in Part 4 (the client), I want to support relative
document links. So here's something to look at:

link:/docs/wigglers
link://example.com/danglers
http://example.com/
telnet://example.com:23

The first two are for *this* new imaginary protocol. The last two
are for *other* protocols in URL form. Also note that there is
no "display" text for the links. I like the idea that the end-
user is completely aware of where they're going when they follow
a link.

Now a "preformatted" or "code" block:

example(){ +------------------------+
print("Hello world!"); | Code or art goes here |
} +------------------------+

I consider these to be nice-to-have formatting items:

A block quote will stand out from the paragraph text.
It will also flow and wrap like paragraph.

It's also hard to make a good document without this ability:

1. Ordered and unordered lists are always nice to have
2. I'm not certain how necessary it is to support nested
lists. I guess it would be nice.

The end.

Okay, that's it
=================================================================

(I had to fake the ordered list because I don't actually have
that working in tjr yet.)

Again, the client will be given leeway to display the document in
whatever way makes it most enjoyable for the end-user.

By the way, I could also see an argument being made for standard-
izing *strong* and _emphasized_ text. But, making unambiguous
rules for these that covers all corner cases is extremely hard.
Also, it breaks the "line-based" nature of the formatting so far.

There are lots of other little details and persuasive arguments I
could try to pack in here, but I think this post has gone on long
enough.

Actually, the next part, The Client, is where I'm *most excited*.
Thanks for reading thus far!

Well, almost done
=================================================================

Oh, one more thing: I've read all of the feedback (I could find)
so far and taken it all to heart. I'm happy to see the passion
and *no hard feelings* if I've rubbed some folks the wrong way
with all of this.

I wanted to specifically acknowledge what gallowsgryph wrote
about a next-gen protocol: [4]

"And I have a name suggestion for it: /Meerkat/. The
burrowing Savannah dweller that have large fami-
lies... Much like pubnix groups, if you think about
it."

I *love* that suggestion. "Meerkat Protocol." "MML - Meerkat
Markup Language." That could work.

What other rodents and small mammals could be pressed into ser-
vice? Shrews, moles, voles, mice, rats, hedgehogs, hamsters,
lemmings...

Ha! This is fun.

*** ****
******* ********
*******love*******
****ratfactor***
************
********
****
**

See you in cyberspace, Gophers!

[0] https://en.wikipedia.org/wiki/Ward_Cunningham#Cunningham's_Law
[1] https://en.wikipedia.org/wiki/Hypertext
[2] https://en.wikipedia.org/wiki/Eating_your_own_dog_food
[3] gopher://sdf.org/0/users/ratfactor/phlog/2019-04-21-text-junior
[4] gopher://sdf.org/0/users/gallowsgryph/phlog/2019-06-07_gopher2_part2.txt