* * * * *

                    The legality of double slashes in URIs

Martin Chang replied [1] to my musings on processing malformed Gemini
requests [2], saying that double slashes in URI (Uniform Resource Indicator)s
are illegal, and pointed out the ABNF (Augmented Backus-Naur Form) grammar
from the URI specification [3] to back up his claim:

-----[ ABNF ]-----
path          = path-absolute   ; begins with "/" but not "//"
path-absolute = "/" [ segment-nz *( "/" segment ) ]
segment-nz    = 1*pchar
pchar         = unreserved / pct-encoded / sub-delims / ":" / "@"
-----[ END OF LINE ]-----

But he didn't quote the segment rule:

-----[ ABNF ]-----
segment       = *pchar
-----[ END OF LINE ]-----

which translated says, “0 or more pchar rules.”

So the ABNF he quoted does indeed rule out //boston/2018/07/04.2. It doesn't
rule out /boston//2018/07/04.2, since by the time we hit the double slash,
we're in the *( "/" segment ) part of the path-absolute rule, and segment can
have 0 characters. But what he quoted only applies to relative links, what I
receive is an abolute link. If you follow the ABNF from that perspective:

-----[ ABNF ]-----
URI-reference = URI / relative-ref
URI           = scheme ":" hier-part [ "?" query ] [ "#" fragment ]
hier-part     = "//" authority path-abempty
                / path-absolute
                / path-rootless
                / path-empty

path-abempty  = *( "/" segment )

; other rules omitted
-----[ END OF LINE ]-----

not only does this allow gemini://gemini.conman.org//boston/2018/07/04.2 but
gemini://gemini.conman.org///////////boston/2018/07/04.2.

I can understand why this was done—to simplify the grammar as the various
path- rules generally end with *( "/" segment ) allows one to end a URI with
a trailing slash or not. I don't think the intent was to allow long strings
of slashes, but that's the end result of a lax grammar. Martin is also
correct that multiple slashes are treated as a single slash on POSIX
(Portable Operating System Interface) (basically, any Unix system), that's
not the case across all operating systems. One exception I can think of
AmigaOS (Operating System), where each slash represents a parent directory.
This command, cd /// on AmigaOS is the same as cd ‥/‥/‥ on a POSIX system.
Crazy, I know. And maybe not even relevant these days, but I thought I should
mention it.

[1] gemini://gemini.clehaxze.tw/gemlog/2022/05-03-two-cents-on-the-mistery-of-double-slashes-in-urls.gmi
[2] gopher://gopher.conman.org/0Phlog:2022/04/30.1
[3] https://www.ietf.org/rfc/rfc3986.txt

Email author at [email protected]