From: [email protected] (The Free Thinker)
Newsgroups: tilde.projects
Subject: Re: Browsecal.sh
Date: Mon, 14 Sep 2020 23:37:47 -0000 (UTC)
Organization: tilde.club
Message-ID: <[email protected]>
References: <[email protected]> <[email protected]> <[email protected]> <[email protected]>

The Free Thinker <[email protected]> wrote:
> The Free Thinker <[email protected]> wrote:
>> Dacav Doe <[email protected]> wrote:
>>> On 2020-09-06, The Free Thinker <[email protected]> wrote:
>>>>
>>>> gopher://aussies.space:70/1/%7efreet/scripts
>>>
>>> After fixing new-lines (sed -i s/.$//)
>>
>> I'm not sure of the solution to this. I think Gophernicus converts
>> all text to DOS-style newlines, probably because the Gopher standard
>> says that gophermaps should use them.
>>
>> But I haven't got around to looking up the gopher commands and trying
>> in telnet to make 100% sure that it's not curl doing the conversion
>> (and that the UMN Gopher client is doing its own conversion back
>> again for me).
>
> Actually, I just tried:
> gopher gopher://aussies.space:70/9/%7efreet/scripts/browsecal.sh
>
> And it downloads with the DOS newlines, so it must be Gophernicus
> adding them. The next question is whether that's a bug, or a
> characteristic of Gopher, and clients should do the conversion back
> again like the UMN Gopher client does when saving?
>
> Some clients might not be able to cope with UNIX newlines.
>
> I guess Gophernicus shouldn't be doing the conversion for type 9
> (binary) downloads though. Otherwise I could have type 0 view links,
> and type 9 download links, for each script.

I finally got around to digging into this properly and figured out
a work-around. It's possible that some clients will choke on a null
character in text that they're viewing, so I've now got separate
"View" and "Download" links on that page, and the download links
provide script files with UNIX-style newlines.

Details:

REMOTE (server-side):
^^^^^^^^^^^^^^^^^^^^^
$ echo -e "text\nis\written" > text.txt
$ echo -e "text\nis\written" > text
$ echo -e "text\ni\0s\written" > bintext.txt
$ echo -e "text\ni\0s\written" > bintext

I meant to put "\n" before "written" instead of just "\", but too
late now...

$ ls *text*
bintext  bintext.txt  text  text.txt

$ cat -v *text*
text
i^@s\written
text
i^@s\written
text
is\written
text
is\written

LOCAL (client-side):
^^^^^^^^^^^^^^^^^^^^
Downloaded files as item type 9 from aussies.space Gophernicus server

$ ls *text*
bintext  bintext.txt  text  text.txt

$ cat -v *text*
text
i^@s\written
text^M
i^M
text^M
is\written^M
text^M
is\written^M

Conclusion:
^^^^^^^^^^^
Only files containing a null byte, AND not ending with a file
extension that indicates a text file, avoid getting converted to
CRLF newlines.

Also, based on the Gophernicus source code, the first null byte must
be within BUFSIZE bytes from the start of the file (currently set to
1024 bytes in gophernicus.h). See function "gopher_filetype" in
menu.c.

The file extensions are assigned to Gopher item types according to
the FILETYPES definition in gophernicus.h.

Running Gophernicus with the "-nc" option should disable checking for
the null bytes, in which case files without recognised extensions
will allways be treated as text (and converted to CRLF).

Running Gophernicus with the "-e [ext]=[type]" option should add or
re-assign extentions to a Gopher item type. In theory "-e txt=9"
should disable CRLF conversion for files with the ".txt" extension,
for example.

For a shared server, I don't see any method for changing the file
type auto-detection behaviour on a per-user basis. A
directory-specific configuration system, something like Apache's
".htaccess" (though hopefully not nearly as complicated and
confusing), would be nice.

On a multi-user system, it seems that the only solution for avoiding
CRLF being added to UNIX script files is to not use a file extension
that's assigned to item type 0 in gophernicus.h, and to add a null
character to the file within the first 1024 bytes.

This command adds a line containing a null character after the first
line of a script, allow downloading from a Gophernicus server
without conversion to CRLF:
sed '2i # \x00'

eg. (still need to avoid using an extension that's associated with
text files in gophernicus.h):
sed '2i # \x00' browsecal.sh > browsecal

--

- The Free Thinker      |      gopher://aussies.space/1/%7efreet/