__________________________________________________________________
__________________________________________________________________
* To: debian-devel <[1]
[email protected]>
* Subject: Hyphens in man pages
* From: Antonio Russo <[2]
[email protected]>
* Date: Sat, 14 Oct 2023 20:51:27 -0600
* Message-id: <[3][🔎]
[4]
[email protected]>
__________________________________________________________________
Hello,
I discovered a new pet peeve today: if you search for a command in a manual page
,
say -e in man 1 zgrep, it's a crapshot whether just searching for '-e' will find
the command or not. The reason is that "-" may been accidentally encoded as ‐
instead of -.
Now, depending on your email client and settings, the above will appear to be th
e
ravings of an unhinged lunatic who wrote the same thing twice, or an unhinged
lunatic who slammed their fists onto the keyboard.
The reason is that man(1) convert bare dashes (0x2D) to hyphens (U+2010). These
are not the same symbol: searching for one does not find the other without some
kind of normalization, pasting commands with one vs. the other does different
things. New users who do not understand this will be discouraged trying to read
manual pages. Chances are, they will fill forums with mundane questions that
could and should have been addressed by a simple search of a manual page.
I recently fixed a ton of these in another upstream package with this vim "one-l
iner":
:%s/--\([a-z]\+\)\(-[a-z]\+\)*/\=substitute(submatch(0), '-', '\\-', 'g')/g
However, this requires manual review and does not fix the '-e' example from zgre
p.
There are also a whole host of this kind of problem, e.g., dashes in URLs that g
et
naievely pasted into man pages (another live example I just addressed).
I come here with several questions:
- Am I off-base thinking this is a problem?
- Should we really be using troff to typeset anything in this year 2023?
(In particular, if we can make the source text more human-readable, we might
be able to leverage LLMs on this wealth of information in the future and auto
mate
support. Are LLMs "fluent" in troff? I have not investigated at all.)
- Are there any alternatives that actually produce nice looking man pages?
(My experience with pandoc is that the source is still awkward, I literally
just found another example of this bug in my own man page, and it looks prett
y
ugly in man. But maybe I just didn't find good examples/documentation.)
- Should we try to come up with some lintian rules to flag this behavior?
(This one: /--\([a-z]\+\)\(-[a-z]\+\)*/ finds long GNU-style commands, I'd
have to think for at least a little bit about finding short ones. This would
ultimately be fragile. For example, the above doesn't find partially broken
tokens; i.e., only one unescaped dash.)
<li> Automated tooling around this, more generally, seems fragile. HTML might h
ave
been a nice compromise, but writing that appears to be out of vogue these day
s,
<sarcasm intensity="medium">despite being a pretty OK thing to read and write
by hand</sarcasm>.</li> But seriously, I would love to be writing HTML instea
d
of troff for manual pages.
Antonio
Attachment: [5]OpenPGP_0xB01C53D5DED4A4EE.asc
Description: OpenPGP public key
Attachment: [6]OpenPGP_signature.asc
Description: OpenPGP digital signature
__________________________________________________________________
Reply to:
* [7]
[email protected]
* [8]Antonio Russo (on-list)
* [9]Antonio Russo (off-list)
__________________________________________________________________
* Follow-Ups:
+ [10]Re: Hyphens in man pages
o From: Jochen Sprickerhof <
[email protected]>
+ [11]Re: Hyphens in man pages
o From: "G. Branden Robinson"
<
[email protected]>
* Prev by Date: [12]Re: The Technical Committee needs you!
* Next by Date: [13]Re: Hyphens in man pages
* Previous by thread: [14]Re: The Technical Committee needs you!
* Next by thread: [15]Re: Hyphens in man pages
* Index(es):
+ [16]Date
+ [17]Thread
References
1. mailto:
[email protected]
2. mailto:
[email protected]
3.
https://lists.debian.org/msgid-search/e21c0729-0789-4079-b9a2-b9a4b1843e05@aerusso.net
4.
https://lists.debian.org/debian-devel/2023/10/msg00083.html
5.
https://lists.debian.org/debian-devel/2023/10/bin8AN4dT_ZPQ.bin
6.
https://lists.debian.org/debian-devel/2023/10/pgpSD95L7uMcj.pgp
7. mailto:
[email protected]?in-reply-to=<
[email protected]>&subject=Re: Hyphens in man pages
8. mailto:
[email protected]?in-reply-to=<
[email protected]>&subject=Re: Hyphens in man pages&
[email protected]
9. mailto:
[email protected]?in-reply-to=<
[email protected]>&subject=Re: Hyphens in man pages
10.
https://lists.debian.org/debian-devel/2023/10/msg00084.html
11.
https://lists.debian.org/debian-devel/2023/10/msg00085.html
12.
https://lists.debian.org/debian-devel/2023/10/msg00082.html
13.
https://lists.debian.org/debian-devel/2023/10/msg00084.html
14.
https://lists.debian.org/debian-devel/2023/10/msg00082.html
15.
https://lists.debian.org/debian-devel/2023/10/msg00084.html
16.
https://lists.debian.org/debian-devel/2023/10/maillist.html#00083
17.
https://lists.debian.org/debian-devel/2023/10/threads.html#00083