__________________________________________________________________
    __________________________________________________________________

    * To: debian-devel <[1][email protected]>
    * Subject: Hyphens in man pages
    * From: Antonio Russo <[2][email protected]>
    * Date: Sat, 14 Oct 2023 20:51:27 -0600
    * Message-id: <[3][🔎]
      [4][email protected]>
    __________________________________________________________________

Hello,

I discovered a new pet peeve today: if you search for a command in a manual page
,
say -e in man 1 zgrep, it's a crapshot whether just searching for '-e' will find
the command or not.  The reason is that "-" may been accidentally encoded as ‐
instead of -.

Now, depending on your email client and settings, the above will appear to be th
e
ravings of an unhinged lunatic who wrote the same thing twice, or an unhinged
lunatic who slammed their fists onto the keyboard.

The reason is that man(1) convert bare dashes (0x2D) to hyphens (U+2010).  These
are not the same symbol: searching for one does not find the other without some
kind of normalization, pasting commands with one vs. the other does different
things.  New users who do not understand this will be discouraged trying to read
manual pages.  Chances are, they will fill forums with mundane questions that
could and should have been addressed by a simple search of a manual page.

I recently fixed a ton of these in another upstream package with this vim "one-l
iner":

:%s/--\([a-z]\+\)\(-[a-z]\+\)*/\=substitute(submatch(0), '-', '\\-', 'g')/g

However, this requires manual review and does not fix the '-e' example from zgre
p.
There are also a whole host of this kind of problem, e.g., dashes in URLs that g
et
naievely pasted into man pages (another live example I just addressed).

I come here with several questions:

- Am I off-base thinking this is a problem?
- Should we really be using troff to typeset anything in this year 2023?
  (In particular, if we can make the source text more human-readable, we might
  be able to leverage LLMs on this wealth of information in the future and auto
mate
  support.  Are LLMs "fluent" in troff? I have not investigated at all.)
- Are there any alternatives that actually produce nice looking man pages?
  (My experience with pandoc is that the source is still awkward, I literally
  just found another example of this bug in my own man page, and it looks prett
y
  ugly in man. But maybe I just didn't find good examples/documentation.)
- Should we try to come up with some lintian rules to flag this behavior?
  (This one: /--\([a-z]\+\)\(-[a-z]\+\)*/ finds long GNU-style commands, I'd
  have to think for at least a little bit about finding short ones.  This would
  ultimately be fragile. For example, the above doesn't find partially broken
  tokens; i.e., only one unescaped dash.)
<li> Automated tooling around this, more generally, seems fragile.  HTML might h
ave
  been a nice compromise, but writing that appears to be out of vogue these day
s,
  <sarcasm intensity="medium">despite being a pretty OK thing to read and write
  by hand</sarcasm>.</li> But seriously, I would love to be writing HTML instea
d
  of troff for manual pages.

Antonio

  Attachment: [5]OpenPGP_0xB01C53D5DED4A4EE.asc
  Description: OpenPGP public key

  Attachment: [6]OpenPGP_signature.asc
  Description: OpenPGP digital signature
    __________________________________________________________________

  Reply to:
    * [7][email protected]
    * [8]Antonio Russo (on-list)
    * [9]Antonio Russo (off-list)
    __________________________________________________________________

    * Follow-Ups:
         + [10]Re: Hyphens in man pages
              o From: Jochen Sprickerhof <[email protected]>
         + [11]Re: Hyphens in man pages
              o From: "G. Branden Robinson"
                <[email protected]>

    * Prev by Date: [12]Re: The Technical Committee needs you!
    * Next by Date: [13]Re: Hyphens in man pages
    * Previous by thread: [14]Re: The Technical Committee needs you!
    * Next by thread: [15]Re: Hyphens in man pages
    * Index(es):
         + [16]Date
         + [17]Thread

References

  1. mailto:[email protected]
  2. mailto:[email protected]
  3. https://lists.debian.org/msgid-search/e21c0729-0789-4079-b9a2-b9a4b1843e05@aerusso.net
  4. https://lists.debian.org/debian-devel/2023/10/msg00083.html
  5. https://lists.debian.org/debian-devel/2023/10/bin8AN4dT_ZPQ.bin
  6. https://lists.debian.org/debian-devel/2023/10/pgpSD95L7uMcj.pgp
  7. mailto:[email protected]?in-reply-to=<[email protected]>&subject=Re: Hyphens in man pages
  8. mailto:[email protected]?in-reply-to=<[email protected]>&subject=Re: Hyphens in man pages&[email protected]
  9. mailto:[email protected]?in-reply-to=<[email protected]>&subject=Re: Hyphens in man pages
 10. https://lists.debian.org/debian-devel/2023/10/msg00084.html
 11. https://lists.debian.org/debian-devel/2023/10/msg00085.html
 12. https://lists.debian.org/debian-devel/2023/10/msg00082.html
 13. https://lists.debian.org/debian-devel/2023/10/msg00084.html
 14. https://lists.debian.org/debian-devel/2023/10/msg00082.html
 15. https://lists.debian.org/debian-devel/2023/10/msg00084.html
 16. https://lists.debian.org/debian-devel/2023/10/maillist.html#00083
 17. https://lists.debian.org/debian-devel/2023/10/threads.html#00083