* <<G12.0914>> Unicode sucks.
And so does ASCII. And everything in-between.
The problem with plain-text is that it isn't really an encoding for
~written language~, is it? What we call "plain-text" is a sequence
of graphemes, numerals, punctuation (written human-language
elements), and miscellaneous graphic symbols, interleaved with
control codes for the operation of a teletypewriter. CARRIAGE RETURN
and LINE-FEED are, of course, not things you do with a pen or pencil
– you may be willing to concede that your arm is a carriage, and
that you mentally "feed" paper away from you as you move down the
page, but I am not – nor are they things a compositor does with his
composing stick, galleys, and formes.
***
-a family of devices that have their own characteristics quite
outside those of the stylus, brush, or printing press.
***
Some of the ASCII control codes that should be used/thought-of as
written-language codes instead of teletype codes:
01 SOH Start-of-Heading
Actually a useful semantic code. Headings are generally
indicated by placement of text within a page. If we want to decouple
the formatting from the language, we need markers like this to
indicate when text is a heading, and when it is body text.
10 LF Line Feed; should be New Line
The UNIX \n "newline" character. That's what this should
represent: a new line, not a 1-line paper feed operation.
12 FF Form Feed; should be New Page
32 space; should be Word Separator
--
Excerpted from:
PUBLIC NOTES (G)
http://alph.laemeur.com/txt/PUBNOTES-G
©2016 Adam C. Moore (LÆMEUR) <
[email protected]>