Subj : not all is lost but far too much for far too long
To : Maurice Kinal
From : Rob Swindell
Date : Wed Jul 03 2019 05:19 pm
Re: not all is lost but far too much for far too long
By: Maurice Kinal to Rob Swindell on Wed Jul 03 2019 10:13 pm
> Hallo Rob!
>
> RS> It's an idea. But that's not how *other* charsets/encodings work
>
> Other than the existance of 8-bit characters utf-8 is totally different than
> standard 8-bit character sets. If one is to scan msgs for 8-bit characters
> it won't help to decypher the message without knowing beforehand what the
> character set is, whereas with utf-8 it doesn't matter.
Yes, but what I'm saying is that there's already a control paragraph (FTN kludge) defined for charsets, just use that.
> The "CHRS: UTF-8 4"
> is totally useless especially when it is wrong such as in "CHRS: UTF-8 2"
> which still happens.
FTS-5003 seems to address that just fine:
Some implementations do not add the <level> field and some
implementations erroneously present "UTF-8 2" instead of "UTF-8 4".
Well mannered implementations should gracefully handle this situation
when reading messages. The recommended way of doing this is to
ignore the level parameter and only use the name of the identifier.
In future the level parameter may become obsolete.
> ON> So, if we wanted to help enforce at a reader (or even tosser
> ON> level) how to handle, I would offer this up as a required BOM to
> ON> the message body that is UTF8.
>
> RS> And why is that better than a header field ("control paragraph"
> RS> as defined in FTS-5003) which indicates UTF-8?
>
> It isn't.