[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] Re: 7 bits forever!



-----BEGIN PGP SIGNED MESSAGE-----

Kent Karlsson wrote:
> Valdis Klētnieks wrote:
> > Let's take as an example the "native language" encoding of my name:
> >
> > From: Valdis Kl=?iso8859-4?Q?=BA?=tnieks <Valdis.Kletnieks@vt.edu>
> >
> > (That's a "small e with macron", Unicode 0113).
> 
> I don't know what you mean by "native language encoding".  The encoding
> here used 8859-4 with QP, but that is no more "native" nor more
> "language" than e.g. (most e-mail programs put the encoding outermost)
> 
> =?utf-16be?Q?Valdis_Kl=01=13tnieks?=

This is a good demonstration of how poorly specified and generally
broken RFC2047 is. It's as clear as mud about whether ASCII characters
are interpreted as themselves, or converted to bytes and then
reinterpreted as the specified charset, here UTF-16 (as would normally
be the case for a transfer encoding syntax in any other context):

#  (3) 8-bit values which correspond to printable ASCII characters other
#      than "=", "?", and "_" (underscore), MAY be represented as those
#      characters.

The problem is "which correspond to": does this mean in the specified
charset, or in ASCII? I would say in ASCII, which implies that the
values are bytes and are reinterpreted as UTF-16. This makes the string
invalid because there are an odd number of bytes, but if we remove the
final 's' it represents:

  U+5661 U+6C64 U+6973 U+204B U+6C01 U+1374 U+6E69 U+656B
or
  噡汤楳⁋氁፴湩敫

I would not be surprised if *some* implementations interpreted the spec
as you've described, and *some* the way I've described, though. Blech.

- -- 
David Hopwood <david.hopwood@zetnet.co.uk>

Home page & PGP public key: http://www.users.zetnet.co.uk/hopwood/
RSA 2048-bit; fingerprint 71 8E A6 23 0E D3 4C E5  0F 69 8C D4 FA 66 15 01
Nothing in this message is intended to be legally binding. If I revoke a
public key but refuse to specify why, it is because the private key has been
seized under the Regulation of Investigatory Powers Act; see www.fipr.org/rip


-----BEGIN PGP SIGNATURE-----
Version: 2.6.3i
Charset: noconv

iQEVAwUBPJvNXTkCAxeYt5gVAQHxhQgAzq5I+B+uuDKyi1MLx0b3zpBMlAKcsO9D
vqmacVnpHoJMAAe00XYaXO1Zd1ZTX/r4GA68N8HOLnR2POO1ilzlvO4+x9Gx0GM4
jkeSn34Egldr+B0nVHvon/elDRICBKvQJlIfukld4nNzwLhKQNbZn5W90EQryOrP
9Nd95xpzNsrz61cWmKaG6DaEvS8nRYgPd6r13jiMIry8xfrkR5ZSP2RWGBvf2iK9
wXWX6v7cy4C5F2cq+ihaAOuzl2WJjF/biALFjk2ITpWir0Go5tWlGwLztE33AL+v
Xein3WezCyag2YV6Eq6CtY8gs65cDiCzzdnAay9dhohnklDA7Rw7vA==
=NZ/G
-----END PGP SIGNATURE-----