[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[idn] Comments on protocol drafts



(I've just caught up with the discussion so this will be a general
comment to the past 100 postings or so.)

First of all I agree with Oscarsson and others in that the Hoffman
solution is not acceptable.

Suppose the organisation "ishockeyförening" ("ishockeyf" + "o" with
umlaut + "rening") owns a domain. The name contains 16 characters
which, if my quick and dirty implementation is correct, will be 18
bytes when compressed with the algorithm in 2.4 (Hoffman). Encoded
with base32 it's 30 bytes and with "ph6" prepended, we're up from 16
to 33 bytes just because one of the characters is not ASCII.

Well, 17 bytes is not a lot but it adds up. (Remember that it's not
unlikely that it will be used frequently for decades or centuries.)
More importantly, it's completely unnecessary. If ASCII is not to be
encoded in domain names, I see no reason for it to be just because
there are one or two non-ASCII characters as well in the string.

Therefore, I propose the following requirement:

  If an encoding is used, the ASCII characters in a string must not
  be encoded in different ways depending on what other characters
  the string contains.


Then, of course, as it seems the discussion about whether to cludge
around old standards or not will be unavoidable later on. I think
I've already made my point a month ago...

/Magnus