[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] Prohibiting characters in draft-ietf-idn-nameprep



At 4:11 PM -0400 8/15/00, Edmon wrote:
>I am sure Einstein will never ever agree that E=mc" is the same as E=mc2 !!!

Of course, but why is that relevant? We are talking about host names, 
not general canonicalization of all text.

>Form KC doesnt seem to make sense in the context of a "name" of which the
>DNS is about...
>
>I believe form C should be the choice...

To summarize the previous discussion (which did not come to 
consensus, I believe):

- Form C preserves the uniqueness of characters, some of which are 
visually indistinguishable from each other. This, in turn, causes 
surprise when a user asks for a name with a character such as U+F900 
and is told that there is no such host because the host registered 
with U+U+8C48, which looks identical to U+F900.

- Form KC loses the uniqueness of some characters whose compatibility 
decomposition is not as clear (such as in the example you give of 
U+00B2, superscript 2), but causes less surprise when a user enters a 
compatibility character and it is normalized to a single character.

The other option for processing, which wasn't popular, is to prohibit 
on input the compatibility characters that are "more" ambiguous and 
then use form C so that the others (such as U+00B2) pass through. 
Many folks would argue that superscript 2 looks too much like digit 2 
to want this, but it is certainly doable.

--Paul Hoffman, Director
--Internet Mail Consortium