[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] Adding "optional" characters in draft-ietf-idn-nameprep



Patrik Fältström wrote:
> 
> At 09.53 +0800 00-08-14, James Seng wrote:
> >No, I dont believe UTR15 handles what we are discussing here. Please
> >read UTR15 and come back to read this thread again.
> 
> If TR#15 doesn't then we have 2 options:
> 
> - Ask the Unicode people to please resolve this issue aswell, and we
> will only use whatever TR#15 talks about. This means that optional
> characters in Hebrew will be significant in DNS names -- as they are
> according to Unicode specifications.
> 
> - Do our own extension to TR#15 for Hebrew and other scripts which
> "are missing" from TR#15.

Patrik, et al,

I don't think folding vowels/diacritics should be part of normalization,
and therefore wouldn't make sense for UTR #15.  I see the definitions of
normalization and canonicalization as:

normalization - transforming multiple code representations of a single
character into one conforming code representation.

canonicalization - transforming multiple character representations of a
single entity into one conforming character representation.

I don't know if others on this list are using the same definitions I am,
however.  To me, case folding and Hebrew vowel removal are part of
canonicalization.  A human can actually see the difference between the
character string before canonicalization and then after.  But in
normalization, the difference is the set of code values underlying the
characters, and so a human cannot see the difference between normalized
and non-normalized strings.

Andrea
-- 
Andrea Vine, avine@eng.sun.com, iPlanet i18n architect
"In these Regulations any reference to a regulation is a 
reference to a regulation of these Regulations"
-- Education (UK Student Loans) Regulations 1997