[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [idn] Comments on IDNA/stringprep/nameprep
The Unicode consoritium debated making the canonical decomposition
from <gg> to <g><g> for a long time. The deciding feedback was from
the Korean national body at the Seoul SC2/WG2 meeting, where they said
it should not be done; that it was akin to canonically decomposing "w"
to "vv". They also objected to combinations like <gs> being
canonically decomposed, principally so that modern syllables could
always be decomposed into 3 pieces. The (weaker) compatibility
decompositions in Unicode until the time that NFC was formed; those
were removed because they would have prevented the formation of Hangul
Syllables in NFKC.
Γνῶθι σαυτόν — Θαλῆς
[For transliteration, see http://oss.software.ibm.com/cgi-bin/icu/tr]
----- Original Message -----
From: "Soobok Lee" <firstname.lastname@example.org>
To: "Kent Karlsson" <email@example.com>; "'Erik Nordmark'"
Sent: Tuesday, February 12, 2002 18:06
Subject: Re: [idn] Comments on IDNA/stringprep/nameprep
> Thanks, Kent.
> ----- Original Message -----
> From: "Kent Karlsson" <firstname.lastname@example.org>
> > > > Even though e.g. [gg] and [g][g] (there are a few hundred
> > > > are not canonically or compatibility equivalent, they still
> > > > the same sequence of Hangul letters, and thus "mean" the same.
> > >
> > > Yes, same argument is used for SC/TC needing to be addressed in
> > No, no, no!! This issue is comparable to the *canonical*
> > that already exist for Hangul syllable characters, and for other
> > characters that have a canonical decomposition (some "double latin
> > letters" have compatibility decompositions, but the relationship
> > is much stronger; and it is much much stronger than case
> > Unfortunately, due to historic events, that equivalence is no
> > recorded in Unicode 3.0 and later property data.
> > This is in no way comparable to the SC/TC issue which is a
> > preference issue, where the "spellings" are actually different.
> > Here it is just about the underlying representation for the
> > spelling (in terms of sequence of letters; there is not even any
> > case difference or font variant difference [for correctly
> > fonts that cover Hangul]).
> True. the canonical equivalence between [gg] and [g][g] is defined
> unicode 3.0 . They should have been unified by NFC, but haven't
> Too late to be changed. and It should be solved in new normalizatio
> But If applications use the new normalization before nameprep,
> As i warned in the last call comments, the following condition will
> trigerred silently,
> stringprep(newnormalization(Hangul)) != stringprep(Hangul)
> If stringprep would be neutral to new normalization adopted by
> stringprep should be perfect and inclusive of all kinds of mature
> that is, the universal set of all kinds of normalizations built
> Applications implementors should be cautious when applying
> to data/texts portions that contain IDN. If some applications
already adopted some
> normalizations forms that are not compatible to stringprep as above,
> backward compatibility requirements are not met in that case.
> IDNA's backward compatibility claim doesn't come without costs.
> Don't build our grand castle on the moving sand dune, on which a
tiny tent is more adequate
> and wise choice. :-)
> Soobok Lee