[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] Re: IDNA: is the specification proper, adequate, and complete?



I don't understand why there is all fuss concerning the various folding
that Nameprep performs.

The Unicode standard says that two canonically equivalent strings
should be treated the same (either one should work in any place that
the other works).  IDNA is simply honoring that recommendation.  If it
didn't normalize strings before encoding them or comparing them, then
canonically equivalent strings wouldn't get treated the same.  Canonical
equivalence deals with things like "a with grave" versus "a" followed by
"combining grave".

The Unicode standard also defines compatible equivalence, and says that
compatibility characters are characters that didn't really deserve
their own code points, but were reluctantly given their own code points
for the sole purpose of preserving round-trip conversions to legacy
charsets.  Nameprep respects this reluctance by using NFKC rather
than NFC, so that the compatibility characters are folded into their
equivalents.

The Unicode standard also defines a locale-independent case folding
algorithm explicitly intended for doing case-insensitive comparisons,
which is exactly what IDNA needs it for.  German sharp s is folded to ss
because a case-insensitive comparison must match German sharp s with SS
(the latter is the normal uppercase form of the former), and must match
SS with ss (the latter is the normal lowercase form of the former), and
so by transitivity it must match German sharp s with ss.

John C Klensin <klensin@jck.com> wrote:

> For example, one could say, and I think we essentially have, that the
> WG is solving the problem of getting things into and out of the DNS
> given that the Unicode coding form is accurately known.

Yes (but not just DNS, all existing protocols that use domain names).

> if the WG's position and recommendations are based on that model,
> we should be obligated to write it down and make it explicit in our
> documents before they go onto the standards track: we owe that much
> to those who think we are solving any of a number of more general
> internationalization problems.

Suggested text addressing that concern was posted to the list a few
weeks ago, see message <iluelfsdnh0.fsf@latte.josefsson.org>.

AMC