[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[idn] Fw: BALLOT on Five Canonical Mapping Errors



(Forwarded from another list)

> Patrik, the consortium does take seriously the relationship with the IETF.
> Were it not for that, there probably would not have been a letter ballot
in
> the first place -- it would have simply been dealt with in the last
meeting.
>
> As the discussion on this list and on
> http://www.imc.org/idn/mail-archive/threads.html attest, this is not a
> simple decision; there are good arguments on both sides.
>
> I myself favor option B -- I think it will cause the least disruption
> overall, especially since those 5 characters are rare, and the mapping
> tables from CNS 11643 to those characters are not widely deployed (and one
> already has to handle cases of variant mapping tables: the JIS characters
> that vary in mappings between different operating systems are a heck of a
> lot more common!)
>
> On the other hand, should the UTC decide to go with option A, the
> NormalizationCorrections.txt file does allow IDN to migrate to future
> versions of Unicode while being strictly backwards compatible. And because
> the mappings are one-way, to characters that themselves do not differ have
> canonical/compatibility mappings in either case, an implementation can
> *still* make use of a stock version of NFKC or NFC to do mappings. The
> implementation just needs to have an additional preprocessing step,
mapping
> those 5 characters according to NormalizationCorrections.txt before
applying
> normalization. (We should make this point clear somewhere in the
> documentation of that file.)
>
> So having NormalizationCorrections.txt does ameliorate the problem. I
agree
> with you and others that it does not *eliminate* the problem (that's why I
> personally believe that B is the better course), but in practice either A
or
> B are tenable solutions.
>
> Mark