[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] case folding




>> >The character repertoire that is allowed in IDN is exactly that of ISO 
>> >10646 at the time we finish IDN. There is a single case-mapping table, a 
>> >single canonicalization table, and so on, at that point.
>> 
>> this requires that the time we finish IDN is significantly far out into the 
>> future, since ISO has so far not come to agreement on a single case-mapping 
>> table.
>
>What about saying that the only case folding is [A-Z][a-z], for backward
>compatibility with present DNS? 
>

Case insensitivity is a very important matter for many. I cannot accept
that those only needing the character in ASCII get case insensitivity.

The solution to require everything to be in lower case (or upper) is
not good either. Different case is used to make things more destinct and
may be an important part of a trademark.

I think we must have case insensitivity for alla characters were case exist.
They are not very many in UCS. And the major difficulties I have seen
is the Turkish I and German double S. To get Turkish I to map to dotless
lower case i is not possible today as the current ASCII I is defined to 
lower case to i. If UCS had a separate code point for Turkish I it would
be possible. I think for case insensitivity we can fairely easy define
how that is done for UCS and if will be ok for most people.
I think we should be able to define this quickly.

But there is another area we are forgetting. For many non latin alphabets
there is no case on a letter, but they have different forms (like
half width, double width, final,...) that for them should compare as
equivalent just like case should compare insensitive for us latin
alphabet users. As I do not use these languages I do not know if
it is difficult to define equivalence matching rules for them, but
this may be a difficult area to define (but must be done before for
example Arabic or Chinese names can be used).

  Dan