[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] case folding

To: Dan Oscarsson <Dan.Oscarsson@trab.se>
Subject: Re: [idn] case folding
From: James Seng <jseng@pobox.org.sg>
Date: Wed, 31 May 2000 08:06:31 +0800
CC: mau@beatles.cselt.it, idn@ops.ietf.org
Delivery-date: Tue, 30 May 2000 17:03:18 -0700
Envelope-to: idn-data@psg.com

Dan Oscarsson wrote:
> I think we must have case insensitivity for alla characters were case exist.
> They are not very many in UCS. And the major difficulties I have seen
> is the Turkish I and German double S. To get Turkish I to map to dotless
> lower case i is not possible today as the current ASCII I is defined to
> lower case to i. If UCS had a separate code point for Turkish I it would
> be possible. I think for case insensitivity we can fairely easy define
> how that is done for UCS and if will be ok for most people.
> I think we should be able to define this quickly.

I suppose squeezing Turkish I into Unicode is not possible :-)
(The argument that it looks like Latin I should take a look at U+0410)
 
> But there is another area we are forgetting. For many non latin alphabets
> there is no case on a letter, but they have different forms (like
> half width, double width, final,...) that for them should compare as
> equivalent just like case should compare insensitive for us latin
> alphabet users. As I do not use these languages I do not know if
> it is difficult to define equivalence matching rules for them, but
> this may be a difficult area to define (but must be done before for
> example Arabic or Chinese names can be used).

Correct.

I suppose we already have consensus that *some* canonicalization has to be
done. Case-sensitivity does not work in domain names or any naming system.

-James Seng

Prev by Date: Re: [idn] canonicalization
Next by Date: Re: [idn] case folding
Prev by thread: Re: [idn] case folding
Next by thread: Re: [idn] case folding
Index(es):
- Date
- Thread