[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [idn] Prohibit CDN code points





> -----Original Message-----
> From: owner-idn@ops.ietf.org [mailto:owner-idn@ops.ietf.org]On 
> Behalf Of DougEwell2@cs.com
> Sent: Tuesday, January 22, 2002 11:08 PM
> To: idn@ops.ietf.org
> Cc: tsenglm@cc.ncu.edu.tw; paf@cisco.com; seki@jp.fujitsu.com
> Subject: Re: [idn] Prohibit CDN code points
> 
> 
> In a message dated 2002-01-22 1:56:56 Pacific Standard Time, 
> tsenglm@cc.ncu.edu.tw writes:
> 
> > TC/SC character equivalence mapping is similar to  the mapping 
> of  UNICODE
> > Alphabet  map  it to its counterpart of ASCII  alpnabet .
> 
> No, it isn't.  Stop saying that.
> 
> ASCII uppercase/lowercase mapping is straightforward and 
> unambiguous, and can 
> be done one character at a time with NO lexical analysis (at 
> least for 99% of 
> all languages that use it; Turkish and Azeri do have exceptions).
> 
> TC/SC is NOT one-to-one for all characters.  It is for many, but 
> nowhere near 
> 99% or 95%.  If you implement any sort of TC/SC mapping you MUST 
> figure out 
> how to handle the many-to-one and one-to-many cases, and this is where we 
> have all been balking.  Users will not understand or accept that 
> "only some" 
> of the TC and SC characters are mapped to each other.

Theoretically, 1-1 TC/SC mapping, ASCII uppercase/lowercase mapping or
any table mapping are all belong to the same computing algorithm. From computer
science point of view, they are the same algorithm.

Your question should address to the souce of the table (1-1 only), not to the 
computing algorithm.

Kenny Huang