[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: My prod at IDN requirements



At 11:04 00/01/04 +0800, James Seng wrote:

> > Others are MUCH better than me in compiling example cases and requirements
> > for Korean, Japanese, Thai, Arabic, Hebrew.....
> 
> in addition, there are also languages which have other problem on folding.
> chinese for example have simplified & traditional glyphs which means the same
> thing, use in the same way but given different codespace.
> 
> japanese kanji also have traditional & simplified glyphs but it is usually
> considered differently. or at least that is what i have been told. 
> 
> this will be a problem if ISO10646 is used. because of the CJK unification
> (arggh who is the idiot?), japanese & chinese falls under the same U+4E00 code
> space. if one folds and the other not, i think it is fairly obvious how messy
> it is going to be.

CJK unification is really not to blame for this. Even for Chinese alone,
there are a lot of cases where the mapping is not one-to-one, and so
the solution has to be thought about carefully.


> korean hangul if i am not wrong does not suffer from this problem :-) it is a
> very clean and well-designed language. 

Hangul is a script, not a language. As a script, it is indeed very
well designed. Unfortunately, its regularity on many levels leads
to many 'obvious' ways of encoding it. The traces of that can be
found in current (and past) versions of Unicode. However, most
of the problems (except for characters only appearing in historical
documents) can be dealt with by using Unicode Normalization Form C.


Regards,   Martin.


#-#-#  Martin J. Du"rst, World Wide Web Consortium
#-#-#  mailto:duerst@w3.org   http://www.w3.org