[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] Comments on protocol drafts



At 01:14 15-02-00 , James Seng wrote:
>RJ Atkinson wrote:
> > All dialects of Chinese use the same written form,
> > mentioned primarily for those without familiarity
> > with Chinese.
>
>Almost the same.

The written form is the same for all dialects of Chinese.

>Hong Kong Cantonese have some special ideogram which exists in BIG5-HK but not
>in Unicode. Similarly, Taiwanese have some phonetic glyphs which exist in
>BIG5-TW but not in Unicode again. All of them are localised but very
>important, at least to those who use it in Taiwan and Hong Kong. 

This is different than form and speaks to completeness of a character set encoding,
which is useful to note, though devising a character set encoding is out of scope
here. 

We agree that both Cantonese and Taiwanese have their own additional characters,
above and beyond the set used for Mandarin.  At least the Cantonese characters
missing from ISO-10646 (and also from UNICODE) are important (e.g. some are used 
in daily newspapers in HK).  I am not as well informed about the Taiwanese characters.

>Standard Mandarin have about few hundred thousand ideograms which does not
>exist on any commonly used character sets, including Unicode. Hopefully these
>can be taken care of in UCS-4.
>
> > Lets please move past this, but lets also remember that native
> > users of Asian languages dwarf native users of European
> > languages and try to find a language neutral (rather than merely
> > European) approach to this issue.
> >
> > I'll also note that ISO-8859 and UTF-8 do not support all European languages
> > equally well, nor does either support other Romanised non-European languages
> > (e.g. Vietnamese) equally well.
>
>Similar, there are other localized encodings used for European languages other
>than the standard locale ISO8859 and CP125X. It appears I18N remains an
>unfulfillable ideal whereby nothing right now can satisfy everyone. :-)
>
>But having said these, while these are obvious problems to I18N,
>unfortunately, 'fixing' these problems are beyond the scope of the WG.

I think you have missed my point.  The critical point is that we should
be seeking a language-neutral approach to this problem.  An approach
that works well for either US-ASCII or ISO-8859-X (for any value of X),
but not nearly as well for other languages (whether Romanised, such as
Vietnamese, or ideographic, such as Chinese) is undesirable and SHOULD
be avoided.  (IMHO, this should be added to the requirements document)

Ran
rja@inet.org