[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] Re: peanut gallery



I am coveying some background information from John Jenkins (who is on
the IRG), slightly condensed:

==========

1) Presence in the G source is *not* an indication of being either
"Chinese" or "simplified Chinese."  The G source is the source which
was used to include all of the KangXi, for example, which is hardly
"simplified Chinese."  There are also some traditional characters used
for Cantonese, and Korean characters in there.

The IRG sources are indications of who *asked* for a character to be
included, and *not* an indication of "what kind of character" is
involved. Excluding G-source-only characters on the presumption that
they're SC would be a mistake.

2) There is still the assumption being made that one can look at a
character and say, "Ah, yes, this is SC" or "Ah, yes, this is TC."  It
is impossible to separate Chinese from Japanese and Korean cleanly
since they use the same characters. Also, given the significant
percentage of characters which are *both* traditional forms in their
own right *and* simplifications of other characters, the whole process
is extremely problematic.

You *cannot* from the 10646 data, or the IRG data, or presence in
charset mappings extrapolate whether a particular ideograph is SC or
TC.  You must have knowledge of the individual characters.

Mark

—————

Πόλλ’ ἠπίστατο ἔργα, κακῶς δ’ ἠπίστατο πάντα — Ὁμήρου Μαργίτῃ
[For transliteration, see http://oss.software.ibm.com/cgi-bin/icu/tr]

http://www.macchiato.com

----- Original Message -----
From: "Adam M. Costello" <idn.amc+0@nicemice.net.RemoveThisWord>
To: <idn@ops.ietf.org>
Sent: Monday, February 04, 2002 21:33
Subject: Re: [idn] Re: peanut gallery


> I wrote:
>
> > Here's a more precise version of the proposal:  Prohibit a Han
code
> > point iff it has a kTraditionalVariant and has only "G" sources
> > (China/Singapore).  Would that do what I intend?
>
> "Mark Davis (jtcsv)" <mark.davis@jtcsv.com> replied:
>
> > No, that wouldn't do as you intend. The kTraditionalVariant is not
a
> > normative field... while it has improved over time, I would not
put
> > any real weight on it without a thorough review.
>
> Okay then, how about this:  Prohibit a Han code point iff it has
only
> "G" sources (China/Singapore).
>
> That might prohibit a few characters unnecessarily, but it will make
> sure that Taiwan, Japan, and Korea are able to use all their
characters,
> and will leave the maximum flexibility for China & Singapore to
define
> how to fold the remaining characters if they decide that's what they
> want to do.
>
> I wouldn't recommend this course, but if most of the Chinese
community
> wanted to do this, I don't see why the rest of us should object.
>
> AMC
>
>