[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Inputting mixed SC/TC (Re: [idn] A question...)



Dear Chun-Hsin Wu,
	Thank you for your message and your New Year's greetings.
http://www.iis.sinica.edu.tw/~wuch/idn/examples/mixinput.htm gives a
very visually powerful demonstration of the problem the working group
has debated over these many months, and I have no doubt that is useful
for many members of the working group to see some of the character
mixtures rather than hear them described.  Thank you for your efforts.
	Like others in this forum, I am an "inexperienced"
Chinese-system user (I use an English-language system with two Chinese
language IMEs installed, much like the system James described below).
Even in that context, it has been clear to me, and to many in the
group, that the TC/SC problem and its concomitant issues are very
real.  We do not dispute the problem.
	We are faced, then, with the following situation: after
receiving a great deal of expert advice, the working group has come to
believe that the problem of character equivalence cannot be solved
within the context of this working group.  Based on that belief, we
can do one of two things:

1) stop, and declare that the problem we set out to solve (increasing
the repertoire of available characters) should not be attempted until
the character equivalence problem can be solved.

2) explicitly acknowledge that we have not solved the character
equivalence problem but increase the repertoire of available
characters, allowing others to limit by policy which characters may be
used in specific domains.

	As the chairs and others have repeatedly noted, a solution to
the problem of character equivalence in the context of Unicode and
Unicode's CJK unification is very difficult, and we have no single
entity who is authoritative for Chinese, Japanese, and Korean on whom
to rely.  If we select choice one, we cannot have any firm or
reasonable expectation for the timing of the emergence of a standard.
It would also have to wait for all of the other, similar character
equivalence problems in Unicode to reach resolution.  This means that
the repertoire would remain that of US-ASCII for the foreseeable
future, with all of the drawbacks that implies.

	We have heard significant objections in the last call period
that the work involved in limiting characters by policy will be
onerous and that it will not solve all of the problems of character
equivalence for the Chinese language community.  That is a substantive
objection.  The chairs' response could be expressed as: "given your
objection, can you present a solution for *CJK* character equivalence
*in Unicode* at this time?"  

	Absent a realistic answer to that question, the group seems
likely to adopt some variation of 2) above.  The other choice,
to continue to limit the repertoire to US-ASCII seems untenable.  
That may reflect a bias on the group's part to reach some resolution
to the character repertoire problem, but it also seems like a fairly
reasonable response; given the choice between having eight potential
character groupings for "qingzhenjiao" and zero, it has chosen eight.
If you would prefer zero, with no time horizon for anything better,
than this is the period to make that preference known.

	My apologies to the chairs or the group if I misrepresented
their sense of consensus.
			regards,
				Ted Hardie







On Sat, Feb 09, 2002 at 05:12:49AM +0800, Chun-Hsin Wu wrote:
> Dear James,
> 
> Adopting your word, please don't try to PAINT yourself as a daily Chinese
> computer user. As I know, you did not use Chinese Windows system daily
> nor input Chinese characters often in your dialy work. I'm afraid that an
> unfamiliar or inexperienced Chinese-system user can not thoroughly realize
> what troubles the proposed IDN solution will make to world-wide Chinese
> Internet societies. Although we have discussed inputting and using SC/TC
> characters in the end of Oct 2001, I still quite doubt whether you really
> understand what I try to explain.
> 
> ### Inputting mixed SC/TC Characters
> 
> In http://www.imc.org/idn/mail-archive/msg04520.html, you said it's
> difficult to type mixing TC and SC and we need to switch from TC IME
> to SC IME repeatively. Followed by Xiang Deng from CNNIC in
> http://www.imc.org/idn/mail-archive/msg04521.html, he told you:
>     "No, I do can type TC AND SC in a very usual IME without swich."
> Besides, in http://www.imc.org/idn/mail-archive/msg04523.html, I also
> told you:
>     "Most IMEs in Traditional Windows 2000 can support TC and SC
>     directly without switching, even they can let the user type in
>     Japanese characters without switching.  It is not unusual for one IME
>     to support all, or as many as possible, Unicode characters if the system
>     supports Unicode.  For examples, the phonetic imput method and the
>     Boshimy input method. The former is the default method in all Chinese
>     systems and almost every Chinese or Taiwanese who uses computers
>     knows how to use it easily. The later is widely used by many people
>     who want to have high Chinese input rate, especially for professional
>     users."
> 
> Two daily native Chinese Windows users have tried to tell you we do
> be able to input mixed TC/SC by a single input method software (IME).
> However, you still claimed in your response:
> http://www.imc.org/idn/mail-archive/msg04524.html
>     "On Windows, you install 2 IME, one for TC and the other for SC.
>     You toggle between these two while you key in."
> Well, how were you so confident? To give you more feeling, in
> http://www.imc.org/idn/mail-archive/msg04545.html , I demostrated
> two examples that use a single IME to input mixed SC/TC, and told you:
>     "So it is clear that we don't need to install TWO IMEs to
>     input mixed TC and SC in TC Windows 2000. The user also
>     does not need to toggle between TC and SC while input. It
>     is natural that one TC input method software aims to make
>     the user input as many Unicode characters (and as easy)
>     as possible. Indeed, almost every add-on TC input method
>     software can support far more HAN characters than those
>     defined in Unicode, since some other charsets define more
>     characters than Unicode. The market will drive the software
>     company to do so, becoming a Uni-input method software
>     and supporting as many HAN characters as possible."
> 
> Now, in http://www.imc.org/idn/mail-archive/msg05781.html you still
> claimed:
>     "By default, TC or SC will be output depending which IME you use. Of
>     course, you *could* manually scroll down the list to get the characters
>     or you could toggle the input method to do so."
> Well, well, .... Fine. See:
>     http://www.iis.sinica.edu.tw/~wuch/idn/examples/mixinput.htm
> It gives you three examples and respective window snapshots to prove
> that one single IME can output TC and SC easily. I believe many IME
> softwares have more functions  for multi-lingual inputs than your
> understanding. Indeed, IME developers also plan to support inputting
> multi-lingual Unicode characters more conveniently in their new
> versions. Next time you come to Taiwan, I can help you discuss IME
> issues with the developers of top 3 IME softwares in Taiwan, in
> Chinese conversation.
> 
> By the way, can you easily distinguish the displayed Chinese names in
> Example 2 and Example 3: eight combinations for "Islam" in Example 2
> and two combinations for "Ministry of the Interior" in Example 3?
> 
> If you do check the variants in Unicode AND display them under Chinese
> Unicode systems such as Traditional Chinese Windows 2000, you can
> find more similar-looking but easy-confusing SC/TC pairs.
> 
> ### 3-month late or 30-year's pain?
> 
> I've been using Chinese computer systems for about twenty years. In
> addition to input-method developers, Chinese software developers,
> Internet application developers and IDN-related professionals and officers,
> these months people from CDNC have also discussed Unicode and IDN
> issues with many Chinese linguists, several of them having experiences in
> Chinese/Han encoding for more than twenty years. Except for PunnyCode,
> I can say none of them agree with passing current proposed IDN WG's
> drafts for CDN hastily. It is just like that ALL Chinese participants in
> IETF
> IDN WG meetings, who use Chinese systems daily, did not agree with the
> proposed IDN solution.
> 
> Things go worse due to that it's not easy for experienced native Chinese
> users to explain the CDN issues to unfamiliar or inexperienced
> Chinese-system users fluently in English. That's part of the reasons that
> Unicode Consortium/WG2 needed to form IRG to discuss Han ideographs
> and it's more easy for JET members (CJKT) to have concrete discussions
> and consensus. Within IDN WG, it's clear that most IDN WG members
> are unfamiliar or inexperienced Chinese-system users.  It's said that they
> are tired and time-stressed. Although they have tried very hard to solve
> IDN issues, however, IETF IDN WG seem to stand like that their solution
> is also best to CDN and be trying to pass the proposed IDN solution for
> CDN, which does not  have the consensus from CDNC and native
> Chinese participants in IDN WG meetings. If IDN is really important to
> the developement of Internet, I'm afraid that the process of passing current
> IDN standards under the above condition would become a bad case and
> symbol in the history of Internet.
> 
> To be responsible for world-wide and Chinese Internet societies, hope we
> all move carefully and responsibly. That's my two-cent point. Hope we have
> peaceful and justicial Internet World.
> 
> Happy Chinese New Year
> 
> Chun-Hsin Wu
> 
> ----- Original Message -----
> From: "James Seng/Personal" <jseng@pobox.org.sg>
> To: <jw-lin@yahoo.com.tw>; <idn@ops.ietf.org>
> Sent: Friday, February 08, 2002 12:23 PM
> Subject: Re: [idn] A question...
> 
> 
> > > It is not ture. I think, maybe you are not familiar how Chinese
> > characters
> > > are used in our daily work.
> >
> > Perhaps you are not be aware I am possibly a Chinese myself too and I
> > use Chinese in my daily live, both traditional (in Malaysia) and
> > simplified (Singapore).  And I work with many Chinese linguists in the
> > last 2 years and the many of them are not Chinese.
> >
> > > But the other 5 cases are often written
> > > (not inputed), especially U+53F0, U+6E7E, U+5B66 are sometimes
> > > preferrd because they are less-strokes.
> >
> > Thanks. Then we have no problem.
> >
> > > What you said are in a restricit environment, like Windows 98, on
> > which
> > > you can not input those characters on some limited IME. However,
> > > Windows 2000 will change the user behavior.
> >
> > By default, TC or SC will be output depending which IME you use. Of
> > course, you *could* manually scroll down the list to get the characters
> > or you could toggle the input method to do so.
> >
> > The user behavior education about domain names should be that domain
> > names are identifier, not names. They should enter into the computer
> > exactly as they seen it or reference it.
> >
> > Anyhow, you are raising issues which have been debated in the list
> > before, one which the wg is quite well aware of. So unless you have a
> > TC/SC solution which you willing to contributed to the group, I consider
> > this discussion closed.
> >
> > And if you do have a TC/SC solution, one which have not be considered by
> > the group and UTC and IRG, I would definately love to hear your idea.
> >
> > -James Seng
> 
> 
>