[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Inputting mixed SC/TC (Re: [idn] A question...)



--On Monday, 11 February, 2002 09:34 -0800 liana Ye
<liana.ydisg@juno.com> wrote:

> Dear John:
> 
> I appreciate your understanding of all the problems the IDN 
> does not solve. The differences between the WG and the 
> CDN group is not a political issue at its base.

I think I understand that, as do, I think, most of the
participants in the WG.  If I correctly understand what they are
saying when they make comments about political issues, those
comments are based on three things:

(i) Whomever and whatever initiated the process, much of the
IETF reacts very badly to attempts to influence a working group
(or other effort) by letter-writing campaigns that involve a
large number of people sending the same note or notes,
especially when those people have not be participating in the
WG.  Such note-sending campaigns are a political action, not an
engineering one.  They are also considered in such bad taste
that they may damage the position they are trying to advocate:
the reaction is, more or less, "they have completely run out of
technical arguments and  are falling back to creating a lot of
noise; this action indicates that even they do not believe in
their technical arguments".

(ii) We have a continuing communications disconnect in which
some members of the Chinese-speaking community are, apparently,
so astonished that their positions are not immediately accepted
by the WG that they assume that no one [else] in the WG cares
about the impact of this work on names written in Chinese
characters.  Again, that leads to statements being made that
sound to others as if reasoning has been abandoned in favor of
political actions.

(iii) There is also a perception that the CDNC group, and some
others, are ignoring or dismissing the problems with Japanese
and Korean that would be induced by special mappings for Chinese
and, similarly, that many careful explanations of why
specialized TC<->SC handling is not analogous to case-mapping in
alphabetic character sets.  Again, this is a failure to
communicate (perhaps on both sides) at a very fundamental level
that is leading to ill-feeling and beliefs that decision are
being made on a political or emotional basis, rather than with
full understanding and acceptance of the engineering constraints
and tradeoffs.

> WG is trying 
> to push TC/SC out of IDN.  The CDN and the others argue
> for them to be dealt with in IDN. 

Again, let me try to explain this a little bit differently.
There is a very strong conviction in the IETF that one of the
issues, perhaps the key issue, in Internet protocol design is
scaling.  The scaling concerns are usually though of as applying
to "size of the Internet" issues -- e.g., we try to avoid
standardizing anything that can work only in a restricted
environment or under other "small network" constraints.  But
they also apply to partial solutions: if a particular aspect of
a problem or design activity seems to require a comprehensive
solution, but none is immediately available, the IETF will tend
to avoid a partial solution until the comprehensive one and a
migration strategy are well understood.

The other difficulty with the "consumer confusion" aspects of
the CDN discussion is that many of the comments seem to ignore a
point that has been made repeatedly in the WG: these issues
exist, in some form, for almost every language and script we
have investigated. All of the "equivalent character"
discussions, the similarity of some characters among, e.g.,
"Latin", Cyrillic, and Greek scripts, and related issues point
to considerable possibility of user confusion, misrouted
queries, and so on.  One might even suggest that the CDNC
complaints have been inappropriately focused on Chinese
characters and that the statement should have been "this will
cause problems all over the world, and the IDN effort should
just stop".  The people in the WG who have put in huge amounts
of effort trying to get _something_ to work probably would not
accept that either, but the position would more nearly recognize
the realities of likely IDN usage.

And that brings me to what I think it is the real problem here,
and it a problem that many of us, including those taking the
lead on IDNA (and nameprep, etc) have understood for most of the
time that the WG has been working on these problems.  There are
many problems with the use of words in languages that cannot be
dealt with in the DNS and some other solution will be needed.
Full TC<->SC matching (of all combinations) appears to fall into
that category.  So does the Greek-Cyrillic-Latin problem, and
the ASCII 0<->O and 1<->l problems.  So does a really
satisfactory solution to the "invisible character" problem, some
issues with Arabic/Hebrew/Yiddish vowels, and so on.  Mark's
notion of using presentation mechanisms to identify unusual
character combinations is, I think, quite ingenious but it isn't
part of the DNS either.  Your notions about phonetic mappings
fall into this category as well.  But the two things that all of
these problems and approaches have in common is the potential
for user confusion (or fraud) --some of it serious-- and the
fact that none of them can be dealt with by the simplistic
matching rules of the DNS.  Instead, we need user choices, or
likelihood functions, or fairly serious language recognition, or
knowledge the DNS doesn't have, or heuristics of some [other]
variety.  And the problems aren't limited to Chinese, even
though they manifest themselves differently.

Now I suggest that almost everyone who has been participating
actively in this effort knows that by now.  We know that other,
non-DNS, mechanisms are going to be needed to accomplish
satisfactory internationalization.  Given that, the hard problem
is whether it makes sense to tamper with the DNS at all.  And,
if one is going to do that, what the right set of constraints is
to prevent some of the worst possible damage.   My reading of
the consensus in the WG is that they believe that eliminating
(or postponing) every script that raises problems is not a
solution: we would rapidly have to eliminate all scripts,
possibly including the LDH subset of ASCII.

I think the WG believes that there are three options, and that
most of its members has eliminated the third from consideration:

	(a) Use Nameprep, or something very much like it, which
	deals with mappings that are complete (i.e., there is no
	controversy about them and no need for examination of
	surrounding characters, other context, or language
	information to do them properly) and which excludes
	characters that are problematic in any context in which
	they appear.
	
	(b) Adopt a very strict "identifier" rule in which
	nameprep becomes unnecessary: characters from the
	permitted repertoire can be intermixed in any way,
	without restrictions, comparisons are made on those
	characters only, and the users will somehow learn to
	cope with the results.
	
	(c) Give it up, keep text labels in the DNS restricted
	to the LDH rules, and do _all_ internationalization
	"somewhere else".

> If you consider IDN is a part of DNS, then the CDN group 
> say "NO, NO, NO" with force for you and the WG to raise 
> eyeballs.   And I hope they have accomplished it with 
> all their protests.

I fear not.  Many of us were fully aware of the problems, and
very concerned about them, before now, and even before Salt Lake
City.    For some of us, the protests have been successful only
in taking up time that should have gone into the "above DNS"
work.  For others, they have distracted from the effort to
carefully review and adjust the details of what nameprep should
be doing in areas where it can constitute a nearly-100%
solution. And, as I have suggested above and elsewhere, they
have led many people to the conclusion (whether correct or
incorrect) that the real goal of the CDNC protest was to disrupt
the work of the WG without offering real alternatives or a way
forward.

> If you consider IDN is above DNS then you are agreeing 
> with CDN group. 

As I trust you know, I have been suggesting that much critical
IDN work will have to be done "above DNS" since before either
the WG or MINC got started.  But to agree on that principle does
not necessarily imply agreement on details of what the WG should
do.

> If you admit that you are not an expert in dealing with 
> Han charaters processing then you should give a good 
> and hard study regarding what they have been saying 
> all along.
> 
> If you are understand what Xiang Deng is saying then 
> don't introduce political arguements.  It doesn't help the 
> communication at all. 

I didn't raise any political arguments in the note of mine that
you quoted, nor did I use that word.  "Policy" issues and
decisions are another matter entirely.   And, while I assume
neither you nor Xiang Deng did so either, I didn't set off the
letter-writing campaign.

regards,
    john