[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] Requirements I-D



--On Thursday, May 18, 2000 11:07 -0700 "Paul Hoffman / IMC"
<phoffman@imc.org> wrote:

> At 5:55 AM -0400 5/18/00, John C Klensin wrote:
>>    I
>> believe that, if we are going to play those games, there need
>> to be explicit protocol provisions for clients fetching the
>> mapping tables (or mapping table extensions) at regular or
>> orderly intervals.  That isn't impossible, or even very hard
>>...
> There is a different option, one requires much less overhead
> than this. The character repertoire that is allowed in IDN is
> exactly that of ISO 10646 at the time we finish IDN. There is
> a single case-mapping table, a single canonicalization table,
> and so on, at that point. Those tables for those characters
> never change except under the most widespread agreement by
> everyone (meaning, they'll probably never chage). As new
> characters get entered into the repertoire, they get new table
> entries; to date, this happens once a year or less.

Paul,

This is just another variation on "what we have / can forsee
today is good enough forever".  I believe that every single
Internet design decision that has been made on that basis has
gotten us into trouble.  I don't anticipate discovery of major
new human languages/ character sets either, but the possibility
of discovering a new coding mechanism that really needs to be
included during the next decade or so seems nontrivial.  This
example will probably get me into trouble, but, as we start
using the internet over very long distances, I can imagine
needing to take advantage of the coding efficiency of
ideographic languages by doing the equivalent of inventing a new
one (or adding a _lot_ of characters to an existing one).   Now,
we might not need to case-fold those things, and I'd be a little
more comfortable with "characters can be added, but they are
defined as not folding", but, still...

> The rule for IDN (which is arguably what is being used for
> today's US-ASCII host names) would be "MUST NOT return illegal
> code points in responses, SHOULD reject queries with illegal
> code points". A client that did not have the most recent
> repertoire list and the associated tables would get an error
> saying "invalid character". Internet lore would tell folks to
> get their new tables, particularly when there are popular
> additions to the repertoire. But there would be no protocol
> issue with a client that had an out-of-date set of tables, and
> thus no protocol need for automatic updating.

I don't have time right now to work through all of the cases,
but suspect this would not work with all of the variations on
resolver- server- forwarder splits in use today.

>>   As names, I believe they need to be linguistically
>> sensible in each relevant target language or, in some basic
>> way, we fail.
> 
> Fully disagree with the binary-ness of "we fail". This
> statement ignores the long string of past successes we have
> had with the Internet using far-less-than-optimal protocols.
> Even as some of those protocols were being developed, the
> record shows that many features that were obviously desired
> were not put in, some methods were chosen because of the
> personalities of the players at the time, and some guesses
> were just plain wrong. And yet using the Internet works
> acceptably well for its current market and can be made to work
> acceptably well for billions of new users.

Paul, we agree on the history.  But the important characteristic
of those protocols is that the issues were about protocols and
features.  This issue is a cultural one about whether one can
use one's own language in a way that is normal, reasonable, and
grammatically and linguistically correct within the context of
that language/ culture.  And, while I sympathize with your
dislike for binary choices and statements, the reality in this
sort of cultural situation is that, as soon as one moves to
either "well, I get to use my language (and characters)
correctly, but you don't" or "we _almost_ get to do it
correctly", we've pretty much dropped the "names" story and
dropped back to "protocol elements" with a restricted set of
rules.

Now, once we get back to "protocol elements" (where is where we
are today, even for English -- there are perfectly good
"English" names that can be written in ASCII, but not in
email-valid DNS names), then we can make rules that expand the
character repertoire in DNS names but still impose restrictions.
But each such restriction provides an incentive for people to
create non-interoperable variations (or "extensions") to the DNS
to accomodate their languages and cultures.  And I repeat that I
consider that result, or things that stimulate it, to be a
failure case.

     john