[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] Re: Is space allowed in a hostname?



--On Sunday, 07 July, 2002 20:07 -0700 Doug Ewell
<dewell@adelphia.net> wrote:

> John C Klensin <klensin at jck dot com> wrote:
> 
>> But, is it worth revisiting the decision?   Much as I am
>> concerned about the potential for damage to the Internet
>> resulting from putting this stuff into the DNS, I haven't
>> seen a lot of justification for going around this particular
>> loop again.  I would feel more positive about doing so had
>> matching and mapping issues with pairs of characters not come
>> up before, but they have and people who didn't read
>> themselves in have to bear some of the blame.
> 
>  to which Simon Josefsson <simon plus idn at josefsson dot
> org> replied:
> 
>> I agree, the current solution is not well understood, and
>> there seems to be no time to revisit all issues and wait for
>> people to understand the consequences.
> 
> This must be a new sense of the phrase "I agree" which I was
> previously unaware of.

Actually, probably not.

> John's point was that the NKFC topic had already been
> discussed and decisions made, and that bringing those people
> up to speed who had not been part of the original discussions
> would be unlikely to shed new light on the situation, but
> would simply cause old issues and questions to be needlessly
> replayed.

Yes, but, this is partially due to my not being able to muster
as much sympathy for those who were "not part of the original
discussions" because they sat them out and who are now raising
old issues as I might have for someone who arrived now and
started raising new issues that challenged the conclusions.
And, more important,...

> This is very different from saying that the decisions were
> made without sufficient input or knowledge, and that people
> are being kept in the dark because of an inflexible and overly
> ambitious WG schedule.

Ah.  But I think the decisions _were_ made without sufficient
input, knowledge, and understanding from sufficiently many
people to constitute a legitimate _working group_ decision.  I
don't see that as desirable, and I think an accurate report of
the number of people in the WG who constitute the
personally-informed consensus on this issue --either as an
absolute number or as a percentage of active participants in the
WG-- would be an embarrassment.

That said, I claim that I do understand the issues and that
several other people in the WG also understand them and have
understood them more or less all along.  And, based on that
understanding, I don't believe that revisiting NFKC is going to
get us anywhere.  That isn't because NFKC is right or wrong, but
because its selection involves tradeoffs, that _no_ alternative
is without similar tradeoffs, and that there are no rational
criteria for selecting one set of tradeoffs over another.  

NFKC is, IMO, guilty of one severe sin, and it is the one I
believe Simon was pointing out (but it isn't a new observation
either): it is not consistent when examined on a language by
language and code-point by code-point basis.  If a single
meta-rule could be made up that would identify the "right"
choice for IETF and DNS purposes, it would handle some sets of
characters consistently with the rule and some inconsistently.
But, again, no better solution is on the table, and, in all
likelihood, no objectively better solution is possible: no
single, simple, statement can be made that accurately predicts
whether characters are duplicated or unified in the
Unicode/10646 code set itself.  The allocation decisions may
well have been made rationally, but enough different rules were
applied, and generic enough rules were applied, that it would be
unreasonable to expect complete consistency with  any simple
rule, much less an IETF-optimized one.

It is all going to come down to arbitrary choices at some stage.
Some of those arbitrary choices are going to irritate some
groups of people and seem harmless to others.  If they were made
a different way, different people would be irritated.  If we
were to apply the test of whether NFKC (and many of the other
stringprep/ nameprep decisions) made everyone who had studied it
carefully about equally unhappy, it would almost certainly pass
that test.

So I don't think this one is worth revisiting, not because the
consensus is solid, or because I am feeling serious constraints
from a rigid schedule, but because I see absolutely no reason to
believe that going around the loop again would produce, a
different decision that was objectively better when viewed
globally.

I do believe that there is, plausibly, a different conclusion
that can be reached from this, but reexamining NFKC isn't going
to help with it one way or the other.  That conclusion is,
essentially, the null hypothesis, i.e., that true, global,
internationalization that is language-sensitive and sensitive to
the semantics commonly attached to various characters is
incompatible with the DNS and the way it is organized and does
lookups.   If one reaches that conclusion, there are then, I
believe, only two choices: 

	* one is to decide, explicitly, that some languages (or
	scripts, or sets of choices about normalization and
	canonicalization) are "more equal than others" and that
	the others just lose.
	
	* and the other is to decide that internationalization
	using the DNS is technically infeasible given all of the
	constraints.

I believe the first of these choices is an inappropriate and
impossible one for the IETF to make (whether explicitly or
implicitly).  And I believe that the second has been
insufficiently discussed in the WG, only partially because some
WG members have taken the position that the WG's returning an
answer of "don't do it", rather than a protocol, is so untenable
that it cannot be discussed.

      john