[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] WG last call summary




"Adam M. Costello" wrote:
> 
> "Eric A. Hall" <ehall@ehsco.com> wrote:
> 
> > > For a given domain name slot (protocol element, structured data
> > > field, function argument, etc), the governing specification...does
> > > not dictate what you may do with the name after you've read it.
> >
> > True for message fields, false for the loosely-coupled data-types
> > which are independent of a particular of any particular protocol
> > message (email addresses, Message-ID, URLs, etc).
> 
> I'm not getting your point at all.  Apparently you are concerned that
> there exist scenarios where ToUnicode ought to be forbidden/discouraged,
> but the IDNA draft allows/encourages it.  Maybe if you described a
> specific example of such a scenario, I'd start to understand.

I don't know how to say this any different than I have been saying it. The
ToUnicode step changes well-known and widely-used data-types in such a way
that the data no longer conforms to the rules which govern that data.
These changes are harmful when the extended data-types are reused, whether
by copy-n-paste, program output, direct transcription, or whatever.

An example I have already given is Message-ID. Basic functionality will be
broken if the structure of the well-known and widely-used Message-ID
data-type as defined in STD11 is extended beyond the scope of the
governing spec. The extended, STD11-incompatible form will break on search
inputs that don't allow those values, it will break if the search input
accepts it and passes it to a remote system via an IMAP SEARCH or NNTP
XPAT operation, it will break if a user puts it into a news or http URL
which gets transliterated and then percent-hacked, and it will break if a
user types it into a web URL as a parameter to a server-based search
function. That's just search and fetch, nevermind other problems like
damage to threading that results from an extended Message-ID which is
manually added to See-Also or References, corrupted spam complaints, and
the dozens of other common uses for this well-known, widely-used,
STANDARDIZED data-type.

This does not mean that users with mail systems located in IDN zones
cannot generate Message-IDs from those domains, only that they have to
continue generating them with LDH domains (via ToASCII). In that scenario,
Message-ID would continue to function across all of the services which
utilize the format specified in STD11.

That is the way it MUST be until that data-type is either extended (after
months of rancorous debate, surely), or it is replaced with something
else, or the WG in charge of that standard choose to allow for
transliteration (THEM, NOT US). Note well: the subsequent rancorous
debates are not the problems of this WG anymore than extending the
data-types is our within our scope of responsibility. They are smart
enough to figure out what they can get away with on a case-by-case basis.

Chorus: Implicitly extending all such data-types currently in use on the
Internet (as the current draft does) is willfully causing breakage, and
goes well beyond the authority of this WG.

-- 
Eric A. Hall                                        http://www.ehsco.com/
Internet Core Protocols          http://www.oreilly.com/catalog/coreprot/