[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] One profile for domain names, or many?




on 6/13/2002 4:53 PM Adam M. Costello wrote:

> The tricky part is that some of these subtypes are already in wide use
> in a wide variety of protocols without having ever been formalized.

> my example list of subtypes, my intent is that the host field of a URI,
> the exchangers listed in an MX record, and the domain field of an HTTP
> cookie are all of type "host name", but no such connection has never
> been formally drawn between these various protocol elements.

I think I can speak for John when I say that this is what he and I both
want to see fixed. The vagaries and loose standardization has been more
trouble than it is worth.

If you look at the BNF for most of these protocols they are just LDH
sequences already anyway. Also, the pre-2181 history has it that most
systems only recognize LDH (I've even run across applications that
couldn't parse 3com.com because it wasn't explicitly allowed until RFC
1123). If the i18n namespace is being invented for the sole purpose of
standardized representation and interpretation of i18n domain names,
there's really not any reason to go back and re-re-fix the standard
RRtypes so that they are LDH hostnames.

Finally, declaring a <hostname> profile allows the other protocols to
reuse that BNF in their syntax, rather than requiring them to define it
for their own specific use.

> I suppose the way to attempt to accomodate your model in IDNA (but I'm
> not sure it could be done with sufficient rigor) would be something
> like this:  ToASCII and ToUnicode would take a Stringprep profile as
> a parameter.  This would be yet another thing that the application
> would have to select, along with the AllowUnassigned flag and the
> UseSTD3ASCIIRules flag.  The IDNA spec would require that Nameprep and
> only Nameprep be used with host name labels and mail domain labels.
> Furthermore, applications would be forbidden from using IDNA with any
> other label types until profiles for them had been standardized.

Again, I would make the applications do the stingprep management, since
they have to do so anyway for basic security measures. All IDNA needs to
be is a simple codec that converts inputs and outputs (unless I completely
misunderstand its inner workings).

>>There is another reason for going this route, which is that the
>>presence of eight-bit codes in the STD13 namespace makes managing
>>UCS names in dual-mode servers extremely difficult.  It essentially
>>requires that servers flag eight-bit domain names as NOT UCS so that
>>the names don't get looked at when an EDNS/deACE'd query comes in.
> 
> I came to that realization this morning.  A hypothetical newDNS protocol
> that allowed text labels to be represented using UTF-8 while still
> supporting the RFC-1035 sequence-of-bytes labels would need an extra
> bit per label indicating whether byte values 80..FF are UTF-8 text, or
> opaque bytes like in DNS.  A hypothetical new resolver interface would
> likewise need this extra bit per label if it wanted to support both
> text-labels and byte-labels.
> 
> I don't know if I'd call that "extremely difficult".

The resolver doesn't have to keep track of this (it's just passing around
octet sequences) but servers (including caches) have to. Otherwise an
STD13 octet sequence which has been stored for future reference gets read
as an ISO-8859-1 string and converted.

Detecting the format of an incoming sequence is easy: the label is either
STD13 or EDNS, and if it is STD13 with eight-bit codes then it is an STD13
eight-bit label. That's easy to deal with for comparison purposes at that
particular moment but storing it and preventing comparison later isn't
very simple. Also, somebody might expect the implementor to convert the
eight-bit A RR into ACE or EDNS and scream if the implementor doesn't,
which gets really ugly.

However, if the default policy (and the existing standards-track RRs) is
changed so that eight-bit codes aren't used by default -- and new RRs can
only define STD13 octets if they are willing to forego conversion -- then
storing these RRs means it is a little simpler with the flag because it
becomes a distinct exception with clearcut logic and there can be no
confusion that they are never subject to conversion. It also makes
enforcing the policy change somewhat simple since violators are just
rejected by the compliant server. Fix or die. The only remaining parties
who are affected are those entities who have gone off and done their own
thing, and managing the flag mapping becomes their problem rather than
everybody's problem.

-- 
Eric A. Hall                                        http://www.ehsco.com/
Internet Core Protocols          http://www.oreilly.com/catalog/coreprot/