[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] One profile for domain names, or many?




on 6/12/2002 7:28 PM Adam M. Costello wrote:

> In the many-profile model (as I understand it), domain names are really
> sequences of bytes, and their interpretation as characters is merely
> a view created by applications for the benefit of human users. All
> comparisons are really exact comparisons except for ASCII letters; the
> illusion of case insensitivity for non-ASCII letters is created by
> applications applying case folding when they convert user-entered text
> into domain-names-which-are-byte-sequences.

You are correct. The ASCII part is confusing but you are correct in that
if an application wants to provide case-mapping it will have to do so in
the profile. There is no lowercase storage in the i18n namespace so the
codes must match precisely. This is a direct result of AMC-Z encoding.

One clarification is that it isn't expectated that other profiles will be
used for human input, although it's not a negative expectation either. The
principle intention is to facilitate the storage of application-specific
domain names, that is all. Although I've typed network names into NetBIOS
applications, I've never actually typed a NetBIOS name (including all of
the markers and codes). I've typed SRV owner names into dig for debugging
purposes, but never for user purposes. I've certainly never typed the
domain names needed to tunnel IP through DNS messages [1]. And so forth.
The principle intention is to allow applications to make use of the domain
names they need to function in the namespace, and that these names
probably won't be entered by users directly. Human input must certainly be
considered, but its not the primary expectation.

Moreover, these examples work with the existing STD13 octets. In that
regard, the intention is to facilitate the migration of STD13 octets. This
is especially important if that usage is going to be prohibited.

> Next, check for any collisions:
>
>     xxxprep(x) == x == fullwidth Latin small letter a
>     yyyprep(y) == y == Latin capital letter A
>     xxxprep(y) ==      Latin small letter a
>     yyyprep(x) ==      error

and then nameprep for feelgood purposes, as stated three times now. But
let's assume they didn't.

>     z = fullwidth Latin capital letter A
> 
> A user could use this same string z to refer to both of these domains.
> If the user asks for an XXX record, z will match one domain; if the
> user asks for a YYY record, the very same string z will match the other
> domain.

Queries contain three elements: qname, qtype and qclass. If all three
match then the query succeeds. If any of them don't match, then the query
will fail.

If the applications (not the user) issue these queries as you have
described them then they will ask for:

     qname=FF21  qtype=XXX  qclass=IN
and
     qname=FF21  qtype=YYY  qclass=IN

Does an owner name exist at FF21?
Does it have the queried RRtype?
Does it have the queried class?

Although I'm more than happy to have duplicates omitted, you can see from
the above why it isn't necessary. The domain name octets have to match AND
there has to be a RRtype as well. If the name is wrong (either in the
query or in the zone) then the names which do match probably don't have
the right RR. If the RR matches then they got what they asked for. If they
got something that they didn't want but some other similar name which also
happens to have the RR bound to it then the profile is flacid and the
developer needs to be shot for writing a profile that allows these errors.

> (By the way, it's not at all clear to me what should happen if the user
> asks for ANY records matching z.)

They MUST get all of the matches that fit in the message. Note that it is
specifically qtype ALL (not ANY). This may appear semantic but its not.
Also note the opposite for qclass ANY (not ALL).

What the application does with those two RRs is up to the application.
Sendmail uses qtype=* to find all of the RRs associated with a mail
domain. What does it do with the RP and TXT RRs that it gets (hint: what
do those RRs have to do with routing mail?).

Really, this is making mountains from molehills. DNS already works this
way today, and the planet has yet to fall off its axis.

> In conclusion, there may very well be more than one right way to
> define IDNs, with tradeoffs among the options.  Based on my current
> understanding of the issues (which I think is better now than ever),
> I would still choose the single-profile model.  I can understand that
> other people might choose differently.

Where will the existing users of eight-bit domain names go if:

 1) they can't stay in STD13 because it has been prohibited

 2) the profile-centric i18n namespace forbids the applications
    from defining their own profiles?

I got news, one of those two changes will be ignored. Standards are
measured by their use, not their intent.


[1] http://slashdot.org/article.pl?sid=00/09/10/2230242&mode=thread&tid=95

-- 
Eric A. Hall                                        http://www.ehsco.com/
Internet Core Protocols          http://www.oreilly.com/catalog/coreprot/