[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] I-D ACTION:draft-ietf-idn-idna-08.txt



[Another copy of this message accidentally got stuck in the pipe
somewhere; I apologize in advance if it eventually comes through.]

"Eric A. Hall" <ehall@ehsco.com> wrote:

> Consider the lowly cache.  What if a client and server exchange
> eight-bit domain names linked with an expiremental RR which the cache
> knows absolutely nothing about?  What if eight-bit owner names are
> explicitly interpreted for that RR?

Excellent thought experiment.

One option that is always open to a cache is to not cache something.
Even if all caching DNS servers stopped caching everything, the DNS
would still work, it would just take a performance hit.

If the caching server receives a record with a name containing 80..FF,
and the caching server is unaware of the semantics of 80..FF for that
record, then the safest and simplest thing to do is not cache it.

However, it could be a little more intelligent.  Surely, if two byte
strings match exactly, then they match in every other way too.  So the
caching server could cache the record and do exact comparisons between
it and queries, and return the record if they match exactly, and let
the query fall through to the authoritative server if they don't match
exactly.

Furthermore, if the caching server knows that 0..7F are ASCII, and is
unsure of only 80..FF, then it can do case-insensitive matching on
0..7F, and exact matching on 80..FF.

Now I finally see an argument for why DNS servers in practice do
comparisons that way.  It's not because RFC 1035 requires it, but
because it's the best effort they can make given 1035's silence on the
issue.

But how does the caching server know that 0..7F are ASCII in domain
names attached to unknown RR types or unknown classes?  We know that
0..7F are ASCII in all character strings, but are all domain labels
necessarily character strings for all future RR types and classes?  Or
can future RR types and classes use non-text domain names?  RFC 1035 is
unclear.  These sentences suggest that domain names are always text:

    For all parts of the DNS that are part of the official protocol, all
    comparisons between character strings (e.g., labels, domain names,
    etc.) are done in a case-insensitive manner.

    The labels in the domain name are expressed as character strings and
    separated by dots.

    \DDD where each D is a digit is the octet corresponding to the
    decimal number described by DDD.  The resulting octet is assumed to
    be text and is not checked for special meaning.

But what does this mean:

    However, future additions beyond current usage may need to use the
    full binary octet capabilities in names,

Does "full binary octet capabilities in names" mean non-text names?  Or
does it merely mean text names that use 80..FF to represent non-ASCII
characters?

Although RFC 1035 is not clear, I think the only sane design is that
all domain names are text regardless of RR types and class.  The data
attached to a name might be non-text, but the name itself is text; 0..7F
are ASCII and 80..FF are text in an unspecified format.  The only safe
way to add non-text names would be via an extension to the protocol that
prevents old servers from participating, like EDNS.

AMC