[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] I-D ACTION:draft-ietf-idn-idna-08.txt



--On Monday, 03 June, 2002 10:41 -0500 "Eric A. Hall"
<ehall@ehsco.com> wrote:

>> As has been pointed out, "undefined" means "undefined" and,
>> given interoperability and the robustness principle, something
>> that should not be (attempted to be) used.  It doesn't mean
>> "non-alphabetic" -- that constitutes defining it.
> 
> They are "undefined" for interpretation, but they are
> "defined" as eight-bit code values for the purposes of storage
> and transfer.

ok

> Consider the lowly cache. What if a client and server exchange
> eight-bit domain names linked with an expiremental RR which
> the cache knows absolutely nothing about? What if eight-bit
> owner names are explicitly interpreted for that RR? Clearly
> there is only one interpretation that works in the distributed
> model.

No, there are two.  One is your interpretation, i.e., that all
labels are to be construed as character strings, with ASCII and
case mapping behavior in 0x00-0x7F and exact matching otherwise.
The other is that true binary labels are permitted by the DNS
spec: if they are, then 0x00-0x7F had better not be case-mapped
and a cache must know how to interpret an RR in order to answer
a query on that RR.

I.e., my reading of RFC 1035, section 2.3.3, "Character Case",...

		For all parts of the DNS that are part of the official
		protocol, all comparisons between character strings
		(e.g., labels, domain names, etc.) are done in a
		case-insensitive manner.  At present, this rule is in
		force throughout the domain system without exception.
		However, future additions beyond current usage may need
		to use the full binary octet capabilities in names,
		[...]

is that "full binary octet capability" implies not only that
those bits might be turned on, but that comparisons might be
required to be binary (i.e., exact) ones, even within the
0x00-0x7F range.

> I suppose you could argue that the  ~standard RRs from STD13
> are special and do not have any such meaning, although I would
> argue against it.

And we would certainly continue to disagree, since I read "all
parts... that are part of the official protocol" and "at
present... However, future additions beyond current usage..." as
saying exactly that.  This is reinforced, for me, by section 3.1
of RFC 1034, which says, in part

		but domain name comparisons for all present domain
		functions are done in a case-insensitive manner,
		assuming an ASCII character set, and a high order zero
		bit. 

and

		The rationale for this choice is that we may someday
		need to add full binary domain names for new services;
		existing services would not be changed.

which appears, to me, to constrain those "existing services"
(which I take to be those "standard" RRs in Class IN) to ASCII.
That interpretation makes non-ASCII octets in _their_ labels a
protocol violation, not merely an "undefined" state -- see below.

Now a slightly different reading of 1035 -- and, in particular,
its syntax references to "character strings" -- globally
prohibits binary labels (e.g., numbers with values > 255) as
distinct from octet strings which might have the high bit of
such octets set.  That would, I think, be an unfortunate
restriction on future extensibility, but it makes things a bit
less complicated.  But I can't get from "they are all
characters" to "case-map ASCII graphics only, and exact-match
everything else" -- it appears to me that matching rules for the
above 0x7F range would then be strictly undefined until rules
are standardized for their interpretation (in the existing RRs
or otherwise).  The advantage of that "character string" reading
is that it would be possible to write RFCs that specify how
those "high bit" octets should be processed.  If "full binary"
is permitted, then I think such octets are prohibited forever in
the vintage-1035 "standard"  RRs.

> You could also argue that sending eight-bit
> codes with the ~standard RRs from STD13 is a bad idea but not
> prohibited and I would agree with you.

I would accept that argument and then note that the robustness
principle slides [generally] "sending [is a] bad idea" into
"prohibited" rather quickly.

   john