[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] I-D ACTION:draft-ietf-idn-idna-08.txt



[This message replies to messages from both Eric Hall and John Klensin.]

"Eric A. Hall" <ehall@ehsco.com> wrote:

> > I was suggesting that conversion between ASCII and non-ASCII never
> > be done inside the infrastructure except possibly when it uses the
> > well-known standard profile; for application-specific profiles, I
> > was suggesting that conversion be done only at the edges.  This
> > model avoids profile-agnostic conversion; only entities that know
> > the proper profile perform the conversion, which simplifies the
> > security analysis.
>
> If the query failed at some remote point in the infrastructure (like
> the authoritative zone for the owner name in question), then the
> infrastructure would get whacked twice every time the application went
> to ask for that data (once for the EDNS form, again for the ACE form).

I guess I still wasn't clear enough.  For labels that use
application-specific profiles, there would not be two forms inside the
infrastructure, only one form.  The application would always convert
to/from that form at the edges as necessary.  Inside the infrastructure,
dual forms would be used only for labels that use the standard profile.
This adds a small burden to applications that don't use the standard
profile, but it significantly simplifies the model.

> This is a very serious concern and must be prevented at all costs.

I think that's an exaggeration.  A factor-of-two increase in overhead
(which I'm not proposing, but even if I were) would not bring down the
network.  It would be small compared to the exponential growth of the
internet.

> How significant would the change be to have the application call the
> appropriate profile prior to calling ToASCII, rather than having
> ToASCII make the call itself?

Whether Stringprep is called inside ToASCII or before ToASCII is beside
the point.  The real question is "How significant would the change be
to allow IDNA to use more than one profile?"  It would be a fundamental
change that greatly complicates the model.

Here are some more examples of the complications:

What happens when two labels that use different profiles are compared?
They might be identical in their UTF-8 forms but different in their
ASCII forms, so you might get a different answer depending on which form
they're in.  Do we need to prohibit the comparison of labels that use
different profiles?  Old software cannot possibly be aware of such a
prohibition.  What are the security implications?  How can we prohibit
such comparisons and still allow DNS ANY queries, which by their nature
need to compare the same query string against many names associated with
many different RR types?

The same issue arises when a domain label is copied from one slot to
another, and the two slots call for different profiles.  Does that need
to be prohibited?  Old software cannot be aware of the prohibition.
What are the security implications?

What if someone wants to associate multiple RR types with the same name,
and those RR types call for different profiles?

What profile should the owner name of a CNAME record use?  Should there
be a single profile for all CNAME records?  If so, which profile?  Or
should the owner of a CNAME record use the same profile as the name it
refers to?  If so, what happens when the latter name is in another zone
and its RR type gets changed without notice?  What are the security
implications?

Is it possible to create subdomains of domains that use non-standard
profiles?  If so, a single domain name might need a different profile
for each label.  But each name is associated with only one RR type.  The
RR type might imply which profile to use for the first label (or the
first k labels), but what about the rest?

Eric, you are obviously a smart guy, and it wouldn't surprise me if
you could solve all these issues, but I think the multi-profile model
is too complex and too subtle to be worth its modest benefits.  The
single-profile model covers the common case, and applications that can't
use the standard profile can still do what they need to do, as long as
they do it outside the infrastructure (before converting their data
types into domain names, and after converting domain names back into
their data types).

An analogy might help clarify that model.  Each U.S. citizen typically
has a social security number, which is nine digits.  It is possible (and
in fact trivial) to map social security numbers onto domain labels; for
example, 123456789.nicemice.net.  Domain labels can contain letters, but
that doesn't mean social security numbers can contain letters.  Domain
labels and social security numbers are two entirely distinct data types.

Similarly, data types like email address local parts are not domain
labels, even if they are sometimes mapped onto domain labels.  Just
because domain labels can (soon) contain non-ASCII characters, that
doesn't mean email address local parts can.  RFC 2822 says they are
ASCII, so they can't contain non-ASCII characters until someone defines
internationalized email address local parts.

IDN labels do not preserve case or non-normalized forms.  If an
application needs to map a data type onto a domain label, and the
data type contains non-ASCII text for which mixed-case forms or
non-normalized forms must be preserved, then the trivial mapping simply
won't work.  The application will need to design a more complex mapping
to suit its needs.  For example, when internationalized email address
local parts are defined, that specification will have the opportunity to
say how they get mapped onto domain labels.

Whenever you try to use one data type to represent another data type,
you might get lucky and find that it's trivial, or there might be a
mismatch and you might have to do more complex mapping/encoding.  That's
just the way it goes.

John C Klensin <klensin@jck.com> wrote:

> Either this is a DNS protocol, in which case it needs to specify
> applicable RRs and fields and may reasonably specify a <foo>prep
> profile --even if it can also be used for names (domain and otherwise)
> in other contexts-- or it is a generic internationalization protocol,
> in which case applicability to the DNS and domain names needs to be
> specified somewhere else.

It is neither.  Domain name is a data type that is used in many
many protocols, and DNS is just one of them.  IDNA is a technique
for enabling applications to use non-ASCII characters in domain
names, regardless of which protocol/interface those domain names are
traversing.  IDNA is not a generic internationalization protocol; it
applies only to domain names.  IDNA is not an extension of DNS; it is
not specific to any particular protocol.

[In another note you point out several places where the IDNA spec seems
to get too far into the guts of DNS.  I agree with a lot of that.  I
personally have kept a closer watch on sections 2 through 4 (which I
wrote the first draft of, and which say nothing about DNS) than on the
other sections (which existed before I got here).]

> (4) To what extent should out-of-band communications between
> applications, which utilize strings which the applications might
> construe as internationalized domain names, influence the design of
> IDNA and, if so, what should the impact be?

I'm not sure what you mean as "out-of-band" and "might construe".
The domain names that appear in SMTP MAIL and RCPT commands are,
I would say, very much in-band, and there's no "construing", they
are unmistakably domain names.  If we are going to enable non-ASCII
characters in domain names in DNS, but not in other protocols, what's
the point?

AMC