[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] Comments on IDNA/stringprep/nameprep



Kent Karlsson <kentk@md.chalmers.se> wrote:

> A *delta* of the hostnameprep whould/should be used for SRV records;
> i.e. it would use hostnameprep with a specified modification:  "LOW
> LINE is not prohibited"

But what about when the exact classification of a domain name is
unknown?  Maybe the software knows that it's a textual domain name but
does't know whether it's a host or a SRV or something else that was
invented after the software was written.  Should the software err on the
side of being too lenient or too strict?  Should it default to enforcing
the hostname syntax, which is probably the most restrictive syntax there
will ever be?  How could it default to anything else if nothing else
were defined in IDNA?

Nameprep defines a least common denominator--the most lenient
restrictions that *all* textual domain names must adhere to.  Software
can always apply those restrictions without fear of disabling legitimate
domain names, and software can depend on the fact that all legitimate
domain names adhere to those restrictions.  Hostnameprep would not have
those useful properties.

Your idea of having one basic preparation for domain names and deltas
against it for particular kinds of domain names is a good idea, but I
think the deltas should always add restrictions, never remove them.  In
this model, the host restrictions in step 3 of ToASCII are a delta.

> I'm not sure why the document was split into "stringprep" and a
> "profile".

So that the stringprep framework could be reused for other things
(things that are not domain names at all).

> If fullwidth letters are perfectly ok to enter (mapped to normal width
> by nameprep), why is not FULLWIDTH FULL STOP ok?  I think most users
> that encounter that or similar (IDEOGRAPHIC FULL STOP in particular)
> will find the currently specified behaviour idiosyncratic.

What currently specified behavior?  IDNA says nothing about the
separators between labels except that they are "usually" dots.  IDNA
says nothing about how to split domain names into labels, or join labels
into domain names.  It neither requires nor prohibits the acceptance of
fullwidth full stop as a domain label separator.

I'm not sure there is any wide-reaching standard on this.  STD13 (the
DNS standard) specifies that labels are dot-separated in DNS master
files, but doesn't say they should use that syntax in other places.  RFC
(2)822 says that domain labels are dot-separated in email headers.  But
those two standards disagree about the trailing dot: it is required
in master files and forbidden in email headers.  They also disagree
about whether whitespace is allowed around the dots.  Syntax rules for
delimiting domain labels appear many separate standards, but I'm not
aware of any overall requirement regarding domain label delimiters that
applies to all applications/protocols.

> "Future keyboards may generate HYPHEN rather HYPHEN-MINUS (except
> perhaps in "programming language mode", which few will use).  At
> least, hostnameprep should not prevent such a development."

I have no strong opinion either way.  Anyone else?  Should hyphen-like
characters get special treatment just as whitespace characters get
special treatment?

AMC