[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] I-D ACTION:draft-ietf-idn-nameprep-07.txt



Yves Arrouye <yves@realnames.com> wrote:

> a bunch of ASCII characters are not prohibited anymore.

Yep.

> It may be that one expects IDN applications to do more checks than
> just Nameprep

Indeed.  Please look at the IDNA draft.  Nameprep is just one of a whole
list of steps to be performed.

> why bother with the 0000-0020 and 007F

I've wondered about that myself.  The nameprep draft gives the reasons
for all the prohibited code points.  I'm not sure how convincing all the
reasons are, but you can look at them and judge for yourself.

The reason some printable ASCII characters were prohibited in earlier
versions of nameprep is that they are prohibited in host names, but
that was not a good reason, because we want nameprep to be usable not
only for host names, but for all domain names, including names like
_ldap._tcp.foo.net (see RFC 2782 about SRV records).

The special restrictions on host names (not only the LDH restriction,
but also the leading/trailing hyphen restriction) are now checked in
ToASCII in the IDNA spec, not in nameprep.

For convenience, here are steps 2 and 3 of ToASCII:

    2. Perform the steps specified in [NAMEPREP].

    3. Host-specific restrictions: If the label is part of a host name
       (or is subject to host name syntax rules) then perform these
       checks:

         * Verify the absence of non-LDH ASCII code points; that is, the
           absence of 0..2C, 2E..2F, 3A..40, 5B..60, and 7B..7F.

         * Verify the absence of leading and trailing hyphen-minus; that
           is, the absence of U+002D at the beginning and end of the
           sequence.

Paul Hoffman / IMC <phoffman@imc.org> wrote:

> If you see a way to get "$$$" though IDNA, but all means let us know,
> because it should not.

If an application is using the domain name $$$.com and knows that
it's not a host name and not subject to the host name syntax rules (I
can think of no example scenarios today, but maybe someday this will
happen), then $$$ will indeed pass through ToASCII successfully (and
unchanged), as it should.

AMC