[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] Re: process



Erik van der Poel <erik@vanderpoel.org> wrote:

> Stephane Bortzmeyer wrote:
>
> >The issue has been discussed at length. See the "Security
> >Considerations" of RFC 3490.
>
> It is true that some of the issues are pointed out by that section, so
> the registries and application developers have to pay attention. But
> one might argue that we have recently been discussing a new *class*
> of homographs.  The RFC mentions "multiple scripts" and one and l.
> These two refer to letters such as Cyrillic small 'a' and digits
> (the "one").  But the slash homograph recently raised on this list
> might be considered to be a new class of homograph (punctuation), not
> specifically indicated in the RFC.  Not only is this type of character
> different from letters and digits, it is arguably even more dangerous
> than the script-based (Cyrillic) attack, since it can be done in a
> domain label that is not under the control of the registries.

We knew that punctuation could be hazardous, and we expected that it
would be severely restricted by the registries.  I don't think we
understood that punctuation could be used to spoof top-level domains
even if every top-level registry prohibited punctuation.

As for application implementors, we made no attempt to mention
every kind of hazard we had thought of; we just wanted to give a
motivating example to start them thinking about what safeguards would be
appropriate for their applications.

Maybe the emerging UTR#36 will become the canonical reference for
spoofing hazards, in which case any revision of the IDNA spec should
certainly cite it.

>    No security issues such as string length increases or new allowed
>    values are introduced by the encoding process or the use of these
>    encoded values, apart from those introduced by the ACE encoding
>    itself.
>
> What does this mean, exactly?  Are any new allowed values introduced
> by the ACE encoding?  This part could be clearer.

It might mean that IDNA does not introduce any new ASCII domain names,
it only introduces new non-ASCII domain names.  In any case, that's
true.

> Also, O and 0 are mentioned, but is this technically correct?  I mean,
> aren't uppercase ASCIIs supposed to be lowercased?

Nameprep (which includes case-folding) is used for encoding and
comparing domain names, not for displaying them.  At least, IDNA makes
no suggestion that Nameprep be used for displaying domain names.  If an
IRI contains a mixed-case non-ASCII domain name, IDNA suggests applying
ToUnicode to each domain label, which will internally use Nameprep
before looking for the ACE prefix, but then, not finding the ACE prefix,
it will return the original un-Nameprep'd input for display.

The draft UTR#36, unlike the IDNA spec, recommends using Nameprep for
displayed domain names, to simplify detection of confusable names.

AMC