Consider a domain name containing a slash-homograph.
As it stands, IDNA section 3.1 requirement 3 tells
applications that they "SHOULD" display the non-ACE form. The
security considerations section, much later, "suggests" that
applications provide visual indications of various anomalies
(from which one could extrapolate that the slash-homograph
would benefit from a visual indication).
I think we've seen that these security concerns need to be
less buried, that "visual indications" are too burdensome on
implementations, and that in some cases (like this one) the
recommendation to display the non-ACE form ought to be
withdrawn, or even reversed (that is, recommend the ASCII
form).
There I propose a technical change to IDNA section 3.1
requirement 3. For reference, here it is as it stands now in
RFC-3490 (with one typo corrected):
3) ACE labels obtained from domain name slots SHOULD be
hidden from users when it is known that the environment
can handle the non-ACE form, except when the ACE form is
explicitly requested. When it is not known whether or
not the environment can handle the non-ACE form, the
application MAY use the non-ACE form (which might fail,
such as by not being displayed properly), or it MAY use
the ACE form (which will look unintelligible to the user).
Given an internationalized domain name, an equivalent domain
name containing no ACE labels can be obtained by
applying the ToUnicode operation (see section 4) to each
label. When requirements 2 and 3 both apply,
requirement 2 takes precedence.
Here is my proposed replacement:
--begin--
3) When a domain label occupying or obtained from a domain
name slot is to be shown to a user, it SHOULD NOT simply
be shown in whatever form it was found in; before being
shown it SHOULD be forced into either ASCII form (which
can be obtained by applying ToASCII) or non-ACE form
(which can be obtained by applying ToUnicode, see
section 4), according to the first applicable of the
following rules:
a) If requirements 2 and 3 both apply, requirement 2
takes precedence, and the ASCII form MUST be used.
b) When the user has explicitly requested to see one
form or the other, that form SHOULD be shown.
c) When it is known that the environment cannot handle
the non-ACE form, the ASCII form SHOULD be shown.
d) If the non-ACE form contains any character outside
Unicode categories L (letter), N (number), and M
(mark), other than U+002D hyphen-minus, the ACE form
SHOULD be shown.
e) If the application determines that showing the
non-ACE form would pose too great a risk of
misleading the user, the ASCII form MAY be shown.
Applications MAY use complex heuristics to estimate
this risk, but SHOULD try to minimize the negative
impact on legitimate usage of internationalized domain names.
f) When it is not known whether the environment can
handle the non-ACE form, the application MAY show the
non-ACE form (which might fail, such as by not being
displayed properly), or it MAY show the ASCII form
(which will look unintelligible to the user if it is
an ACE).
g) In general, when rules a-f do not apply, the non-ACE
form SHOULD be shown.
Rules c, d, and e above apply tests to "the" non-ACE
form, but in fact there can be many non-ACE forms that
differ only in capitalization and/or normalization. If
a given non-ACE label fails some test, it MAY be
converted to an equivalent non-ACE label by applying the
map and/or normalize steps of [NAMEPREP] (or all the
steps), and then given another chance to pass the test.
--end--
Thoughts?
AMC