[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] IDNA section 3.1 requirement 3



I agree that that the discussion on how to represent the IDNs correctly to the users on the apps is very important given the recently spoofing attack.

I am also concern about "standardizing" how applications should flag out IDN or determine dangerous IDNs etc. I dont see why we need standardization in this area when we never really standardize the "lock icon" for https or etc. Instead, I think we should let market force work this out, and lets apps developers innovates on how to display these IDN properly.

OTOH, I think it is useful to document the considerations.

Does this confuse anyone?

-James Seng

On 17-Mar-05, at PM 11:37, Mark Davis wrote:

I agree with John. Use of the raw punycode in place of the represented
characters will cause more user confusion, not less. User sees:

xn--tlralit-byabbe.fr versus xn--tlralit-byabb390f.fr

Presented with a collection of apparently random letters, eyes quickly glaze
over, and people really can't distinguish between two names in any sensible
fashion. Users are not going to memorize which gobbledygook is the one they
want. And this is, over course, vastly magnified once you are outside of
Latin script IDNs.


While I agree on the need to present some mechanism for distinguishing cases
(see tr36), raw punycode is not a good choice. It would even be better to
present raw IP addresses (not that I am really suggesting that).


Mark

----- Original Message -----
From: "John C Klensin" <klensin@jck.com>
To: "IETF idn working group" <idn@ops.ietf.org>
Sent: Wednesday, March 16, 2005 15:43
Subject: Re: [idn] IDNA section 3.1 requirement 3


Adam,

I see where you are going here, but, to some considerable
extent, I think it is a waste of time or worse.

Please take a step back from looking at coding, disclaimers, and
this type of recommendation and put yourself into the place of
the application developer who, other than the registrars and
registries, is the market for the IDNA technology.  Those
application developers, at least the ones who intend to survive,
are quite sensitive to user reactions: users who are unhappy
with user interfaces tend to have thoughts about going
elsewhere.  The IETF has, for years, avoided getting involved
with user interface design for many reasons; being the source of
that unhappiness is one of the reasons.

You are now proposing to transform an IDNA rule that, given
contemporary operating system and display designs, was
essentially meaningless (but harmless) into a fairly complex set
of rules.  If the current rule contracted the experience
application designers wanted to deliver to their users, they
would ignore it... and have been doing so.  If the new rule is
at variance with the target experience, it, too, will be ignored.

Let's look at a few of the things we know about user behavior
and reactions:

    * They do not like looking at punycode or similar "no
    obvious meaning" and/or "ugly" constructions.  If too much
    of it is displayed, they will be unhappy.

    * They don't like the unpredictable and unexpected.  If they
    are used to seeing native characters and punycode suddenly
    pops up, there had better be a good, and plausible, and
    immediately accessible explanation.

    * They get really irritated with repeated and intrusive
    warnings that they don't know how to interpret and what to
    do about.  With too many pop-up alerts, the usual response
    is to click "ok" every time and to go looking for a way to
    shut the alerts off.  Given that there are many legitimate
    cases that fall into your "display punycode" categories,
    especially in the vicinity of certain scripts and languages,
    the application would generate a lot of false positives and
    the user would learn to ignore whatever is written there.

Make whatever suggestions and recommendations you think
appropriate, but wrapping them in conformance language about
what SHOULD/MUST be done just brings discredit on the protocol/
algorithm core of IDNA.

Just my opinion.

     john

--On Wednesday, March 16, 2005 10:13 PM +0000 "Adam M. Costello"
<idn.amc+0@nicemice.net.RemoveThisWord> wrote:

Consider a domain name containing a slash-homograph.

As it stands, IDNA section 3.1 requirement 3 tells
applications that they "SHOULD" display the non-ACE form.  The
security considerations section, much later, "suggests" that
applications provide visual indications of various anomalies
(from which one could extrapolate that the slash-homograph
would benefit from a visual indication).

I think we've seen that these security concerns need to be
less buried, that "visual indications" are too burdensome on
implementations, and that in some cases (like this one) the
recommendation to display the non-ACE form ought to be
withdrawn, or even reversed (that is, recommend the ASCII
form).

There I propose a technical change to IDNA section 3.1
requirement 3. For reference, here it is as it stands now in
RFC-3490 (with one typo corrected):

   3) ACE labels obtained from domain name slots SHOULD be
hidden from       users when it is known that the environment
can handle the non-ACE       form, except when the ACE form is
explicitly requested.  When       it is not known whether or
not the environment can handle the       non-ACE form, the
application MAY use the non-ACE form (which       might fail,
such as by not being displayed properly), or it MAY       use
the ACE form (which will look unintelligible to the user).
Given an internationalized domain name, an equivalent domain
name       containing no ACE labels can be obtained by
applying the ToUnicode       operation (see section 4) to each
label.  When requirements 2 and       3 both apply,
requirement 2 takes precedence.

Here is my proposed replacement:

--begin--

   3) When a domain label occupying or obtained from a domain
name       slot is to be shown to a user, it SHOULD NOT simply
be shown in       whatever form it was found in; before being
shown it SHOULD be       forced into either ASCII form (which
can be obtained by applying       ToASCII) or non-ACE form
(which can be obtained by applying       ToUnicode, see
section 4), according to the first applicable of       the
following rules:

      a) If requirements 2 and 3 both apply, requirement 2
takes          precedence, and the ASCII form MUST be used.

      b) When the user has explicitly requested to see one
form or the          other, that form SHOULD be shown.

      c) When it is known that the environment cannot handle
the non-ACE          form, the ASCII form SHOULD be shown.

      d) If the non-ACE form contains any character outside
Unicode          categories L (letter), N (number), and M
(mark), other than          U+002D hyphen-minus, the ACE form
SHOULD be shown.

      e) If the application determines that showing the
non-ACE form          would pose too great a risk of
misleading the user, the ASCII          form MAY be shown.
Applications MAY use complex heuristics to          estimate
this risk, but SHOULD try to minimize the negative
impact on legitimate usage of internationalized domain names.

      f) When it is not known whether the environment can
handle the          non-ACE form, the application MAY show the
non-ACE form (which          might fail, such as by not being
displayed properly), or it MAY          show the ASCII form
(which will look unintelligible to the user          if it is
an ACE).

      g) In general, when rules a-f do not apply, the non-ACE
form          SHOULD be shown.

      Rules c, d, and e above apply tests to "the" non-ACE
form, but       in fact there can be many non-ACE forms that
differ only in       capitalization and/or normalization.  If
a given non-ACE label       fails some test, it MAY be
converted to an equivalent non-ACE       label by applying the
map and/or normalize steps of [NAMEPREP] (or       all the
steps), and then given another chance to pass the test.

--end--

Thoughts?

AMC