[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [idn] IDNA section 3.1 requirement 3
I agree with John. Use of the raw punycode in place of the represented
characters will cause more user confusion, not less. User sees:
xn--tlralit-byabbe.fr versus xn--tlralit-byabb390f.fr
Presented with a collection of apparently random letters, eyes quickly glaze
over, and people really can't distinguish between two names in any sensible
fashion. Users are not going to memorize which gobbledygook is the one they
want. And this is, over course, vastly magnified once you are outside of
Latin script IDNs.
While I agree on the need to present some mechanism for distinguishing cases
(see tr36), raw punycode is not a good choice. It would even be better to
present raw IP addresses (not that I am really suggesting that).
âMark
----- Original Message -----
From: "John C Klensin" <klensin@jck.com>
To: "IETF idn working group" <idn@ops.ietf.org>
Sent: Wednesday, March 16, 2005 15:43
Subject: Re: [idn] IDNA section 3.1 requirement 3
> Adam,
>
> I see where you are going here, but, to some considerable
> extent, I think it is a waste of time or worse.
>
> Please take a step back from looking at coding, disclaimers, and
> this type of recommendation and put yourself into the place of
> the application developer who, other than the registrars and
> registries, is the market for the IDNA technology. Those
> application developers, at least the ones who intend to survive,
> are quite sensitive to user reactions: users who are unhappy
> with user interfaces tend to have thoughts about going
> elsewhere. The IETF has, for years, avoided getting involved
> with user interface design for many reasons; being the source of
> that unhappiness is one of the reasons.
>
> You are now proposing to transform an IDNA rule that, given
> contemporary operating system and display designs, was
> essentially meaningless (but harmless) into a fairly complex set
> of rules. If the current rule contracted the experience
> application designers wanted to deliver to their users, they
> would ignore it... and have been doing so. If the new rule is
> at variance with the target experience, it, too, will be ignored.
>
> Let's look at a few of the things we know about user behavior
> and reactions:
>
> * They do not like looking at punycode or similar "no
> obvious meaning" and/or "ugly" constructions. If too much
> of it is displayed, they will be unhappy.
>
> * They don't like the unpredictable and unexpected. If they
> are used to seeing native characters and punycode suddenly
> pops up, there had better be a good, and plausible, and
> immediately accessible explanation.
>
> * They get really irritated with repeated and intrusive
> warnings that they don't know how to interpret and what to
> do about. With too many pop-up alerts, the usual response
> is to click "ok" every time and to go looking for a way to
> shut the alerts off. Given that there are many legitimate
> cases that fall into your "display punycode" categories,
> especially in the vicinity of certain scripts and languages,
> the application would generate a lot of false positives and
> the user would learn to ignore whatever is written there.
>
> Make whatever suggestions and recommendations you think
> appropriate, but wrapping them in conformance language about
> what SHOULD/MUST be done just brings discredit on the protocol/
> algorithm core of IDNA.
>
> Just my opinion.
>
> john
>
> --On Wednesday, March 16, 2005 10:13 PM +0000 "Adam M. Costello"
> <idn.amc+0@nicemice.net.RemoveThisWord> wrote:
>
> > Consider a domain name containing a slash-homograph.
> >
> > As it stands, IDNA section 3.1 requirement 3 tells
> > applications that they "SHOULD" display the non-ACE form. The
> > security considerations section, much later, "suggests" that
> > applications provide visual indications of various anomalies
> > (from which one could extrapolate that the slash-homograph
> > would benefit from a visual indication).
> >
> > I think we've seen that these security concerns need to be
> > less buried, that "visual indications" are too burdensome on
> > implementations, and that in some cases (like this one) the
> > recommendation to display the non-ACE form ought to be
> > withdrawn, or even reversed (that is, recommend the ASCII
> > form).
> >
> > There I propose a technical change to IDNA section 3.1
> > requirement 3. For reference, here it is as it stands now in
> > RFC-3490 (with one typo corrected):
> >
> > 3) ACE labels obtained from domain name slots SHOULD be
> > hidden from users when it is known that the environment
> > can handle the non-ACE form, except when the ACE form is
> > explicitly requested. When it is not known whether or
> > not the environment can handle the non-ACE form, the
> > application MAY use the non-ACE form (which might fail,
> > such as by not being displayed properly), or it MAY use
> > the ACE form (which will look unintelligible to the user).
> > Given an internationalized domain name, an equivalent domain
> > name containing no ACE labels can be obtained by
> > applying the ToUnicode operation (see section 4) to each
> > label. When requirements 2 and 3 both apply,
> > requirement 2 takes precedence.
> >
> > Here is my proposed replacement:
> >
> > --begin--
> >
> > 3) When a domain label occupying or obtained from a domain
> > name slot is to be shown to a user, it SHOULD NOT simply
> > be shown in whatever form it was found in; before being
> > shown it SHOULD be forced into either ASCII form (which
> > can be obtained by applying ToASCII) or non-ACE form
> > (which can be obtained by applying ToUnicode, see
> > section 4), according to the first applicable of the
> > following rules:
> >
> > a) If requirements 2 and 3 both apply, requirement 2
> > takes precedence, and the ASCII form MUST be used.
> >
> > b) When the user has explicitly requested to see one
> > form or the other, that form SHOULD be shown.
> >
> > c) When it is known that the environment cannot handle
> > the non-ACE form, the ASCII form SHOULD be shown.
> >
> > d) If the non-ACE form contains any character outside
> > Unicode categories L (letter), N (number), and M
> > (mark), other than U+002D hyphen-minus, the ACE form
> > SHOULD be shown.
> >
> > e) If the application determines that showing the
> > non-ACE form would pose too great a risk of
> > misleading the user, the ASCII form MAY be shown.
> > Applications MAY use complex heuristics to estimate
> > this risk, but SHOULD try to minimize the negative
> > impact on legitimate usage of internationalized domain names.
> >
> > f) When it is not known whether the environment can
> > handle the non-ACE form, the application MAY show the
> > non-ACE form (which might fail, such as by not being
> > displayed properly), or it MAY show the ASCII form
> > (which will look unintelligible to the user if it is
> > an ACE).
> >
> > g) In general, when rules a-f do not apply, the non-ACE
> > form SHOULD be shown.
> >
> > Rules c, d, and e above apply tests to "the" non-ACE
> > form, but in fact there can be many non-ACE forms that
> > differ only in capitalization and/or normalization. If
> > a given non-ACE label fails some test, it MAY be
> > converted to an equivalent non-ACE label by applying the
> > map and/or normalize steps of [NAMEPREP] (or all the
> > steps), and then given another chance to pass the test.
> >
> > --end--
> >
> > Thoughts?
> >
> > AMC
> >
> >
>
>
>
>
>
>