[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [idn] Re: character tables
Erik,
A few observations...
(1) First, a registry does have the right to require
that registrants observe particular rules and conditions
in subdomains they delegate and to pass those rules down
the tree. Whether that is wise or sensible is another
issue, and enforceability is yet another question.
But, unless national law prevents it, RFC 1591, to which
all TLD registries more or less agreed, rather
explicitly provides for passing the responsibilities to
the community down the tree. Even ignoring troublesome
concepts like "require" and "enforce", certainly nothing
prevents registries from educating and persuading
registrants about how they should behave.
(2) In my regular role as a luser, I really like fast,
easily-used, small-footprint browsers. I'm more
security-conscious and suspicious than the user average,
and therefore also like handy tools to help me dissect
and verify things that might look suspicious. Tying up
a browser with heuristics, such as mixed-script
detectors, that may not work well and have a large
footprint, doesn't impress me as a good tradeoff. For
better or worse, the assumption of a decade ago that
most criminals, especially most electronic criminals,
were stupid is no longer applicable, if ever it was.
That implies, I think, that if we design a simple test
that blocks some look-alike cases but permits other,
more subtle, ones, we will simply drive the phishers to
better understand and use the subtle stuff: not a good
tradeoff.
(3) As far as surfing around the world is concerned,
we've got a situation today in which the domain name
associated with a particular URL does not really predict
the content to be found on that page. That will
undoubtedly get worse, as more folks discover that the
intersection of domain and host administration with web
site organization often makes it much easier to maintain
versions of pages in multiple languages in the same,
rather than different, DNS trees. So, since I don't
read Chinese, I'm unlikely to frequently seek out pages
whose content is in Chinese. But I frequently find
pages I can read via URLs that contain elements written
in pinyin. I fully expect those elements, and some of
the subdomain names, will shift to Chinese characters as
IDNs and IRIs are more widely available. I also expect
that transition will make things more comfortable for
someone who reads Chinese and would prefer to not deal
with Latin characters and harder for me, but that is a
reasonable tradeoff over which none of us will have much
influence.
(4) We need to get unstuck from thinking about this
purely as a browser problem. The usual phishing attack
involves an email message containing a link. For those
email clients that don't immediately invoke a full
browser as soon as a link appears --and many of those
links occur in plain-text, not HTML, email-- they are
invoking the browser when the link is clicked on. The
situation in the browser is then different, since none
of the "hover over link", "look at status bar", etc.,
tools are going to apply, or, at least, are not going to
work in the ways that some of these discussions suggest
for links that appear on web pages that are already open
in the browser. Now, we have given MUA writers no
advice about what they should pass to the browser if
they see an IRI or otherwise-encoded string that
contains an IDN. If they pass the IRI/
native-script-form IDN, they risk passing it to a
browser version that doesn't have a clue. So maybe they
force the thing into URI/ punycode form and pass that.
Now, do you really want the browser to look at the
thing, perform ToUnicode on the name (which, of course,
may yield something other than what the user saw),
perform some tests, and then pop up a "you just passed
me an IDN that looks suspicious, do you really want to
open that page?" box. I think probably not. Moreover,
I think that, if you do, there would quickly be a
sufficient number of false positives (positive for bad
stuff) to get users really used to clicking "yes"
without thinking... and cursing the browser implementer
for bothering them with a pointless warning.
So my conclusion is that we need a mixed
protocol-registry-browser strategy. That strategy, IMO, should
shifted the processing burdens as much as possible to the first
two. And I think that notions that the problem can or should be
solved in any of those three places alone are probably misguided.
john
--On Tuesday, 01 March, 2005 20:47 -0800 Erik van der Poel
<erik@vanderpoel.org> wrote:
>> However, I note that this particular conversation is between
>> a browser developer (Gervase) and one of the IDNA authors
>> (Paul), neither of which is a registry representative, so
>> why exactly are you 2 having this conversation? :-)
>>
>> Sorry, I'm half joking. Half, because you two have every
>> right to discuss whatever you wish. The other half because I
>> believe browser developers can afford to focus more on their
>> end of things.
>
> Sorry, I've been told that this half-joking thing was
> confusing, and I now believe I shouldn't have tried to be so
> cute.
>
> All I'm trying to say to *Gervase* is that it doesn't really
> matter *what* characters are allowed to be registered in a
> registry, as long as the browser takes steps to warn the user
> when something phishy might be going on, e.g. a slash
> homograph, or a Cyrillic small 'a' when the user was probably
> expecting a Latin small 'a'. As I have pointed out, the
> registry does *not* have control over higher-numbered level
> domains. E.g. .de controls the 2nd level domain (2LD), but not
> the 3LD, 4LD and so on. That is where the slash homograph
> problem *really* matters.
>
>> Instead, I wish the browser developers would
>> focus more on the *user*, who may be "surfing" from one site
>> to the next, spanning the globe, and crossing language
>> boundaries.
>
> Sorry, this may not have been the best logic to use in my
> argument. It would have been better to talk about phishers,
> who often spam users with email containing URIs that *could*
> contain IDN labels with dangerous homographs at any level of
> the name, 2LD, 3LD, or whatever.
>
> (Most users *don't* surf around the world, since many are
> monolingual or maybe bilingual.)
>
> Anyway, help me out, guys and gals. Pull my logic through the
> wringer, and comb it with the finest comb you have at your
> disposal. This way, we can collectively improve our
> understanding of the IDN phishing problem and ways to address
> it.
>
> Erik