[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] who should be doing IDN filtering

To: Eric Johanson <ericj@shmoo.com>, idn@ops.ietf.org
Subject: Re: [idn] who should be doing IDN filtering
From: Martin Duerst <duerst@w3.org>
Date: Thu, 17 Feb 2005 18:17:32 +0900
In-reply-to: <Pine.LNX.4.61.0502161902070.10267@localhost>
References: <Pine.LNX.4.61.0502161902070.10267@localhost>

Hello Eric,

At 12:37 05/02/17, Eric Johanson wrote:

>My question for the group: > >Are you sure that it's the registrars/TLDs/etc whom should be doing this filtering?

Yes, this would be ideal. The main reason is that a domain name
gets registered 1 time, but it gets used thousands if not millions
of times. Also, there is much less software for registration, which
can be updated much more easily, than e.g. updating all the browsers
around the world. In addition, some approaches require tables, which
might be difficult to fit into a mobile phone.

This does not exclude that browsers would implement some warnings
in certain cases, but that may be with completely different means
(e.g. a browser could check a "phishing alert" database and put
up a warning for pajpal.com; it seems that you already had
a similar idea at the end of this mail).

>I understand that that by using the language tag, it generally might make some type of filtering concepts easier & more practical to deploy. > >Because this 'language tag' is only available to registrars (when I say registrars, I mean anyone involved with the registration of a new domain, on any TLD), I suspect it makes it impractical to do the filtering at the browser/application level.

Not exactly. The main reason would be that even with language tags,
different registries may have different policies.

On the other hand, except for those cases where there is a
'reservation' of variants, even the registrar/registry doesn't
have to know the lanugage except for tracking purposes. As long
as it fits any one of the tables they offer, it's okay.
And 'reservation' up to now only has been used to address the
simplified/traditional case in Chinese. While it may look nice
in theory to e.g. create some mechanism for reserving/
equating o-umlaut and oe in German, and so on, this isn't
actually appropriate for names (Goethe (the famous one) would never
be written Go"the, altough there are plenty of Germans that
write their name that way). Then there are the cases where
an 'o' and an 'e' are adjacent without being pronounced like
o-umlaut. And then the number of such letters per name is
small enough to let people register both variants if they
think they need to.

>My question is: > >...assuming we can make the language tag available via some dns tricks or some API... > >Shouldn't we leave the filtering up to the user, and the user's browser? It's likely the application will know more about what the user expects as far as site identity (some tricks I've heard proposed is to not warn about sites in the users favorites, or create whitelists of 'trusted sites' [see trustbar]), and knows about the local language settings.

These kinds of warnings are indeed appropriate on the user side.
But the language tag isn't helpful for that; the user sees (and
is potentially fooled by) the characters, not the language tag.

>For example, if I have windows XP, with the default language set to german, the browser isn't (as obligated) to warn the user of potential IDN phishing attacks when it sees german scripts in the IDN. My hope is that this idea could scale to any script/language, but I fear that I'm missing something.

No, this is an idea worth exploring. But the problem is that it
may be a lot of work to try to figure out what kinds of letters
German users confuse, and there may be a lot of variation between
different users, and browser makers may have a hard time covering
more than just a few languages.

>I'm not proposing we don't have some type of filtering of IDNs at the registrars, etc. I'm simply proposing we give application (browser, email) developers the same information the registrars have, in order to give them a fighting chance of warning the user of a homograph attack.

They have a chance already. Language information doesn't buy
them anything more, I think.

>Taking this one step further: > >.. if we think about IDN homograph attacks as the same class of problem as 'spam/UCE'.... > >..Then tools will just start to exist to filter it. Imagine, if you will, a RBL which collects IDNs found in spam/phishing attacks, and communicates with a users browser.

Feasible maybe, and if feasible, can be used for plain old ASCII stuff, too.
Or is it that because it hasn't been used for ASCII, we may conclude that
it doesn't work for IDNs?

Mind you, there are very serious organizations (e.g.
http://www.antiphishing.org/) which are very hard at work to
make sure that anybody involved in real phishing (as opposed to
setting up an example to show how you might do it) gets removed
from the DNS and the Web as quickly as they can manage.

Regards, Martin.

References:
- [idn] who should be doing IDN filtering
  - From: Eric Johanson <ericj@shmoo.com>

Prev by Date: Re: [idn] homograph attacks
Next by Date: Re: [idn] homograph attacks
Previous by thread: [idn] who should be doing IDN filtering
Next by thread: Re: [idn] who should be doing IDN filtering
Index(es):
- Date
- Thread