[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [idn] homograph attacks
- To: Martin Duerst <duerst@w3.org>
- Subject: Re: [idn] homograph attacks
- From: William Tan <wil@dready.org>
- Date: Sat, 19 Feb 2005 02:15:56 +1100
- Cc: Erik van der Poel <erik@vanderpoel.org>, Michel Suignard <michelsu@windows.microsoft.com>, Martin Loewis <martin@v.loewis.de>, "Kane, Pat" <pkane@verisign.com>, idn@ops.ietf.org, ericj@shmoo.com, tedd <tedd@sperling.com>, dam@icann.org, roozbeh@sharif.edu
- In-reply-to: <6.0.0.20.2.20050218113553.08077390@localhost>
- References: <FD16260E2EDF204E80A22FB904A425C30BD0A0CF@WIN-MSG-10.wingroup.windeploy.ntdev.microsoft.com> <4214ABF5.3080700@dready.org> <4214C445.9070106@vanderpoel.org> <6.0.0.20.2.20050218113553.08077390@localhost>
- User-agent: Mozilla Thunderbird 1.0 (Windows/20041206)
Hi Martin et al,
Martin Duerst wrote:
Very much agreed. Except for registries with very special
policies (such as the blocking used by some East Asian
registries), the language association doesn't make too
much sense.
Indeed, for registries that wish to support CJK languages by following
RFC3743, it makes sense to find out the intended language of a label in
order to decide whether to apply traditional/simplified Chinese
bundling. Take for example a label that contains only Han characters.
When used in a Japanese context, there probably wouldn't be any variant
that the registrant would care about. But when the same label is used in
a Chinese context, assuming that it does mean something, the registrant
might want both the traditional and simplified variants of the label.
Quoting John in draft-klensin-reg-guidelines:
...and with different geographical and political locations
and languages having requirements for different collections of
characters, the optimal registration restrictions became, not a
global matter, but ones that were different in different areas and,
hence, in different DNS zones.
It is my belief that CJK is not the only "problematic" or special cases.
As languages are being researched on in terms of their implications in
IDN, more language-dependent rules may need to be accommodated.
Immagine that a gTLD registry had a few hundred language tables,
and immagine that a registrant wanted to register a particular
sequence of characters. It would be very easy for a registrar
to set up a service that figured out a language (don't care
which) that worked, and register the name with that language.
There is nothing wrong with this approach, technically. The language tag
is, the way I see it, simply a hint and is more an administrative piece
of information rather than concerning the operations of the DNS. It is
used only at registration time, to allow the registry to make certain
decisions and apply rules to it. If we could drop the word "language"
and simply stick with "tag", then that tag could be a script, subset of
a script, or simply a list of characters, labeled by a suitable term
such as "Characters used in Germany", "Nordic languages", or "List of
allowable characters for Japanese".
Take the .PL IDN program for example, as long as the desired label
contains only characters from one of the allowable tables, the
registration will go through. The tables are named "Latin set",
"Cyrillic set", "Greek set", etc. They could jolly well launch Chinese
and introduce a rule that says, if your label fits into the Chinese
table, we will reserve any variants of it as well. One could see that as
associating an IDN with a table, and the table may be a single language,
or may represent several languages.
IMO, whether it is "language association" or "table association", they
are one and the same concept.
Regards,
wil.