[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [idn] homograph attacks



We list what we do as opposed what we do not do at
http://www.verisign.com/products-services/naming-and-directory-services/nami
ng-services/internationalized-domain-names/idn-standards/idn-character-varia
nts/page_002087.html.  Korean is missing from this list and should be there.

We are currently updating this portion of the site. 

As far as Tajik, and for that matter most of the languages that have
available tags, the matter is the availability of tables to support the
tags.  I am not an expert on Tajik, but I do know that there may be some
mapping that may be inclined to do for their language between scripts.  I
don't know, so I have to wait for the table.  It would be the same if the
Germans determined that want to map u with ü.  I can't and won't make that
determination.  VeriSign has always wanted to use the tables that were
developed by appropriate bodies (either NICs or local language bodies), but
the development process for those tables appears to be stalled at best.  For
many languages there are just no tables.

CJK tables are clearly available and have been deployed within VeriSign.  As
others are developed we are deploying them.  We are moving forward this year
with German, Danish, Swedish, Norwegian and Thai as those tables have been
listed with appropriate documentation on IANA.  There are ccTLDs that have
IDN programs that have not listed their table or have identified characters
that serve a broader community within their constituency such as Denic
opening up 92 additional characters or NASK developing their own Hebrew and
Arabic tables.  These don't speak to a language, but the practices of a
specific tld.

Pat

-----Original Message-----
From: "Martin v. Löwis" [mailto:martin@v.loewis.de] 
Sent: Tuesday, February 15, 2005 3:29 PM
To: Kane, Pat
Cc: tedd; idn@ops.ietf.org; ericj@shmoo.com
Subject: Re: [idn] homograph attacks

Kane, Pat wrote:
> VeriSign does prevent domains with the Russian language tag from
commingling
> A-Z with the Cyrillic characters.  It does permit 0-9 and the dash to be
> used.  This filter also applies to other Cyrillic based languages such as
> Belarusian, Ukrainian, Serbian, Macedonian and Bulgarian.  
> 
> There are other languages that are listed within ISO 639-2 that today use
a
> combination of Latin and Cyrillic as they were originally Latin based
(Tajik
> was Arabic prior to being Latin based), migrated to Cyrillic during the
> Soviet era and today are migrating back to Latin.

Thanks for the clarification. Is this information publically available
somehow? On

http://www.verisign.com/static/002533.pdf

I can find the language code list (which shows that indeed TGK and RUS
might be treated differently); I wonder whether you somehow list the
constraints implemented for each tag. How did the applicant know that
he would have to use Tajik in order to get a cyrillic letter into an
otherwise latin label?

As for the Tajik writing system: why is it then necessary to allow
mixed scripts? Wouldn't the Tajik users be satisfied if you could
either register all-Latin or all-Cyrillic labels (perhaps allowing
all-Arabic as well)?

Regards,
Martin