[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] homogram attacks: cyrillic and registration guideline



Dear Soobook,
it would have been so easy to say 'all the IDNs are to be punycoded 3LD+ with "xn--ISO 639 language code" as an SLD, language code+TLD being documented by the TLD table'. This would have permitted additional limited/full mixed tables in permitting ISO 639+1 or 2 chars tags. This would have been so coherent with DNS that this is probably the way plug-ins resolve the problem, until some fed-up Govs create their own conpressed "xn--ISO 639.ccTLD" version under the form of MLTLDs with their common addition to the root.


Anyway all this is of low interest since what we discuss are SLDs, not 3LDs and lower where the registrant is free.
jfc


On 11:29 16/02/2005, Soobok Lee said:

IDN WG had discussed about this IDN-based homogram attacks 3 years ago.
The conclustion is that:   the problem should be solved in registration stage,
 not in encoding/protocol level.

So we have now "IDN registration guideline for CJK (han ideographs) languages", but that
does not cover cyrillic / greek ones yet. IETF seems to have no plan to expand and
publish it. that is, "Do it yourself , registries !",


http://www.unicode.org/charts/PDF/U0400.pdf

I ask you all to open this PDF unicode chart and see how many lowercase cyrllic
alhpabets look exactly the same as their latin-alphabet lowercase counterparts .


To list some of them, " a  e  i  y  c o s j".
(some of them are not russian,but for eastern europe)

In the uppercase characters, "B H M P" including the above 8 chars.
cyrillicHP.com /ascii HP.com came from the latter category.

please compare the lowercase "cyrillic iii.com" with ascii "iii.com"
. In the address bar,
they may look exactly the same, because cyrillic/ascii fonts are almost the same ones
in many OS/GUI environments.


Soobok