[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] Why IDNA breaks copy-and-paste



> To I would be extremely careful with using cut&paste in order to
> transfer unique identifiers from a terminal emulator to an 
application.
> This can end in an endless can of worms, if Hebrew, Arabic or Syriac 
> are involved, and we haven't even begun to understand how to handle 
> that,  even after scratching out heads for a year in heated discussions
on 
> the  linux-utf8 and i18n@xfree86.org mailing lists.

Cut&paste should have three levels of handling:

1. traditional ASCII text, consistent with URI requirement;
   -Keep termcap as is.  To expand on that, there are already
  tons of published books about processing various scripts 
  with various encoding on it, but the point may not be obvious 
  to many.   To tug Latin into DNS is another example:-)
  I don't  want to dive into that mess.

> The other critical security issue related to DNS are of course
> homoglyphs. The same glyph appears in Unicode many times as separate
> characters that are part of different alphabets or different usage
> backgrounds, Latin, Cyrillic, and Greek "A" being just one example.
> 
> Markus

2. New IDN identifiers, a subset of UCS:
  - Try to come up a new universal identifier character table 
   to eliminate script and character confusions. Thus language 
   tag has to be used, and limited mixed scripts can be 
   implemented, and IDN Cut&paste can be verified.
   
3. The rest of UCS characters, bidi, Chinese TC/SC n-1 
  cases to be  handled.  This level traditionally is handled in 
  applicantion level. With UAX15, UTR21 done at UTC, it is 
  possible to  have a universal interface for all applications 
  to implement,  but this is not the job in DNS or IDN. 

If this IETF decides to come up with a universal interface for 
IDN, then it is possible to Cut&Paste on the third level cross
major applications such as Web pages.  However, there is little 
information for me to say how much the IDNA is going to do, 
since there is no specification about scripts and character 
confusions in IDNA, then it is must belong to DNS level of 
above.  No wander the Chinese group has organized an 
official protest.
 
Regards, 

Liana