[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] I-D ACTION:draft-ietf-idn-cjk-00.txt



Keith Moore wrote:

> but the fact that people sometimes register domain names for pet
> dogs should not compel us to make IDNs support dog names, especially
> if doing so were to increase the complexity of the IDN system.

** Two answers. One technical:

With Unicode (unlike in the world of multiple character sets)
supporting more languages does *not* increase complexity. On the
contrary, schemes which favour major languages (English, Chinese
etc.) *do*.

For example, suppose z-variants for Chinese were implemented
and consider sendmail checking for spoofed domain names on
its SMTP connection. It could look up the domain name from the
IP address and know how to compare z-variants (there is no one
preferred form of the character). Or it could look up the IP
address from the domain name and then the server has to know
those matching rules.

I agree language tagging is a very bad idea.

The simplest workable scheme I can think of

  - folds only A-Z and forms of them with diacritics to
  lower case. [It would be possible not even to fold the diacritic
  forms, except that the folding code couldn't know when a new
  nonspacing diacritic was introduced.]
  
  - disallows alternate forms of bidirectional markers
  
  - disallows alternate forms of Korean (all the Jamo) and Han
  (the ideographic forms)
  
  - disallows or folds Unicode compatibility characters, all of
  which are duplicates
  
  - disallows existing characters in the ASCII range which are
  special to the application layers, but not other characters
  which just look like them.
  
It's simple to case-fold all scripts (Turkish fails, oh well) if
you use the table the Unicode Consortium supply. IOW, such a
scheme is close to completely unrestricted Unicode.  How would
reducing the number of supported languages or characters
make IDNS simpler?

** And the other answer is legal. What is driving the rush to
IDNS is laws in some jurisdictions (including the US) which
require a trademark owner to attempt to register it as a domain
name or lose it. Too restrictive a design risks being ignored
in favour of something proprietary.