[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: An idn protocol for consideration in making the requirements




>The new draft has the ungainly name of CIDNUC (Compatible Internationalized 
>Domain Names Using Compression); I chose this so that no one could accuse 
>me of trying to market it. :-) You can find the draft at 
><http://www.ietf.org/internet-drafts/draft-hoffman-idn-cidnuc-00.txt>. I 
>hope this helps focus the requirements discussion.

I also hopes it helps the discussion, as a solution to international
domain names I see it is totally unacceptable.

I see it as very important that we do NOT allow the solution to
be encoded in ASCII! It is time everybody learns to deal with the
problems of having more then the ascii subset.

If you look at RFC 1035 there are a few things that need to be
changed. Some of them are:
- Host names/labels/domain names
  This must be changed from a-z, 0-9 and - to
  all letters (and letters means not the letters in ascii), digits,
  and a lot more.
  The labels must not follow ancient rules for ARPANET host names,
  we need new rules acceptable in an international comunity.
  The restriction to max 63 characters could be removed (though
  remember that in an international world 63 characters may take
  more than 63 bytes to store. Minimum 63 international characters
  must be allowed (but current protocol restricts it to 255 bytes).
  
- The definition of domain names are defined to contain 8-bit bytes,
  but RFC 1035 recommends them to follow host name syntax.
  It also only requires matching to be case-insensitive assuming ASCII.
  We need just to expand this to entire UCS.
  
In general I think the basic protocol format of current DNS have no
problem having UTF-8 encoded text which makes ascii-only names work
without change. Most servers should allow 8-bit bytes, but they need
to be upgraded to handle case-insensitive matching.

An other area is all programs that to day handles domain names, many of
them need to be fixed as they now do not allow or handle non-ascii
well.
Yes, there will be work for many people fixing old software, and
some may break when given a non-ascii domain name. But it is much
better that way than trying to hide internationalisation inside
ascii just to legacy software still works.

If we should do the solution by hiding everything in ascii, then we
should hide everything, including ascii only names. That way everybody
still need to fix their software. If we do not do this, I expect
that in year 2005 I will still not be able to enter or view URLs,
host names and domain names using non-ascii. Because software
developers like Netscape or Microsoft will still think it is
ok to enter and display letters as %XX in URLs.

ASCII compatibility and backward compatibility is good, if it does
not make things bad for everybody where ascii is not enough.
It is better to break software and get them fixed instead of
still trying to make everything to work in a world as it was
at the dawn of the computer age.

   Dan