[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[idn] IDNA and other things



As I currently have very little free time to discuss IDN and will soon
be away several weeks I cannot be that active on the list.
But I do not feel we are really going forward. I will here give some
of my views hoping it will inspire to some new discussions.

IDNA is only about internationalised host names. I do not think this
is enough. All of DNS must be internationalised. All character data in
DNS must allow non-ascii, and to be really usefull they must use the same
encoding and normalisation. When I start using non-ascii in DNS I must be able
to use all letters I need for my language everywhere.

IDNA says that updating DNS servers for IDN will be difficult. Maybe, but
experience shows that applications are the last thing to be updated!
Just look at MIME in SMTP. The servers handles it, but not many applications.
mailx on my Unix system still do not understand it. I expect resolver
libraries (dynamically loaded into applications) and DNS servers to be updated
long before my applications. And as applications will take so long to update,
the DNS servers need to accept names sent in encodings like ISO 8859-1 because
when I enter a URL in my web browser I do it with my local character
set (ISO 8859-1) and the web browser will just use them when doing a DNS
lookup.

When an application displays a domain name for me, I want it to be displayed
using my local character set. If my local character set cannot display
all characters, I want it to display all characters that can be displayed
directely and others in some "escape" fashion. I do not want it to
be turned into something totally unreadable like RACE.

IDNA states that the RACE encoded host names returned by gethostbyaddr 
are not a big problem because so few application shows then to users.
The host names returned by gethostbyaddr is used in log files, used
when checking access rights and other places. Many of them I as a user
will see and have to handle.

I guess IDNA requires only one encoding for a host name is so that names
can be matched in the DNS server as they have removed all matching
logic from DNS. I think this is the wrong way to go, encodings that
represent the same name should be allowed. It removes a lot of possibilities
of difficult to find errors. It is like saying www.xab.com does not
match www.XAB.com. One mall bug in a RACE/nameprep code in one
application will make matching not work for some names, and resulting
in difficult to understand failures.

I think preserving form (case) is needed. At least in PTR records.
The PTR records are used to translate from IP address to host name.
This host name is then used for many thing, for example to define who
may mount a file system or access a e-mail server. And unfortunately
Unix still does case sensitive matching in those cases. If the case
is returned wrong, the name will not match the permission files.
And not all host names are is just lower case!
If we require names to be always returned in lower case, we will break
a lot of old programs.

Zone files: Using internationalised DNS I expect to be able to store and
edit my zone files using my local character set. It is the DNS server,
not me, to convert the names from my local character set into the
standard set used internally by DNS. DNS must be user friendly, we cannot
expect a user to enter UTF-8 or ACE encoded names.

In applications: when I use library calls like gethostbyadd, gethostbyname.
To be user friendly, these calls must like most other internationalised
calls, use the locale and output/input names using the local character set.
We cannot expect a user to accept UTF-8 or an ACE.

When we start using names using non-ASCII we can expect some to want to
have both a "real" non-ASCII name and an alias in ASCII. Both names
representing the same name, but the ASCII name is not an ACE of the
non-ASCII name. DNS should therefore support a way to get both the real
and the ASCII alias. You can have a look at my latest UDNS draft
which includes a way for this.

That is all for now, will see when I next have time to write something.

    Dan