[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] IDN WG Last Call: draft-ietf-idn-idna-06.txt



Dan Oscarsson <Dan.Oscarsson@trab.se> wrote:

> There are "internationalised domain name slots", probably meaning a
> domain name slot that uses the native character set of the context the
> slot is in,

The meaning is stated in the draft.  An internationalized slot is one
explicitly designated for carrying an IDN, regardless of how it is
represented.

> though IDNA is having all non-ASCII domain names being in ACE.

All non-ASCII names in generic slots are in ACE.  Non-ASCII names in
internationalized slots need not be in ACE.

> And there is a "generic domain name slot", which should really be
> called "ASCII only domain name slot" because it is not generic.

It is generic in the sense that it is for "domain names", not
specifically for "internationalized domain names", and therefore you
don't know what might happen if you put non-ASCII code points into it.
That's why IDNA allows only ASCII code points in generic domain name
slots.

> Reading more in IDNA you find that it a domain name is to be put in a
> "ASCII domain name slot" it has to be converted into ASCII.  But you
> will also find a section about entry and display in applications.  In
> it there is talk about different places where domain names can appear,
> including that non-ASCII character should be used if allowed.

Yes.  There is no contradiction.  "Slots" are things communicated
between machines/software, like fields of protocol messages, function
arguments, etc.  Text widgets in user interfaces don't really count
as "slots".  But even if you wanted to view them as slots, an
internationalized application would of course consider them to be
internationalized slots, so the requirement to use only ASCII would
still not apply there.

> Where do the user enter URLs?  In the location field to open new
> pages, and in the HTML code in links.

The location field is not a problem.  An internationalized application
will use whatever representation it likes in the location field, and
convert to/from ACE as neede.

> Links in HTML code will contain non-ASCII! As the context of a HTML
> document have a defined charactert set, URLs embedded in the will be
> written using the same character set.

The HTML spec requires the href attribute to contain a URI as specified
by RFC 2396, which requires the host part to contain only LDH
characters.  Under the current rules, ACE is the only way of using IDNs
in an href attribute.

It's possible that a new version of HTML will allow non-ASCII characters
in the host part of the URI in an href attribute.  If that happens, then
you will have an easier way of typing your HTML source, and the href
attribute will no longer be a generic domain name slot, but will in fact
be an internationalized domain name slot, and thus the new HTML will not
be in any conflict with IDNA.

> the person writing the HTML page may not know the ACE form and will
> write:
> <p><a href="http://中æ-?.com";><b><font
> color="#FFFFFF">中æ-?.com</font></b></a></p>
> 
> and that will not work in browsers that are not IDNA aware

It violates the HTML spec, so there's no reason to expect it to work in
any browsers, regardless of whether they are IDNA aware.

>    sendmail will have to be fixed so it can convert host names to ACE
>    before sending.

It was never our vision that sendmail should be altered in any way.  The
user program (Mutt, Pine, etc.) is what we expect to be altered.

Upgrading just a few applications (a web browser, a mail client, and
maybe an instant messaging client) will go a long way for most users.

You don't even have to touch the resolver (though by messing with it you
might be able to trick some old applications into partially working with
IDNs).

AMC