[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] IDNs in IE and Google



Michel Suignard wrote:
Concerning IRI, it is not a matter of 'preference'. If you present
something like a URI containing a host name presented in non ASCII
repertoire, you are in fact using an illegal URI per RFC2396 definition.

While this is true, it only imposes constraints on the HTML authors - not on the Web browser. In particular, B.2.1 of HTML 4.01 explains that Web browsers should expect authors to violate the constraint, and gives a suggestion on how to deal with this violation. This suggestion clearly predates IDNA, so dealing with this violation using IDNA for the host name part is more sensible.

It is (unfortunately) quite common that HTML authors do not follow
the relevent HTML recommmendations. Some authors even believe that
unlimited extensibility is one of the guiding principles of HTML.

In fact, MSIE 6 does process URLs with non-ASCII characters - even
though not in the suggested way (but instead, apparently by sending
UTF-8 directly to the wire). Changing it to perform IDNA on the
host part (and leaving everything else as-is) would not make it less
standards-conforming, and give users added value.

Regards,
Martin