[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] URL encoding in html page




----- Original Message ----- 
From: "Adam M. Costello" <idn.amc+0@nicemice.net.RemoveThisWord>
To: "IETF idn working group" <idn@ops.ietf.org>
Sent: Saturday, March 23, 2002 8:52 AM
Subject: Re: [idn] URL encoding in html page


> Soobok Lee <lsb@postel.co.kr> wrote:
> 
> > If a simple HTML page contains the following tag,
> >   <a href=http://www.<ML>.com>Hello World!</a>
> > in which, <ML> maybe in a native legacy encoding or utf8 encoding, it
> > is easy to imagine that some vistors who click that link may be led to
> > wrong sites or nowhere.
> 
> Very easy to imagine indeed, because the HTML spec says that the href
> attribute must contain a URI, and the URI spec says that the host must
> contain only ASCII letters, digits, hyphens, and dots (or it may be a
> bracket-enclosed IPv6 address literal).
> 

Right!  
Under Current URI and HTTP spec, native-encoded IDN hostnames
are just illegal inputs and they should be passed to some exception
handling mechanisms , for example, search engines like  search.netscape.com
or auto.search.msn.com.

> > Should IDNA recommend all HTML authors to use such ACEed URL for
> > backward compatilbility and error-free fast deployment?
> 
> Not necessary, since the HTML and URI specs already limit the host to
> ASCII letters, digits, hyphens, and dots.

We experts already knew this. But, many ML.com registrants don't know  about this
poor destiny of ML.com. They want to use native ML.com in their HTML homepage.

If we want to have interoperable URI supporting native IDN, we should revise
URI spec and HTTP spec BOTH. But, native IDN supports accompany potential
legacy code versioning and code interoperablility problems.
Would anyone provide indepth analysis on this caveat  ?

Soobok Lee