[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] universal typability



What you describe is not a problem with ASCII compatiblity encoding scheme.

For the record, %HH encoding scheme was designed for the <path> side of URL
and not the <hostname>. Therefore, by strict implementation, %HH in hostname
should not have work, on any broswer. 

The real problem is lack of standardisation with IDN and people went off their
own ways to do IDN, thus causing inconsistency. This is why this WG is so
important.

-James Seng

ps: the reason why MSIE works for UTF-8 domain names is because MSIE act smart
to do a conversation with UTF-8 for domain names. It seem cool to be able to
do multilingual domain names. But this is not a prove that UTF-8 in URL is
okay because it is not. There are a lot of bugs and problems if you look into
this more carefully. For example, try typing the sample domain name provided
into a Chinese MSIE broswer. Or try Chinese URL (take it from www.idns.org) on
a English MSIE. And there are also a few buggy downcasing by MSIE which cause
'wrong' UTF-8 been send out.

Spend a little more time understanding the consequences with MSIE UTF-8 domain
name and the picture are not so rosy anymore.

Dan Oscarsson wrote:
> This show one of the problems Kent is talking about.
> A URL like: http://www.gås.net/gås.html
> could look like: http://www.8wahdfhud.net/g%c3%a5s.html
> if we use an opaque scheme for domain names.
> The %-encoding should not be used in the domain name part.
> Think about the mess otherwise:
>   Should I do ftp www.8wahdfhud.net or www.g%c3%a5s.net?
>   Why does ftp www.8wahdfhud.net work but not ftp www.g%c3%a5s.net.
>   But in my browser ftp://www.g%c3%a5s.net/ does work.
> 
> If we use ASCII compatibility encodings, the same object must
> be encoded using the same scheme everywhere, if possible.
> What a mess otherwise.
> 
>    Dan