[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] URL encoding in html page



> ACE and UTF-8 are just compression algorithms that squeeze larger
> Unicode characters. ACE is more efficient than UTF, although more
> complex.

UTF-8 does not have any compression.

 > But the claim to fame that UTF-8 has is that it is a standard that
> idn can reference, and it is coming into widespread use elsewhere.

Yes, agree it is "coming into widespread'. Until it is widespread, it is
only "Coming Soon". or "Coming Soon to your favourite OS RSN(tm)".

> So moving to UTF-8 long term seems like such an obvious choice
> is surprises me that it even gets debated.

Any technical decision have to based on well-founded technical reasons.

The transcoding can be bypassed if everything is in UTF-8 is a good argument
I can accept. To use UTF-8 just because everyone else is *going to* do so
does not sound too conviencing. And "UTF-8 is obvious" is not an argument I
can buy.

> Of course, Punycode is a nice compression method... Maybe we should
> take it to the Unicode forum and suggest that it be the next version of
> UTF-8?
>
> But if that doesn't happen, it should logically be phased out.

Lets bring base64 and qp over to Unicode and also %-escaping and other
adaptive mechanism out there, then phased them all out! Lets phase out all
other local encoding too. The world should only have one and only one
encoding! Long live UTF-8, for all fonts, for input, for all protocols, all
systems, all software! This the utopia dream of all I18N engineers (myself
included).

But enough dreaming. Lets get back to ugly reality. *sigh*

-James Seng