[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] universal typability



> Karlsson Kent - keka wrote:
> 1) Have you set the "always send URLs in UTF-8" "advanced Internet" option?
> 2) Are the names stored in the DNS system stored in the UTF-8 encoding?
> 3) Is the browser using the correct 'charset' when reading the pages?

While I sound dumb sometimes, especially with termiology, I believe you can
give me a benefit of doubt on this. :)

Let me explain to you what I have done. I subscribe to MSDN. I got all product
Microsoft ever produce, including Windows 98/95/NT/2000 in all different
languages. I install almost every 98 (some 95) and do my testing. I test it
with CJK+French+Finnish+Arabic+etc URL click-on with correct charset set, with
wrong charset set, and then repeat it with key-on in UTF-8/localized encoding
with MSIME and repeat it with external IME, then I repeat it with
cut-and-paste then I repeat the test on MSIE4.x/5.x, then repeat it with UTF-8
on/off (on 5.x only btw) then I repeat it with MSIE with International support
and without.

At the end, the conclusion is very disappointing. UTF-8 is not consistent
across different platform of Windows, some send out 'almost' correct UTF-8
(one byte wrong), some are plainly wrong, so wrong that I have no clue how to
they encode it in the first place, downcase of accented localized set, some
convert it to UTF-8 correct but it remove bit 7 (downcase) on every UTF-8 byte
(doh) and blah blah. It works sometimes of course, but only a very limited
subset of my whole my whole test series.

Now, I havent got to the case when I tried it with out Windows applications
beyond MSIE. Some test with Outlook show some Outlook bomb out when it see
I18N Email. Others applications are kinder and reject them. I have not done
any extensive testing on this yet so I shall not go into them.

Frankly speaking, I wish I am wrong. I wish MS could do what they claim to do
properly rather than all these inconsisent.

-James Seng