[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Inputting mixed SC/TC (Re: [idn] A question...)
> http://www.iis.sinica.edu.tw/~wuch/idn/examples/mixinput.htm gives a
> very visually powerful demonstration of the problem the working group
> has debated over these many months, and I have no doubt that is useful
> for many members of the working group to see some of the character
> mixtures rather than hear them described. Thank you for your efforts.
Lest those on this list unfamiliar with Chinese be overly impressed
with the "powerful" demonstration posted on that site, the
islam.gif (2nd example) is *not* a valid representation of the
distinctions in characters in question.
The first character shown in all 8 lines is the glyph for U+6DF8
and *not* the glyph for U+6E05.
The second character shown in all 8 lines is the glyph for U+771E
and *not* the glyph for U+771F.
The third character shown in all 8 lines is the glyph for U+654E
and *not* the glyph for U+6559.
So if told to type *exactly* what is shown in each line, the right
answer in all 8 cases would be the same Unicode string:
U+6DF8 U+771E U+654E.
There *are* minute glyph distinctions here in the example, so it is
clear that the font used to display these has a different glyph
for each code position, but the font designer has *chosen* to
make each pair look almost identical, rather than to reflect the
glyph distinctions which underlay the separate character encoding.
Presumably this is to meet some Taiwan-specific market
requirements for font design. But the net effect is to artificially
exaggerate the problem being complained about by those objecting
to the IDNA handling of Chinese characters.
To then advertise these examples as making the problem clear to
those on this list who may be less than conversant with the
Chinese variant problems I consider to verge on misleading.
Those who wish to get an accurate depiction of the differences
between the 3 pairs of characters in question should consult the
standards themselves: ISO/IEC 10646-1:2000 or the online charts
for the Unicode Standard:
(* O.k., I'll sit back down now. *)