[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] Unicode tagging



RJ Atkinson wrote:
> I'd bet a Dim Sum lunch that there are other languages with similar
> issues in Unicode/ISO-10646.

No need to bet. Indic languages in Unicode now are basically similar.
And so is Thai. 

Correct me if I am wrong but the principle that Unicode adopt is that if
a character can be formed by the NFD, they will use the decompose form
rather than the assigning codepoint for a composed form.

However, this does not prevent anyone trying to change it tho :-) For
example, there are two standard Tamil endorsed by Tamil Nadu known as
TAB and TAM (Bililingual/Monolingual). TAM contains composed form of
Tamil which takes 2 to 3 Unicode codepoint to form.

-James Seng

> The bottom line is that a hard limit does not appear reasonable
> to define and implement -- at least not in a manner that is fair
> to all language groups and fairness was the objective of having
> such a hard limit.
> 
> Yours,
> 
> Ran
> rja@inet.org