[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[idn] [Fwd: Identifier comparisons for IDN, XML]



Mark Davis/Cupertino/IBM wrote:

> One evening at the recent W3C i18n meeting in Seattle, I wrote a program to
> generate data files that contain the differences between the XML
> identifiers, the Unicode recommended identifiers, and nameprep. I put the
> results on: http://www.macchiato.com/unicode/IdentifierDiff.txt
>
> This is only informational, to get an idea of how the three of them differ.
> I tried to segment the differences in a meaningful way within the file. In
> so doing, I also generated a data file that shows when characters came into
> Unicode. It is at http://www.macchiato.com/unicode/CharacterAge.txt.
>
> For Nameprep I actually used the canonical closure, where the canonical
> closure of X is the set of all characters that are canonically equivalent
> to a sequence of one or more characters from X.
>
> For both of these, if you view as UTF-8 you can see the characters as well
> as the names and code points.
>
> Mark
> ___
> Mark Davis, IBM Center for Java Technology, Cupertino
> (408) 777-5850 [fax: 5891], mark.davis@us.ibm.com, president@unicode.org
> http://maps.yahoo.com/py/maps.py?Pyt=Tmap&addr=10275+N.+De+Anza&csz=95014