[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [idn] ToUnicode output can be longer than input
Edmon Chung <email@example.com> wrote:
> > x n - - fi fi - a ffl u e n t - s o u ffl - viii - u i c
> > The spaces are not really there, they just indicate the clusters, which
> > represent single code points (ligatures and roman numerals: U+FB01,
> > U+FB04, U+2177). That's 24 code points.
> If I counted it correctly, there are 33 "codepoints" in the above ACE
fi represents one code point (U+FB01), ffl represents one code point
(U+FB04), and viii represents one code point (U+2177). Now if you count
again, you should count 24. I'm trying to describe a non-ASCII ACE
string containing 24 code points, some of which are ASCII and some of
which are compatibility characters.