[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] Re: IDNA: is the specification proper, adequate, and complete?



Mark Davis <mark at macchiato dot com> wrote:

> 2. There are no "non-Unicode coding systems" that unify beta and
> eszed; the language issue is irrelevant.

MS-DOS code page 437 had a character at 0xE1 that was sometimes rendered
more like a sharp-s and sometimes more like a small beta, depending on
which screen font you were using.  In the standard 8x8 font and 8x14
fonts it was very definitely a beta, but in the 8x16 font it was a
sharp-s.

To make matters worse, the character was surrounded by small alpha at
0xE0 and capital gamma at 0xE2, implying a continuous sequence of Greek
letters.  But in code page 850, 0xE1 was unmistakably a sharp-s.

One might say that CP437 was inadequate to encode Greek anyway, but then
it was inadequate to encode plenty of Latin-based languages as well, so
that was not a great argument against the "beta" interpretation.

Of course, character tables hardcoded in the ROMs of PCs weren't
intended to be definitive.  Things were looser then.  Before Unicode
came around as sort of the "gold standard" of character encodings, most
people didn't worry about whether 0xE1 was a beta or a sharp-s.  It
could be either, depending on context.  Now our expectations are higher,
and we need to differentiate between the two.

-Doug Ewell
 Fullerton, California