[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[idn] RE: alpha v0.3



Title: RE: alpha v0.3

Some comments on the draft requirements doc.:

> A character is the smallest component of written language that has
> semantic value. A character has a single abstract meaning and/or shape,
> but not a specific shape.

Both of these sentences are objectionable.  A character is rather
smaller, usually, than anything that has a *semantic* value in any
sense. Nor does a character have a single abstract *meaning*.

I don't have any really good substitute sentences to suggest right
now.

-------

> A glyph is the specific shape that a character can have when it is
> rendered or displayed. A single glyph may correspond to a single
> character, or it may correspond to many characters; for example, the
> same glyph is used to represent the Latin capital letter "P" and the
> Greek capital letter "Rho". Similarly, a single character may
> correspond to multiple glyphs due to font, formatting style, national
> differences, and other reasons.

Contextual shaping (Arabic in particular) should be mentioned.

A single glyph may correspond to a sequence of characters,
like when ligatures are formed.  And a character or character
sequence may be rendered using multiple pieced-together subglyphs.

On the other hand, does the IDN work need to bother with this at all?
These are display issues, and I hope display issues are out of scope.

--------

> A language is a way that humans interact. In written form, a language
> is expressed in characters.

Well, it's "codified" for IT use as "characters".  When writing by
hand, or when using pre-computer era printing techniques, one do/did
not use "characters" in the sense we mean here.

> The same set of characters can often be
> used in many languages, and many languages can be expressed using
> different scripts.

Should there be some words about transliteration? Transcription?
Such things may have an effect on IDNs in that one may wish
to register, or prevent others to register, the "same" name
transcribed/transliterated to other scripts.  But that is
a registration issue, not a technical one.  There are some
ISO standards on transliteration, maybe also transcription,
but I have no detailed information on them.  John Clews
would know much more about that.

> A particular charset may have different glyphs
> (shapes) depending on the language being used.

I think "language" is not the real selector here.
More about which "printing tradition" is used.
Which may be confined more or less to various
languages, but it is not language per se that
determines.  To take a non-CJK example: Fractur
is very different from 'ordinary' Latin fonts.
But Fractur could be used for several languages
(and was).  It lasted the longest in Germany,
but that does not mean that German and Fractur
are, or rather were, linked in any way beside
popularity factor.

This is another display issue, and as such ought to be
out of scope.

-----

Karlsson Kent <keka@im.se>  ----> Kent Karlsson <keka@im.se>