[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [idn] Comments on IDNA/stringprep/nameprep





> -----Original Message-----
Paul Hoffman / IMC:

> It seems a tad inappropriate for the IETF to consider not only going 
> against the Unicode standard but *also* against the desires of the 
> Korean national body to ISO because of the desires of a small number 
> of people who are neither representatives of standards bodies or the 
> national governments that are most interested in the script in 
> question.


It seems to be much more than a tad inappropriate (for anyone) to go
against the *design* of a script. Note that Hangul was really designed,
by a working group, resulting in a document describing the (then) new
script.  So there is no after-construction involved when referring to
"the design" here.  There have been some minor changes of use, like
IEUNG as trail now stands for what YESIEUNG ("ng") used to stand for.
But that does not affect the structure of the design.  See "The Korean
Language" [1], section 6.3 gives an English translation of the original
description of the script design.  (Note that I'm not suggesting that
YESIEUNG be folded to IEUNG, they are different letters, albeit similar.)


I think I know why the letter cluster characters were invented, and 
why one did not want them to be decomposed at the time: it has to do
with collation.  But getting Hangul correctly collated *can and
should* be done without resorting to these letter cluster characters.
(There are several methods, irrelevant for this group.)


Note that it is not entirely unusual for SC2/WG2 to go "against the
desires of the NNN national body to ISO" that is asking for some
rearrangement or precomposition of characters, referring said
NB to 14651 for how to achieve correct collation order.


For all other (alphabetic or syllabic) scripts we have that: same
sequence of letters (including diacritics) = same "nameprepped" form =
same "name"; indeed, for IDN we even disregard all case differences
(for historic reasons; which we therefore must generalise; even though,
as has been pointed out, there are still some technical problems
with that [Turkish, and Greek; dotted i and adscript iota...]).


Now why should we in this instance at all regard multiple low
level representations of exactly the ***SAME*** sequence of
letters (and [tone] marks), according to the design, as being
different?  They will not only look the same (if properly
supported), they *ARE* the same, even if one representation
uses letter cluster characters and the other does not.


		Kind regards
		/kent k


PS
Note still that this is very different from the SC/TC case where the
spelling is different for the same words.


[1] SOHN, Ho-Min, "The Korean Language", Cambridge University
    Press, 1999.


> 
> --Paul Hoffman, Director
> --Internet Mail Consortium
>