[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [idn] Adding "optional" characters in draft-ietf-idn-nameprep



I did not mention Arabic because I do not feel qualified to discuss Arabic
issues.

The attitude to vowel marks in Hebrew is the same. There are also other
marks of similar nature.

Regarding identifiers, vowels and other points may be used for the reasons
given - to clarify homonyms, for children etc. - but they should be
optional. This is the way they are commonly used. Therefore they should be
ignored, and it should not be possible to register two identifiers with the
same letters and different vowels.

The number of variations (for option c) can be huge, because each individual
vowel may be omitted or given, and also spelling mistakes are possible - for
example most people are never sure when to use Patah or Qamats.

Option a is the best, in my opinion, because it most closely fits the
intention. It also solves the problem of malformed strings, when invalid
sequences of vowels were entered.

The Israeli standard for HTML takes a similar attitude - if you cannot
display them, just ignore them and don't display the unknown character mark,
but keep them in the data. This means that in links, the user may not be
aware of their existence, and so option b would cause considerable
bewilderment.

Jony

> -----Original Message-----
> From: owner-idn@ops.ietf.org [mailto:owner-idn@ops.ietf.org]On
> Behalf Of James Seng
> Sent: Sunday, August 13, 2000 10:26 AM
> To: Paul Hoffman / IMC
> Cc: idn@ops.ietf.org
> Subject: Re: [idn] Adding "optional" characters in draft-ietf-idn-nameprep
>
>
> Paul Hoffman / IMC wrote:
> > Jonathan Rosenne pointed out that we might need another class of
> > characters for processing. Hebrew vowels are optional characters that
> > some people enter, although most don't. There are probably a few such
> > characters in other written scripts as well. We have a few choices:
>
> Just wanted to point out Arabic script also contain some diacritic which
> should be ignore. However, as the diacritic is used to symbolised how
> you should pronounce the word, the meaning of the word may varies
> depending on diacritic. But yes, diacritic is should be ignore in the
> normal language usage since the meaning is usually infered from context.
> But no, children are taught these diacritic to help them but yes, they
> slowly drop it when they grow older. Confused? *argghh*
>
> > a) We can ignore these characters on input (that is, toss them out of
> > the input stream).
> >
> > b) We can prohibit the characters on input.
> >
> > c) We can allow them in names. This would mean that people
> > registering names would have to register them with and without the
> > characters (possibly in many combinations).
>
> I am more inclined for (a), ie we silently ignore them on input.
> However, as I mention in the WG meeting, I like to point out what
> codepoint we put on this list would be sensitive as you can shrew it for
> other uses. (e.g. if we place '&' on the list, AT&T.COM will be
> equivalent to ATT.COM)
>
> > UI issues:
> > (a) would be easiest for users because they don't have to remember
> > whether or not to use the characters. (b) would cause users who enter
> > them in names to get an error that says an illegal character was
> > entered. (c) would also be easy for users, but only if name holders
> > register all logical possibilities of the names.
>
> (a) still sound like the best solution here.
>
> > Complexity in nameprep:
> > (a) adds another step and another table to the nameprep, although the
> > table will be small. (b) will be easiest because we already have a
> > table of prohibited characters. (c) adds no complexity.
>
> Construction of this table is 'bad'. I hate to have to maintain yet
> another list...And who is going to do that?
>
> > Registration of names:
> > (a) and (b) allows the registration of the fewest names, that is,
> > without the optional characters. (c) would require that name owners
> > register all names that include the optional characters.
>
> I recommended that this should be place into the comparsion I-D.
>
> -James Seng
>