[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[idn] combining marks and space-like unicode char




>>I tried U+1160 followed by a Latin character in MSIE with i-Nav and in
>>Firefox with IDN turned on, and it was displayed as a wide space. It
>>is unfortunate that both implementations chose to display it as a
>>space instead of deleting it.
>>    
>>
>
>Yes. Plugins M U S T filter out U+1160 from validated ToUnicode()ed
>labels, whether or not IDNA requires that.
>
>Soobok
>
I will add this: In standard hangul writing system,
U+1160 is meaningful only in some context (surrounded by at least one
jamo char).
But, is standalone U+1160 is illegal ? No, it is NOT illegal.

So, blind filtering of U+1160 is fault. Plugins' filtering should be
context-sensitive.
That is why it would complicate stringprep if it were included into
stringprep. :-)

We can find similar problems in "combining diacritical marks" (U+3xx).
What if
a label with single char 'combining accent or above-dot ' without any
preceding
alphabet? It will combine with its preceding dot delimiter. and that
will produce
confusing looks ( looks like a colon which is a protocol delimiter).

AFAIK, any single standalone combining accent char is not prohibited by
stringprep.

Sooobk