[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[idn] combining marks and space-like unicode char

To: Soobok Lee <lsb@lsb.org>
Subject: [idn] combining marks and space-like unicode char
From: Soobok Lee <lsb@lsb.org>
Date: Fri, 08 Apr 2005 16:21:28 +0900
Cc: Erik van der Poel <erik@vanderpoel.org>, idn@ops.ietf.org
In-reply-to: <42562D22.3090609@lsb.org>
References: <42181FD5.3070608@lsb.org> <4255E488.8010302@vanderpoel.org> <42562D22.3090609@lsb.org>
User-agent: Mozilla Thunderbird 1.0 (Windows/20041206)


>>I tried U+1160 followed by a Latin character in MSIE with i-Nav and in
>>Firefox with IDN turned on, and it was displayed as a wide space. It
>>is unfortunate that both implementations chose to display it as a
>>space instead of deleting it.
>>    
>>
>
>Yes. Plugins M U S T filter out U+1160 from validated ToUnicode()ed
>labels, whether or not IDNA requires that.
>
>Soobok
>
I will add this: In standard hangul writing system,
U+1160 is meaningful only in some context (surrounded by at least one
jamo char).
But, is standalone U+1160 is illegal ? No, it is NOT illegal.

So, blind filtering of U+1160 is fault. Plugins' filtering should be
context-sensitive.
That is why it would complicate stringprep if it were included into
stringprep. :-)

We can find similar problems in "combining diacritical marks" (U+3xx).
What if
a label with single char 'combining accent or above-dot ' without any
preceding
alphabet? It will combine with its preceding dot delimiter. and that
will produce
confusing looks ( looks like a colon which is a protocol delimiter).

AFAIK, any single standalone combining accent char is not prohibited by
stringprep.

Sooobk

References:
- [idn] space-like unicode char
  - From: Soobok Lee <lsb@lsb.org>
- Re: [idn] space-like unicode char
  - From: Erik van der Poel <erik@vanderpoel.org>
- Re: [idn] space-like unicode char
  - From: Soobok Lee <lsb@lsb.org>

Prev by Date: Re: [idn] space-like unicode char
Next by Date: Re: [idn] space-like unicode char
Previous by thread: Re: [idn] space-like unicode char
Next by thread: Re: [idn] space-like unicode char
Index(es):
- Date
- Thread