[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[idn] Re: IDN WG Last Call on two major changes to Stringprep



Quoting the draft:

,----
| In any profile that specifies bidirectional character handling, all
| three of the following requirements MUST be met:
...
| 2) If a string contains any Right-to-Left character (defined as
| belonging to Unicode bidirectional categories "R" and "AL"), the string
| MUST NOT contain any Left-to-Right character (defined as belonging to
| Unicode bidirectional category "L").
| 
| 3)  If a string contains any Right-to-Left character (as defined above),
| a Right-to-Left character MUST be the first character of the string, and
| a Right-to-Left character MUST be the last character of the string.
`----

There is little rationale for the last two requirements.  Without
knowing the rationale, it is difficult to understand how to implement
this, not to speak of understanding and evaluating the specification.

It is not difficult to construct various strings that violates these
requirements, but seem like valid identifiers to me (e.g., U+05D0
U+0966, contemplate it being written by a mathematically inclined
writer in India).  Why is U+05D0 a R/AL character but U+2135 not?
U+2135 is NFKC'd into U+05D0.  It thus seems like the identifier is a
valid IDN if NFKC is not used, but if NFKC is used, it is not a valid
identifier.  A bidi user thus seem to require NFKC not to be used in
order to have the bidi string accepted.