[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] Last call comments to nameprep/stringprep: BIDI




As an expert in modern text processing of the Arabic script, having put my
head deeply into the mouth of the bidirectional algorithm, I wish to tell
about my complete approval of Martin Duerst's comments. The suggested
solution solves many security and confusion problems that may raise 
because of the complexities of the bidirectional scripts.

The problem may need to discussed further, but Mark's suggestion is the 
best I have seen so far.

Roozbeh Pournader

On Mon, 11 Feb 2002, Martin Duerst wrote:

> Currently, neither draft-ietf-idn-nameprep-07.txt nor
> draft-hoffman-stringprep-00.txt deal with bidirectionality
> (mixing right-to-left (Arabic/Hebrew) and left-to-right
> writing directions) issues. This should be changed as soon
> as possible.
> 
> If a label can contain both right-to-left and left-to-right
> characters, how it will be displayed, and how displayed
> labels will be entered and looked up in the DNS, is highly
> context-dependent. This is obviously very undesirable.
> 
> The following is a proposal written up by Mark Davis,
> based on input from others:
> 
> 
>  >>>>
> A. Characters are classified into RTL, LTR, DIGIT, OTHER.
> 
> These categories are drawn from the BIDI algorithm. The precise lists of 
> characters in each category would be added to NamePrep as an appendix. The 
> composition is as follows (See 
> <http://www.unicode.org/reports/tr9/#Bidirectional_Character_Types>http://ww 
> w.unicode.org<http://www.unicode.org/reports/tr9/#Bidirectional_Character_Ty 
> pes>/reports/tr9/#Bidirectional_Character_Types).
> 
> LTR   := L ; # including LRM
> 
> RTL   := R | AL ;
> 
> DIG   := EN | AN ;
> 
> OTH := all other characters: NSM, ON, etc.
> 
> Note: The characters in categories LRM, RLM, LRO, RLO, LRE, RLE, PDF, B, S, 
> and some other BIDI categories are prohibited anyway.
> 
> 
> B. In any field that contains any RTL characters:
> B0. no LTR characters can occur.
> C1. a sequence of characters of type DIG can only occur at the end.
> C2. a sequence of characters of type OTHER can occur only between 
> characters of type RTL.
>  >>>>
> 
> I propose that this be added as an additional step after the current
> 'prohibition' step.
> 
> Regards,    Martin.
> 


-- 
Note: If you want me to read a message, please make sure you include my
address in "To" or "CC" fields. I may not be able to follow all the
discussions on the mailing lists I subscribe. Sorry. (No, there's no problem
to receive duplicates.)