[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Thread on - Re: [idn] Prohibit CDN code points



Dear  Adam  & all :
          Thanks your reply ,  It is very clear, Punycode has some optional
features that can be selected by the implementers.  How to input the
information of  annotational flag is an implementation issue.

> > Q1:  U+hhhh can be represented as u+hhhh or not ?
>
> The Unicode standard always uses U+, never u+, and the same is true of
> the IDNA draft.  The Punycode draft always uses U+ in the main spec, but
> the sample implementation uses both U+ and u+ in order to represent the
> annotation flags, and the examples section likewise uses both U+ and u+
> to make it easy to feed the examples into the sample implementation.
>
        To my understand, I descrbe it as:  if the implementation want to
use the feature of  annotation, then the flag of annotation should be passed
from  ToASCII  thru nameprep to Punycode. If a tradition of  IETF RFC
protocol  is followed , input parameter by an ASCII  string  like U+hhhh
will be used ,  so  U+/u+  can be an annotational flags.
> > Q2:  Here U+HHHH is not a hostname , does it MUST be forced to lower
> > u+hhhh or not in nameprep ?
>
> The case of the U is not part of the code point.  A code point is just
> an integer.  For example, U+0391 and u+0391 both represent the integer
> 913 (decimal) which is the code point for uppercase alpha.  U+03B1
> and u+03B1 both represent the integer 945 (decimal) which is the code
> point for lowercase alpha.  Nameprep always converts uppercase alpha
> to lowercase alpha (so it would always output 945, never 913), but a
> nameprep implementation that included support for mixed case annotations
> would output not only an array of code points but also a parallel
> array of case flags, and the lowercase alpha (945) would be flagged
> as "wanting to be uppercase".  The flags could be passed along to the
> Punycode encoder and recovered by the Punycode decoder.
>
> The Punycode sample implementation and examples sections use U+03B1
> to mean "lowercase alpha with flag set (wants to be uppercase)" and
> use u+03B1 to mean "lowercase alpha with flag clear (wants to stay
> lowercase)".
>
> The flags have no affect on which ASCII letters and digits are output
> by the Punycode encoder.  The flags merely affects the upper/lowercase
> property of the ASCII letters.
>
              I  think it is very clear,  the annotational flag just let the
Punycode input from an original  U/u+hhhh can output a LDH string with  a
upper/lower  character is set in its corresponding output part.  Punycode do
not change any results from nameprep even  it is work in single case
annotational mode.
            Actually, any  ACE encoder  can  do it the same way, but
Punycode is a special one that  treat basic code point and non-basic-point
as two separate parts, so it can let original flag information  can be
displayed in output string  even the coded ACE string is a lower case code
point .  The displayed case is no effect in DNS query, but it can help to
let user to view an original IDN form without  feeling  the nameprep has do
some strange thing to the ML-domain name.

Thanks  your  careful answers.
Best regards

L.M.Tseng