[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] Newbie's questions implementing the [IDNA]



On Mon, Dec 09, 2002 at 11:18:28PM +0000, Adam M. Costello wrote:
> Soobok Lee <lsb@postel.co.kr> wrote:
> 
> > Only LDH and PREFIX--** ones are allowed as valid inputs to ToUnicode
> > according to IDNA.
> 
> According to the first paragraph of the ToUnicode section, any sequence
> of Unicode code points can be the input of ToUnicode.

You are right. but, many novices would carelessly misinterpret "toUnicode()" 
as doing some legacy to unicode conversion (if he did not carefully read drafts
or API manuals), 
and they may actually call "toUnicode("legacy encoded IDN")" expecting that.

> 
> > the output of ToUnicode must not contains
> > unnameprepped/prohibited/(unassigned) codepoints.
> 
> The output of ToUnicode can contain unnameprepped, prohibited, and
> unassigned code points.  Simply feed such a string as input to
> ToUnicode, and the string will be output unaltered by ToUnicode.

right. IDNA states that such outputs should not be displayed as native
ones, but just as ASCII ones as it is.  "must not" is  meant for that.
I think it is clear enough in drafts.

> 
> > ToUnicode is for display and verification purpose for the punycode
> > encoded labels later.
> 
> IDNA uses ToUnicode for exactly two purposes: to force labels to
> human-friendly form before displaying them to users, and to define the
> concept of ACE label (which is any label that ToUnicode would alter).
> 
> IDNA uses ToASCII for exactly two purposes: to define the concept of
> internationalized label (which is anything that ToASCII can be applied
> to without failing), and to force labels to ASCII form before putting
> them into slots (protocol fields, function arguments, etc).

Thanks.

> 
> AMC