[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] Newbie's questions implementing the [IDNA]



Soobok Lee <lsb@postel.co.kr> wrote:

> Only LDH and PREFIX--** ones are allowed as valid inputs to ToUnicode
> according to IDNA.

According to the first paragraph of the ToUnicode section, any sequence
of Unicode code points can be the input of ToUnicode.

> the output of ToUnicode must not contains
> unnameprepped/prohibited/(unassigned) codepoints.

The output of ToUnicode can contain unnameprepped, prohibited, and
unassigned code points.  Simply feed such a string as input to
ToUnicode, and the string will be output unaltered by ToUnicode.

> ToUnicode is for display and verification purpose for the punycode
> encoded labels later.

IDNA uses ToUnicode for exactly two purposes: to force labels to
human-friendly form before displaying them to users, and to define the
concept of ACE label (which is any label that ToUnicode would alter).

IDNA uses ToASCII for exactly two purposes: to define the concept of
internationalized label (which is anything that ToASCII can be applied
to without failing), and to force labels to ASCII form before putting
them into slots (protocol fields, function arguments, etc).

AMC