[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[idn] Unassigned code points discussion indraft-hoffman-stringprep-07.txt



It has been pointed out that the handling of unassigned code points in
stringprep is unclear, and I agree.  Quoting the document:

,----
| 2. Preparation Overview
|
| The steps for preparing strings are:
|
| 1) Map -- For each character in the input, check if it has a mapping
| and, if so, replace it with its mapping. This is described in section 3.
|
| 2) Normalize -- Possibly normalize the result of step 1 using Unicode
| normalization. This is described in section 4.
|
| 3) Prohibit -- Check for any characters that are not allowed in the
| output. If any are found, return an error. This is described in section
| 5.
|
| 4) Check bidi -- Possibly check for right-to-left characters, and if any
| are found, make sure that the whole string satisfies the requirements
| for bidirectional strings. If the string does not satisfy the requirements
| for bidirectional strings, return an error. This is described in section 6.
|
| The above steps MUST be performed in the order given to comply with this
| specification.
`----

A step for handling unassigned code points would make it clearer:

5) Check unassigned code points -- Possibly check the output for
   unassigned code points, according to the profile.  This is
   described in section 7.

A comment on whether this is what was intended or not would be
appreciated.

It could be argued that step 3 covers for unassigned code points, but
prohibited characters and unassigned characters are treated separately
elsewhere, and the forward reference does not include section 7.  So
unless it is stated explicitly that case 3 covers for unassigned code
points too, one will not likely reach that conclusion when reading the
document.