[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[idn] stringprep comment 6: casefold and then noramlization is not enough




As I and David Hopwood suggested in the list,

  NFC(casefold(x)) or NFKC(casefold(x)) are not correct for <I dot above> and <I><dot above>.

  Rather, NFKC(casefold(NFKC(x)) is the correct one, even though it's not efficient.

  And There may be new UTC or non-UTC normalization that may applid before stringpre
   in various future i18n applications.


But, in current Stringprep draft in its section 2,

"2. Preparation Overview

The steps for preparing strings are:

1) Map -- For each character in the input, check if it has a mapping
and, if so, replace it with its mapping. This is described in Section 4.

2) Normalize -- Possibly normalize the result of step 1 using Unicode
normalization. This is described in Section 5.

3) Look for prohibited output -- Check for any characters that are not
allowed in the output. If any are found, return an error. This is
described in Section 6.

The above steps MUST be performed in the order given to comply with this
specification."

Please suggest a proof that this last enforcement is necessary and sufficient.

I suggest  the last "MUST" be changed to "MAY" .
this 3-staged stringprep may be not sufficient for future
diverse i18n appliations and protocols.
Too early freeze-up of stringprep architures from over-simplifications does
not help the stringprep to last long, IMHO.


Soobok Lee