[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[idn] IDNA comment 1 : applications' own normalization vs stringprep




In the IDNA draft version6,  section 6.1, paragraph 4 and 5,

"In protocols and document formats that define how to handle
specification or negotiation of charsets, labels can be encoded in any
charset allowed by the protocol or document format. If a protocol or
document format only allows one charset, the labels MUST be given in
that charset."

"In any place where a protocol or document format allows transmission of
the characters in internationalized labels, internationalized labels
SHOULD be transmitted using whatever character encoding and escape
mechanism that the protocol or document format uses at that place."

Here,we need more security warnings in this section regarding to applications'
own normalizations that may collided with STRINGPREP's NFC/NFKC.
 
Like some XML and HTML  standards, there may be applications that performs
NFC,NFKC or other normalizations on their data/text contents that 
may contain IDN labels that will be stringprepped and the ACEed later time.

But, even in the case NFC, stringprep(NFC(x)) == stringprep(x) is not always 
guaranteed, especially in the case of <I dot above> and <I><dot above> ( + <acute>).
That will cause silent failures in applications.

Current premature UAX15 used in stringprep and other compensating or superior normalizations that may
be from the same UTC or other organizations  may collide to each other in
a sequence of normalizatiom processes eventually ending in stringprep  
in many mission critical applications.

We need more researches and inspections for the possiblities of normalization vs normalization
conflicts and include the warning or recommendations in the IDNA specficiations.

Soobok Lee