[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] IDNA comment 1 : applications' own normalization vs stringprep



>stringprep(NFC(x)) == stringprep(x)

This was brought up early in the Unicode 3.2 development. We have
programmatically checked, and I with dot is the only case that causes
a problem. It will be discussed at the UTC meeting this week. I have
no doubt that it will be resolved for U3.2, and even if StringPrep
doesn't pick up U3.2, it could add a mapping to that one case.

Mark
—————

Πόλλ’ ἠπίστατο ἔργα, κακῶς δ’ ἠπίστατο πάντα — Ὁμήρου Μαργίτῃ
[For transliteration, see http://oss.software.ibm.com/cgi-bin/icu/tr]

http://www.macchiato.com

----- Original Message -----
From: "Soobok Lee" <lsb@postel.co.kr>
To: <idn@ops.ietf.org>
Sent: Monday, February 11, 2002 08:39
Subject: [idn] IDNA comment 1 : applications' own normalization vs
stringprep


>
> In the IDNA draft version6,  section 6.1, paragraph 4 and 5,
>
> "In protocols and document formats that define how to handle
> specification or negotiation of charsets, labels can be encoded in
any
> charset allowed by the protocol or document format. If a protocol or
> document format only allows one charset, the labels MUST be given in
> that charset."
>
> "In any place where a protocol or document format allows
transmission of
> the characters in internationalized labels, internationalized labels
> SHOULD be transmitted using whatever character encoding and escape
> mechanism that the protocol or document format uses at that place."
>
> Here,we need more security warnings in this section regarding to
applications'
> own normalizations that may collided with STRINGPREP's NFC/NFKC.
>
> Like some XML and HTML  standards, there may be applications that
performs
> NFC,NFKC or other normalizations on their data/text contents that
> may contain IDN labels that will be stringprepped and the ACEed
later time.
>
> But, even in the case NFC, stringprep(NFC(x)) == stringprep(x) is
not always
> guaranteed, especially in the case of <I dot above> and <I><dot
above> ( + <acute>).
> That will cause silent failures in applications.
>
> Current premature UAX15 used in stringprep and other compensating or
superior normalizations that may
> be from the same UTC or other organizations  may collide to each
other in
> a sequence of normalizatiom processes eventually ending in
stringprep
> in many mission critical applications.
>
> We need more researches and inspections for the possiblities of
normalization vs normalization
> conflicts and include the warning or recommendations in the IDNA
specficiations.
>
> Soobok Lee
>
>
>
>
>
>
>