[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] IDNA comment 1 : applications' own normalization vs stringprep




----- Original Message -----
From: "Mark Davis" <mark@macchiato.com>
To: "Soobok Lee" <lsb@postel.co.kr>; <idn@ops.ietf.org>
Sent: Tuesday, February 12, 2002 1:59 AM
Subject: Re: [idn] IDNA comment 1 : applications' own normalization vs stringprep


> >stringprep(NFC(x)) == stringprep(x)
>
> This was brought up early in the Unicode 3.2 development. We have
> programmatically checked, and I with dot is the only case that causes
> a problem. It will be discussed at the UTC meeting this week. I have
> no doubt that it will be resolved for U3.2, and even if StringPrep
> doesn't pick up U3.2, it could add a mapping to that one case.

Okay!     I am glad to make useful contributions to UNICODE.

The honor is mine!

But, still we may have many compensation UTC or non-UTC normalizations
especially for hangul or other scripts, and that will make similar stringprep
faiilures..

That is , the possibility of     stringpre(newnormlization(x) != stringpre(x).

Soobok Lee


>
> Mark
> —————
>
> Πόλλ’ ἠπίστατο ἔργα, κακῶς δ’ ἠπίστατο πάντα — Ὁμήρου Μαργίτῃ
> [For transliteration, see http://oss.software.ibm.com/cgi-bin/icu/tr]
>
> http://www.macchiato.com
>
> ----- Original Message -----
> From: "Soobok Lee" <lsb@postel.co.kr>
> To: <idn@ops.ietf.org>
> Sent: Monday, February 11, 2002 08:39
> Subject: [idn] IDNA comment 1 : applications' own normalization vs
> stringprep
>
>
> >
> > In the IDNA draft version6,  section 6.1, paragraph 4 and 5,
> >
> > "In protocols and document formats that define how to handle
> > specification or negotiation of charsets, labels can be encoded in
> any
> > charset allowed by the protocol or document format. If a protocol or
> > document format only allows one charset, the labels MUST be given in
> > that charset."
> >
> > "In any place where a protocol or document format allows
> transmission of
> > the characters in internationalized labels, internationalized labels
> > SHOULD be transmitted using whatever character encoding and escape
> > mechanism that the protocol or document format uses at that place."
> >
> > Here,we need more security warnings in this section regarding to
> applications'
> > own normalizations that may collided with STRINGPREP's NFC/NFKC.
> >
> > Like some XML and HTML  standards, there may be applications that
> performs
> > NFC,NFKC or other normalizations on their data/text contents that
> > may contain IDN labels that will be stringprepped and the ACEed
> later time.
> >
> > But, even in the case NFC, stringprep(NFC(x)) == stringprep(x) is
> not always
> > guaranteed, especially in the case of <I dot above> and <I><dot
> above> ( + <acute>).
> > That will cause silent failures in applications.
> >
> > Current premature UAX15 used in stringprep and other compensating or
> superior normalizations that may
> > be from the same UTC or other organizations  may collide to each
> other in
> > a sequence of normalizatiom processes eventually ending in
> stringprep
> > in many mission critical applications.
> >
> > We need more researches and inspections for the possiblities of
> normalization vs normalization
> > conflicts and include the warning or recommendations in the IDNA
> specficiations.
> >
> > Soobok Lee
> >
> >
> >
> >
> >
> >
> >