[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] stringprep comment 1



Stringprep keeps its own tables. What any implementation of Unicode
needs to do if they want to maintain strict backwards compatibility
for case folding is to only *add* to tables in the future, not change
the values of characters that are already in the tables.

While it is extremely unlikely that Unicode will make a change that
causes problems for nameprep, using the above strategy guarantees
backwards compatibility.

To repeat, the loose vs strict approach works as follows:

Suppose the client is Unicode 3.1 and the server is Unicode 4.0. As
long as the client produces names that are only 3.1, no problem.
Stringprep on the client will produce results that the server accepts.

Suppose the user has characters that are unassigned in 3.1 (but are in
4.0). As long as the user (manually) picks lowercase characters (in
the right canonical form), those names will be accepted by the 4.0
server. While it is not as easy as when the software does it for the
user, it will work for any new characters.

This is important for another scenario. Client A is on Unicode 4.0,
client B is on Unicode 3.1, and the server is on Unicode 4.0. Client A
namepreps a string, sends to client B. Client B sends the string on to
the server. Everything works. It even works if client B re-namepreps
the string.

Mark
—————

Πόλλ’ ἠπίστατο ἔργα, κακῶς δ’ ἠπίστατο πάντα — Ὁμήρου Μαργίτῃ
[For transliteration, see http://oss.software.ibm.com/cgi-bin/icu/tr]

http://www.macchiato.com

----- Original Message -----
From: "Yves Arrouye" <yves@realnames.com>
To: "'Michel Suignard'" <michelsu@microsoft.com>; "Paul Hoffman / IMC"
<phoffman@imc.org>
Cc: <idn@ops.ietf.org>
Sent: Thursday, January 31, 2002 22:51
Subject: RE: [idn] stringprep comment 1


> > Not a chance. Unicode and ISO 10646 collect in fact some unneeded
crumbs
> > that way. Once a character is officially approved for inclusion it
is
> > there forever, whatever we don't like it or not. Characters get
> > deprecated (i.e. usage discouraged) but never removed. This is the
price
> > you have to pay for stability and usage by other specification
(like
> > IDN).
>
> Sorry, I did not mean removed from Unicode (I am aware of the rule
there,
> which makes sense), I meant one character that is added to Unicode
but that
> a given Stringprep profile would think desirable to remove.
>
> As for case folding, as soon as one adds a new case mapping in the
Nameprep
> profile, one needs to upgrade clients at the same time as servers.
Oops.
> Encoding the Nameprep version in queries would solve that, but
obviously at
> the cost of some other meaningful info in the label, thus reducing
the
> length of IDN names.
>
> YA
>
>
>
>