[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[idn] stringprep: PRI #29

To: Simon Josefsson <jas@extundo.com>
Subject: [idn] stringprep: PRI #29
From: Erik van der Poel <erik@vanderpoel.org>
Date: Sat, 19 Mar 2005 18:03:08 -0800
Cc: idn@ops.ietf.org
In-reply-to: <iluzmx36h6t.fsf@latte.josefsson.org>
References: <42322CE2.4040509@vanderpoel.org> <4232B2FD.1080104@vanderpoel.org> <4232BA56.5090001@vanderpoel.org> <iluk6odazwb.fsf@latte.josefsson.org> <00e801c528a8$99ad37d0$72703009@sanjose.ibm.com> <ilull8qb5n5.fsf@latte.josefsson.org> <42367B63.6080300@vanderpoel.org> <4237450A.9010901@v.loewis.de> <423754F3.50405@vanderpoel.org> <ilumzt47ezc.fsf@latte.josefsson.org> <20050316091126.GA24254~@nicemice.net> <iluzmx36h6t.fsf@latte.josefsson.org>
User-agent: Mozilla Thunderbird 1.0 (X11/20041206)

Simon Josefsson wrote:

There appears to me be a lot of decisions made out of subjective
opinions on how normalization "should" behave, or is "assumed" to
behave.

I don't think it's subjective. The concept of normalization requires that it be idempotent.

One way is to incorporate the PR-29 fix, declare the earlier attempt
as buggy, and re-cycle at PROPOSED.  I suspect you prefer that way?  I
am hesitant about that approach, because we have already deployed the
old RFC and it is not clear what problems there will be in mixing the
old and the new code.

We already have the situation where some implementations do it one way, and some do it the other way. It is quite clear what will happen when somebody uses a character sequence that is interpreted differently by these implementations. Keep in mind that Unicode may add new characters in the future that may also be affected.

Both Kerberos and SASL appears to be going to
use the old StringPrep as well, so we will be seeing security critical
infrastructure based on the old interpretation.

You write "the old interpretation" as if there is only one interpretation of the old spec. That's not true. As we have seen, there are implementations that do it one way, and those that do it the other way.

Another way is to carry on with the Unicode 3.2 NFKC even though it
breaks some human's assumptions on what "normalization" means in a
theoretic setting.

It's not just an "assumption", and it's not merely "theoretical". This is a very basic requirement for the normalization process.

Machines will cope, they compute an
algorithm, they don't care if the output meet some unstated invariant
or not.

IDNA specifies that a Punycode label must be decoded and then Nameprepped and Punycoded again to make sure you get the same string back in order to decide what to display (Unicode vs Punycode). This, in itself, should make you realize that the process is supposed to be idempotent. So we *do* care how the machines compute this algorithm.

A third way, which is what I am deploying, is to use the Unicode 3.2
NFKC together with a filter to reject the PR-29 problem sequences.
This is in line with the RFC's, it solves problems related to PR-29
problem sequences, and is simple to implement.

I don't think this is in line with the RFCs. You are rejecting sequences that are not rejected by the RFCs.

More importantly, when you continue to ship your implementation as is, more and more installations of your popular library will occur, making it more difficult for the world to adjust if and when the affected types of character sequences are introduced, either with the current characters or new characters.

You are in a position to make a difference. You already have. Please reconsider.

Erik

Follow-Ups:
- [idn] Re: stringprep: PRI #29
  - From: Simon Josefsson <jas@extundo.com>
- Re: [idn] stringprep: PRI #29
  - From: Erik van der Poel <erik@vanderpoel.org>

References:
- [idn] Unicode categories
  - From: Erik van der Poel <erik@vanderpoel.org>
- [idn] Re: Unicode categories
  - From: Erik van der Poel <erik@vanderpoel.org>
- [idn] stability
  - From: Erik van der Poel <erik@vanderpoel.org>
- [idn] Re: stability
  - From: Simon Josefsson <jas@extundo.com>
- Re: [idn] Re: stability
  - From: "Mark Davis" <mark.davis@jtcsv.com>
- [idn] Re: stability
  - From: Simon Josefsson <jas@extundo.com>
- Re: [idn] Re: stability
  - From: Erik van der Poel <erik@vanderpoel.org>
- Re: [idn] Re: stability
  - From: "Martin v. Löwis" <martin@v.loewis.de>
- Re: [idn] Re: stability
  - From: Erik van der Poel <erik@vanderpoel.org>
- [idn] Re: stability
  - From: Simon Josefsson <jas@extundo.com>
- Re: [idn] Re: stability
  - From: "Adam M. Costello" <idn.amc+0@nicemice.net.RemoveThisWord>
- [idn] Re: stability
  - From: Simon Josefsson <jas@extundo.com>

Prev by Date: [idn] stringprep: existing profiles and string processing complexity
Next by Date: Re: [idn] stringprep: PRI #29
Previous by thread: [idn] Re: stringprep mailing list
Next by thread: Re: [idn] stringprep: PRI #29
Index(es):
- Date
- Thread