[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[idn] stringprep: existing profiles and string processing complexity

To: Simon Josefsson <jas@extundo.com>
Subject: [idn] stringprep: existing profiles and string processing complexity
From: Erik van der Poel <erik@vanderpoel.org>
Date: Fri, 18 Mar 2005 11:33:27 -0800
Cc: idn@ops.ietf.org
In-reply-to: <ilu7jk86idc.fsf@latte.josefsson.org>
References: <421B8484.3070802@vanderpoel.org> <p06210208be4390618c81@[192.168.0.101]> <421E0D0C.2000309@vanderpoel.org> <p06210202be43c3888991@[192.168.0.101]> <E07CE813AD23B2D95DA0C740@scan.jck.com> <421E30F2.1040408@vanderpoel.org> <0E7F74C71945B923C52211F3@scan.jck.com> <421EA0C9.1010500@vanderpoel.org> <00a401c51af3$7863aae0$030aa8c0@DEWELL> <A574CA1BE87BFDA3C2A1AC0E@scan.jck.com> <42322CE2.4040509@vanderpoel.org> <4232B2FD.1080104@vanderpoel.org> <4232BA56.5090001@vanderpoel.org> <iluk6odazwb.fsf@latte.josefsson.org> <00e801c528a8$99ad37d0$72703009@sanjose.ibm.com> <ilull8qb5n5.fsf@latte.josefsson.org> <42367B63.6080300@vanderpoel.org> <4237450A.9010901@v.loewis.de> <423754F3.50405@vanderpoel.org> <ilumzt47ezc.fsf@latte.josefsson.org> <423782F0.60604@vanderpoel.org> <ilu7jk86idc.fsf@latte.josefsson.org>
User-agent: Mozilla Thunderbird 1.0 (X11/20041206)

Simon Josefsson wrote:

The issue should be handled in an update of the StringPrep.  If the
PR-29 fix is incorporated, there must a good transition discussion in
the document.  The problem cannot only be discussed in the realm of
IDNA, since StringPrep is used for other purposes as well.  For me,
the most important use is SASLprep, because it is used to prepare
username and passwords.  Hence, it is used in security critical
application, where the requirements are different than from IDNA.  The
discussion must be wider than for only IDNA.

I downloaded all the RFCs and Internet Drafts and grepped for Stringprep. Some of the profiles specified in RFCs are not registered at IANA. There are 3 profiles in an NFS4 spec (RFC 3530) that have not been registered. It is not clear to me that profiles must be registered. I have started a table of profiles that shows which parts of Stringprep are being used. So far, I have only listed profiles found in RFCs (not Internet Drafts). This table was far easier to create in an HTML editor, and I didn't want to send HTML to this list, so it's on my Web site:

http://nameprep.org/stringprep.html

It's still rough and doesn't indicate all the details, but it's a start. As I read the profiles and text in their vicinity, it seems as though some may not be fully aware of some of the complexities of string processing in Unicode. (This is not a flaw in Unicode. There are various written languages in the world, and some of them have complex rules.)

For example, RFC 3530 says:

   If the case_preserving attribute is present and set to false, then
   the NFS version 4 server MUST use table B.2 to map case when
   processing utf8str_cs strings.  Whether the server maps from lower to
   upper case or the upper to lower case is an implementation
   dependency.

Stringprep's case mapping table is intended to be used only to map upper case to lower case. For example, here is a special case:

   00DF; 0073 0073; Case map

You can't really use this entry to go "backwards".

There are other RFCs and Internet Drafts with various issues like this, or that are so vague as to make you wonder whether interoperability is truly assured. The IETF community is of course full of networking experts, and the Unicode community is full of internationalization experts. Stringprep finds itself in the rather special position of being wedged between networking (specifically, string input and matching) and Unicode. It might be a good idea to collect the issues and try to increase understanding of international string processing among networking community members.

Erik

References:
- [idn] nameprep2 and the slash homograph issue
  - From: Erik van der Poel <erik@vanderpoel.org>
- Re: [idn] punctuation
  - From: tedd <tedd@sperling.com>
- Re: [idn] punctuation
  - From: Erik van der Poel <erik@vanderpoel.org>
- Re: [idn] punctuation
  - From: tedd <tedd@sperling.com>
- Re: [idn] punctuation
  - From: John C Klensin <klensin@jck.com>
- Re: [idn] punctuation
  - From: Erik van der Poel <erik@vanderpoel.org>
- Re: [idn] punctuation
  - From: John C Klensin <klensin@jck.com>
- [idn] process
  - From: Erik van der Poel <erik@vanderpoel.org>
- Re: [idn] process
  - From: "Doug Ewell" <dewell@adelphia.net>
- Re: [idn] process
  - From: John C Klensin <klensin@jck.com>
- [idn] Unicode categories
  - From: Erik van der Poel <erik@vanderpoel.org>
- [idn] Re: Unicode categories
  - From: Erik van der Poel <erik@vanderpoel.org>
- [idn] stability
  - From: Erik van der Poel <erik@vanderpoel.org>
- [idn] Re: stability
  - From: Simon Josefsson <jas@extundo.com>
- Re: [idn] Re: stability
  - From: "Mark Davis" <mark.davis@jtcsv.com>
- [idn] Re: stability
  - From: Simon Josefsson <jas@extundo.com>
- Re: [idn] Re: stability
  - From: Erik van der Poel <erik@vanderpoel.org>
- Re: [idn] Re: stability
  - From: "Martin v. Löwis" <martin@v.loewis.de>
- Re: [idn] Re: stability
  - From: Erik van der Poel <erik@vanderpoel.org>
- [idn] Re: stability
  - From: Simon Josefsson <jas@extundo.com>
- Re: [idn] stability
  - From: Erik van der Poel <erik@vanderpoel.org>
- [idn] Re: stability
  - From: Simon Josefsson <jas@extundo.com>

Prev by Date: [idn] list: productive discussion
Next by Date: [idn] stringprep: PRI #29
Previous by thread: [idn] Re: stability
Next by thread: Re: [idn] Re: stability
Index(es):
- Date
- Thread