[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Canonical configuration database order



Hi,

W3C XML Schema Part 2 (datatypes) explicitly chooses 
NOT to define an 'order-relation' for 'xsd:string'.

But for real-world use of XML documents, sorting of 
strings should be locale-sensitive - or it produces 
nonsense results - even a straight codepoint value 
sort of UTF-8 may produce garbage in English, if 
you don't first normalize the UTF-8.

The best IETF document on this topic is Stringprep
(RFC 3454).  The W3C I18N Core WG is working on 
"LTLI" (Language Tags and Locale Identifiers) to 
address the impact of language tags (e.g., xml:lang)
on W3C technologies (including XML itself).

Cheers,
- Ira


Ira McDonald (Musician / Software Architect)
Blue Roof Music / High North Inc
PO Box 221  Grand Marais, MI  49839
phone: +1-906-494-2434
email: imcdonald@sharplabs.com

> -----Original Message-----
> From: owner-netconf@ops.ietf.org [mailto:owner-netconf@ops.ietf.org]On
> Behalf Of Andy Bierman
> Sent: Friday, June 02, 2006 1:08 PM
> To: Randy Presuhn
> Cc: Netconf (E-mail)
> Subject: Re: Canonical configuration database order
> 
> 
> Randy Presuhn wrote:
> > Hi -
> > 
> >> From: "Andy Bierman" <ietf@andybierman.com>
> >> To: "Randy Presuhn" <randy_presuhn@mindspring.com>
> >> Cc: "Netconf (E-mail)" <netconf@ops.ietf.org>
> >> Sent: Friday, June 02, 2006 9:33 AM
> >> Subject: Re: Canonical configuration database order
> >>
> >> Randy Presuhn wrote:
> >>> Hi -
> >>>
> >>>> From: "Andy Bierman" <ietf@andybierman.com>
> >>>> To: "Netconf (E-mail)" <netconf@ops.ietf.org>
> >>>> Sent: Friday, June 02, 2006 8:30 AM
> >>>> Subject: Canonical configuration database order
> >>> ...
> >>>> The canonical order for named sibling instances is the 
> natural sort order,
> >>>> based on the key associated with the nodes.  For multiple unnamed
> >>> ...
> >>>
> >>> What is meant by "the natural sort order"?
> >> Ascending order for each component of the key.
> >> Keys are evaluated 'top to bottom', where the
> >> topmost component represents the major index,
> >> and each successive key component represents
> >> successive minor index components.
> > ...
> > 
> > Yes, but what is meant by "ascending order"?
> > I'm wondering whether, for example, strings are
> > canonicalized before comparison (e.g. is "A" plus
> > "Umlaut" the same thing as "A-Umlaut"), whether
> > locales matter (e.g. does "tra" come before or
> > after "tuy"), and what happens with numbers
> > (e.g. does 9.9 come before or after 999)?
> 
> I should have seen this coming, based on who asked the question ;-)
> Hopefully, there are rules in place for canonical representation
> and sort order of UTF-8 strings, including internationalization
> and localization details.
> 
> I guess it is data type (and data modeling language) specific
> as to the exact meaning of ascending order.  I compare numbers
> as numbers, not strings, so 9.9 is less than 999 and -4 is less
> than 2.1.  Enumerations ascend in the order they are defined
> (aligns with SMIv2 and C), not alphabetic order of the enum values.
> 
> > 
> > Randy
> 
> Andy
> 
> --
> to unsubscribe send a message to netconf-request@ops.ietf.org with
> the word 'unsubscribe' in a single line as the message text body.
> archive: <http://ops.ietf.org/lists/netconf/>
> 
> -- 
> No virus found in this incoming message.
> Checked by AVG Free Edition.
> Version: 7.1.394 / Virus Database: 268.8.1/355 - Release 
> Date: 6/2/2006
>  
> 

-- 
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.1.394 / Virus Database: 268.8.1/355 - Release Date: 6/2/2006
 

--
to unsubscribe send a message to netconf-request@ops.ietf.org with
the word 'unsubscribe' in a single line as the message text body.
archive: <http://ops.ietf.org/lists/netconf/>