[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Compatibility requirements



At 21:54 00/01/23 +0100, Harald Tveit Alvestrand wrote:
> At 11:02 23.01.00 -0800, Paul Hoffman / IMC wrote:
> >At 02:31 PM 1/23/00 +0800, James Seng wrote:
> >>Canonicalization algo is an ongoing process.
> >
> >Not for all character sets. It is fixed and done for Unicode, for example. 
> >See Unicode Technical Report 15.
> 
> UTR 15 has a depressing number of references to the Unicode data files.

You need some data to do the work. And the data is available in these
files.

> And 
> a depressingly low number of promises not to change those in incompatible ways.

The major problems of compatibility with future Unicode versions have been
discussed in the report (namely how to handle the addition of new precomposed
characters). For other cases, very rare to appear in practice, I agree
that there could be some more discussion. I'm currently writing an I-D
that describes Normalization form C, and have already included
quite some discussion there. Also, I can assure you that the relevant
authorities at the UTC are very, very aware of the fact that they
shouldn't change  the tables.

> One reason why UTR 15 hasn't met with universal acclaim from ISO, I think.

No, I don't think that's the issue. The reasons as I see it are:
- ISO always only standardised codepoints.
- Some people in the ISO WG have difficulties with defining
  canonical equivalences as such.
- Some people in the ISO WG had difficulties with making the
  decomposed form the normal form (UTR #15 goes a long way to
  address this)
- Some people in the ISO WG want to uphold the level distinctions
  in ISO 10646 (Level 1 doesn't allow decomposition). With more
  and more implementations going beyond Level 1, this may change.


> Still, much more work has been done on canonicalization of Unicode than any 
> other reasonably large set I know of.

Yes indeed.



Regards,   Martin.


#-#-#  Martin J. Du"rst, World Wide Web Consortium
#-#-#  mailto:duerst@w3.org   http://www.w3.org