[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] Re: Legacy charset conversion in draft-ietf-idn-idna-08.txt (in ksc5601-1987)



There is, to my knowledge, no such central registry. The IANA names
are, unfortunately, neither unambiguous nor comprehensive (see other
mail).

The closest I know of to this is that the ICU team has been gathering
data on different character sets based on a programmatic analysis of
how conversion actually operates on different platforms (which is
often different than the documentation of that mapping!). We have made
provision for (but have not gathered), historical data -- unless a
mapping is also in modern use on one of the platforms we collected
from: aix, glibc, hpux,os390, os400, java, solaris, windows.

See http://oss.software.ibm.com/icu/charset/ for an overview;
http://oss.software.ibm.com/cvs/icu/charset/data/xml/ for the data in
xml.

However, I want to emphasize that this is an open-source project,
*not* a registry.

Mark
__________

http://www.macchiato.com

 "Eppur si muove"
----- Original Message -----
From: "Soobok Lee" <lsb@postel.co.kr>
To: <idn@ops.ietf.org>
Sent: Thursday, May 30, 2002 08:56
Subject: Re: [idn] Re: Legacy charset conversion in
draft-ietf-idn-idna-08.txt (in ksc5601-1987)


> By "additions", i mean the required new tag for new version of
legacy encoding, like "ks_c_5601-1992"
> which should have been used, but never have been, as far as i know.
Is there any central
> registry that maintain the correct tag values for vaiour versions of
numorous legacy encodings ??
> If not, how to ensure stable and interoperable legacy-2-unicode
conversion among myriads of applications ?
>
> Soobok Lee
>
> ----- Original Message -----
> From: "Doug Ewell" <dewell@adelphia.net>
> To: <lsb@postel3.postel.co.kr>; "James Seng" <jseng@pobox.org.sg>
> Cc: <idn@ops.ietf.org>
> Sent: Wednesday, May 29, 2002 2:32 PM
> Subject: Re: [idn] Re: Legacy charset conversion in
draft-ietf-idn-idna-08.txt (in ksc5601-1987)
>
>
> > Soobok Lee <lsb at postel3 dot postel dot co dot kr> wrote:
> >
> > > What if  proper and stable implementation of legacy encodings in
> > > IDNA is not a tangilbe object due to their frequent updates/
> > > additions ?
> >
> > How "frequent" are updates to legacy encodings, or updates to the
> > mapping tables between legacy encodings and Unicode?
> >
> > As for additions, they shouldn't cause a problem anyway, because
they
> > don't break existing legacy-to-Unicode mappings.  An
often-mentioned
> > case of adding to a legacy encoding was when Microsoft retrofitted
> > U+20AC EURO SIGN onto their Windows code pages (mostly at
previously
> > unassigned code position 0x80).  The only people who suffered at
all
> > were the ones who thought "unassigned" somehow meant that they
should
> > map 0x80 to U+0080.  Everyone else made it just fine.
> >
> > -Doug Ewell
> >  Fullerton, California
> >  (would prefer to receive only one copy of these messages)
> >
>
>
>