[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[no subject]



Sat, 2 Feb 2002 09:10:26 +0100 (MET)
Date: Sat, 2 Feb 2002 09:10:26 +0100 (MET)
From: Elisabeth Porteneuve <Elisabeth.Porteneuve@cetp.ipsl.fr>
Message-Id: <200202020810.JAA06777@balsa.cetp.ipsl.fr>
To: Elisabeth.Porteneuve@cetp.ipsl.fr, Marc.Blanchet@viagenie.qc.ca,
        ajm@icann.org, erin@twnic.net.tw, fred@cisco.com,
        harald@alvestrand.no,
        htk@eecs.harvard.edu, iab@isi.edu, idn@ops.ietf.org,
        iesg@ietf.org,
        jet-member@nic.ad.jp, jseng@pobox.org.sg, klensin@jck.com,
        lynn@icann.org, mkatoh@mkatoh.net, mkatoh@wdc.fujitsu.com,
        mouhamet@next.sn, narten@us.ibm.com, nordmark@eng.sun.com,
        paf@cisco.com, phoffman@imc.org, qhhu@public.bta.net.cn,
        sharil@cmc.gov.my, shkyong@kgsm.kaist.ac.kr, vcerf@mci.net
Cc: alanysho@hkdnr.net.hk, christine.tsang@hkdnr.net.hk,
        deng@cnnic.net.cn,
        hlqian@cnnic.net.cn, hoho@iis.sinica.edu.tw,
        huangk@alum.sinica.edu,
        jasonho@umac.mo, lee@whale.cnnic.net.cn, mao@cnnic.net.cn,
        snw@twnic.net.tw, sstseng@twnic.net.tw, tsenglm@cc.ncu.edu.tw,
        whzhang@cnnnic.net.cn, wschen@twnic.net.tw,
        wuch@gate.sinica.edu.tw,
        yktham@umac.mo
Subject: Re: [idn] Chinese Domain Name Consortium (CDNC) Declaration
Sender: owner-idn@ops.ietf.org
Precedence: bulk


If I may add a note on Latin-Cyrillic confusion. Quoted
from an explanation I have been providing to another group.

An aside note - I learnt from Russian colleagues that some
Russian favor to register domain names under .PY (ccTLD for Paraguay)
rather that .RU (ccTLD for Russia). The reason is that "PY"
is the beginning of the word "Russia" in Cyrillic - PYCCU[R].
The last caracter is Cyrillic "ya", see below, any other is identical
printing in Latin and Cyrillic, different code points in Unicode,
identical code point in "LDH".

Best regards,
Elisabeth Porteneuve
--

   Let have a glimpse on both end-user and intellectual property
   perspectives with an example.

   The word "COBET" reads as it is if one assumes it is Latin
   alphabet, but spells "soviet" if one assumes it is Cyrillic.
   The Unicode code point representation for Cyrillic "C", 0x0421,
   is different from code point representation for Latin "C", 0x0043,
   but they are identical on a printed paper, business cards
   or a screen. Taking into account the above, a usage of Unicode
   code points subsequently makes it impossible to communicate
   with anybody without knowing which language is _printed_, or,
   even worst, which letter or sign is printed in which language.

   In the famous TOYS[R]US the R in brackets is a Cyrillic
   code point 0x042f spelled "ya", which also happen to be the
   letter R seen as in mirror, spelled "are". With the exception
   of that letter [R], any other one in TOYS[R]US may be read
   either as Latin or as Cyrillic code point, different spellings,
   different code points, identical printing on paper or screen.
   In an example of a word of 6 code points, with the same
   printing but 2 different contents there is 2**6 = 64 possible
   combinations  It is the number of times a 6 letters word
   should be registered to preserve its whole intellectual
   property rights in 2 alphabets, Latin and Cyrillic.
   It is also the maximal number of tries an end-user should
   made to get to a website, if she or he got only a printed
   information.
   I have no competencies to expand this example to other
   alphabets or code points. Hovever, as far as I understand,
   the problem of Chinese code points have some similarity.

--