[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] Newbie's questions implementing the [IDNA]



At 14:37 11/12/02, Stephane Bortzmeyer wrote:
On Wed, Dec 11, 2002 at 05:05:22AM +0100,
 JFC (Jefsey) Morfin <jefsey@jefsey.com> wrote
 a message of 48 lines which said:

> as "iesg--coca-cola.com" or "iesg--jonathan-cohen.net" or
> "iesg--vint-cerf.org".

The Unicode strings which IDN-encodes into "iesg--coca-cola.com" is:
LATIN SMALL LETTER C (Basic Latin)
LATIN SMALL LETTER O (Basic Latin)
ORIYA DIGIT THREE (Oriya)
ORIYA DIGIT THREE (Oriya)
LATIN SMALL LETTER C (Basic Latin)
LATIN SMALL LETTER A (Basic Latin)
FULL STOP (Basic Latin)
LATIN SMALL LETTER C (Basic Latin)
LATIN SMALL LETTER O (Basic Latin)
LATIN SMALL LETTER M (Basic Latin)
My program tells:
u+0063 u+006F u+0B69 u+0B69 u+0063 u+0061 . fr
are we in agreement?

In what language does it make sense? It is certainly possible to find
funny ACE encodings which have a meaningful Unicode form but it is not
obvious.
The Unicode form is of no real interest except that the more character sets the more chances the IDN displays in ACE form and the less chances it is already registered as a multilingual design, TM or even patent (?).

The target is only to see printed "From: iesg--coca-cola.com" when the registrant sends a mail.
I feel multi-sub-profiling could drastically reduce that capacity. But what about u+A317.com? Even then it will print in most of the cases as iesg-ibm.com.

Again, unless I have misunderstood something. This is what I ask.
jfc