[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Character equivalence mapping (was: Re: [idn] SLCminutes)



--On Thursday, 03 January, 2002 11:31 -0500 Edmon
<edmon@neteka.com> wrote:

>...
> The domain names as they
> conceptually exist however should remain user friendly.  In
> other words, domain names for humans will contain context and
> meaning (an "ALPHA" will be an "ALPHA" not "A"), but domain
> names as viewed in the DNS (machine) should be devoid of
> context.  To bridge these two, a set of equivalent characters
> are prepared as a table for the machine so that it can blindly
> treat them as identical without contemplating its context.

But, Edmon, we already have disproofs of this in trademark
registrations and the registrars claim (and I believe them) that
the companies involved are anxious to take advantage of
internationalization to register their trademarks more directly
in the DNS.  We have organizations who want to be 
  <alpha><roman-i><digits123>
  toys-<cyrillic-ya>-us
and so on, a list that will certainly get longer as more
characters are perceived as available.  

What you are trying to do also involves some rather complex
judgement calls which I don't know how to make.  As someone who
is not very familiar with either, I've seen a number of font
forms of Arabic and Thai that I'm not sure I could tell apart,
at least without considerable context.  I would assume that
daily users of the two scripts wouldn't have that problem, but
who is to make the decision about equivalence?

It seems to me that there, in practice, are only two ways to get
what you want without ambiguity or confusion.  One goes down the
path that Ken points out, which results, ultimately, in
"equivalencing" some very different things.  And the other is to
say "ok, we will add characters to the LDH list, but only those
characters that cannot be confused with anything else,
regardless of the case or font used".  Maybe, in the long term,
they turn out to be the same option.

It would have the advantage of adding many useful characters to
the DNS-application-permitted set (and Harald and Patrik could
stop worrying about the names of their respective sons :-)), but
it would do almost nothing for internationalization other than
throwing us back into strange transliterations or transpositions
for many scripts.   Put differently, we would end up with a DNS
character set that would probably support Latin-1 and Han
characters properly, but maybe nothing else.  I don't find that
very satisfying, much as I am concerned about accurate
transcription of labels from printed form into the DNS.

     john