[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Character equivalence mapping (was: Re: [idn] SLC minutes)



Hi John & Ken,

> > The problem of "AB.example" is generally dealt with by
> > context.

But the context of domain names is Internet, and the Internet is
conceptually and practically becoming multilingual, which is why we are
discussing this issue today.  Therefore, the point is that if the context is
multilingual, AB.example needs to be unique regardless of what the
"intended" "language" is, because the "perceived" character is what becomes
the context in itself.  Therefore, the context of a character is the
character itself.  Whether by a Latin-trained-eye or by an
Arabic-trained-eye or a Chinese-trained-eye, some characters are
unmistakably identical.  That is the root of the problem at hand.

> > First of all "example" would be in Greek if I was
> > really dealing with Greek. Second, if I wanted people to enter
> > "ab.whatever" I'd be advertising in *English* to set the
> > expectations. If I wanted people to enter
> > "<alpha><beta>.....", I'd be advertising in *Greek* to set the
> > expectations, and people would be using Greek keyboards and
> > expect to enter Greek.

This is a presumption.  And "localization" by TLD is beyond the
technicalities of trying to make a solid IDN solution.  We should not resort
to the management of TLDs to finish our job..?..

Again, context should be irrelevant because context is not known.
For example (really just an example, there could be other cases)
I dont know greek, but I see an interesting ad that I think might be for a
Greek translation service.
I see the domain A2<beta>.example, (the actual domain was
<ALPHA>2<beta>.example)
I try it as I see it, it goes elsewhere.

> While I think Edmon's notions would make
> that problem worse, rather than better (hence my attempt at a
> reductio ad absurdum argument in my earlier note),

My notion is to keep the DNS stupid and just map all Characters with
"perceived" equivalence during internal matching.  The domain names as they
conceptually exist however should remain user friendly.  In other words,
domain names for humans will contain context and meaning (an "ALPHA" will be
an "ALPHA" not "A"), but domain names as viewed in the DNS (machine) should
be devoid of context.  To bridge these two, a set of equivalent characters
are prepared as a table for the machine so that it can blindly treat them as
identical without contemplating its context.

> I don't
> believe that "resolve by context" is going to be realistic in
> practice --again, in the DNS context-- either.

Very Agreeable.

Edmon


...Just as a final note here, I think Character Equivalence mapping could be
an optional "best practice" type of configuration for zone operators to
provide to their users, however, it would be a good idea to have a
technically sound approach, which is what I am trying to discuss.