[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: An argument against multiple character sets



--On 2000-01-26 13.14 -0500, "J. William Semich" <bill@mail.nic.nu> wrote:

> If all domain names are registered using a single universal standard
> encoding then all registered domain names will be unique within a
> particular TLD, assuming the current requirement for a single root server
> system continues to exist.

...because the encoding itself, like UNICODE in UTF-8, include ambiguities
like the fact that

'Ä' is one position in the Unicode tables.
'ä' is a different position in the tables.
'A' followed by "Combination 'M'" is a third.
'a' followed by "Combination 'M'" is a fourth.

Should they be treated as equal or not?

It has NOTHING to do with encoding. We don't talk about encoding here.

The problem is the meta-question regarding matching rules, where they are
defined, if they have to be defined etc etc etc.

Today, 'A' and 'a' are treated equal in DNS, so one can not register the
domainname "Example.com" if "example.com" is already registered, even
though different bytes are encoded in the labels. Should "äxample.com" be
different from "Äxample.com" and in turn be different from "aMxample.com"
and "AMxample.com"?

   paf