[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] Prohibit CDN code points



--On 2002-01-23 23.24 +0900 YangWoo Ko <newcat@spsoft.co.kr> wrote:

> I am talking about IDNA. With IDNA, we do not add matching algorithm
> to any server. Some client may try "az--" and it fails, it may try 
> "bz--" with TC/SC feature-enabled preprocessing, while others only 
> try "az--". I know this may look quite ugly. But, current IDN proposal 
> seems premature in terms of internationalization at least from the
> Chinese  people's point of view.

The problem occur when you use a language like for example Swedish. I can
type something in Swedish, and it should match something in Norwegian. It
might also match something in German, and English. What about "color" and
"colour" matching when you use US-english and UK-englisg?

So, my resolver library encode in some IDNA-like fashion the query in all
of the languages above, and then try them one at a time until a match is
found.

I call that guessing, and not something the DNS is made for, very efficient
in doing and something we in the IETF repeatedly have said "No" to many
times.

Just because the DNS is a lookup system where a client calculate a key,
that key is sent to the mesh of servers, and a result is returned.

The alternative would be to have the servers understand that UK-Enlish and
US-English is "almost" the same, and they are able to match ab-<color> with
bc-<colour>. Should ab-<colour> match bc-<colour>? If not, then ab-<color>
is the same as bc-<colour> which in turn is different from ab-<colour>.
This means we will/can get a registrant which register ab-<colour> and one
which register bc-<colour> but noone at that time register ab-<color>
because bc-<colour> is already taken, and we require global uniqueness.

I.e. if you _want_ to go down this path, you should look at the RFC's about
definition of lanaguge codes, and think about how you should do something
which can handle those languages. Then, hopefully, you will be convinced
that this is not a path that can be followed.

Multiple lookup in DNS is not something that will be accepted, and multiple
matching algorithms will not work either.

> Allowing Unicode be used where only ASCII was used before is not enough 
> for internationalization. We should prepare an enough room in which
> localization can be done while complying with internationalized standards.

You have to differ between localization and internationalization. It is
wellknown that IDN is _not_ about localization.

>> I.e. if you open the box of "problems" with Unicode, you will find that
>> the SC/TC problem is only one of them. Only one. 
> 
> Yes, I totally agree.
> 
>> I guess we have some 20-30
>> other problems which are similar to the SC/TC, i.e. problems because of
>> unification or non-unification in Unicode.
> 
> Why are we going to hide and ignore problems even though we know they are
> there ?

Because you can not solve them using Unicode and non-context-matchings
which is what we do in DNS.

>> My conclusion is the same, every server need to have knowledge about how
>> to handle all encodings.
> 
> My conclusion is the same with yours.

Good.

  paf