[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] case folding



At 13:39 31-05-00 , Brian W. Spolarich wrote:
>   What problem does case folding solve?

Preventing having functionally identical strings pointing at different
web content, to give an obvious example.

>Is it reasonable for protocol
>users to expect that MYDOMAIN.COM and MyDoMaIn.CoM are semantically the
>same, and therefore the protocol should understand that?

They already have demonstrated that they do, so the principle of least
astonishment means this behaviour should not change.

>While there is a
>backward compatibility requirement for US-ASCII, is it truly the case that
>users of the IDN will so strongly expect this behaviour that it becomes a
>requirement?

Absolutely yes.

>Is it possible to come up with a case-folding implementation
>that is going to satisfy the behavioural expectations of the large
>majority of the users?  I am mostly ignorant of these issues as they apply
>to the the vast majority of languages, but given the issues that have been
>raised here, I have to wonder if this is practically achievable.

The key is to distinguish between alphabetic languages (e.g. English,
Norwegian, Vietnamese) and non-alphabetic languages (e.g. Chinese).

For alphabetic langugages, case folding needs to be handled appropriately,
while for non-alphabetic languages "case" is not generally meaningful.

For the previous example of German double-S, it isn't really a matter
of case-folding but would definitely be within scope for canonicalisation,
IMHO.

UNICODE already has a specification for canonicalisation, which specification
reportedly includes case folding.  We can simply use that specification;
ISO not having an equivalent specification today (and not expected to have
one soon).

>   One of the DNS' strengths is its relative simplicity for the complex
>distributed task that it accomplishes.  Would the complexity and potential
>ambiguity involved in coming up with case mapping rules that meet
>everyone's expectations dimish the simplicity priciple that makes the DNS
>work well?

No.  Failure to define case-mapping as part of canonicalisation would
definitely cause users to become frustrated and angry and work against
the continued health of the Internet.

Ran
rja@inet.org