[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] case folding



At 16:13 11/06/00 , Marco d'Itri wrote:
>On Jun 11, RJ Atkinson <rja@inet.org> wrote:
>
>  >One could imagine a URL:
>  >         http://www.d-o^ng.vn
>  >and its capitalised equal:
>  >         http://WWW.D-O^NG.VN

(Background: the 2 examples above would be URLs that compared
as identical in my proposal for approaching normalisation
for letter-based <not ideogrammatic> scripts)

>What you explained would apply not only to Vietnamese, but to most
>european languages too.

Yes.

>The point in this proposal (which I support) is that users should
>not assume that domain names containing non-ASCII characters are case
>insensitive.

Hmm.  I'm not sure that I follow your comment, so I'll try
to re-explain my perspective. :-)

ASCII does not contain (for example) the characters,
         "d-", "D-", "o^", or "O^"
so my proposal extends to ALL Romanised letters, not
just those in the very limited US ASCII domain.  The
formal definition of "ASCII" is ANSI X3.4, of course,
not any version of ISO-646.

(NB: There are charset limitations of the email system that I am using
that prevent me from representing the Vietnamese-unique
Roman letters as precomposed inside my email message body).

So my proposal is that ALL domain names using alphabetic letters
(whether Greek, Roman, Latin, or other alphabetic scripts)
WOULD have case-folding for comparison purposes, hence such
letters WOULD be case-insensitive for domain-name/host-name
matching.  

I omit discussion of ideogrammatic languages (e.g. Chinese) 
from this current discussion.

>Actually, at least in my country, users would not think of this as a
>problem because most of them do not even know there are upper case
>letters with accent marks, because they are not on the keyboard.
>Maybe the RFC could suggest canonicalization being done by the user
>interface (web browser, MUA) to help users, but this should not be
>a requirement.

The UI is not a good choice because some requests might originate
inside software that doesn't have a UI.  That might be an argument
to say that the DNS Client should normalise PRIOR to putting any
request onto the wire.

>  >It is not reasonable to say that the content provider needs to 
>  >register all domain-names with the myriad case combinations 
>  >and manually map them to the same content, though I can see

>I agree. I think registries should not allow registration of labels
>containing upper case letters.

I'm not sure I understand your comment.  I said nothing about
upper-case letters.

I think that the URLs above should compare as equal using the 
new DNS extensions,  hence it would not be a matter of permission 
-- the technical specification would require that both URLs be equal, 
hence there is no technical feasibility to issue separate domain
names (or host names) with case differences.  That noted, the
user might mix case in any way without altering which domain-name
or host-name the new DNS would resolve to.  For example, the
user could use the all-CAPS version of the URL without any
technical problems.

Ran
rja@inet.org