[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: China



I'll add my $.02 to yours...

At 10:04 AM 1/27/00 -0800, Andrew Draper wrote:
>Not quite sure where the subject line came from... but here's my ?0.02
>
>>  - I think, first, that the selection of character set is a no brainer. 
>>    There are defined character sets, and we know how to put them into DNS.
>
>>    The thing to do is follow the same procedures and rules we have used in
>
>>    extending other protocols that use alphabetic information - some 
>>    combination of ISO 10646 enhanced by UNICODE rules. I would encourage 
>>    you to settle that debate quickly and move on. This is the easy part.
>
>If choosing ISO10646/UNICODE as the single charset helps resolve
>canonicalisation matters (and I think it does) then we should do so now.
>Choosing the encoding can be left for later.

Agreed

>>  - Although DNS is defined as a binary service (and therefore amenable to 
>>    changes such as the use of UTF-8), many implementations are dependent 
>>    on the specific character set used by UTF-5. Therefore, deployment of a
>
>>    UTF-8-based solution implies a need for extensive testing of 
>>    implementations to make sure that they accomplish the necessary goals.

Any new implementation of the DNS - whether it includes UTF-5, UTF-8 or
UTF-16 - needs extensive testing.

>Does this imply a requirement that the protocol shouldn't send non-ASCII DNS
>labels to servers which aren't expecting them?  To resolvers which aren't
>expecting them?
>
>I suspect that servers will tend to be more robust (and easier to upgrade
>since they are normally attended by sufficient administrators).  There are
>also not that many DNS server implementations, and most of them have been
>written by knowledgeable people so I suggest that "Should not send non-ASCII
>names to servers which don't support IDNs" is a non requirement (and should
>be listed as such).

Agreed

>However, who knows how many resolvers are out there built into printers and
>the like.  I expect that some of these will be quite fragile.  So I would
>suggest adding "Should not send non-ASCII names to resolvers which don't
>support IDNs" to the requirements.

This should *not* be added to the requirements. IMHO, it will constrain
implementation alternatives excessively. 

>>  - There are significant questions in the comparison of characters. For 
>>    example, in European alphabets, upper and lower case are considered 
>>    equivalent - "Cisco.com" and "cisco.com" are the same DNS name. In 
>>    German, a "u" with an umlaut over it is equivalent to a "u" followed by
>
>>    an umlaut extender, and also to the character string "ue". No doubt it 
>>    only gets more interesting as you move to ideographic alphabets.
>
>I think that the "nasty ASCII alternatives" issue is also a non-requirement
>since it is so language dependent - do all languages which use u-umlaut
>pronounce it as "ue"?  What about n-tilde? (in Spanish this might be written
>"ny" in an ASCII-only charset).  Can anyone who speaks several European
>languages come up with an example of an accent which has different "nasty
>ASCII only" spellings in different languages?  Is this also a problem in
>Asian languages (sorry - don't know the right term)?

As ASCII "compromises" for representation of international charsets, "ae"
"ny" and "oe" are, AFAIK, informal and nonstandard, and should not be an
issue for this group. 

Cheers,

Bill Semich
.NU Domain

Bill Semich
President and Founder
.NU Domain Ltd
http://whats.nu
bill@mail.nic.nu