[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: My prod at IDN requirements



As Patrik mention, the WG is not comfirmed yet, pending Erik and Thomas
approval. Anyway, since Harald have started the discussion, lets go on until
someone ask us to stop :-)

Harald Tveit Alvestrand wrote:
> Base requirements - I think we can regard these as given:
> 
>    DO NOT DAMAGE PRESENT DNS INTEROPERABILITY

A very big YES! IMHO, it should also extend to other Internet Protocols. I
think someone need to do a count on how many RFCs maybe affected by IDN.

> Internationalization requirements:
> 
>    Allow internationalized characters to be represented and used in DNS names
>    Allow internationalized characters to be represented and used in DNS records
> 
>    This is too broad - we don't know what that means.

I think we need to properly defined 3 case.

I18N of Domain Names as represented on the client.
I18N of Domain Names as represented in DNS packet.
I18N of Domain Names as represented as DNS record/zones.

They may be the same, or they may not be. We do not know.

>    i18c in a Query must be possible   YES/NO 

yes.

>    i18c in the name field of a RR Response must be possible YES/NO

yes

>    i18c in the content of a TXT record must be possible YES/NO

maybe but preferrably yes.

>    i18c in a name field of a Response or in content of a RR must be
> uniquely identifiable as such YES/NO

this is sort of related to the matching problem. but i think yes.

>    i18c must be returned as content of a CNAME YES/NO

yes. we should not change the existing DNS system. 

>    i18c must be returned as content of a PTR YES/NO

maybe? i would like to see it been a yes.

>    i18c must be possible in dynamic update names & records YES/NO

yes. we should not change the existing dns system.

>    it must be possible to DNSSEC sign i18c records DNS server to client YES/NO

yes. we should not change the existing dns system.

> More in the solution space:
> 
>    iso 10646 characters will be enough forever for DNS purposes YES/NO

UCS-4 should cover all languages including all variation in time to come.
However, it also have a lot of problems, including the fact that it changes
from time to time :P

>    a single representation for i18c must be chosen YES/NO

maybe? i think different proposals will have answer to this. i think we should
leave it open, and not limit to only iso10646 or some other encodings.


> For matching records, Choose One:
> 
>    it matters whether matching is consistent across all servers
>    it doesn't matter whether matching is consistent across all servers

I think obviously we need to make sure matching is consistent across all
servers.

>    i18c Cyrillic A must compare equal to Latin A
>    i18c Cyrillic A must compare not equal to Latin A
>    i18c A with Ring Above must compare equal to a with ring above
>    i18c A with Ring Above must compare not equal to a with ring above
>    i18c ASCII A must compare equal to a
>    i18c ASCII A must compare not equal to a
>    i18c A + COMBINING RING ABOVE must compare equal to A with Ring Above
>    i18c A + COMBINING RING ABOVE must not compare equal to A with Ring Above

case-folding is not a simple problem, even for european languages as it may
varies on context. http://www.unicode.org/unicode/reports/tr21/ is a good
report on case mapping problem, at least for european languages.

> Others are MUCH better than me in compiling example cases and requirements
> for Korean, Japanese, Thai, Arabic, Hebrew.....

in addition, there are also languages which have other problem on folding.
chinese for example have simplified & traditional glyphs which means the same
thing, use in the same way but given different codespace.

japanese kanji also have traditional & simplified glyphs but it is usually
considered differently. or at least that is what i have been told. 

this will be a problem if ISO10646 is used. because of the CJK unification
(arggh who is the idiot?), japanese & chinese falls under the same U+4E00 code
space. if one folds and the other not, i think it is fairly obvious how messy
it is going to be.

korean hangul if i am not wrong does not suffer from this problem :-) it is a
very clean and well-designed language. 

-James Seng