[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] Comparisons of the proposals



Dan wrote:
> But it would be nice for me not to need a special editor when editing
> zone files. That is why I in my DNS internationalisation draft
> recommend DNS servers to handle loading of zone files from the
> local character set and convert it into the standard used internally.

I think we keep seeing arguments on how IDN should be on the wire, in the
zonefile, on the client and blah blah. While ideally they should all be the
same, we live in a practical world...

For example, URL has been escaping non-ASCII characters in %hh format (Ah,
sorry making a mistake which Larry corrected). Even Martin in his
draft-duerst-dns-i18n-02.txt suggested that URL shld keep the host part in
%hh.

On the other hand, %hh escaping may not work for other client representation,
for example, Email. For SMTP RFC822, MIME RFC2048 may be more appropriate.
Other protocols (e.g. RFC821?) and applications (tcpd?) may not be so lucky
and may not able to handle multilingual characters..this is where ACE is
useful.

Now, moving down to the wires, for DNS packets, UTF-8 seem most appropriate.
But if 8-bit dirty data is allowed, then why not localized encodings? 

Moving over to the DNS server, it is probably better to keep the domain name
in wchar_t (in native ISO10646) rather than char (UTF-8). It is easiler to
manipulate data in Unicode rather than UTF-8.

Then on the zonefile, why couldnt it be store in localized encodings? It is
definately much easiler for the DNS administrator to manipulate the zone in
localized encodings than UTF-8 or ACE. On the other hand, ACE has the
advantage that DNS administrator to enter a domain names even for a language
he dont know and understand. On the other hand UTF-8 editors are around...but
ACE dont even need any special editor...but...blah blah...

Therefore, I believe the issues can really be discussed at various level, each
with a different issues and possible different result.

> For those of you who want to study different possibilities of an ACE,
> I have one more for you. In my draft I have a very simple ACE that
> takes a lot of space so that long ACE names cannot be sent over
> the wire through DNS. CIDNUC can be very compact but is difficult
> to encode/decode. 

Cool. Maybe I shld delay my CIDNUC testing until I get your ACE in too. Let me
see what I can do this week. :) I am also toying with a run-length compression
with a base36 encoding...(yep, base36 is going to make things very hard to
implement...perhaps harder then CIDNUC)

> charcters. It is based on ideas from CIDNUC and UTF-5.
> It is only intended to encode BMP. Leaving BMP you
> must go international and use UTF-8.

One question. Why limit it to only BMP since from the algo, it is possible for
it to go beyond BMP...

-James Seng