[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] length restrictions on IDN label



Paul Hoffman / IMC wrote:

At 11:43 AM +0900 10/14/02, Soobok Lee wrote:

UTF8-encoded IDN labels are not governed by RFC1035 length restrictions ?

There is no such thing. IDN labels are always encoded in ASCII following the rules of STD 13, just as it says in the draft.

That is true only in protocols predating IDNA draft.
IDN labels can be typed in/ displayed/ copy&pasted/ or exchanged in UTF8 (or other) encoding
in now and future applications or protocols slots as described in IDNA draft itself.
See enclosed excerpts from IDNA draft ( "SEE HERE").

I think some length restriction in code points is needed, rather than in octets ....
IDNA is the right place to put such things..

Soobok Lee

6.3 DNS servers

Domain names stored in zones follow the rules for "stored strings" from
[STRINGPREP].

For internationalized labels that cannot be represented directly in
ASCII, DNS servers MUST use the ACE form produced by the ToASCII
operation. All IDNs served by DNS servers MUST contain only ASCII
characters.

If a signaling system which makes negotiation possible between old and
new DNS clients and servers is standardized in the future, the encoding
of the query in the DNS protocol itself can be changed from ACE to
something else, such as UTF-8. The question whether or not this should (SEE HERE)
be used is, however, a separate problem and is not discussed in this
memo.



6.1 Entry and display in applications

(snip)

In protocols and document formats that define how to handle
specification or negotiation of charsets, labels can be encoded in any
charset allowed by the protocol or document format. If a protocol or
document format only allows one charset, the labels MUST be given in
that charset.

In any place where a protocol or document format allows transmission of
the characters in internationalized labels, internationalized labels
SHOULD be transmitted using whatever character encoding and escape ( SEE HERE )
mechanism that the protocol or document format uses at that place.

All protocols that use domain name slots already have the capacity for
handling domain names in the ASCII charset. Thus, ACE labels
(internationalized labels that have been processed with the ToASCII
operation) can inherently be handled by those protocols.


6. Implications for typical applications using DNS

In IDNA, applications perform the processing needed to input
internationalized domain names from users, display internationalized
domain names to users, and process the inputs and outputs from DNS and
other protocols that carry domain names.

The components and interfaces between them can be represented
pictorially as:

+------+
| User |
+------+
^
| Input and display: local interface methods
| (pen, keyboard, glowing phosphorus, ...)
+-------------------|-------------------------------+
| v |
| +-----------------------------+ |
| | Application | |
| | (ToASCII and ToUnicode | |
| | operations may be | |
| | called here) | |
| +-----------------------------+ |
| ^ ^ | End system
| | | |
| Call to resolver: | | Application-specific |
| ACE | | protocol: |
| v | ACE unless the |
| +----------+ | protocol is updated |
| | Resolver | | to handle other |
| +----------+ | encodings | (SEE HERE)
| ^ | |
+-----------------|----------|----------------------+
DNS protocol: | |
ACE | |
v v
+-------------+ +---------------------+
| DNS servers | | Application servers |
+-------------+ +---------------------+