[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

alpha v0.2



take two! i added new comments, took out some, edited others and added a few
other clause. comments pls.

few points to discuss and i realised missing.

1. localization needs. for example, some queries raised on the issues of
   doing <COUNTRY>.<2LD>.<DOMAIN>.<HOST> in the reverse way. Make more sense
   for some languages and country convention. 
   (Oh boy, this is going to get flame!)

2. localization needs again. Does double-width period counted as a domain
   delimitator like a single-width period? On a similar notes, this is
   related to double-width Alphanumeric vs single-width alphanumeric.

3. localization need once more. How to handle right->left writing order
   such as Arabic. One consideration is treat this as an non-issue because,
   for example, MS Windows CP1256 which defines Arabic actually encodes
   the domain name in the correct byte order as per norm from left->right
   but the render reverse it. On the other hand, this may not apply on
   some other system such as Mac or Unix.

-James Seng
               Requirements of Internationalized Domain Names

Status of this Memo

    This document is an Internet-Draft and is in full conformance with
    all provisions of Section 10 of RFC2026.

    Internet-Drafts are working documents of the Internet Engineering
    Task Force (IETF), its areas, and its working groups. Note that
    other groups may also distribute working documents as
    Internet-Drafts.

    Internet-Drafts are draft documents valid for a maximum of six
    months and may be updated, replaced, or obsoleted by other documents
    at any time. It is inappropriate to use Internet-Drafts as reference
    material or to cite them other than as "work in progress."

    To view the entire list of Internet-Draft Shadow Directories, see
    http://www.ietf.org/shadow.html.

    This Internet-Draft will expire on DD MMMM YYYY.

Copyright Notice

    Copyright (C) The Internet Society (2000). All Rights Reserved.

Abstract

    This informational document describes the requirement for encoding
    international characters into DNS names and records. This document 
    should be considered as a guidance for developing of solutions of 
    internationalised domain names. 
    
    This document is being discussed on the "idn" mailing list. To join
    the list, send a message to <majordomo@ops.ietf.org> with the words
    "subscribe idn" in the body of the message. Archives of the mailing
    list can also be found at ftp://ops.ietf.org/pub/lists/idn*.

Conventions
    
    The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
    "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
    document are to be interpreted as described in [RFC2119].

    "I18C" is often used in this document to refer to internationlized 
    characters or characters not within the US-ASCII.

    Characters mentioned in this document are identified by their 
    position in the Unicode character set. The notation U+12AB, for 
    example, indicates the character at position 12AB (hexadecimal) 
    in the Unicode character set. However, this is not an indication 
    of any requirement to use Unicode.

    "IDN" is used in this document as an abbreviation for 
    "internationalized domain name". This is defined as a domain name 
    that contains one or more characters that are outside the set of 
    characters specified as legal characters for domain names in 
    [RFC1034] Section 3.5.

    "RR" is used in this document as an abbreviation for "Resource
    Record" as defined in [RFC1035].

    A master server for a zone holds the main copy of that zone. This 
    is sometimes stored in a zone file.  A slave server for a zone 
    holds a complete copy of the records within that zone. A caching 
    server holds temporary copies of DNS records, it can use these 
    records to answer identical queries but restricted by time-to-live
    (TTL) of the records. Further explaination on master/slave server
    can be found in [RFC1034] and [RFC1996].


Table of Contents
    1. Introduction ...............................................    2
    ...
    ...


1. Introduction

    This informational document describes the requirement for encoding
    international characters into DNS names and records. This document 
    should be considered as a guidance for developing of solutions of 
    internationalised domain names (idn). 

    ...

    Examples quoted in this document should be considered as a form to
    further the explaination of the meanings and principles adopted by
    the document. It is not a requirement to satisfy the examples.

2. General Requirements

2.1 Compatibility and Interoperability

The DNS is essential to the entire Internet. Therefore, IDN must not 
damage present DNS interoperability. It must do minimum amount of 
changes to existing protocols on all layers of the stack. It must 
continue to allow any system anywhere to resolves any domain names.

Implementation of IDN must preserve the basic concept and facilities
of domain name as described in [RFC1034]. It must maintain a single, 
global, universal and consistent hierachary namespace.

The same name resolution request should generate the same response, 
regardless of the location (or localisation settings) of the resolver, 
the master server and any slave or caching servers involved. 

IDN should also allow a caching server which does not understand the 
charset in which a request (or response) is encoded to be build, and 
which works as well for IDNs as in the ASCII-only case. The caching
server must performs correctly if it gives the essentially the same
answer as the master server would have done if presented with the same
request (of course without the authoritative bit).

If the IDN implementation specifies a canonicalisation algorithm then 
a caching server should perform correctly regardless of how much (or 
how little) of that algorithm it has implemented. 

Implementors may proposal modification to DNS protocol [RFC1035] and
other related work undertaken by [DNSEXT] WG. However, these changes 
should be as minimal as possible and it must be approved by the DNSEXT
WG.

The best solution is one that maintains complete compatibility with 
current DNS standards as long as it meets the other requirements in 
this document. 

[JS: ?? i am not sure if there can be such solution!! :P]

"There can be no "flag days" nor a split DNS."

[JS: I do not understand what this means. I think we need to elaborate
this further]

2.2 Internationalization (I18N)

Internationalized characters (I18C) must be allowed to be represented 
and used in DNS names and records. Implementation must specify what 
character set is used and how these characters are encoded in the 
domain names and DNS records. 

This document does not recommand any character set for I18N. However, 
non-standard character set must not be used to avoid duplicate work 
on general I18N. If multiple character sets are used then the 
implemention must specify all the character sets being used and for 
what purpose.

IDN should not make any assumptions where in the domain name that I18N 
might appear. In other words, it should not differentiate between any
part of a domain name as it may impose a restrict on future I18N
effort.

IDN should also not make any culture restrictions in the protocol.
For example, an IDN implementation which only allows domain name to
use a single script would immediately restrict multinational 
organisation. 

IDN must be able to handle localized requirement of different languages.
For example, IDN must be able to handle right-to-left writing order of
Arabic. 

In addition, IDN must

1. provide a record which can contain internationalised text 
   (similar to TXT RR). [1 request to remove this as it is not IDN]

--- need comments ---
Must allow I18C in DNS queries.
Must allow I18C in DNS RR response.
Must allow I18C in DNS TXT records. 
Must allow I18C in DNS CNAME records.
Must allow I18C in DNS PTR records.
---

2.3 Canonicalization

Matching rules are the most complicated process of I18N of domain 
names. Canonicalization of characters must follow precise and 
predictable rules to ensure consistency.

In order to retain backward compatiblity, the implementation must 
(should?) retain the case-insensitive comparsion for US-ASCII as 
according to [RFC1035] Section 2.3.3.

For example, Latin captial letter A (U+0041) must match Latin small 
letter A (U+0061).

If other canonicalization is done, then it
1. must be done before the domain name is resolved.
2. must be easily upgradable as new languages and writing systems
   are added.

[CHARREQ] is a recommanded as a guide on canonicalisation.

Any conversion (case, ligature folding, punctuation folding, ...) from 
what the user enters into a client to what the client asks for 
resolution MUST be done identically on all requests.

"Thus, it must be specified in the protocol, not in the requirements 
document. The requirements document might list the kinds of conversions 
we might expect, but should not mandate where the converstions happen."

[JS: In this case, can i remove the rest of the section? Personally,
I think this is one of the most troublesome part of IDN and if this
is not cleared now in the requirement, we might get into a lot of 
argument in future.]

Case folding should also be used.

For example, Latin captial A with a ring above (U+00C5) should match
Latin small A with a ring above (U+00E5).

[JS: This opens up cans of worms for context sensitive folding? What
about CJK? How is the folding to be done?]

On the other hand, similar glyphs given different codespace on a
character set should be treated differently.

For example, cyrillic A (U+0410) should not match to Latin A (U+0041).
For example, Greek captial letter omicron (U+039F) should not match
to Latin captial letter O (U+004F).

2.4 Operational Issues

Zone files should remain easily editable.

Character set of a signed zone file should be capable of being the same
as the character set of the unsigned zone file.

IDN capable resolver or server should not generate any more traffic 
than a non IDN capable resolver or server.

IDN should add no new centralized administration for the DNS. A domain 
administrator should be able to create internationalized names as 
easily as adding current domain names. 

IDN must allow offline DNSSEC signing. It should also be able to look
at the signed file and see that it is the same as the unsigned one.

2.5 Others

The DNS protocol should remain deterministic. No DNS element (resolver,
server or zonefile) should be required to do guess work. 
[JS: One request to remove this.]

3. Specific Requirements

3.1 Client Requirements

3.2 Server Requirements

3.3 Zone file Requirements

4. Technical Analysis

There are many standard protocols and RFCs which is dependent on the
domain name and have make various assumption on its character set. 
Therefore, any implementation must contain a summary of the 
compatiblity issues and security consideration with, but not limited 
to, the following protocols:

<...list the sets of RFCs which we would like to have an summary...>

In addition, the implementation document must contain a summary of the
technical opinion of the working group.

5. Security Considerations

Any solution that meets the requirements in this document must not 
be less secure than the current DNS. Specifically, the mapping of 
internationalized host names to and from IP addresses must have the 
same characteristics as the mapping of today's host names.

Specifying requirements for internationalized domain names does not 
itself raise any new security issues. However, any change to the DNS 
may affect the security of any protocol that relies on the DNS or on 
DNS names. A thorough evaluation of those protocols for security 
concerns will be needed when they are developed.

References

    [RFC2119]   "Key words for use in RFCs to Indicate Requirement 
                Levels", rfc2119.txt, March 1997, S. Bradner.

    [RFC1034]	"Domain Names - Concepts and Facilities", rfc1034.txt,
                November 1987, P. Mockapetris

    [RFC1035]   "Domain Names - Implementation and Specification", 
                rfc1035.txt, November 1987, P. Mockapetris

    [RFC1996]   "A Mechanism for Prompt Notification of Zone Changes
                (DNS NOTIFY)", rfc1996.txt, August 1996, P. Vixie

    [CHARREQ]   "Requirements for string identity matching and String
                Indexing", http://www.w3.org/TR/WD-charreq, July 1998,
                World Wide Web Consortium

    [DNSEXT]    "IETF DNS Extensions Working Group", 
                namedroppers@internic.net, Olafur Gudmundson, Randy Bush

Author's Address


Appendix A. Acknowledgements

    The editor gratefully acknowledges the contributions of:

    Harald Tveit Alvestrand <Harald@Alvestrand.no>
    Martin Duerst <duerst@w3.org>
    Patrik Faltstrom <paf@swip.net>
    Andrew Draper <ADRAPER@altera.com>
    Bill Manning <bmanning@ISI.EDU>
    Paul Hoffman <phoffman@imc.org>
    James Seng <jseng@pobox.org.sg>
    Randy Bush <randy@psg.com>
    Alan Barret <apb@cequrux.com>

    as authors of corresponding sections and the contributions of:

    for their useful comments.


Full Copyright Statement

    Copyright (C) The Internet Society (2000). All Rights Reserved.

    This document and translations of it may be copied and furnished to
    others, and derivative works that comment on or otherwise explain it
    or assist in its implmentation may be prepared, copied, published
    and distributed, in whole or in part, without restriction of any
    kind, provided that the above copyright notice and this paragraph
    are included on all such copies and derivative works. However, this
    document itself may not be modified in any way, such as by removing
    the copyright notice or references to the Internet Society or other
    Internet organizations, except as needed for the purpose of
    developing Internet standards in which case the procedures for
    copyrights defined in the Internet Standards process must be
    followed, or as required to translate it into languages other than
    English.

    The limited permissions granted above are perpetual and will not be
    revoked by the Internet Society or its successors or assigns.

    This document and the information contained herein is provided on an
    "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
    TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
    BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
    HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
    MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

Acknowledgement

    Funding for the RFC editor function is currently provided by the
    Internet Society.