[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] Suggested clarifications of the IDN requirements document



> I suggest the following changes to the IDN requirements document, version -03:
> 
> [2.5] The DNS service layer (the packet formats that go on the wire)
> MUST NOT limit the codepoints that can be used. This interface SHOULD
> NOT assign meaning to name strings; the application service layer,
> where "gethostbyname" et al reside, MAY constrain the name strings to
> be used in certain services. (conflict)
> 
> Change to
> 
> [2.5] The DNS protocol (the packet formats that go on the wire) MUST NOT
> limit the codepoints that can be used.
> A service defined on top of the DNS, for instance the IDN-to-Address function,
> MAY limit the codepoints that can be used.
> The service description MUST describe what limitations are imposed.

this is mostly okay with me, though I still don't see where the
MUST NOT requirement derives from.

> ------------------
> [4] The protocol SHOULD allow creation of caching servers that do
> not understand the charset in which a request or response is encoded.
> The caching server SHOULD perform correctly for IDN as well as for
> current domain names (without the authoritative bit) as the master
> server would have if presented with the same request.
> 
> Change to
> 
> [4] The protocol MUST NOT require that current cache servers be modified
> to support IDN. If a cache server can have additional functionality to
> support IDN better, this additional functionality MUST NOT cause problems
> for resolving current domain names.

I might insert "correctly functioning" between "current" and "cache servers".
I don't think we should constrain ourselves to support broken DNS caches,
especially if those broken caches are not widely deployed.

> -------------
> [18] The protocol SHOULD NOT place any restrictions on the
> application service layer. It SHOULD only specify changes in the DNS
> service layer and within the DNS itself.
> 
> Suggest to delete this requirement.

agreed.

> -------------
> [37] The protocol MUST work for all features of DNS, IPv4, and IPv6.
> 
> Change to:
> 
> [37] The protocol MUST support the following operations:
> - Mapping an IDN to IPv4 addresses
> - Mapping an IDN to IPv6 addresses
> - Mapping an IDN to an MX record
> The protocol SHOULD support the following operations:
> - Mapping an IPv4 address to an IDN
> - Mapping an IPv6 address to an IDN
> The protocol MAY support other operations.
> The protocol MUST NOT allow an IDN to be returned to a requester
> that requests the IP-to-(old)-domain-name mapping service.
> 
> I suggest also that this requirement be moved to be [2.6] - I think it is
> critical for knowing what solutions can be allowed, and should be early in
> the document.

First, I would make the MUST NOT requirement separate from the others.

As for the rest - at this point I wonder whether either the old or the 
new language is appropriate.  I'm inclined to say that this kind of 
constraint is premature.

I think we need two things from IDNs:

  1. ability to define easily remembered, and easily transcribed, 
  DNS-like  names in any language that people normally use...

  (but we're not trying to look up arbitary strings or even arbitrary
  "human friendly names", and we'd like to avoid the possibility of an 
  IDN having multiple meanings if we can.  we also assume that 
  "languages that people normally use" are, or will eventually be, covered 
  by 10646)
  
  ...and which are subdivided in a way that is compatible with DNS hierarchy

  (we want IDNs to be a logical superset of present-day DNS names, and
  we don't want conflicts between IDNs and present-day DNS names)
  
  2. ability to use those names to identify and find service locations
  of any internet service which can be found using existing DNS.
  
but this does not imply direct mapping of IDNs to specific DNS RR types,
nor does it imply "work with all features".

present-day DNS requires exact matches, modulo case sensitivity.
the user must type in the exact sequence of characters that form the DNS
name that identifies the service he is interested in, and the client 
must be able to unambiguously map those characters into the same binary 
representation as that used by the DNS. 

if this turns out to be infeasible for some languages, then either

 - we "live with" less utility of IDNs for those lanaguges
   (to an extent, we do this in English already, by discouraging
   the use of certain characters like "_" or "&" or " " in DNS.  
   similar conventions might apply to other languages but they 
   would need to be defined)

   or

 - we need to incorporate in IDN protocols some way of resolving the 
   ambiguity that can result when names are transcribed - say a feedback
   mechanism.

for the sake of backward compatibility with applications that don't 
understand IDNs, there's probably also a need to be able to come up 
with a pure-ASCII DNS name for any service named by an IDN.

these might imply that it is better to have a separate layer 
for mapping IDNs onto DNS names for the purpose of resolving 
transcription-induced errors or ambiguity.  such a layer would 
introduce the possibility for user feedback (for IDNs that are typed in),
and would make a clear distinction between potentially ambiguous
names typed in by users, and unambiguous names wired into various 
documents (such as URLs in HTML).

so we need to find out whether it is feasible to transcribe names
in all languages that people use, in such a way that the client
can be expected to generate the same binary representation of
that name as is used by the IDN system, or whether we will need
an IDN system that can tolerate variant spellings/representations
of IDNs, identify potential ambiguities, and resolve them.

naturally we'd like to be able to avoid the extra layer if possible,
but I don't think we've yet determined whether we can do that.

note that an extra layer doesn't necessarily imply a significant
performance impairment - if we had a that mapped an IDN to a DNS
name (perhaps allowing the user to select from multiple names)
we could also have that layer return the DNS RRs corresponding 
to each candidate IDN.  

Keith

p.s. for the purpose of discussion I would suggest three kinds of
transcription that need to be supported at a minimum:

1. from visual media to memory to typed characters
2. from visual media to spoken words to typed characters
3. from visual media to typed characters

the sight-impaired might also want IDNs that can be pronounced 
by a computer in such a way that they can be typed back in
without error.