[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[idn] Last minute editorial clarifications on IDNA



The discussion about the IDNA problem statement etc on the list plus
some 11:th hour review comments about technical clarity has brough forth
this list of editorial clarifications with some additions to the security
considerations section. If anybody thinks these changes are more then
editorial changes then please let me know.

Authors, can we apply these to the IDNA specification?

  Erik


1. In section 2 the word "integral" is used.
The use of "integral" might raise the question "integral of what".
Please change it to "integer".

2. In section 2 make it clear that the definition of Internationalized Label
is more clear about what "can be applied" since it doesn't
say anything explicit about the UseSTD13ASCIIRules.
	An "internationalized label" is a label to which the ToASCII operation
	(see section 4) can be applied without failing. This implies that every
Add "(with the UseSTD13ASCIIRules flag unset)" after "without failing".

3. In section 2 add words to try to make it clearer that all
ASCII names are IDN by adding the parenthesis:
    This implies that every ASCII domain name is an IDN (which implies
    that it is possible for a name to be an IDN without it containing any
    non-ASCII characters).

4. In section 4 make it more clear when UseSTD13ASCIIRules should be set
by adding the lines starting with '+':
    For each label, decide whether or not to enforce the restrictions on
    ASCII characters in host names [STD3].
  + (Applications already faced this choice before the introduction of
  + IDNA, and can continue to make the decision the same way they always
  + have; IDNA makes no new recommendations regarding this choice.)
    If the restrictions are to be enforced, set the flag called
    "UseSTD3ASCIIRules" for that label.

5. In section 4 before the start of section 4.1 add text making clear
that the protocol only specifies the externally visible behavior
and not e.g. that ToASCII and ToUnicode be actual APIs:

	This description of the protocol uses specific procedure names and
 	names of flags, and so on, in order to facilitate the specification of
 	the protocol. These names as well as the actual steps of the
 	procedures are not required of an implementation.
 	In fact, any implementation which has the same external
 	behavior as specified in this document conforms to this specification.

6. Section 4, step (4) can be interpreted to ban the display of an ACE which is
inconsistent with section 3.1. The fact that it isn't inconsistent could be
made more clear by restating step (4) as:
    Process each label with either the ToASCII or the ToUnicode
    operation as appropriate.  Typically, you use the ToASCII operation
    if you are about to put the name into an IDN-unaware slot, and you
    use the ToUnicode operation if you are displaying the name to a
    user; section 3.1 gives greater detail on the applicable requirements.

7. In section 4.1 there is a use of "needs to have" which might be
confusing since it isn't clear whether it is a requirement or something 
else. The actual requirement is in the preceeding sentence.
Thus it makes sense rephrasing the last setence in this paragraph:
> It is important to note that the ToASCII operation can fail. ToASCII
> fails if any step of it fails. If any step of the ToASCII operation
> fails on any label in a domain name, that domain name MUST NOT be used
> as an internationalized domain name. The application needs to have some
> method of dealing with this failure.
to be:
	The method for deadling with this failure is application-specific.

8. In the fifth paragraph of 6.2, the phrase "other formats
than ASCII".  ASCII is not a format.  Guesses as to
what this might mean open up loopholes in the specification.
I suggest
	will be able to accept domain names in other formats than ASCII, and
s/formats/charsets/

9. Add this in an approprite place in the security considerations section:
    The introduction of IDNA means that any existing labels that
    start with the ACE prefix and would be altered by ToUnicode will
    automatically be ACE labels, and will be considered equivalent to
    non-ASCII labels, whether or not that was the intent of the zone
    adminstrator or registrant.

10. In section 4.2 add a statement about the length of the ToUnicode output
after the second paragraph. This statement is probably helpful for
implementors trying to avoid buffer overflow problems.
    The ToUnicode output never contains more code points than its input.
    Note that the number of octets needed
    to represent a sequence of code points depends on the particular
    character encoding used.

11. In the security considerations section point out that the
that permission to impose registration restrictions of section 2 might be
useful in some cases to minimize homographs.
After:
	To help prevent confusion between characters that are visually similar,
	it is suggested that implementations provide visual indications where a
	domain name contains multiple scripts. Such mechanisms can also be used
	to show when a name contains a mixture of simplified and traditional
	Chinese characters, or to distinguish zero and one from O and l.
ADD:
	DNS zone adminstrators may impose restrictions (subject to 
	the limitations in section 2) that try to minimize homographs.

---