[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] Few comments on idna



Marc Blanchet <Marc.Blanchet@viagenie.qc.ca> wrote:

> here is few comments on idna.
> 
> - p2. "To allow such a label to be handled by existing applications, an 
> "ACE label" is defined ..."
> + ACE should be spelled: ASCII Compatible Encoding.

Good catch, "ASCII Compatible Encoding" never appears anywhere in the
IDNA draft.  It's awkward to insert in the middle of "ACE label", but we
could add "(ACE stands for ASCII Compatible Encoding)" at the end of the
sentence.

> - 3. 2). "When requirements 1 and 2 both apply, requirement 1 takes 
> precedence".
> + My understanding is the inverse: i.e. requirement 2 takes precedence. I 
> might be wrong.

You are wrong.  :)

The ASCII form might be ugly, but software won't choke on it.  The
non-ASCII form is pretty, but some software will choke on it, and that's
more important.  So requirement 1 takes precedence.  It basically says
don't pass non-ASCII domain names to software that might choke on it.

> - 4.1 ToASCII and 4.2 ToUnicode
> + the algorithm should more clearly state what happens when the "Verify 
> actions" result to fail.

Section 4.1 says "ToASCII fails if any step of it fails.  Failure
means that the original sequence cannot be used as a label in an IDN."
Section 4.2 says "ToUnicode never fails.  If any step fails, then the
original input sequence is returned immediately in that step."  How can
we be any clearer?

> - 6.1 Entry and display in applications.
> "if it does, rendering the ACE SHOULD NOT be the default."
> + we should make it clear that the prefix MUST be shown if the ACE
> version is shown.  This is probably obvious for us, but the document
> is not clear about the fact that when "ACE" is used, it means with
> or without the prefix.

Section 2 says "The 'ACE prefix' is defined in this document to be
a string of ASCII characters that appears at the beginning of every
ACE label." and "The conversion of labels to and from the ACE form is
specified in section 4.", and in section 4 ToASCII is defined in such a
way that ACE labels always begin with the ACE prefix.  Section 5 "ACE prefix"
even gives an example:

    For example, the eventual ACE prefix might be the string "jk--".
    In this case, an ACE label might be "jk--r3c2a-qc902xs", where
    "r3c2a-qc902xs" is the part of the ACE label that is generated by
    the encoding steps in [PUNYCODE].

I have a hard time imagining what would lead someone to think that "ACE"
refers to the part after the prefix.  But if Patrik or Paul wants to
make an editorial change emphasizing that, I don't see what harm it
would cause.

> - title: Punycode version 0.3.3.
> + the title should be changed to something like punycode version 1.0. 

Agreed.  In the next revision it will be 1.0.0.

>  I would also argue to have a more descriptive name in the title,
> something like: "Punycode: an encoding designed for use iwth
> Internationalized Domain Names (idn)."

Agreed.  How about "Punycode: An encoding of Unicode for use with IDNA"?
(I think it's significant that it's for use with IDNA as opposed to IDN.
I could expand IDNA if you like, but that would get very long.)

> - 3.2 typo:  s/but should insead/but should instead/

Thanks.

> - Annex B.
> + to me, this is not clear if it is mandatory to implement or not. We 
> should put a clear statement about this.

The introduction says "Punycode can support an optional feature
described in appendix B", and the third (== last) paragraph of appendix
B is "Punycode encoders and decoders are not required to support these
annotations, and higher layers need not use them."  Is that not clear
enough?  Should I add a third such statement at the beginning of the
appendix?  Or move the last paragraph to the beginning of the appendix
(which means it would contain forward references)?

> = Annex D.
> + I would erase the first 8 lines in comments. The author is already 
> identified in the draft (author section).

You would, but I wouldn't. :) The first thing most people are going to
do with the sample code is save it to a .c file, and I want the version
information to be kept with it, along with the URL pointing at possibly
newer versions of the sample code.

> Also the numbering of the versions is inconsistent!!!.

The sample code has its own version number.  The version numbers of both
the spec and the sample code will become 1.0.0 in the next revision,
but they could diverge after that.  Editorial changes to the spec don't
cause changes in the sample code.  Also, an interface change in the
sample code could cause a large jump in the sample code version (to
1.1.0 or 2.0.0) while causing only a small jump in the spec version (to
1.0.1).

AMC