[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[idn] Re: idn-uri document



Hello Erik,

Sorry to be late with my answer. I started writing it when
I got your mail, but it somehow got lost.

At 19:09 02/09/13 +0200, Erik Nordmark wrote:

Martin,

I just read draft-ietf-idn-iri-01 and I have some comments and
questions.
Thanks for your interest!

I don't know if the best place to discuss this is the IDN list, or
if there is some other list looking at URI/IRI stuff.
For URIs, there is the old (IETF WG) uri list, now hosted at W3C:
uri@w3.org. That would be appropriate because the draft is currently
written as an update to the URI spec. Roy Fielding is working on
a new edition of RFC 2396, mostly bug fixing.

The W3C Internationalization WG is looking after the IRI draft.
See e.g. http://www.w3.org/International/iri-edit/. So they
could take it on. Public discussion would be on www-international@w3.org.

The IDN WG is of course the third alternative, that's where it's
currently, although it hasn't really been discussed a lot.


On page 4 it says "will always be rejected by resolvers".
I don't know if this is intended to be a statement about the current
implementations of resolvers, or a statement about something we should
recommend or require resolvers to do.
I added an explanatory sentente 'because no such domains will be
registered'. So it is about current (and hopefully future)
operations, not about dns server implementations or resolver
implementations.


I do think there currently are resolvers which happily pass whatever
string of octets into DNS packets and send them off.
Ah, I see, of course the resolvers pass things through and then
pass the negative result back, so they don't actually reject it.
So now the sentence reads:

However, such syntax should never be used, and will never be
resolved because no such domains will be registered.


And I'm far from certain it would be a good idea recommend or mandate
that resolvers do additional checks. The IDNA model is that the clients
do nameprep and that the DNS servers just to a (ASCII case insensitive)
exact match.
Agreed, sorry about the confusion.


The defined syntax rules for declare certain ASCII domain names illegal
(such as *.example.org). Where is the check for illedgal names assumed to
be performed? For IDNA it probably makes sense to only apply this types
of checks (setting the UseSTD3ASCIIRules flag) when verifying domain name
registrations and not do such checks in the clients.
This is an IDNA question, not a idn-uri question. As far as I remember,
the idea was to have the checks done on the clients, too (with some
leeway for unassigned characters to stay forward-compatible with
new character assignements). The reason for this was to create
pressure on registries to follow the rules.


     The work of the IDN WG includes some procedures for name preparation
    [Nameprep].  Before encoding an internationalized domain name in an
    URI, this preparation step SHOULD be applied.  However, the URI
    resolver MUST also apply any steps required as part of domain name
    resolution by [IDNA].

The above statement says that for all domain names (note that the term
"IDN" is defined to include the existing ASCII domain names)
one should apply nameprep. This might be fine but it makes sense
stating this explicitly. The ToASCII in IDNA does not apply nameprep
to all-ASCII labels.
The idea was simply to say: We RECOMMEND that you apply the IDNA rules
already when you create an URI. What these rules are is up to IDNA.
If IDNA says that their preparation of ascii-only labels is the
identity operation, then we recommend that you apply that (i.e. do
nothing), and not something else. If you see a way to make this clearer,
please tell me.



Always applying nameprep will have the effect
of downcasing the ASCII characters in all ASCII labels, which IDNA does not
do.
And of course we don't want to do either.

Which are the "any steps required as part of domain name resolution"
above? I can't figure out to what it might refer.
That's the nameprep and related checking that the client has to
do when it resolves a domain name. In IDNA terms, 'client' would
be easy to understand. But using the word 'client' in an URI context
doesn't work, so I tried to word around it. Any improved wording
appreciated.


Finally, is the intent that nameprep always be applied before characters
are encoded in UTF-8? Then it makes sense stating that in the first real
paragraph on page 4.
No. In the context e.g. of IRIs, the conversion from an IRI to an URI
would not do nameprep.

Regards,    Martin.