[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[idn] Where to do form-folding?



Hi

Even if it is summer, some of you are hopefully still active. At
least some of you going to the IETF meeting.

Here is one thing we could discuss and that could be discussed
at the meeting.

The question is: Where should form-folding be done?

Background: With ASCII names names are compared case-insensitively.
One way to do the comapring is to first case-fold and then binary compare.
When we have IDNs, case-insenstivity is not enough. Instead I will call
it form-insensitivity, which includes case-insensitivity and insensitivity
to many other forms used in non-latin alphabets. To compare to IDNs
you can first form-fold them and then binary compare the names.
Form-folding is another name for all the normalisation, lower casing,
canonisation and simplifying described in for example the NAMEPREP draft.
Does anybody have a better name?

Places to do form-folding:
Below I give what I think are the two most important places and the most
important good and negative points for them. There are more points.

1) Form-folding is done only in the DNS servers.
   + Only authorative servers need to implement it.
   + Only the folding for the authorative subset of UCS need to
     be implemented.
   + Less places where form-folding need to be implemented.
   - Higher CPU demand on servers.

2) Form-folding is done in resolver (and servers when zone loading).
   + Less CPU demand on servers.
   - More places where form-folding need to be implemented.
   

In my draft I use 1), others have 2) instead.
I have thought that the good points of 1) makes it the best choice. Also
some drafts saying 2) would not fullfill any requirements good enough.
But a few days ago I thought about one more aspect that might make
2) the best choice:

There are many applications that compare hosts or domain names.
For example: sendmail and many browsers.
These applications should also compare IDNs as equal in the same way
that DNS does. This means that they also need to implement form-folding.
If the resolver libraries include the standard implementation
of form-folding (and name comparing), and had the API public, all
applications could use it instead of implementing their own.

So by choosing 2) we can get the additional benefit of supplying
a standard place for applications to the routines to do
form-folding and IDN comparing. And thus reducing the risk of many
implementations, some that will do it wrongly.


Thats it. What do you think? Am I missing some important aspects to
make the best choice? Maybe it is something that could be
discussed at the meeting.

Regards,

  Dan