[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[idn] Expected handling of labels



I see that still, after so long time, the people of the IDN working
group
do not agree of how labels in DNS should be handled.

DNS, as defined by RFC 1035, defines for me what I would expect
of a database with objects having an owner name:
- A name is a sequence of printable characters.
- The character data in a name is preserved.
- Names are matched case-insensitive.

Yes, it is printable characters even though RFC 1035 does not
forbid it because that is how it is generally used.
If we want to use binary data in labels, either define
a new binary label or encode binary data using ACE.

There is now much different discussions of how DNS should work
when be broaden the scope of characters of all of the world's need.
Some want to force special rules on some labels (like requiring
host names to be lower case) and some want anything to be possible
including binary data and un-normalised character data.

I strongly oppose DNS having more than ONE form of character labels.
All character labels (it is ok to define a binary one) in DNS
must follow the same rules. To be well working and follow what would
be expected of names used for matching in directorys or other types of
databases, they should:
- Be only printable characters (no control codes).
  This because they are names, not application data.
- Be normalised. Unicode form C or probably KC is suitable.
  This because only one representation of a character must
  exist so that data can be easily processed.
  I cannot say if KC is OK for everybody. From what I can
  see for Latin based letters KC is fine. Things like
  full/double width or circled forms are display options like
  bold or italic and should not have a special code in a
  character set.
- Be stored and returnd in queries using the original
  entered form (including mixed case and other characteristics
  available).
  This because form may be significat in some systems and
  it is important to enhance readability.
- Be matched using form-insensitive matching (this includes
  case-insensitive matching).
  This because for the common man, form is not significant.


Using the above rules DNS (which are defined in RFC 1035 or
the current usage/semantics of DNS) would continue having
a broad and easy to use functionality.

If there exist specific need to have binary or form-exact
matching, new label types can be defined for those labels.

Please stop thinking so much about programmers. Think
of how people want to use a database where you bind a
name to an object. At least I want things to work
like LDAP and DNS today works: I define an object,
give it a name, the name is reatind and returned from
the database in the same form I entered it and when
doing matching the database ignores case of letters.

   Dan