[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] Unicode tagging



> > > That is very reasonable... why then is it not a good idea to tag the
> > > encoding as we have suggested in a standard and easily recognizable way?
> >
> > if you always use unicode on the wire, there's no need for a tag.
> > systems that have to support multiple encodings (or even the
> > possibility of multiple encodings) are far more complex than
> > single-encoding systems.
> >
> 
> I dont think this is totally true... there are different transformations of
> Unicode as well as different specifications 16 bit or 32 bit.  

so pick one of them.  domain names (and presumably IDNs) are short
enough that there's not likely to be enough bandwidth savings 
to make it worth complicating the protocol.

> While I truly
> believe that for the sake of the DNS, the use of a uniform byte length
> encoding scheme is best especially considering the fact that there exists a
> "count" in front of a label and the count could then correspond to the
> number of characters so that it could be "fair" between languages, 

there are differnt interpretations of "fair".  (my idea of "fair" 
is that the information density per octet - *not* character density
per octet - is about the same in all languages)   but the particular 
representation chosen matters a lot more for storage formats, or
the format of an email message which might be megabytes in length, 
than in a relatively short IDN query or response.

lots of different transformations have suggested, there's certainly 
plenty of good ideas in this area.  I'm confident that we'll identify 
one that is reasonably fair.

> the
> problem exists when we want to expand the character set just like what
> Unicode will be doing soon to evolve to 32 bit.  Therefore, we must tag at
> least the form of Unicode that is carried by the packet.

no, we just need to make sure that the format chosen by IDNs can represent
32-bit characters, and that an IDN query is distinguishable from a DNS query.

> My "alternate implementation" in the I-D that was submitted illustrates how
> we could confine the tagging to unicode forms.

we could do that.  I personally think it would be sub-optimal, because
it would further complicate things like DNSSEC.

> > > > >  but I thought there were suggestions to rejected some characters 
> > > > > in the DNS such as symbols?
> > > >
> > > > perhaps, but this would be as a matter of policy, rather than as a
> > > > constraint that is wired into the IDN protocol.
> > >
> > > A matter of policy for respective registries?
> >
> > perhaps, but ideally, no.  or at least, not as a solution to the
> > transcription problem.  the last thing we need is for registries
> > to compete against one another on the basis of which one accepts
> > which characters.
> 
> Shouldnt it be left to the customers to decide what they actually want in
> their domain names?  

no.  customers are generally ignorant of what it takes to keep the service
working well.  customers would demand that all manner of characters appear
in domain names or IDNs; then they would complain about the lousy service
that they brought on themselves.  (or that other customers brought on them) 

> As engineers I feel that our responsibility is to
> provide a platform that can handle any character in the world.  

again, there is a difference between what the protocol can support and
what policies exist for defining names.  should the policies prove 
inadequate, they are easier to change than the protocols - especially
if the policies start out a bit on the restrictive side.

> Choice is true control while constraints creates instability... 
> isnt this the truth of regulations?

not in general, no.  that's a popular belief these days in certain
circles, but it doesn't hold up under examination.
 
> If the end users demand certain symbols, the registries will want to provide
> them with it... as engineers we should make that possible.  Therefore if a
> registry A restricts some symbols while Registry B allow it, people flock
> over to B... A will be forced to open it up... eventually all characters and
> symbols will be permitted...

and the resulting service will be lousy.  therefore it is in the interests
of everyone (even though they might not realize it) for there to be
some restrictions on the names which can be used.

Keith