[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] Re: IURI questions




From: Aaron Irvine <airvine@corp.phone.com>

> > So at the very top of the stack, use %hh escaped UTF-8.  But deeper, utilise
> > somehow the hyphen to encode characters above ASCII.  One possibility I here
> > suggest could be:
> > * triple-hyphened UTF-5 for when a scheme/username/domainlabel contains one 
or
> > more characters above Latin extended B
> > * double-hyphened UTF-8 otherwise
> > where:
> > * triple-hyphened UTF-5 means convert to UTF5 then insert "---" after first
> > letter
> > * double-hyphened UTF-8 means covert %XY to "X--Y"
> > * and note a bare(trailing) hyphen never occurs in these
> > * if in the unlikley event the original contains -- (or ---) then this is
> > encoded as "----2" (or "----3")

I still wonder if we should encode characters using just the original
character set allowed in DNS. 
If we do not want to add "odd" characters as "%", a solution could 
be using underscore "_" as escape char. We already know that, even if 
_ is not officially blessed, it has been used anyway, so existing base
should not be affected: and if someone used __ in a domain of his, it's
a problem of his :-)

ciao ,.mau.