[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [idn] WG last call documents



 
> > But there is one class of characters that might indeed be 
> dreadful at
> > the beginning: combining characters.  I recently refered to labels
> > that begin with combining characters as invalid Unicode strings, but
> > they're not, are they?  They just behave in surprising ways when
> > abutted with something else.  Maybe nameprep should prohibit initial
> > combining characters.
> 
> Note that these are rules for hostnames in particular.
> 
> A prohibition against combining characters is not feasible. See
> <200111212258.OAA23757@birdie.sybase.com> from K Whistler.
> 
> This was resolved without disconsent in <3C0DD610.F50F947B@ehsco.com>:
> 
>  | First and last characters in the label MUST NOT be a diacritical
>  | mark or hyphen-minus.



I agree that having a combining character as the first character
in a hostname part (label) is a bad idea (nominally it combines with what
is before it and can in rare cases be "absorbed" into syntax before
the acutal string (like in >/, for xhtml, where / is combining long solidus;
after NFC (applied blindly to the entire text) there will be a precomposed
negated greater than character there...)

But I see absolutely no reason to prohibit combining characters
at the end of a label (unless applied to a hyphen-like character...).

		Kind regards
		/kent k