[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] Re: Is space allowed in a hostname?



On Tuesday 09 July 2002 22:46, Dan Oscarsson wrote:
 
> It may be to late to do something about NFKC. NFKC is difficult and
> at least I have never gotten any answer from those I have asked what
> NFKC changed compared to NFC. I have therefore tried to read the
> Unicode tables to see what it means, and I do think it goes to
> far for domain name.

By definition, NFKC = "compatibility decomposition" + NFC .
This "compatibility decomposition" can be partitioned into
groups of script-specific compatibility equivalences, and IMO, each group 
has its own degree of appropriateness in the context of domain name.
For example, only with NFKC, compatibility  ideographic character (Kc)
can be mapped to its equivalent unified ideographic character (Ku).
Kc and Ku share the same glyph, but often have different readings and origins. 
That  mapping is OK. However, NFKC does not unify a certain pairs of TC/SC 
which are purely in font-variant relations (like between italic/subscript A 
and normal A) and share the same meanings and readings.
When applied to hangul compat jamos (u+33??), NFKC makes errors.
When NFKC maps circled "A" into bare "A", it may be useless or unnecessary
in domain name context.

Ideal IDN name normalization was carved to fit in NFKC.
"if you sleep in a bed which is shorter than you, you should bend your 
legs every  night ?  or  buy new bed ?"  Stringprep/nameprep teach us
to bend our legs.  :-)

Soobok Lee

> I have seen drafts refering to IDNA/NAMEPREP to define
> what characters are allowed in a domain name. That is bad as
> it makes it difficult for us to change things later on.
>
>     Dan