[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[idn] Fwd: Re: Rationale wanted for Unicode identifier rules



Is this something we can use (possibly modified) in IDN to describe what a 
reasonable character set for IDN labels is?

                Harald



>X-UML-Sequence: 12492 (2000-03-01 21:35:45 GMT)
>From: Kenneth Whistler <kenw@sybase.com>
>To: "Unicode List" <unicode@unicode.org>
>Cc: unicode@unicode.org, kenw@sybase.com
>Date: Wed, 1 Mar 2000 13:35:44 -0800 (PST)
>Subject: Re: Rationale wanted for Unicode identifier rules
>
>John Cowan asked:
>
> >
> > Kenneth Whistler wrote:
> >
> > >   A. Identifier syntax along the lines described in Unicode 3.0.
> >
> > Can you (or someone) supply a precis of this to the poor fellow
> > who still hasn't heard from his bookstore's order department?
> > Especially if it is indeed simpler than the Unicode 2.0 version?
>
>Sure. For those of you who already have the hymnal, turn to page 134 to
>sing along.
>
><identifier> ::= <identifier_start> (<identifier_start> | 
><identifier_extend>)*
>
><identifier_start> is defined by an equivalent category set consisting of
>        all those characters with the General Category values:
>        Lu, Ll, Lt, Lm, Lo, Nl
>
><identifier_extend> is defined by an equivalent category set consisting of
>        all those characters with the General Category values:
>        Mn, Mc, Nd, Pc, Cf
>
>Thus, identifiers can start with any "letter" or "letter number".
>
>Identifiers can continue with any "letter" or "letter number", any combining
>mark (except the symbolic surrounds), any decimal digit, any connecting
>punctuation, or any format control character (e.g. the invisible bidi
>layout controls, ZWJ, ZWNJ, etc.).
>
>Note that this definition explicitly excludes the following General Category
>values from identifiers:
>
>    Me, No, Zs, Zl, Zp, Cc, Pd, Ps, Pe, Pi, Pf, Po, Sm, Sc, Sk, So
>
>i.e. enclosing combining marks, "other numerals", all spaces, control
>characters, all other punctuation, and all "symbols".
>
>--Ken

--
Harald Tveit Alvestrand, EDB Maxware, Norway
Harald.Alvestrand@edb.maxware.no