[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] Newbie's questions implementing the [IDNA]



"JFC (Jefsey) Morfin" <jefsey@jefsey.com> wrote:

> if (ToASCII(nameprep(ToUnicode(ascii_text))) == ascii_text) babelname=true;
> if (babelname) "iesg--ascii_text" will display in ASCII mode on most
> of the systems while having been registered, and possibly TMed, as
> ToUnicode(ascii_text).

Your reference to "iesg--ascii_text" suggests that you expect ascii_text
not to include the ACE prefix.  But in that case, ToUnicode(ascii_text)
is simply ascii_text itself, because ToUnicode won't alter a string that
doesn't begin with the ACE prefix.

> I documented Adam with cases where babelname==true.  Among them
> "coca-cola", "ibm", "vint-cerf", "adam-costello".

babelname, as you've defined it, would be true for *every* lowercase
ASCII string (up to 63 characters).  I think the feature of those
example strings that you're interested in is that if you prepend the ACE
prefix to them, the result is an ACE (which is, by definition, something
that ToUnicode would alter).  So the test you're looking for is:

    ToUnicode(IESG--ascii_text) != IESG--ascii_text

which is roughly approximated by this test:

    Punyencode(Nameprep(Punydecode(ascii_text))) == ascii_text

which is similar in form to the test you proposed, but ToASCII is
very different from Punyencode, and ToUnicode is very different from
Punydecode.

> if (babelname) "iesg--ascii_text" will display in ASCII mode on most
> of the systems while having been registered, and possibly TMed, as
> ToUnicode(ascii_text).

IESG--ascii_text will be displayed in non-ASCII form by IDN-aware
applications capable of displaying the characters, and will be displayed
in ASCII form otherwise.

AMC