[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] NSI Multilingual Testbed Information (fwd)



James:

At 08:36 AM 8/27/00 +0800, James Seng wrote:
>Bill,
>
>Your concern about ACE and Jason's patent is no more than a excuse
>against ACE. 

I'm not making an "excuse against ACE" WRT the Pouflis patent. I was
strongly informed by John Klensin in Pittsburgh that the issue of patent
disclosure WRT potential IETF standards was *very* important to the IETF.
This WG is proposing ACE as a potential standard.

>So far, your case against ACE includes:

Sorry, I have not been making a "case against ACE" in my previous comments.
I have been providing information. Take it or ignore it. If that
information works against ACE, so be it. 

However, I *will* make some technical observations about ACE now - since
you suggest that is what I should do. I'm sure others will have differing
opinions, though <smile>.

>a) Software Patent by Jason which our (idns) lawyers have said it wont
>   stand up in a claim due to prior public works. (Jason filed in
>Australia
>   few months after Martin's -00 I-D.)

Good to hear. Will I-DNS accept liability for any future claims by Pouflis
against any company, user or software developer who adopts an ACE as a
solution?

>b) Business reasons to adopt Microsoft DNS. 

This is totally out of left field, James. Meaning no disrespect to anyone
in the WG from Microsoft, please reproduce the quote on this list where I
have *ever* encouraged anyone to adopt Microsoft DNS. Where did you get
this idea??

>Neither of which is of any technical merits to the WG.

OK, here are some of the technical reasons ACE is not preferred to UTF-8:

1. Usability problems: any domain part that has any non-ASCII character
anywhere in it is
transformed end to end to a form that is completely unreadable, causing
problems at leakage points.  

2. The proposed ACE solutions are not ASCII-transparent at all.  In
contrast, UTF-8 is completely
ASCII-transparent.

3. Significant new functionality must be added to every client that is to
use an ACE IDN system.  Not so for UTF-8.

4. One ACE proposal says "With internationalized names, the user
application MUST
convert the pre-converted name into a post-converted name so that is
acceptable to resolvers."  This is not the case with UTF-8.

5. The ACE encoding schemes described in the ACE proposals are nearly
always less
efficient than the UTF-8 encoding scheme - often much less efficient
 - even in those that utilize compression schemes. 

Example: UTF-8 itself exhibits a space advantage with the first 2048 code
points, 
relative to higher codepoints. Since most common alphabets 
are in this region, in practice the overwhelming
majority of alphabetic characters are encoded as only two bytes in the
fully encapsulated UTF-8 version.  

In fact the manner in which this is achieved with UTF-8 is 
inherently better, since it is insensitive to changes in 
the most significant byte per se - the codepoint just has
to stay below 2048, and it will unless the DN contains Asian,
Cyrillic, or African script or ideograms.  With RACE, it appears 
all the characters must have the same most significant byte for
compression to work.  

A quick look at the UNICODE character dictionary shows one 
reason why this reliance by RACE is problematic: Latin variant
characters are spread across four 256 codepoint blocks, in such a way
that most Latin-variant names with non-ASCII content will not
compress.  This is unavoidable; there are just too many Latin variant
characters to fit into a single 256 codepoint block.


And so it goes...

Bill Semich