[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] Document Status?



--On 2002-09-01 08.53 -0400 "vinton g. cerf" <vinton.g.cerf@wcom.com> wrote:

> I know I would need special software to render Hangul or Kanji, for
> instance, but I assume that the rendering packages also serve to make
> highlighting and cut/paste work.

The copy and paste problem is difficult, but not so hard as people belive
(I think).

I know how copy and paste work on the Apple Macintosh platform, and as that
has been around and worked that way for decades(!) I take for granted it
works the same way in for example Windows.

When doing "copy", the software "sending" the copied information identifies
the selection and calls a routine which notifies the operating system that
data exists in the paste buffer. The information passed include information
like what type(s) the data can be fetched as, the size(s) etc. Note that
several alternatives can be stored there.

It looks like the content-type mechanism in email. Very precise tagging of
the data.

Now, some other application have a menu which is to be drawn. The menu
includes an item called "paste". Before doing the actual drawing, it calls
a routine to check (a) if there is something in the paste buffer, and (b)
if the data is of a type which it can interpret. If both are true, the menu
item "paste" is _not_ shadowed.

The paste operation happens, and it can either grab data which is already
generated by the sender application, or the sender application is called
and asked to produce the data which is to be sent. The receiving
application gets a pointer (handle) referencing the data.

Similar things happens with drag & drop.


Ok, the important thing is the tagged data. We are _not_ talking about
plain bits which are arbitrary from arbitrary scripts etc.

I see we now have a cople of different scenarios:

(a) I get an email with IDNA encoded sender address. I want to add that to
some address book software. That imply copy and paste from email program to
address book program. The email address have ACE encoded labels in them.

(a1) The email program understand IDNA, but not the address book program.
As it understands IDNA, it will display (if the script and font exists) the
correct Unicode characters, and not the ACE encoded string. Now, the copy
operation happens, and I would if I were the email programmer put two (2)
things in the paste buffer: One "email address" which is the ACE encoded
string. Same thing as what is passed in SMTP or POP. One which is the
address in Unicode (or local script, which will be named as part of the
tag). The addressbook which fetches data from the paste buffer gets the
string, and notice it is ace encoded, and can choose to decode that if it
can/know etc.

(a2) The email program does not understand IDNA. It will only see the ACE
encoded string, and  will just like today place the ACE encoded string into
the paste buffer. See (a1) for rest of story.

(b) I can type some weird codepoints in my email application, but the
address book can not handle it. Also in this case the safest way of moving
forward is to place the ACE encoded string in the paste buffer.


So, in both cases, I compare the paste buffer with a random protocol which
according to the IDNA spec should have domain names encoded in ACE, if not
negotiation can take place. In some cases the negotiation can happen in the
protocol, and the same way negotiation _can_ happen in the copy/paste
buffer because of the tagging.


My conclusion after working on these things for a while is that I still
don't think we have any problem at all as long as the domain name is stored
as bits.

Only issue are applications which don't understand IDNA, but one can enter
non-ascii characters in labels, something which should not be possible if
the application was working according to the RFC's.

Problem is when changing bits to sound or ink and then back to bits again.

   paf