[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] Re: Document Status?



Simon Josefsson <jas@extundo.com> wrote:

> At least in X11 cut'n'paste works by transfering charset tagged but
> otherwise opaque character arrays.

Cut & paste in X11 works fine when everything is ASCII.  Otherwise, in
my experience, it is quite broken already, even before IDNs enter the
picture.

For example, I often run a text-mode editor in one kterm, and a
text-mode browser in another.  I can cut and paste English and Japanese
text between them with no problems.  However, if I try to copy Japanese
text from Netscape 4 into a kterm, it does not work.  I figured out how
it was broken and implemented a hack to make it possible (a small tcl/tk
program that provides a button that grabs the selection, performs the
transcoding, and re-exports the selection).

Now when I try to copy text from Mozilla 1.0 into a kterm, it doesn't
work, but it's broken in some other way that I haven't investigated
yet, so then what I do is copy the URL, load the page into my text-mode
browser, and copy the text from there.

So I think it's true that cutting and pasting IDNs in X11 will fail in
many cases, but not because IDNs are so difficult, but only because
cutting and pasting anything other than ASCII text is already broken to
begin with.

Patrik Fältström <paf@cisco.com> wrote:

> IDNA says that if no negotiation exists between two entities which
> exchange domain names between them, ACE encoding should be used.

IDNA says that the ASCII form must be used when a domain name is put
into an IDN-unaware domain name slot.  IDNA says that ACE forms should
not be shown to users (unless the non-ACE form would cause problems).
That still leaves some cases up in the air.  When a domain name is put
into a place that is not a domain name slot (for example, a generic
text context) and the place is not user-visible, then IDNA makes no
recommendation about which form to use.

> There is no difference between a protocol which uses IP or the paste
> buffer.  It is the same thing.

I don't think so.  When the user copies a domain name from the displayed
header of a mail message, is the user intending to copy text, or
intending to copy a domain name as such?  Is the user going to paste it
into a text editor where a message body is being composed (in which case
the user wants text), or is the user going to paste it into the location
field of a browser (in which case the user wants a domain name as such).

Cut&paste buffers are a tricky case because they have no specified
purpose; they can be used for anything.

As you said, in the ideal situation the buffer would contain two
versions of the selection, one tagged as plain text, and one tagged
as a textual identifier or as structured text, so that the receiving
application could choose the appropriate representation.

But if only one representation can be put in the buffer, it is not clear
whether the ACE form or the non-ACE form should be used, and the IDNA
spec does not require either one, because the buffer is not a domain
name slot and is not user-visible.

Simon Josefsson <jas@extundo.com> wrote:

> > Even if MUTT would become IDNA-aware in the future, copy & paste
> > operations grab the IDN-like strings directly from the xterm, not
> > from the MUTT.  So, the MUTT cannot have any opportunity to toss
> > ACE-encod the IDN into the receiving applications or the clip board
> > area.  Text-based MUA does not have any copy&paste support to/from
> > it.  Xterm does all the job.
>
> The specifications seems quite clear on what should happen here -- if
> there is no negotiation, ACE should be used.  TTY MUAs therefore must
> display ACE strings as there is no negotiation between xterm and the
> MUA that an IDNA string is being displayed.

That conclusion does not follow from the IDNA spec.  ASCII forms are
required only in IDN-unaware domain name slots.  The tty is not a domain
name slot, it's just a generic text terminal.

It would be silly to forbid all tty-based applications from displaying
non-ASCII domain names, just because cut&paste might fail sometimes.
IDNA does not make such a prohibition.

AMC