[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[idn] Re: Document Status?



--On 2002-09-01 19.42 +0200 Simon Josefsson <jas@extundo.com> wrote:
> At least in X11 cut'n'paste works by transfering charset tagged but
> otherwise opaque character arrays.

Ok. Good.

> What you are proposing seem to
> require a cut'n'paste protocol to be implemented in both the MUA and
> the address book application.

Not at all.

What I say is that one should send the ACE encoded string in the paste
buffer. Further, that is what will happen when an application doesn't know
anything about IDNA at all. In cases like MacOS where one can have
alternative forms of the data, it is possible to define a new type for the
Unicode version of the domain name.

> If the
> strings are to be ACE encoded or raw encoded is not specified anywhere
> as far as I can tell, and different implementations will chose
> different strategies.

IDNA says that if no negotiation exists between two entities which exchange
domain names between them, ACE encoding should be used. There is no
difference between a protocol which uses IP or the paste buffer. It is the
same thing.

> In general, cut'n'paste of IDNA in the real world is not well defined,
> since IDNA only solves the IDNA problem for Unicode, and the real
> world isn't running Unicode everywhere.

IDNA do specify how to encode a domain name which is to be passed between
two applications. If there is no negotiation, ACE encoded Unicode is to be
used.

> There are other scenarios as well.
> 
> (c) The email address was located in the message body, and thus not
>     ACE encoded.  If the message body was non-Unicode, see (d) for the
>     rest of the story, if the message body was Unicode, it is not
>     clear which application, or if at all, will ACE encode it, and you
>     have the situation in (a) again.

You already today have this problem because detecting where a URI start and
end inside unstructured data like an email message body. Yes, you will get
problems in this situation.

> (d) Email program understands IDNA but is running in a non-Unicode
>     environment.  The address is tagged and is transfered to address
>     book application using e.g. ISO-8859-1.  IDNA doesn't handle or
>     care about this scenario, but it do exists in the real world
>     (e.g. my machine).

See above, if the address is to be transferred to other application, it
should either be negotiated that it is an IDN in a specific charset (like
8859-1) or sent as ACE.

   paf