[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: An argument against multiple character sets

To: idn@ops.ietf.org
Subject: Re: An argument against multiple character sets
From: Paul Hoffman / IMC <phoffman@imc.org>
Date: Sun, 23 Jan 2000 14:55:06 -0800
Delivery-date: Sun, 23 Jan 2000 14:56:56 -0800
Envelope-to: idn-data@psg.com

At 10:08 PM 1/23/00 +0100, Harald Tveit Alvestrand wrote:
>Note: This is the UTF-16 (or UCS-2) representation of Unicode.

UTF-16BE, to be exact. Kinda near and dear to my heart right now.

>Your argument indicates that adding character sets to a list after initial 
>implementation is impossible.

That's one argument, yes, but not the only one.

>  It doesn't mean that the initial set needs to be just one, although a 
> server has to be able to compare strings between all the initial 
> character sets - which is clearly a bit simpler if there is just one of them.

I don't think that is even enough. Without labelling the query from the 
user to the resolver with the character set and encoding, how would the 
resolver know whether a request with 0x46F9 was LATIN SMALL LETTER F 
followed by LATIN SMALL LETTER U WITH OGONEK (8859-4) or LATIN SMALL LETTER 
F followed by HEBREW LETTER SHIN (8859-8)?

>However, I think the *requirement* you are trying to state is that when a 
>domain name is represented as text on paper, the user who thinks he has 
>access to suitable input devices for that text should be able to query on 
>that string and have returned information about the domain that the text 
>on paper was intended to represent.

In the absence of a single character set and encoding, yes. It also puts 
much more load on the resolver, which now needs to be able to translate 
from every encoding that might come from a user to every encoding that 
might be used in the domain name.

--Paul Hoffman, Director
--Internet Mail Consortium

Prev by Date: Re: An argument against multiple character sets
Next by Date: Re: Compatibility requirements
Prev by thread: Re: An argument against multiple character sets
Next by thread: Re: An argument against multiple character sets
Index(es):
- Date
- Thread