[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] Re: An idn protocol for consideration in making the requirements



Hi Dan,

While this is really out of topic for the moment, it is really pointless to
flame other people proposal because your own personal belief of "how things
must be done". To be more constructive and proactive, I would suggest you try
writing a proposal for IDN in UTF-8 instead.

In this way, the WG focus on discuss the merits of each proposal and not
spending its time shooting down proposals after proposals. It would be very
disappointed if the WG end up saying "all proposal are no go". :(

But before we go that far, I think we have some milestone we need to meet.
Lets stay on track. *cheers* :-)

-James Seng

Dan Oscarsson wrote:
> 
> >
> >At 02:01 PM 2/1/00 +0100, Dan Oscarsson wrote:
> >>I see it as very important that we do NOT allow the solution to
> >>be encoded in ASCII! It is time everybody learns to deal with the
> >>problems of having more then the ascii subset.
> >
> >Since we are discussing requirements, could you state your reason for this?
> >What is the strong advantage of having a non-ASCII *encoding* for
> >internationalized names on the wire as long as the rest of the requirements
> >are met?
> >
> >Personally, I believe that if we come up with a compatible encoding for the
> >full set of internationalized characters, the software industry will rush
> >to make input and display mechanisms for the encoding. The desire for
> >internationalization is just too strong for them to ignore it.
> 
> Just look at MIME in e-mail. There we got ASCII encoding of everything.
> quoted-printable, base64. After so many years with MIME many tools
> do still display the transport encoding and store e-mail in transport
> encoding instead of in a user friendly way. I still quite often have to
> read quotable-printable as software cannot handle internationalisation
> or localisation.
> 
> I also do not want every protocol to define their own encoding of
> non-ascii. E-mail using quoted-printable and any number of character
> sets. IMAP using UTF-7. DNS using some UTF-5 or other complex encoding.
> I want all protocols for transport to use the same encoding of
> international characters. Then I can reuse my software to encode/decode
> the transport format in all my programs.
> As of today I can see only one good choice that can be accepted by
> many people: UCS encoded using UTF-8. It is ascii compatible and fairely
> compact. To then encode UTF-8 into ascii so that no value over 127
> exists in a byte is a unnecessary waste of bandwidth and resources.
> To choose some format that encodes everything in a-z just so that people
> with software that can only display ascii is not a good reason. Even
> they have to learn to handle at least 8-bit byte values.
> 
> I can easily make software the converts between my local character set
> and UCS normalised using form C encoded in UTF-8. But not to make
> software for all different encodings in all protocols.
> 
> >
> >>ASCII compatibility and backward compatibility is good, if it does
> >>not make things bad for everybody where ascii is not enough.
> >>It is better to break software and get them fixed instead of
> >>still trying to make everything to work in a world as it was
> >>at the dawn of the computer age.
> >
> >I fundamentally disagree with this last sentence for two reasons:
> >- You have not shown that you need to break software in order to fix the
> >problem of lack of internationalization
> >- It is never a good idea to break the existing base of software,
> >particularly when you are talking about breaking a wide variety of
> >protocols across many levels of the Internet architecture. You should only
> >do this when there is no other viable alternative.
> 
> OK. Break is maybe to hard word. What I want is not for software to
> break but the need to handle non-ascii be very apparent.
> 
> If we have host names encoded using UTF-8 and my software displays
> the name badly or disallows me to enter a non-ascii name - then I
> can easily complain to the software producer to fix it. If everything
> is encoded as ascii, the much software will not complain when non-ascii
> host names is used and it will take much longer before they get fixed.
> 
> Using UTF-8 everything will work as before while ascii only is used,
> but problems may occur when non-ascii is used. Fine, then we can quickly
> find what software needs fixing. If everything is encoded using
> ascii only code values, I am sure we 10 years from now will have lots
> of software handling only ascii where they should handle non-ascii.
> 
> For my needs I have better formats than UTF-8, but for the needs of
> the entire world UTF-8 is the best choice I can see now. It has
> ascii compatibiltity, fairely compact and can incoporate all characters
> in the world.
> 
> Let us make UCS normalised using form C encoded in UTF-8 the
> only recommended choice for interoperability!
> 
> Or do anyone have a better format?
> 
>    Dan