[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] Document Status?



One working definition of internationalization is that the encoding/expression is "understood" by speakers of all languages. There is global agreement, I believe, that block Latin characters can be used by anyone in any country to express the name of a destination country in a postal address. So for example "UNITED STATES" or "FRANCE" or "AUSTRALIA", "JAPAN", "VIETNAM" are all considered acceptable in every country. This agreement allows, for example, that the destination address, except for the name of the country, can be rendered in a language local to the target country and does not have to be understood by the postal service in the originating country. Consequently, someone sending a letter from the US to a recipient in Vietnam can write the destination address in Vietnamese and the US postal service need only understand the characters "VIETNAM" at the bottom of the destination address.

Multilingualization is more focused on what is sometimes called "localization" - that is, the characters used in rendering a local language can be used (e.g. for domain names or for filling out forms etc) and these renderings need not be universally understood.

This definitional distinction helps (me anyway) to appreciate that the creation of multilingual domain names may not necessarily contribute to universal ability to use the resulting strings because it may be difficult to impossible to render or enter arbitrary character sets at the user interface to a local service. We have collectively probably created some confusion for ourselves by using the term "internationalized domain names" to cover both concepts. It strikes me that the IDNA documents are more aimed at localization/multilingualization than internationalization, using the "definition" in the first paragraph above. 

Concerns about how cut/paste will work are germane to the discussion about the utility of IDNs because such actions may be the ONLY way in which someone may be able to enter special character strings into text intended to represent an IDN. Something like this happens to me regularly as I compose email to friends whose names involve the use of characters with various accent markings. Since I don't know how to enter these from my simple ASCII keyboard, I usually end up cutting and pasting the characters. This works because the text of email is permitted to be pretty general in its encoding. I don't know how that would work out if I were dealing with non-Latin character sets. I know I would need special software to render Hangul or Kanji, for instance, but I assume that the rendering packages also serve to make highlighting and cut/paste work. 

One of the important questions that I sense is being asked in the discussion of IDNA is just how applications that encounter these encoded objects/strings should handle them, particularly if the intent is to allow cut/paste to insert these encodings into places where domain names are expected to appear. Precision in the specification and elimination of ambiguity seems essential to assure interworking of the many software pieces that need to interwork for the whole scheme to be "universally" beneficial.

I hope I haven't done any damage to all the concepts expressed in the long debate about IDNA - but I am sure that people more knowledgable than I will set me straight if this message contains serious misunderstandings.

Vint Cerf

 At 05:24 AM 9/1/2002 +0000, Adam M. Costello wrote:
>I don't know exactly what the difference is between internationalization
>and multilingualization.  I think one reason the latter term was
>not used is that domain names have no language tag.  Maybe there
>were other reasons, or maybe it was arbitrary.

Vint Cerf
SVP Architecture & Technology
WorldCom
22001 Loudoun County Parkway, F2-4115
Ashburn, VA 20147
703 886 1690 (v806 1690)
703 886 0047 fax