[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] I-D ACTION:draft-ietf-idn-vidn-00.txt



Brian,

Thank you for your comments. Please see below for my responses to your
comments.

Sung

----- Original Message -----
From: Brian W. Spolarich <briansp@walid.com>
To: FDU - Sung Jae Shim <sshim@mailbox.fdu.edu>
Cc: <idn@ops.ietf.org>
Sent: Wednesday, November 22, 2000 2:11 PM
Subject: Re: [idn] I-D ACTION:draft-ietf-idn-vidn-00.txt


> On Mon, 20 Nov 2000, FDU - Sung Jae Shim wrote:
>
> | Sung: Yes, VIDN maps the letters of non-English languages into
> | [a-z,0-9, and hyphen] that already exist in the DNS, in the same way
> | that people in regions where English is not widely spoken, currently
> | create their domain names in English.
>
>   So is it safe to say that VIDN is, in the context of
> draft-ietf-idn-compare-01, an 'arch-3: Just send ACE' proposal?  And, in
> the context of 'where to do IDN', VIDN proposes that the ACE encoding
> happens in the application (as opposed to the resolver, recursive server,
> authoritative server, or root server)?
>

Sung: Not exactly. VIDN and ACE differs in that VIDN proposes that host
naming rules remain the same and that all domain names be sent in "ASCII"
format, while ACE proposes that host naming rules remain the same and that
all internationalized domain names be sent in "ACE" format.

Sung: VIDN proposes that the conversion from non-ASCII format into ASCII
format takes place in end-user applications. Alternatively, VIDN also
can be implemented at the name server or resolver. Please see the section
4.3 of the draft for details of VIDN implementation.

>   As such, I'm not sure I understand how VIDN is better than, say,
> draft-ietf-idn-idnra, using any of the proposed ACE formats.  In the other
> proposals, the encoding is done on the basis of mapping Unicode codepoints
> to the RFC1035-mandated characters.  While the result is perhaps ugly, the
> mapping is deterministic, straightforward, and can readily be processed by
> applications for re-display in a localized script.  Indeed, that's one of
> the assumptions behind the whole prefix issue:  by identifying the encoded
> domain name with some (hopefully) unique affix, applications can display
> the string in the original set of codepoints.
>

Sung: With the code-matching scheme described in the draft, VIDN also can
achieve a one-to-one, reversible mapping. Maybe, as suggested by Paul
Hoffman in his earlier email regarding this draft, VIDN could use a prefix
hack similar to the ones as proposed in ACE schemes.

Sung: There is another difference between ACE and VIDN. In ACE,
internationalized domain names should be created in ACE format, and the
resulting domain names may be just an unintelligible collection of ASCII
characters. Only the original scripts before the ACE encoding may be
intelligible to those who know the respective local language. But in VIDN,
there is no need to create
additional domain names in any format other than ASCII. Further, in VIDN,
"virtual" domain names "used" in non-ASCII format as well as "actual" domain
names "registered" in ASCII format are intelligible. VIDN converts non-ASCII
characters into the corresponding ASCII characters at a higher level of
human-readable characters, while ACE schemes encode non-ASCII characters
into ACE characters at a lower level of machine-readable coded characters.

>   In this context, the only significant difference thatn I see between
> VIDN and other ACE proposals is that the VIDN-encoded strings can be
> 'read' by humans with some hope of being able to figure out what the
> corresponding 'actual' domain name is.
>

Sung: In VIDN, there is no need for the end user to guess or figure out what
the actual domain name in ASCII format is. If the end user knows what the
actual domain name in ASCII format is, he or she will use it instead of its
corresponding virtual domain name in non-ASCII format. The end user can use
a virtual domain name in non-ASCII format, which is clearly more intuitive
and meaningful in the local context, without need to know what the
corresponding actual domain name in ASCII format is. It is the job of VIDN
to convert the virtual domain name in non-ASCII format into its
corresponding actual domain name in ASCII format.

> | Sung: VIDN does not need round-trip mapping, although it may be
> | possible to convert characters from English back to local languages.
> | What is the use of this reversed conversion? Is it for those who speak
> | English, so that they can use English to go to domain names registered
> | in local languages? In VIDN, there are no domain names created and
> | registered in local languages. Those who speak English do not need
> | domain names in local languages. Please do not forget that domain
> | names in local languages are for those who do not speak English, not
> | for those who speak English. Again, VIDN does not create and register
> | domain names in local languages, and VIDN needs only domain names in
> | English actually exist as in the current DNS.
>
>   I think perhaps you're confusing 'language' with 'script' here.  VIDN
> does not propose to represent domain names in English.  The string
> 'jungang.com' is not an English-language sequence, but an imperfect
> representation of a Korean-language set of phonemes in English.
>
>   Paul Hoffman's document draft-hoffman-i18n-terms-00 provides a good
> discussion of some of these distinctions.
>

Sung: VIDN proposes to keep domain names only in ASCII format as the current
DNS does. The string "jungang" is an ASCII representation of a set of Korean
(non-ASCII) characters, and both represent the phonemes that have the same
or proximate sounds.

>   The general assumption of other proposals has been on representing
> scripts in the DNS, typically using some variant or encoding of the
> Unicode-3 or ISO:IEC-10646 character set.  To be honest, it never occurred
> to me to even think about higher-level 'language'-oriented approaches,
> because this domain is much less well-defined.
>

Sung: I think I can understand why. To be honest, when I first became
interested in the issue of internationalized domain names, I was surprised
to find that most, if not
all, earlier proposals for internationalized domain names were trying to
physically create or represent multilingual scripts in the DNS one way or
another. It seemed to me that they may have been preoccupied with
lower-level "technical" approaches, trying to think out the ways to
physically create or represent multilingual scripts in the DNS in order to
allow using multilingual domain names. In my opinion, "using"
internationalized domain names does not necessarily involve "creating"
internationalized domain names. It would be much better to allow using
internationalized domain names without actually creating them, and I think
VIDN achieves this.

>   Your arguments for this proposal include a statement that VIDN is the
> way to go because other methods involve a 'lengthy and costly process of
> implementation'.  I'm don't believe this is an accurate statement
> (depending on what you mean by 'lengthy' or 'costly'), or that VIDN
> proposes a better alternative.  In the case of a script-based approach to
> IDN, there are extremely well-defined standards (Unicode and
> ISO/IEC-10646) for encoding an extremely comprehensive set of characters
> in a standard way.  Indeed the ISO and Unicode work in this space has been
> in process for nearly 20 years.  In contrast, your proposal suggests that
> the basic for encoding should be a 'language X' to English
> transliteration, for which there are no common standards, much less mature
> ones like Unicode.
>

Sung: In ACE, domain names in ACE format should be created and registered,
in addition to existing domain names in ASCII format. One example of time
and cost - please consider the time and cost required on the parts of users
and the DNS to do this. VIDN does not need to do this.

Sung: If you know any local language other than English, please take a look
at domain names currently used in the regions where the local language is
widely spoken. I am sure that you will see some common patterns in those
domain names, regarding how the characters of the local language are
transliterated into the characters of English. VIDN uses these common
patterns of transliteration in converting non-ASCII characters into ASCII
characters.

>   In terms of cost, I'm not sure I understand how any ACE-based approach
> is going to cost more or less than any other.  Once the algorithm is
> defined and agreed-upon, writing implementation code is pretty
> straightforward.
>

Sung: Again, please consider the time and cost required on the parts of
users and the DNS to create and register domain names in ACE format in
addition to existing domain names in ASCII format. VIDN does not need to do
this.

>   Also, some phonemes aren't even directly representable in the RFC1035
> character set (the click sounds in some African languages, for example are
> usually represented with an exclamation point '!').
>

Sung: I do not know how they can be represented in the current DNS. If they
are not representable in the RFC1035 character set, they are not
representable in VIDN, either. I guess they are onomatopoeia. I am curious
about how they can be represented in ACE and other methods.

> | > b. There are no transliteration standards for many, many languages.
> | >
> |
> | Sung: Without such standards, people speaking local languages have
> | been creating and registering domain names in English. The most common
> | way to create domain names in English is to transliterate the
> | characters in local languages into the characters in English that have
> | the same or proximate sounds. VIDN uses the knowledge of this
> | transliteration based upon the sound or phonemic systems of the
> | respective local language and English. Please take a look at how those
> | domain names in English have been created in regions where English is
> | not widely spoken, without such standards.
>
>   If I correctly interpret your argument, you're saying "People are
> already doing something like VIDN today, so we should just formalize
> it."  I'm not sure how compelling of an argument that is.
>

Sung: Again, if you know any local language other than English, please take
a look at domain names currently used in the regions where the local
language is widely spoken. I am sure that you will see some common patterns
in those domain names, regarding how the characters of the local language
are transliterated into the characters of English. VIDN uses these common
patterns of transliteration in converting non-ASCII characters into ASCII
characters.

>   Given that the focus of IDN should be to seamlessly enable resolution of
> multilingual names for end-users, I'm not sure I understand what
> particular value the VIDN-proposed encoding scheme has over any other.  If
> the end-user is presented with a VIDN-encoded name, but isn't able to read
> the name as represented in ASCII characters and English syllables, what
> is the value of it?
>

Sung: I have already described above some important differences between VIDN
and other methods, which I believe are advantages of VIDN over other
methods.

Sung: Again, in VIDN, there is no need for the end user to guess or figure
out what the actual domain name in ASCII format is. If the end user knows
what the actual domain name in ASCII format is, he or she will use it
instead of its corresponding virtual domain name in non-ASCII format. The
end user can use a virtual domain name in non-ASCII format, which is clearly
more intuitive and meaningful in the local context, without need to know
what the corresponding actual domain name in ASCII format is. It is the job
of VIDN to convert the virtual domain name in non-ASCII format into its
corresponding actual domain name in ASCII format.

> | Sung: Because of its small size (e.g., the testing version of VIDN for
> | Korean-English conversion is about 800KB and the actual DLL file used
> | for the conversion is about 250KB), VIDN can be easily embedded into
> | user programs that use domain names, such as web browser and client
> | email software. Alternatively, the knowledge base of conversion and
> | the logic to process it can be embedded into operating systems as a
> | library, so that client software such as web browser and email
> | software can share them. The user will need only the module for
> | conversion of his or her preferred local language into English. Again,
> | there is no need to convert the romanizations back to native
> | characters for every language.
>
>   Why do this in the application?  That will require each and every
> application be modified for IDN.  Why not do this in the resolver and
> worry about fixing particular applications that get in the way?
>

Sung: VIDN proposes that the conversion from non-ASCII format into ASCII
format takes place in end-user applications. Alternatively, VIDN also
can be implemented at the name server or resolver. Please see the section
4.3 of the draft for details of VIDN implementation.

>   -brian
>
>
>
>
>