[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] IDN WG Last Call: draft-ietf-idn-punycode-00.txt



The decoding process is a bit hard to follow. I would recommend adding
some annotated examples of decoding to Punycode in a section 7.1 and
referencing them from the text. E.g.
...
3. Bootstring description

    Bootstring represents an arbitrary sequence of code points (the
    "extended string") as a sequence of basic code points (the
    "basic string").  This section describes the representation.
    Section 6 "Bootstring algorithms" presents the algorithms as
    pseudocode. [[[Section 7.1 "Annotated Decoding Examples"
    provides some examples of the decoding process which may
    be useful in understanding the following text.]]]
...

...
7.1 Annotated Decoding Examples
The following provides two examples of the decoding process. One
example contains mostly ASCII letters with a few accented characters,
while the other has only non-ASCII characters:

[[[The deltas are just pulled out of a hat for illustration; you'd run
your program to generate them.]]]

1. "Proprostnemluvesky-uyb24dma41a"

Since there is a delimiter, the basic string consists of:
  "Proprostnemluvesky"
and the delta codes are:
  "uyb24dma41a"
The delta codes provide instructions on where to insert other
characters in the stream.
The first delta is contained in "uy", and produces the number +YYYY
It inserts <ccaron> = U+XXXX at position 3, giving:
  "Pro<ccaron>prostnemluvesky"
[[[insert the rest of the steps]]]
The final result is:
  "Pro<ccaron>prost<ecaron>nemluv<iacute><ccaron>esky"

2. "ihqwcrb4cv8a8dqg056pqjye"

Since there is no delimiter, the basic codes are empty to start with,
and the entire encoded string consists of deltas.
The first delta is contained in "ih", and produces the number +XXXX.
It inserts <4ED6> at the start, giving:
 <4ED6>
[[[insert the rest of the steps]]]
The final result is:
<4ED6><4EEC><4E3A><4EC0><4E48><4E0D><8BF4><4E2D><6587>

Mark

----- Original Message -----
From: "Marc Blanchet" <Marc.Blanchet@viagenie.qc.ca>
To: <idn@ops.ietf.org>
Sent: Monday, January 28, 2002 09:06
Subject: [idn] IDN WG Last Call: draft-ietf-idn-punycode-00.txt


>
> Title: Punycode version 0.3.3
>
> Abstract:
> Punycode is a simple and efficient encoding designed for use with
> Internationalized Domain Names [IDN] [IDNA].  It uniquely and
> reversibly transforms a Unicode string [UNICODE] into an ASCII
> string.  ASCII characters in the Unicode string are represented
> literally, and non-ASCII characters are represented by ASCII
> characters that are allowed in hostname labels (letters, digits,
> and hyphens).  Bootstring is a general algorithm that allows a
> string of basic code points to uniquely represent any string of code
> points drawn from a larger set.  Punycode is an instance Bootstring
> that uses particular parameter values appropriate for IDNA.  This
> document specifies Bootstring and the parameter values for Punycode.
>
> A URL for this Internet-Draft is:
> http://www.ietf.org/internet-drafts/draft-ietf-idn-punycode-00.txt
>
> This two weeks WG last call ends Feb 11th 2002, 23h59 GMT-5.
>
> This document is for standards track.
>
> The co-chairs see a wg rough concensus on this document.
> Please send any proposed modification of the text to the wg mailing
list as
> well as to the wg co-chairs, stating clearly the text to be modified
and
> the new text.
>
> Marc & James
>
>
>