[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [idn] WG last call summary



Valdis.Kletnieks@vt.edu writes:
> Hmm.. so you're saying that *ALL* that code out there that
> double-checked that things that claimed (possibly implicitly) to be
> USASCII were in fact in the 0-127 range are "crusty" code?
> Damn.  Sendmail 8.12.3.Beta1 is crusty - it actually bothers checking.

Time for some facts.

Sendmail, by default, does _not_ enforce the 0-127 restriction for mail
message headers. It allows bytes 160-255. Otherwise European users would
be dumping Sendmail even more quickly than they are today; ISO 8859-1
Subject lines are extremely popular.

Sendmail _does_ discard bytes 128-159 in mail message headers, because
it uses those bytes internally for its internal macro handling. Those
bytes aren't used in ISO 8859-1, but they are used in UTF-8. See
http://pi.cr.yp.to for a concrete example.

I sent Allman some email in February 1999 suggesting that he convert

   128 -> 255 160
   129 -> 255 161
   ...
   159 -> 255 191
   255 -> 255 255

with the opposite conversion on output. There have been several
security-fix releases of sendmail since then, so we could have had the
128-159 problem fixed on a huge number of machines. But he ignored the
suggestion. Apparently he doesn't care about international users.

People proposed more than a decade ago that the IETF require 8-bit-clean
mail software. (See, for example, Andre Pirard's ietf-smtp message dated
Tue, 19 Feb 91 12:08:00 +0100.) The only objection to this requirement
was the claim that 8-bit support would take a long time to be deployed.
Paul Vixie said that he had some seven-year-old sendmail binaries, for
example, and concluded ``with near-certainty'' that ``any changes to the
SMTP spec will take at least a decade to reach 90% of the critical
server population.''

In fact, it took less than a decade for every critical server to add
support for 8-bit message bodies, even though the IETF _still_ doesn't
require this. If the SMTP specification had been changed in 1991 to
require transparent 8-bit handling in both the header and the body, we
wouldn't have Sendmail's UTF-8 problems today.

Sendmail's continued data corruption is an embarrassment to the Sendmail
company. The fact that RFC 2821 and RFC 2822 allow this garbage is an
embarrassment to the IETF.

---D. J. Bernstein, Associate Professor, Department of Mathematics,
Statistics, and Computer Science, University of Illinois at Chicago