[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[idn] Re: Unicode and Security



<p0433010ab8884ddfb787@[192.168.254.4]>
<p04330110b88860b92563@[192.168.254.4]>
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: 8bit
From: Lars Marius Garshol <larsga@garshol.priv.no>
Date: 09 Feb 2002 21:52:17 +0100
In-Reply-To: <p04330110b88860b92563@[192.168.254.4]>
Message-ID: <m3ofiywf7i.fsf@lambda.garshol.priv.no>
Lines: 53
X-Mailer: Gnus v5.5/Emacs 20.2
Sender: owner-idn@ops.ietf.org
Precedence: bulk


* Elliotte Rusty Harold
|
| Let's say I register microsoft.com, only the fifth letter isn't a
| lower-case Latin o. It's actually a lower case Greek omicron.

I'll grant you that this is possible, perhaps even likely, and that it
may cause problems, but I'm far from convinced that this in any way
supports the "there are security problems in Unicode" thesis.

There are many characters which look alike, and yet are different,
which can cause problems of this kind. There are for example already
viruses which exploit the visual similarity between 1 and l in the
Windows system font to keep themselves from being discovered in file
listings.

So if this really is considered a problem it would seem to me that you
would need to deal with the problem of whoever@hotmai1.com,
whoever@hotmaii.com, and whoever@hotrnail.com looking very similar to
whoever@hotmail.com in lots of fonts. To exploit this, all you need to
know is what email client someone uses, and usually every email they
write will have that information in its headers.

It seems to me that this problem really needs some other fix than the
merging of all similar-looking characters in all character sets. I
just can't see that working.

Similarly, the "security problems" caused by using Unicode encoding
tricks to hide or mangle text in, say, contracts, is no different from
using HTML or CSS (or whatever) tricks to achieve the same effect, and
yet nobody is talking about security problems with HTML or CSS. See
[1] for one way of dealing with it that is now being worked on.

So while I accept that there is a problem it does not seem to me that
Unicode is the problem. To me the problem seems to be the complexity
of the relationship between the bytes sent to the user and what the
user actually sees and reacts to. That complexity is not going to
disappear, and aspects of the same "problem" exist with just about any
information representation, so clearly the solution must be something
other than changing all of these syntaxes/formats/encodings.

In the specific case you cite, for example, a better solution might be
for the user's email client to keep track of all the user's contacts
and for it to indicate in some clearly visible way whether the current
email comes from one of them or not. Whether it uses string matching
of email addresses or digital signatures to do that doesn't really
matter; it solves the problem in your example either way.

[1] <URL: http://www.w3.org/TR/xmldsig-core/#sec-Seen >

--
Lars Marius Garshol, Ontopian         <URL: http://www.ontopia.net >
ISO SC34/WG3, OASIS GeoLang TC        <URL: http://www.garshol.priv.no >