[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[idn] Re: Legacy charset conversion in draft-ietf-idn-idna-08.txt



Paul Hoffman / IMC <phoffman@imc.org> writes:

> At 3:25 PM +0200 5/27/02, Simon Josefsson wrote:
>>I think the third paragraph of the security consideration should more
>>clearly express that IDNA actually is vulnerable to the attack if
>>machines, like most machines on the Internet, use legacy encodings.
>
> It isn't clear what "the attack" is. There is clearly a problem for
> the user when System A transcodes text from Encoding X into Unicode
> differently than System B does, but I don't see what the security
> issue is. Could you provide some suggested wording for the security
> consideration?

The basic attack: Alice runs on host that uses Latin-1 for
input/output and enters www.µbank.com (where µ is 8859-1 0xB5).  The
domain is registered using U+00B5, but Alice's application transcode
the string using U+03BC.  Either Alice can't connect (if the other
domain doesn't exist) or she ends up talking to someone else (if the
other domain does exist).

There are many arguments that can be raised on the applicability of
the simple attack:

1 You shouldn't map ISO-8859-1 0xB5 into U+03BC, the application is
  broken and should map it into U+00B5.  My reply: This might be true,
  but this doesn't follow from IDNA as IDNA leaves the transcoding
  issue open to the implementator.  It seems as if either IDNA need to
  reference mapping tables that MUST be used for legacy encodings, or
  state that IDNA enables the attack.

2 So what if Alice talks to someone else?  DNS can be spoofed anyway,
  so this isn't a new problem.  My reply: Such problems in DNS can be
  solved with DNSSEC.  However, even if DNSSEC is used, IDNA would
  enable this new attack => same conclusion as in 1.

3 Still, so what if Alice talks to someone else?  Alice should use TLS
  or IPSEC or SSH or CMS or Kerberos or SASL or something else to
  authenticate the endpoints and protect data.  My reply: This is
  where the subtle problems appears, and I think more investigations
  are needed here.  A (probably flawed) initial attempt:

  1 TLS and PKIX certs does not support IDNA, so we must first assume
  they are extended to support IDNA.  The security implications might
  be different depending on how IDNA is implemented, so we must study
  each approach individually.  Some approaches I can see:

    1 Put IDNA strings in DN/subjectAltName of the PKIX cert.  Alice
    compares IDNA in cert with the IDNA used to contact the host.
    While this appear to solve the problem, it really only moves the
    transcoding problem elsewhere.  Perhaps the CA had to convert an
    ISO-8859-1 string it received in mail into the IDNA when
    generating the cert.  Perhaps the machine used to apply for the
    certificate used ISO-8859-1 and had to convert it into an IDNA.
    Unless you assume the whole world switches to Unicode the day IDNA
    hits the street, the transcoding problem is present in at least
    one step in the chain, and has to be solved there.  Point is, the
    mapping tables must be standardized or you open up for attacks.

    2 Put IDNA strings in DN/subjectAltName of the PKIX cert.  Alice
    compares decoded IDNA in cert with the name used to contact the
    host.  Decoding from Unicode into legacy encodings is tricky.
    Alice must have mapping tables here as well.

    3 Use UTF-8 strings in DN/subjectAltName of the PKIX cert.  Alice
    compares ToASCII(name-in-cert) with IDNA used to contact host.
    This seems similar as 1, in that unless the whole world uses
    UTF-8, the mapping has to be done somewhere and must be well
    defined there to be secure.

    4 Add {charset, string} elements indicating the character set
    used, and the string in DN/subjectAltName of the PKIX cert.  If
    charset is the same as the charset that Alice uses, it will work
    fine.  However, in all other cases, mapping tables are needed, but
    this time O(n^2) tables must exist.

  2 SASL is just a framework, so it is each SASL mechanism that has to
  be studied.  Several SASL mechanisms (HMAC-MD5, DIGEST-MD5, SRP)
  does not include names of the endpoints, so the identity of the
  other end is only implicitly known after a succesful authentication.
  Thus it only makes man-in-the-middle attacks or password cracking
  slightly easier, no real security impact.

  ... etc, the cases for Kerberos, IPSEC and SSH seems to only repeat
  the discussions above.  I have not slept for a long time, so I'll
  spare you the discussion and me the typing. :-)

Suggested modified security consideration below.  It essentially says
that unless everyone switches to UTF-8, IDNA will enable new attacks
that has security implications.

--- draft-ietf-idn-idna-08.txt.orig	Mon May 27 18:18:58 2002
+++ draft-ietf-idn-idna-08.txt	Mon May 27 20:08:44 2002
@@ -690,10 +690,24 @@
 are introduced by the encoding process or the use of these encoded
 values, apart from those introduced by the ACE encoding itself.
 
-Domain names are used by users to connect to Internet servers. The
-security of the Internet would be compromised if a user entering a
-single internationalized name could be connected to different servers
-based on different interpretations of the internationalized domain name.
+Domain names are used by users to identify and connect to Internet
+servers. The security of the Internet is compromised if a user
+entering a single internationalized name is connected to different
+servers based on different interpretations of the internationalized
+domain name.  When all systems use ASCII or Unicode, different
+interpretations are not allowed in this specification.
+
+When involved systems use non-ASCII and non-Unicode characters (such
+as ISO-8859-1 and ISO-2022-JP, which are common on the Internet),
+however, this specification leaves the transcoding problem up to the
+application.  Thus there can not be any assurance that two
+applications will not implement different transcoding rules. When two
+applications implement different transcoding rules, they will
+(assuming both domains exists) contact different servers.  Note that
+the problem can not just easily be solved by using a security protocol
+such as TLS to identify and authenticate to end points, unless these
+protocols have already solved the problem which IDNA is trying to
+solve.
 
 Because this document normatively refers to [NAMEPREP], it includes the
 security considerations from that document as well.