[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[idn] Re: Fwd: Unicode letter ballot



"Adam M. Costello" <idn.amc+0@nicemice.net.RemoveThisWord> writes:

> Simon Josefsson <jas@extundo.com> wrote:
>
>> Authentication identity "admin", authorization identity U+4711,
>> password X.  For the argument, let's say U+4711 decomposes into U+1234
>> in Unicode 3.2 but is later changed to U+4321.
>>
>> The SASL library, acting as a proxy in front of the application
>> software, implements the current libstringprep correctly.  It checks
>> that admin's password is X and that he is authorized to log in as
>> U+1234 (which is the result after stringprep of U+4711, which was sent
>> because the client hadn't been updated to use stringprep, which should
>> cause no problem) and says OK to the application.
>>
>> Now, in 1a the application is using updated tables from a more recent
>> stringprep that incorporates the fixed decomposition mapping, causing
>> it to admit the user to an account U+4321.  This is bad.
>>
>> In 2a, the application sees that the characters are deprecated due
>> to its decomposition mapping changed, and rejects the user.  This is
>> good.
>
> It looks like the security hole in 1a stems from the existence of two
> Unicode strings X and Y such that now Stringprep(X) != Stringprep(Y)
> (so that two distinct accounts for X and Y can be created), but later
> (after the update of the decomposition mappings) Stringprep(X) ==
> Stringprep(Y), so the two accounts will get confused.
>
> But I think the same phenomenon can happen with 2a.  There are CNS
> 11643 strings A and B such that now Stringprep(CNS11643toUnicode(A))
> != Stringprep(CNS11643toUnicode(B)) (so that two distinct accounts for
> A and B can be created), but later (after the deprecation and addition
> of Unicode characters, and the subsequent update of CNS11643toUnicode
> to use the new Unicode characters instead of the deprecated ones)
> Stringprep(CNS11643toUnicode(A)) == Stringprep(CNS11643toUnicode(B)),
> so again the two accounts get confused.  No deprecated characters are
> going to be seen and rejected, because no CNS 11643 characters are
> deprecated, and the deprecated Unicode characters do not appear in the
> new CNS11643toUnicode table.

Yes, if I understand it correctly, I belive this was my motivation for
proposing a security consideration saying something along the lines of
that transcoding to and from the Unicode charset is a critical part of
secure IDN and the the current IDN specification set doesn't address
that problem, so it is a security consideration.  IMHO that problem is
bigger than the issues discussed here which only concerns a few and,
more importantly, well known characters.  To my knowledge, nobody has
studied how many or which characters are in conflict in various
transcoding table used.

But in case 1a, it seems the problem can happen even when only Unicode
is used.  This makes the problem caused by transcoding exist even when
no transcoding is involved.

> An approach that would really avoid this pitfall would be to deprecate
> these characters not only in Unicode, but also in CNS 11643 and any
> other character sets that contain them, and create new characters
> in all these character sets, and leave all the mappings of the old
> deprecated characters unchanged in both the Unicode database and the
> WhateverToUnicode tables.

That would solve it.  However, it seems the IDN WG has decided to only
care about Unicode though, so the consequence of that decision is to
declare that solution as out of scope as far as IDN is concerned.
Anyone implementing this in the real world will have to solve the
problem herself though, perhaps by adovcating the solution you
proposed.