[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: FYI: DNSOPS presentation
* Jason Livingood
> [...] a short whitepaper some colleagues and I put together in
> advance of that meeting (available at
Hello, Jason. I have a few comments. On the 1st of April, you posted
the following text to the dnsop mailing list, quoting another message
sent to the same list by me:
> - Total client loss to the dualstacked host is at 0.074%.
> - (At least) 95% of the client loss is due to clients choosing to use
> inherently unreliable transitional IPv6 (6to4/Teredo) instead of IPv4.
> - I've identifed three groups of clients that behave in this way:
> 1) Users of Opera <10.50, which did not use the RFC 3484-compliant
> getaddrinfo() call, at least on Windows. It would prefer any form
> of IPv6 connectivity above IPv4. This is especially a problem on
> Windows Vista and 7, where 6to4 and/or Teredo are enabled by
> default. This accounts for about half of the 95%.
> 2) Dualstacked Mac OS X users with RFC 1918 IPv4 addresses (using
> NAT) and transitional IPv6 addresses. In this case, getaddrinfo()
> will sort the IPv6 address above the IPv4 address, causing the
> transitional connectivity to be used. This accounts for the other
> half of the 95%.
> 3) Dualstacked Linux users with RFC 1918 IPv4 addresses and
> transitional IPv6 addresses, as GNU libc's getaddrinfo()
> implementation behaves exactly like the one in Mac OS X. However,
> the overall client loss caused by this is miniscule compared to
> #1 and #2 (I have problems measuring any at all).
> - When disregarding all hits from users in problem groups #1 and #2,
> the total client loss is at 0.003% - this is, in my opinion, low
> enough to accept. (Of course, other content providers might feel
I find it quite puzzling that you, just a few weeks later, claim that no
data about this problem has been shared with the community. That was,
after all, exactly what I intended to do with the above message as well
as all the reports I've sent to the ipv6-ops mailing list in the last
few months. Why did you choose to disregard the data from my
measurements completely? Do you feel that they are flawed or
incomplete? If so, I would appreciate it very much if you could tell me
what exactly is the problem. I promise I will do my best to address
your concerns, and also publicly share any updated reports.
Others are doing measurements, too, and are publishing results:
I believe that I'm the one that's been trying the most to look behind
the numbers and identify (and publish) the actual underlying causes of
client loss, though.
Also, in the document, you write:
> [...] 0.078% of users are "broken." That is a trivial percentage of users to
> most ISPs. For example, in the Comcast network, this represents fewer
> than 12,000 users. If the problem could be solved with a new home
> gateway device, to use one possible example, Comcast could rapidly
> ship replacement devices to affected users. We believe, given the
> small number claimed, that most ISPs could rapidly solve this problem
> (if it can be precisely defined) on their own, without content
> providers having to resort to DNS whitelisting.
I doubt any content provider feels that breaking access to 12,000 users
from Comcast alone is «trivial». For a large provider, 0.078% of all
users could very well mean millions of users world-wide. In the end, it
all boils down to the content providers asking themselves the following
«Do we want to service 999 or 1,000 users today?»
Or: «Do we want to make €999 or €1,000 today?»
I fail to see why any commercial content provider would choose the first
option - which currently represents dual-stack. The DNS whitelisting
approach is just a workaround in order to allow for some IPv6/dual-stack
operation to take place without at the same time reducing the number of
At some point in the future, when the number of users able to access a
dual-stack site equals or exceeds the number of users able to access a
IPv4-only site, there's no longer any need for DNS whitelists and I'm
sure they will disappear overnight (potentially replaced with blacklists
though), and since having only IPv4 will be a competitive disadvantage,
you'll see IPv6 deployment accelerating rapidly on the content side.
You in Comcast can help making that point in time be reached sooner
rather than later, by providing production-quality native IPv6 service
to all users. You obviously are working on it, and I applaud that.
However your efforts alone are not enough - we need other ISPs to do the
Almost all of the client loss I see is caused by end-user software
preferring unreliable transitional IPv6 connectivity, 6to4 in
particular. I've not been able to identify faulty CPE devices as a
cause of client loss, so this is out of your control. (It could very
well be that faulty CPE devices are a problem with certain ISPs in
countries that simply are not accessing my Norwegian-language content,
In any case, you can help alleviate this problem somewhat by placing
6to4 relays (220.127.116.11/24) in your network as close to the end users
as possible, and then content providers can do the reverse (relay to
2002::/16) in their data centres. This will ensure that 6to4 will be
working almost as reliably as regular IPv4.
But as before, Comcast can not solve this problem alone, and broken 6to4
connectivity is especially a problem in enterprise/managed network
environments that are filtering proto-41 outright. I've had discussions
with some of these networks, some do it knowingly and have no intention
of changing it, some have no idea of what I'm talking about, and finally
some removed their filters. Identifying and contacting all these
«6to4-hostile» networks, and persuading them to change their practise,
is simply not a scalable way of handling the problem, though.
Especially if you're a global content provider.
The best way I can think of to fix the problems I'm observing, is to
have software vendors release updates, and for those updates to make
their way onto most end users' systems. I've been pursuing that goal
for a while, so far this has resulted in the release of a RFC
3484-compliant version of the Opera web browser, which has caused a very
significant drop in overall client loss. Also several operating system
vendors have applied my patches to GNU libc implementing section 2.7 in
draft-arifumi-6man-rfc3484-revise-02 to their developement branches. So
far these are Fedora, Ubuntu, Gentoo, and Mandriva. Debian, openSUSE,
and last but not least Apple, have not yet responded.
The Linux distributions will not have a noticeable impact on the overall
client loss numbers, but if Apple were to apply the proposed change to
Mac OS X and push updates to their customers, it would have a enormous
impact. Glancing briefly at my numbers this far in April, some 60-70%
of client loss is attributable to Mac OS X, or 85-95% if I first
disregard older known broken versions of the Opera web browser. The GNU
libc maintainers declined to apply the change because RFC 3484 is still
the current standard, perhaps Apple feels the same way. If so it would
be fantastic if an updated RFC 3484 was published really soon, as it
takes time to get software updates onto most end users' computers. And
time is something we do not have too much of at this point.
Redpill Linpro AS - http://www.redpill-linpro.com/
Tel: +47 21 54 41 27