[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: failure detection
On Thu, 18 Aug 2005, marcelo bagnulo braun wrote:
SHIM wg is about providing multihoming support for IPv6. In
particular a solution for IPv6 multihoming must be able to preserve
communications through outages in the communicating path. Such
functionality is provided in IPv4 through BGP features but it
requires the injection of site routes in the interdomain routing.
In IPv6 PA addressing is used, so we need additional mechanisms
(the shim) to provide equivalent functionalities, in particular to
preserve established communications through outages.
Ok. We're on the same page here I think.
1. Why this is a compelling argument given that it's been possible to
publish multiple addresses in DNS for a long long time, yet there has been
0 demand for either applications to implement n^2 path-probing of each
local address to every remote address, or for OSes to implement some kind
of 'path-probe' shim to provide such functionality for all applications?
I am afraid you are missing our goal here. this is not a matter of
oportunity but the way we can preserve established communication
through outages. see above
But you can preserve comms without n^2 probing.
With the shim, path are closely related to addresses used, in
particular exit paths of the multihomed site are related to source
So in order to provide this type of features, source address
selection has to be influenced, for instance using RFC 3484 policy
So in order to determine the source, you're saying the table which
affects source selection has to be influenced by shim6. Surely that's
like saying "shim6 will use whatever source pleases it" (ie simply
ignoring SAS for the final shim6 output packet).
The SAS aspects of this n^2 probing talk bother me greatly. It will
ignore local policy, policy which might be there for very good
reasons, which shim6 has no oversight of.
Ie: The best way to honour local policy is to use INADDR_ANY and let the OS
decide the source address by consulting local routing policy -
alternatively, an administratively specified address. Why exactly is shim6
so different from everything else on the internet and special that this
would not work for it?
this is exactly how the shim would support policing see above
No, you said shim6 likely will have to influence SAS. Thats quite
different from not worrying about SAS at all and letting the OS
decide according to its local policy.
3. The traditional way on the internet to guard against path failures is to
get a routing feed (and no, that does *not* imply you advertise anything),
why is shim6 so special that it can't defer to existing practice?
scalability. traditional IPv4 routing based multihoming lacks of it
Note again "does *not* imply you advertise anything". Everyone is
(sort of) agreed "multihome by advertising your prefix" doesn't scale
and should not be considered for IPv6 - hence shim6.
I'm saying you can still get a "read-only" routing feed (BGP or
whatever), purely for informational purposes, to help decide which of
your ISPs has the best path.
That's entirely scaleable and well within reason for deployment at
'enterprise' shim6 sites.
4. How will you decide which path is best?
local policy can be expressed to some degree with the policy table
defined in RFC 3484. If more fine grained expression is needed
(e.g. per app) additional parameters need to be included in the
Sorry, my question was more about the metrics you will use to
determine whether path (local A, remote B) is better than (local C,
remote D). Eg, RTT, packet loss, etc.
we seem to be assuming that multihoming support is something useful
and that it will be needed in IPv6.
I'd agree with that assumption. :)
This multihoming support seems to require communications to be
preserved through outages.
Agreed, but I still don't understand why this requires shim6 to
specify overriding existing SAS policy and do n^2 probing.
6. If path-probing really is desired, explain why this is shim6 specific?
path exploration is a fundamental part of the shim protocol-. maybe
is not shim specific and ideas from other similar protocols can be
used, but it is imho a key part of the shim protocol and need to be
part of it.
If it is not specific to shim6, why should it be solved in shim6? (In
the 'for every possible path of the combinations of local and remote
- software that does local path-probing to determine reachability of
locally attached gateways (eg IPMP in Solaris for one)
this seems to be local, while shim is defined e2e
Indeed. But it's already implemented. And local probing (with no SAS
messing about) likely will do for many cases.
- BFD and some other protocols in development
B means bidirectional, and we are not assuming bidirectional paths here
- software that monitors the systems route-cache and does
path-probing for destinations that currently see flows
not sure what you mean by those but in any case, i am sure we can
benefit from these designs as well from the others you emntioned to
design the shim path exploration.
Yes, shim6 could benefit from these potential designs by *not*
getting involved in complex path probing and SAS. ;)
Eg, the local address probing proposed for shim6 is /counter/
productive to other possible external mechanisms (worst of all,
including "get a read-only BGP feed from your ISPs", which is the
most practical way to do this.)
If you are familiar with those, i am sure that your knowledge would
be very useful to help with the design of the path exploration
protocol of the shim
Drop the local source selection from shim6, things become easier,
shim6 will interoperate better with other routing software, etc.
Note that it's the probing using every single local address which bothers
the most. Simple heartbeats and monitoring which set of locators are
reachable and just picking one and sticking to it till you needed to
switch, I could agree with.
AFAICT this is the approach being considered here or at least one
One of them yes, the n^2 probing is one of the options being
considered - I hope to quash its further development :) - at least
in terms of something that is mandated officially as part of shim6
specs (can still be mentioned in some kind of implementors note).
I mean, imho, we would only need to perform path exploration after
You can do it more cleverly than that.
I fail to understand what you are missing. Failure detection and
Path exploration are key components of the shim, and they are
needed to preserve established communications through outages.
Sure, I agree. But why not do it in the simplest way possible?
on that. It's the n^2 probing I want to ensure is /not/ considered for
inclusion in shim6 RFCs, other than as something mentioned as a possible
Ok, i think i see now. Your problem is with probing with different
source locators, right?
Well, this is needed because the source address determines the exit
path from the multihomed site. I mean, because we are assuming PA
addressing, changing the source address results in using a
different ISP in the multihomed site. That is why different source
address need to be explored
Explored and then dismissed as a bad idea, I strongly hope.
The source to use for the shim should be determined by the
/destination/ address in the shim6 packet you receive (eg) first from
the other side.
Ie, given two shim6 stacks, A and B, that want to talk to each other
via the global internet (eg to exchange info needed to create shim
mapping between them, A initiates), each with 3 addresses say (A1,
A2, so on). The communication required is:
A retrieves the required locator information, and gets a list "B1,
B2, B3". It then sends a packet to each remote address, n packets at
a time, in series, with whatever shim6 control message is required:
A(IN6_ADDRANY) -> B1
A(IN6_ADDRANY) -> B2
A(IN6_ADDRANY) -> B3
some time later a reply is received:
A(x) <- B3
The source address A should prefer to use (if it must prefer one over
IN6_ADDRANY) is 'x', as determined by which destination address was
in the packet of B's that got through.
See, that's simple and works - no probing required.
If "n^2 probing is simply not an option" is in your mind, then you'll
start realising there are other, simpler, better ways of achieving
the same end-goal - which don't require messing with local SAS policy
either, but rather use it as intended (without modification).
There's many many years of existing deployment of IP using systems and
applications that simply don't consider such complex probing worth it.
right, because they are not assuming the usage of multiple PA
addresses in a single host
That's a fair point, and likely part of the answer.
Another possibility is simply that path failures in the 'middle' of
internet are rare and hence users and apps have not had any
compelling need to explore this possibility.
When you include multiple PA addresses in hosts within a multihomed
site, then you find out that you need to try with different source
No, you don't. Because you think it's an option (it isn't :) ),
you've stopped trying to find better options (which there are).
Note that part of my definition of "better option" is one which
includes "doesn't use n^2 probing and fiddle with SAS", so it may be
a self-fulfilling definition.
sort of... you still lacking DoS protection and locator security
but kind of what is being considered (with a couple of additional
Yes, no security in that.
well, yes, but we are considering quite a few optimizations for
this, like ULP feedback and traffic monitoring also but yes, This
is in the lines of the failure detection mechanism being
ULP feedback - don't rely on it :) (I checked, it exists in Solaris
TCP, for use by NDP, as you pointed out. I can't find anything in
path exploration is more complex than that, because you need to
change the source address to change the exit ISP. remeber that we
are assuming PA addresses, and they are only routed through one of
the ISPs of the multihomed site.
I know this fine well, I've had operational experience of PA
multihoming using tunneling with IPv4. :)
See above, given above, you *do not* need to do anything clever with
source-address selection at all, imho.
i think that most of this stuff is included in the current drafts,
just that additional complexity is considered, for instance
security stuff, unidirectional path support and cosniderations
about the constraints imposed by the usage of multiple PA addresses
in the multohomed site and ingress filtering
Paul Jakma email@example.com firstname.lastname@example.org Key ID: 64A2FF6A
The mouse escaped.