[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: failure detection
El 19/08/2005, a las 17:11, Paul Jakma escribió:
On Fri, 19 Aug 2005, marcelo bagnulo braun wrote:
Why must host1 detect this? Host2 could also ;).
not in a unidirectional connectivity scenario
consider the case where the failure implies that:
PrefA:Host1 -> Host2 is not working
PrefB:Host1 -> Host2 is working
Host2 -> PrefA:Host1 is working
Host2 -> PrefB:Host1 is not working
How would you cope with this case?
How important is this case?
Further, in your scenario, this was due to a local-failure near Host1.
A failure which can easily be detected locally without any need for
What's needed is:
- Host1 to detect the local failure and update the exit path to use
(and hence the source to use)
- this is achievable in multiple ways
- none of which need be in shim6
- none of which require shim6 to be aware of SAS or egress
- Host2 shim6 to detect host1's valid locators have changed
- Maybe because it receives a packet from Host1 with a new
- Maybe because Host2's reachability probes detect PrefB
How common is this failure mode?
You want to specify that shim6 be able to work around /any/ kind of
routing failure, anywhere on any part of the internet affecting any
path between Host1 and Host2.
My gut feelings though are:
- Failures typically are near the edges
- Failures are typically bi-directional for a given path
- Uni-directional failures tend to be due to /congestion/, not
actual failures - again, typically at the edges. Congestion related
"failures" tend to be very transient/sporadic.
- Failures in the 'middle' are uncommon, and tend to affect /huge/
numbers of paths (ie there's a decent chance it will take out /all/
- The problem of uni-directional failure on two /unrelated/ paths at
the same time is *tiny*
Hence (as a gut feeling):
- n^2 probing in shim6 is simply introducing huge expense in order to
solve a very uncommon problem
You think the tradeoff in order to achieve perfection is worth it.
I don't, I think the above is a general quality-of-internet-routing
problem. I think it's something that should and will be tackled within
the routing area, where people have been and are continuing working on
optimising routing protocols (from OSPF to BGP) to cope gracefully
with failures and restarts in order to eliminate some common scenarios
where routing-loops can occur in todays routing protocols.
I don't see a compelling reason to consider problems in internet
routing to be something shim6 needs to introduce great complexity for
in order to work around, when a simple approach (let underlying OS
routing pick the local prefix) will likely allow 99% of failures to be
detectable and worked around.
ok, i guess we have come to key point here.
We agree that the proposed mechanism proposed for the shim is what is
needed to deal with all failure modes and to identify if there is at
least one working path right?
We seem to disagree about if the cost that implies is worth it, right?
you seem to consider that there are simpler methods that would deal
with a significant amount of the common failure modes, in particular
the one you detail above.
I guess that probably RFC3178 already provides a reasonable solution
that provides a the protection level that you ask for. I mean RFC3178
protects from failures in the edges in a transparent fashion
There are many other possible mechanisms.
A host could have the following default route:
default via ISP1-gateway
what if there is a single router in a link of the multihomed site? i
mean, you cannot assume that in all links of the multihoemd site there
will be as many routers as ISPs the site is multihomed too, right?
In this point, i guess you end up requiring source address based
routing in the multihomed site, in order to allow the end host to force
routing through the selected exit ISP and the shim using the source
address to actually select the exit ISP hence the shim selecting the
source address, i guess
ISP1-gateway device X src ISP1-PA-address
ISP2-gateway device X src ISP2-PA-address
Some external mechanism could update this route as required. Be it
gateway-probing, probing "well known hosts on the internet", a RIP
default announced from border routes, or even an application which
monitors route-lookups or the OS route-cache and probes those and
updates routes accordingly.
The wealth of possibilities if you at least /allow/ the
source-selector mechanism to be external to shim seems a compelling
reason for shim6 to /not/ (by default) get involved.
So far, we are assuming that the shim is a host based approach and
that each host performs all the functions of the shim
the case of the proxy that you are mentioning have been considered
and it is attractive but it presents some difficulties, especially
w.r.t security... perhaps you could try to consider the security
implications of that split that you are considering...
There is one case where 'split' or 'proxy' mode shimming would be
possible without security ramifications, I think. The case where the
ULID's in use are IPv6 addresses, network prefix and host identifier.
Then a simple stateless static mapping on the "shimmers" (which would
be gateways into/out of the "shimmed" ULID network) will do. Security
then is simply not a concern, no more than it is for a normal router
with a static forwarding table.
not sure what you mean.. are you thinking in something like GSE here?
not sure what you mean here... ULID is upper layer identifier, right?
so the ULID belongs to the shim host, so i guess they are in the same
As above, it seems possible to me that shim6 could allow for at least
one usage that would not require shim and "ULP" to be same machine. It
could be very useful for small/medium size multihomed sites (ie not
large enough to get a global IPv6 prefix).
i agree it would be useful but i still not sure how do you deal with
security stuff in this case...
Paul Jakma firstname.lastname@example.org email@example.com Key ID: 64A2FF6A
Don't let your status become too quo!