[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: failure detection

To: marcelo bagnulo braun <marcelo@it.uc3m.es>
Subject: Re: failure detection
From: Paul Jakma <paul@clubi.ie>
Date: Fri, 19 Aug 2005 16:11:34 +0100 (IST)
Cc: shim6 <shim6@psg.com>
In-reply-to: <4eb5dc3a95d2217a22ab1d81e23fd10d@it.uc3m.es>
Mail-copies-to: paul@hibernia.jakma.org
References: <8622E6A4-B0D7-4C9B-B184-8EB2A7C2738E@muada.com> <Pine.LNX.4.63.0508141523170.7023@sheen.jakma.org> <efebcb5728efd81901d5357b3993b6db@it.uc3m.es> <Pine.LNX.4.63.0508171556080.5353@sheen.jakma.org> <efa6464a563345cc24542d6ab48f3538@it.uc3m.es> <Pine.LNX.4.63.0508171932550.5353@sheen.jakma.org> <0f13bcc353755a4b9b965267a6a7ffb1@it.uc3m.es> <Pine.LNX.4.63.0508181034240.5291@sheen.jakma.org> <d1bbabb2d2a04821223d24f940796d23@it.uc3m.es> <Pine.LNX.4.63.0508181513480.5291@sheen.jakma.org> <4eb5dc3a95d2217a22ab1d81e23fd10d@it.uc3m.es>

On Fri, 19 Aug 2005, marcelo bagnulo braun wrote:

Why must host1 detect this? Host2 could also ;).


not in a unidirectional connectivity scenario

consider the case where the failure implies that:
PrefA:Host1 -> Host2 is not working
PrefB:Host1 -> Host2 is working
Host2 -> PrefA:Host1 is working
Host2 -> PrefB:Host1 is not working

How would you cope with this case?


How important is this case?

Further, in your scenario, this was due to a local-failure near Host1. A failure which can easily be detected locally without any need for n^2 probing.

What's needed is:

- Host1 to detect the local failure and update the exit path to use
  (and hence the source to use)
	- this is achievable in multiple ways
	- none of which need be in shim6
	- none of which require shim6 to be aware of SAS or egress
	  issues

- Host2 shim6 to detect host1's valid locators have changed
	- Maybe because it receives a packet from Host1 with a new
	  source
	- Maybe because Host2's reachability probes detect PrefB

How common is this failure mode?

You want to specify that shim6 be able to work around /any/ kind of routing failure, anywhere on any part of the internet affecting any path between Host1 and Host2.

My gut feelings though are:

- Failures typically are near the edges
- Failures are typically bi-directional for a given path
- Uni-directional failures tend to be due to /congestion/, not
  actual failures - again, typically at the edges. Congestion related
  "failures" tend to be very transient/sporadic.
- Failures in the 'middle' are uncommon, and tend to affect /huge/
  numbers of paths (ie there's a decent chance it will take out /all/
  your paths)
- The problem of uni-directional failure on two /unrelated/ paths at
  the same time is *tiny*

Hence (as a gut feeling):

- n^2 probing in shim6 is simply introducing huge expense in order to
  solve a very uncommon problem

You think the tradeoff in order to achieve perfection is worth it.

I don't, I think the above is a general quality-of-internet-routing problem. I think it's something that should and will be tackled within the routing area, where people have been and are continuing working on optimising routing protocols (from OSPF to BGP) to cope gracefully with failures and restarts in order to eliminate some common scenarios where routing-loops can occur in todays routing protocols.

I don't see a compelling reason to consider problems in internet routing to be something shim6 needs to introduce great complexity for in order to work around, when a simple approach (let underlying OS routing pick the local prefix) will likely allow 99% of failures to be detectable and worked around.

I'd like to see more information on path failure modes seen on the internet, what is common, what is not, before I would change my position.

this last point is influencing SAS and trying with alternative source locators, which is basically what we are considering here and that you are opposing right?


Correct.

As I stated later in that mail, you can achieve nearly the same thing by simply relying on external means to shim6 to update which egress interface/address the OS uses.

My position is that "nearly the same thing" is more than good enough, particularly when it could implement a great amount of complexity from shim6.

Note that such external mechanisms would *not* be precluded from doing n^2 probing, or any other fancy scheme they want to implement. (I wouldn't care to implement it, but..). Ie I'm not arguing shim6 should preclude such probing, only that the base spec should assume simple mechanisms and provide the protocol tools to allow more complex probing (eg a 'PROBE' message or somesuch).

So weirder, more complex (and unneeded imho) probing and getting involved in SAS would be an implementation detail ;).

we have already discussed this point (in multi6) and imho it is not such a good idea to have the hosts to receive a full BGP feed.. i think there was ssome kind of concensous on this point, but maybe now has changed...

BGP feed (not advertising anything) is simply /one/ option - I gave a list of possibilities, using a routing protocol was just one of them. Using BGP would only be suitable for enterprise sites (presuming shim6 allows for a split/proxy mode).

Eg, DSL connected hosts: My experience, with my ISP, is that the only failures I actually notice are telco/local-loop related, or related to the DSL cable run in my house - and my PPP stack (which implements a type of keepalive) is the first to notice.

Maybe I just have a good ISP.

Oh, there's yet another option, in the long-run: If your ISP(s) sucks so much that odd uni-directional path failures are regular enough in occurance -> drop that ISP and go to another.

The idea is that shim performs e2e failure detection (as i already mentioned earlier) so that the fate of the communicating parties i.e. the apps is shared with the fault tolerance mechanism and that the shim can detect all the potential outages and recover from them. Having different mechanisms deal with different types of failures would result in a reduced protection i guess

I can see why the goal is attractive. I wonder though whether the required complexity is worth it.

but you are assuming that host1 detects local failures through other means


Correct.

and this other means are likely to be injecting a full BGP feed into end hosts, right?


No, that's not likely at all.

Your math is correct


Phew, cause it's been a while ;).

now let me ask: how many unidirectional address pairs are available between two hosts having y and n addresses each one?

i guess that we agree that there are y^2+n^2 different unidirectional address pairs, right?


Right.

So the point is: if you want to provide full fault tolerance you need to explore them all, if you don't, there may exists available paths that you are discarding, hence there are communications that could be preserved but you are not finding the available path to use.


Right.

Is it worth though?

You're coming from a position where you do not wish to have to rely on the stability of internet routing. In your world the "internet cloud" is likely to be swiss-cheese (full of "black holes" ;) ), and can't be relied upon.

That isn't in line with my experience of the internet. IMHO, it's reliable "enough". Further, it there are reliability problems in internet routing, then surely the best way to deal with them is in the /routing area/ working groups? ;)

Ie, is it worth it?

Unreliable ISPs will (eventually) be taken care of by market pressures. They'll either fix their problems, maybe take advantage of graceful-failure mechanisms which are becoming more prevalent in routing protocols and implementations, or they will slowly die as their customers move elsewhere.

Another factor to consider in routing reliability, that (IMHO) obviates need to worry about so much about it in shim6: VoIP. As more and more telcos switch over to unified IP core for both their voice and data services, they're putting ever resources into researchers, implementors and IETF to make routing 'perfect', at least for intra-AS - they can't tolerate packet loss or odd latencies because customers will *hear* it. In time I suspect even inter-AS routing will be optimised to be much more stable than it is today, as eventually (i suspect) VoIP inter-telco peering will replace SS7 (and they'll get researchers/implementors and IETF to optimise that case too).

Anyway, I think that's a fairly exhaustive explanation of why I think perfection in using available paths should /not/ be a core shim6 objective.

I'll stop harping on on that topic. I would like to see justification as to why it should be though.

Of course, you may have optimizations, like testing two unidirectional paths (one in each directions) with a single packet exchange (2 packets instead of 4), using local information for discarding some addresses and so on, but again these are only optimizations.

You could I guess. I wouldn't, but you could. I suspect most people would be happy with just assuming the internet "cloud" is mostly reliable.

Particularly: If it means that shim6 drafts are simplified, easier to write, easier to get reviewed and approved, easier to implement the basic functionality, etc.. then that means shim6 gets "to market" (ick) quicker.

If I can get to use shim6 in N years because it's simple, rather than N+2 because it tries to cope with every possible path-failure..

how? i guess that you are asuming a BGP feed on hosts right?

No, I was assuming a split-mode of operation, with shimmed-address/ULID using hosts not doing the shimming, but edge 'shim routers' doing it (hence BGP would be confined on those edges).

The drafts consistently refer to shim sitting on each host. So my assumption appears to be wrong. Though, I don't see why split-mode would not be possible (particularly if ULID's are IPv6 addresses and composed of a prefix and host identifier..).

I mean, a hosts with a single deafult route, wouldn't really know which of the n addresses available in its single interface to use for a given destiantion address... i mean DAS would not result in any particular address and the source address would be selected randomly...


There are many other possible mechanisms.

A host could have the following default route:

default via ISP1-gateway
	via ISP2-gateway

ISP1-gateway device X src ISP1-PA-address
ISP2-gateway device X src ISP2-PA-address

Some external mechanism could update this route as required. Be it gateway-probing, probing "well known hosts on the internet", a RIP default announced from border routes, or even an application which monitors route-lookups or the OS route-cache and probes those and updates routes accordingly.

The wealth of possibilities if you at least /allow/ the source-selector mechanism to be external to shim seems a compelling reason for shim6 to /not/ (by default) get involved.

So far, we are assuming that the shim is a host based approach and that each host performs all the functions of the shim


Yes.

the case of the proxy that you are mentioning have been considered and it is attractive but it presents some difficulties, especially w.r.t security... perhaps you could try to consider the security implications of that split that you are considering...

There is one case where 'split' or 'proxy' mode shimming would be possible without security ramifications, I think. The case where the ULID's in use are IPv6 addresses, network prefix and host identifier.

Then a simple stateless static mapping on the "shimmers" (which would be gateways into/out of the "shimmed" ULID network) will do. Security then is simply not a concern, no more than it is for a normal router with a static forwarding table.

not sure what you mean here... ULID is upper layer identifier, right? so the ULID belongs to the shim host, so i guess they are in the same machine


Right.

As above, it seems possible to me that shim6 could allow for at least one usage that would not require shim and "ULP" to be same machine. It could be very useful for small/medium size multihomed sites (ie not large enough to get a global IPv6 prefix).

regards,
--
Paul Jakma	paul@clubi.ie	paul@jakma.org	Key ID: 64A2FF6A
Fortune:
Don't let your status become too quo!

Follow-Ups:
- Re: failure detection
  - From: Erik Nordmark <erik.nordmark@sun.com>
- Re: failure detection
  - From: marcelo bagnulo braun <marcelo@it.uc3m.es>
- Re: failure detection
  - From: Iljitsch van Beijnum <iljitsch@muada.com>

References:
- failure detection
  - From: Iljitsch van Beijnum <iljitsch@muada.com>
- Re: failure detection
  - From: Paul Jakma <paul@clubi.ie>
- Re: failure detection
  - From: marcelo bagnulo braun <marcelo@it.uc3m.es>
- Re: failure detection
  - From: Paul Jakma <paul@clubi.ie>
- Re: failure detection
  - From: marcelo bagnulo braun <marcelo@it.uc3m.es>
- Re: failure detection
  - From: Paul Jakma <paul@clubi.ie>
- Re: failure detection
  - From: marcelo bagnulo braun <marcelo@it.uc3m.es>
- Re: failure detection
  - From: Paul Jakma <paul@clubi.ie>
- Re: failure detection
  - From: marcelo bagnulo braun <marcelo@it.uc3m.es>
- Re: failure detection
  - From: Paul Jakma <paul@clubi.ie>
- Re: failure detection
  - From: marcelo bagnulo braun <marcelo@it.uc3m.es>

Prev by Date: Re: Thoughts about layering multi-addressing
Next by Date: Re: shim-aware transports
Previous by thread: Re: failure detection
Next by thread: Re: failure detection
Index(es):
- Date
- Thread