[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: soft state (was Re: shim6 and bit errors in data packet headers

To: Iljitsch van Beijnum <iljitsch@muada.com>
Subject: Re: soft state (was Re: shim6 and bit errors in data packet headers
From: Erik Nordmark <erik.nordmark@sun.com>
Date: Wed, 01 Jun 2005 14:01:03 -0700
Cc: shim6 <shim6@psg.com>
In-reply-to: <645EF203-1E18-4278-A499-FCB9F48AB654@muada.com>
References: <42780A4C.7080904@sun.com> <41646a8410adc0a6c58bf98db0260577@it.uc3m.es> <427F9FC0.3020509@sun.com> <2254e1ef5ab17f200b4bd5f09a92b63e@it.uc3m.es> <4281276F.4020802@sun.com> <8394717ce2d159e84afd5244914eca6c@it.uc3m.es> <42824BD0.90608@sun.com> <0ac42790d17b083958f7ef83c670be3c@it.uc3m.es> <4283F3B1.7010906@sun.com> <a66e3bc998a4e570312aecc7d4e4259d@it.uc3m.es> <42853EAA.3020206@sun.com> <a944cd319cc32b3751b9275782e3e8df@it.uc3m.es> <4288D5F3.10507@sun.com> <c6b1ad0ae825764532280c9cf85463d1@it.uc3m.es> <94BCF980-BA34-4488-9674-508609CEAF64@muada.com> <3e95934eb9792bab0d1c56fcc6c3b034@it.uc3m.es> <05AA31AF-F4DE-4472-B770-5F32EDA4A405@muada.com> <baa2bdbbc44db9796d5008379de2b20d@it.uc3m.es> <719D2E97-1171-46FB-A9B3-774C7533C60F@muada.com> <42979307.2030008@sun.com> <3BB337BE-E932-4EEE-BFA8-20C2D666101D@muada.com> <429CA0A9.90108@sun.com> <6A81B75E-0FEF-4F39-879D-8F0A0EB6AF73@muada.com> <429CCF7! 0.6030408@sun.com> <645EF203-1E18-4278-A499-FCB9F48AB654@muada.com>
User-agent: Mozilla Thunderbird 1.0.2 (X11/20050323)

Iljitsch van Beijnum wrote:

(BTW: advise is a verb, advice is a noun.)


Oops. Thanks.

Thus the lack of positive advise is not the same as negative advise.
So what we really need is: good / unknown / bad rather than either good / unknown or bad / unknown.

I'm not sure we need that level of complexity. With ULPs providing a single bit with the semantics 1 = good advice, 0 = I have nothing to say, I think we'll have sufficient information in the shim.

I imagine some optimizations such as recognizing that there is no reply for TCP ack only packets, but let's ignore those for now.

What I propose is a mechanism that purely looks whether traffic is flowing in both directions. This doesn't require parsing any headers except source and destination addresses which the shim must look at anyway.

I guess I don't understand what this would add beyond what your proposed negative advice already can do.

Perhaps you can explain how this would work in this example:
A is using locator pair <A1, B1> and B is using <B1, A1>.
TCP is sending data from A to B, with B responding with ACKs.

The path from B->A fails. The shim layer on A observes this because it stops receiving packets. (But TCP on A also starts retransmitting.) The shim layer on B thinks everything is fine because it sees the packets from A, and it sees the ACKs that B is sending back to A.

Yes, but it opens the door for continuous reachability probes, which is a bad thing because it wastes bandwidth and because it is likely to detect failures when the link is idle, which we shouldn't do IMO.

Quite the contrary. Just as RFC 2461 NUD probing, the probes would be data driven. Thus if there are no ULP packets to send there will be no probes.

But how about this: each side tells the other side a timeout value: after not having seen any traffic from A, B starts probing. Now one of three situations can happen:

But doesn't this lead to a choice between having to send probes when there is no ULP traffic, or having a long timer resulting in a long time until a failure would be detected?

- regular traffic: the timer is restarted before it expires by regular
  traffic, so the timer never expires and there are no probes
- irregular traffic: in order to make sure the timer doesn't expire if
  there isn't any traffic for some time, the sender injects keepalives
  so there are no probes

The shim on the sender? Or the ULP? In either case, you end up adding packets when things would have otherwise been idle.

- no traffic: the sender sets the timer to a very large value or
  infinity, so there are no probes

If the timer has been set to e.g. 5 minutes, how quickly can the shim detect a failure?

I'm assuming that when reachability probes are sent, probes with different address pairs are sent until a working pair is found in both directions, or it is determined that there is no bidirectional connectivity anymore. So B would be informed unless there is no longer any reachability possible.

Yes, but it takes much longer time for B to be informed, because B will not know there is a problem until A manages to get its first probe through to B. If B has data driven probes (suppressed when there is positive advice from the ULP), then it can find out sooner.

It's hard to say for sure what's going to work best in practice, so we need some experimentation at some point. However, that doesn't help us now, as we need to find a good candidate or a small number of candidates for the initial experiments.


Agreed we need experiments down the road.

If the ULP retransmits 10 times with binary exponential backoff starting with a timeout of 4 seconds, and it has been told to send negative advise after consuming half the retransmits, then the shim will see negative advise after 4+8+16+32+64 seconds.
(Note that this doesn't apply to what I was talking about as the ULP itself isn't involved.)

Understood. But I think probing that is suppressed when the ULP doesn't send anything is a useful approach, because it avoids adding any probes or keepalives when the ULP is silent.

I agree 10 secs is a good starting point but I'm afraid using 10 seconds will clash with transports that use 10 seconds themselves... We probably need to review a bunch of ULPs to make a good decision here.

I think both TCP and SCTP start off with 4 seconds RTO if they have no RTT estimate for the peer. But it makes sense to validate this.

 - send 3 probes at time 10 seconds
 - send 3 other probes at time 14 seconds
 - send 3 more at time 22 seconds
Why send 3 at the same time and then wait? Even with a small packet train of 3 packets we're unnecessarily bursty. I think sending one at a time would be better, for instance:

Agreed. My "3 at a time" was just to say that I think we can send a fixed small number of probes for each exponential backoff "epoch" and still not cause a congestion control problem. But it makes sense to space them out as you suggest.

Hm, ok. But we have to be careful about mandating continuous communication between layers. In a properly layered implementation, this type of communication can be quite expensive (context switches and so on).

FWIW the implementation of this that I know the best doesn't add any communication events between the layers. When TCP goes to send a packet to the IPv6 layer it can set a single flag "positive reachability advise" as part of the packet that goes down to the IPv6 transmit routine.

   Erik

Follow-Ups:
- Re: soft state (was Re: shim6 and bit errors in data packet headers
  - From: Iljitsch van Beijnum <iljitsch@muada.com>

References:
- shim6 and bit errors in data packet headers
  - From: Erik Nordmark <erik.nordmark@sun.com>
- soft state (was Re: shim6 and bit errors in data packet headers
  - From: marcelo bagnulo braun <marcelo@it.uc3m.es>
- Re: soft state (was Re: shim6 and bit errors in data packet headers
  - From: Erik Nordmark <erik.nordmark@sun.com>
- Re: soft state (was Re: shim6 and bit errors in data packet headers
  - From: marcelo bagnulo braun <marcelo@it.uc3m.es>
- Re: soft state (was Re: shim6 and bit errors in data packet headers
  - From: Erik Nordmark <erik.nordmark@sun.com>
- Re: soft state (was Re: shim6 and bit errors in data packet headers
  - From: marcelo bagnulo braun <marcelo@it.uc3m.es>
- Re: soft state (was Re: shim6 and bit errors in data packet headers
  - From: Erik Nordmark <erik.nordmark@sun.com>
- Re: soft state (was Re: shim6 and bit errors in data packet headers
  - From: marcelo bagnulo braun <marcelo@it.uc3m.es>
- Re: soft state (was Re: shim6 and bit errors in data packet headers
  - From: Erik Nordmark <erik.nordmark@sun.com>
- Re: soft state (was Re: shim6 and bit errors in data packet headers
  - From: marcelo bagnulo braun <marcelo@it.uc3m.es>
- Re: soft state (was Re: shim6 and bit errors in data packet headers
  - From: Erik Nordmark <erik.nordmark@sun.com>
- Re: soft state (was Re: shim6 and bit errors in data packet headers
  - From: marcelo bagnulo braun <marcelo@it.uc3m.es>
- Re: soft state (was Re: shim6 and bit errors in data packet headers
  - From: Erik Nordmark <erik.nordmark@sun.com>
- Re: soft state (was Re: shim6 and bit errors in data packet headers
  - From: marcelo bagnulo braun <marcelo@it.uc3m.es>
- Re: soft state (was Re: shim6 and bit errors in data packet headers
  - From: Iljitsch van Beijnum <iljitsch@muada.com>
- Re: soft state (was Re: shim6 and bit errors in data packet headers
  - From: marcelo bagnulo braun <marcelo@it.uc3m.es>
- Re: soft state (was Re: shim6 and bit errors in data packet headers
  - From: Iljitsch van Beijnum <iljitsch@muada.com>
- Re: soft state (was Re: shim6 and bit errors in data packet headers
  - From: marcelo bagnulo braun <marcelo@it.uc3m.es>
- Re: soft state (was Re: shim6 and bit errors in data packet headers
  - From: Iljitsch van Beijnum <iljitsch@muada.com>
- Re: soft state (was Re: shim6 and bit errors in data packet headers
  - From: Erik Nordmark <erik.nordmark@sun.com>
- Re: soft state (was Re: shim6 and bit errors in data packet headers
  - From: Iljitsch van Beijnum <iljitsch@muada.com>
- Re: soft state (was Re: shim6 and bit errors in data packet headers
  - From: Erik Nordmark <erik.nordmark@sun.com>
- Re: soft state (was Re: shim6 and bit errors in data packet headers
  - From: Iljitsch van Beijnum <iljitsch@muada.com>
- Re: soft state (was Re: shim6 and bit errors in data packet headers
  - From: Iljitsch van Beijnum <iljitsch@muada.com>

Prev by Date: Re: address pair exploration, flooding and state loss
Next by Date: Re: address pair exploration, flooding and state loss
Previous by thread: Re: soft state (was Re: shim6 and bit errors in data packet headers
Next by thread: Re: soft state (was Re: shim6 and bit errors in data packet headers
Index(es):
- Date
- Thread