[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: About draft-ietf-shim6-failure-detection-07 {2}



I'm curious - was this ever resolved?

    Brian

-------- Original Message --------
Subject: About draft-ietf-shim6-failure-detection-07 {2}
Date: Fri, 26 Jan 2007 13:53:10 +0100
From: Sébastien Barré <sbarre@info.ucl.ac.be>
To: jari.arkko@ericsson.com, iljitsch@muada.com
CC: shim6@psg.com

Hi,

I have another question regarding the failure detection draft. This is
about the end of an exploration. I think it is useful to have a notion of
- when an exploration process begins.
- when an exploration process terminates.

* Motivation :
   - An implementation might want to have a specific thread/process for
each exploration. It is useful to know when the exploration thread may
be terminated.
   - We need to store information about each probe sent/received
(because reports are included in the probes). It is useful to know when
the memory used by this information can be released. This would ensure
that only information relative to the current exploration is stored in
memory. Also, a context in operational state would need less space than
a context in course of exploration, which is more scalable if lots of
contexts are present.

* The problem :

While it is rather evident when we must trigger an exploration (send
timer expiry or incoming probe exploring), we are not so certain about
the end of an exploration. More precisely *one of the peers* is not
certain, while the other is. Here is an example situation with the
currently defined scheme for the end of an exploration :

Peer A                                        Peer B
     |                                             |
   State:                                        State:
 Inbound_OK                                    Exploring
     |                                             |
     |           Probe Inbound OK                  |
     |-------------------------------------------->|
     |                                           State:
     |                                         Operational
     |           Probe Operational                 |
     |    	  /---------------------------------| path working, but probe lost (because of
     |                                             | congestion for example)
     |           Probe Inbound OK                  |
     |-------------------------------------------->|
     |                                             |
     |           Probe Operational                 |
     |<--------------------------------------------|
     |                                             | Now B can forget its received/sent probe reports,
  State : 					    | because A is in state operational, but B has no way to know it.
Operational

As illustrated in this scenario, when A receives a probe Operational, it
knows for sure that B is operational, and so that the exploration
process is terminated. But B won't receive such a probe. B enters into
the Operational state when he receives a Probe Inbound OK. This means
that he knows the conversation will work from now, but this doesn't
mean, however, that A won't ever ask B to send its list of sent/recvd
probe reports, as is the case in the above scenario (because of the
first probe operational being lost).

* One possible (and simple :-) ) solution :

Because the only problem is for B to know for sure that A is
operational. Just let it know this fact by sending another Probe
Operational. Of course, with such a rule, we could end up with A and B
infinitely sending Probes Operational. This can be simply solved by
having something different in the last Probe Operational, such as a
flag. This can also be solved by sending no probe report (psent=0 and
precevd=0). This may seem contradictory with my previous mail, but it
isn't : we are here in the special case where A *knows* that B is in
operational state, so he doesn't need anymore to have any probe
information, his only need is now to be sure that A is also in
operational state. In fact, the only thing that B will do when receiving
the last probe is check whether this is an operational probe or an
Inbound OK probe :
   - in the first case : stop the exploration process gracefully
   - in the last case : The last probe operational from B has been
lost, send another one.
This results in the following (very similar, but much simpler/efficient
from an implementation point of view, IMHO) scenario :

Peer A                                        Peer B
     |                                             |
   State:                                        State:
 Inbound_OK                                    Exploring
     |                                             |
     |           Probe Inbound OK                  |
     |-------------------------------------------->|
     |                                           State:
     |                                         Operational
     |           Probe Operational                 |
     |    	  /---------------------------------| path working, but probe lost (because of
     |                                             | congestion for example)
     |           Probe Inbound OK                  |
     |-------------------------------------------->|
     |                                             |
     |           Probe Operational                 |
     |<--------------------------------------------|
     |	                                            |
  State : 					    |
Operational					    |
     |						    |
     |           Probe Operational|final           |
     |-------------------------------------------->|
     |                                             | Now B can forget its received/sent probe reports,


In fact, the final flag is not necessary, because there is no ambiguity,
here would be the difference in the state machine :
event : Reception of the probe message State Operational
if in state Operational : just update timers (see draft) -> *end of
exploration*
if in state Inbound_OK : goto Operational, update timers AND send a
probe operational
if in state Exploring : goto Operational, update timers AND send a probe
operational (mmm, in fact i guess this case should not occur, unless one
of the peers is buggy, but this answer to this little probable event
seems OK).

What is your opinion ?
Thanks for any comments,

Sébastien.


--
Sébastien Barré
Researcher,
CSE department, UCLouvain, Belgium