[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: Segment protection failure when recovery LSPs overlap
Hi Dimitri,
I don't think that solves this issue. Let me restate the problem with more detail on the master/slave behaviour.
We have the following topology, where the LSP A-B-C-D-E-F-G-H has 1:1 segment protection with extra traffic, and the link D-E fails:
> > K-----------L
> > / \
> > A===B===C===D x E===F===G===H
> > \ /
> > I-----------J
RFC 4426 defines a master/slave relationship for the endpoints of the recovery LSPs.
- For the recovery LSP B-K-L-F, either B or F will be the master, controlling switchover onto that recovery LSP.
- For the recovery LSP C-I-J-G, either C or G will be the master, controlling switchover onto that recovery LSP.
However, there is no mechanism defined for electing the masters. Rather, RFC 4872 defines the switchover procedures for 1:1 protection, and states that the first endpoint of the recovery LSP to detect the failure is the one that initiates switchover (http://tools.ietf.org/html/rfc4872#section-7.2).
In this example, C and F are closest to the failure, and have their NOTIFY_REQUEST objects at the top of the stack at D and E respectively. It is therefore likely that C and F will detect the failure before B and G, and so C and F will be the masters.
That presents the problem that C and F will both attempt to initiate a switchover using their respective recovery LSPs, leading to the data loss described in my original mail.
If there was a way to force B and C to be masters then your suggestion that B should avoid triggering protection switching before C may work. However, that is explicitly not considered in 1:1 protection switching. See Note 1 in http://tools.ietf.org/html/rfc4872#section-7.2:
Note 1: a 2-phase protection-switching signaling is used in the
present context; a 3-phase signaling (see [RFC4426]) that would imply
a notification message, a switchover request, and a switchover
response messages is not considered here.
Nic
-----Original Message-----
From: ALU - Dimitri Papadimitriou
Sent: 24 February 2009 09:27
To: Nic Neate; ccamp@ops.ietf.org
Cc: labn - Lou Berger; IBryskin@advaoptical.com; Aria - Adrian Farrel Personal
Subject: RE: Segment protection failure when recovery LSPs overlap
Nic,
I will restate, in all protection scheme there is a master slave mechanism. Now concerning the SRRO: C (and B) and F (and G) are generators in the upstream and downstream direction. So the SRRO are known to B and it is what we are interested in that B does not trigger recovery before C and the same for F and G i.e that G does not trigger recovery before F.
Thanks,
-dimitri.
> -----Original Message-----
> From: Nic Neate [mailto:Nic.Neate@dataconnection.com]
> Sent: Monday, February 23, 2009 4:13 PM
> To: PAPADIMITRIOU Dimitri; ccamp@ops.ietf.org
> Cc: labn - Lou Berger; IBryskin@advaoptical.com; Aria - Adrian Farrel
> Personal
> Subject: RE: Segment protection failure when recovery LSPs overlap
>
> Hi Dimitri,
>
> We had wondered about the SRRO as a possible solution to this problem
> as well. However, there are a couple of issues as the protocol
> currently stands.
>
> - SRROs can only be present in Path messages between the merge node
> and the egress, and in Resv messages between the branch node and the
> ingress. See
> http://tools.ietf.org/html/rfc4873#section-2 and
> http://tools.ietf.org/html/rfc4873#section-5.2. Therefore, C does not
> have the SRRO for recovery LSP B-K-L-F, and F does not have the SRRO
> for recovery LSP C-I-J-G.
>
> - The inclusion of the SRRO is optional, controlled via the
> segment-recording-desired flag in the SESSION_ATTRIBUTE object
> (http://tools.ietf.org/html/rfc4873#section-5.2). If the SRRO is
> required in order to avoid data loss then it needs to be mandatory.
>
> So I think we need a protocol extension in order to provide a
> signaling-based solution.
>
> Nic
>
>
> -----Original Message-----
> From: ALU - Dimitri Papadimitriou
> Sent: 21 February 2009 23:51
> To: Nic Neate; ccamp@ops.ietf.org
> Cc: labn - Lou Berger; IBryskin@advaoptical.com; Aria - Adrian Farrel
> Personal
> Subject: RE: Segment protection failure when recovery LSPs overlap
>
> Nic,
>
> RFC4873 by means of SRRO allows nodes to determine existence of
> upstream/downstream recovery segments as carried in Path/Resv message.
> Combined with RFC4426 and RFC4428 that refers to master/slave it
> results that either C (or F) trigger a recovery action by means of
> disjoint recovery segments.
>
> Thanks,
> -d.
>
> > -----Original Message-----
> > From: Nic Neate [mailto:Nic.Neate@dataconnection.com]
> > Sent: Friday, February 20, 2009 4:05 PM
> > To: ccamp@ops.ietf.org
> > Cc: labn - Lou Berger; IBryskin@advaoptical.com; PAPADIMITRIOU
> > Dimitri; Aria - Adrian Farrel Personal
> > Subject: Segment protection failure when recovery LSPs overlap
> >
> > Hi CCAMP,
> >
> > I'd like to raise one more issue with RFC4873 segment
> recovery, which
> > I believe will lead to data loss when overlapping segment recovery
> > LSPs are used.
> >
> > RFC4873 allows topologies like this one:
> >
> > K-----------L
> > / \
> > A===B===C===D===E===F===G===H
> > \ /
> > I-----------J
> >
> > A working LSP A-B-C-D-E-F-G-H is protected by two
> overlapping segment
> > recovery LSPs: B-K-L-F and C-I-J-G. The recovery scheme is 1:1
> > protection with extra traffic.
> >
> > Suppose the link D-E fails:
> >
> > K-----------L
> > / \
> > A===B===C===D x E===F===G===H
> > \ /
> > I-----------J
> >
> > My understanding is that the failure will be handled as follows.
> >
> > - D detects the link failure, and sends Notify to C (first Notify
> > object
> > in the received Path). C and G exchanged Notify
> messages to remove
> > extra traffic from the C-I-J-G repair, and then send and receive
> > traffic from the working LSP on C-I and G-J.
> >
> > - Meanwhile, E also detects the failure, and sends Notify
> to F (first
> > Notify object in the received Resv). F likewise exchanges Notify
> > messages with B to remove extra traffic from the B-K-L-F repair,
> > and
> > and then send and receive working LSP traffic on B-K and F-L.
> >
> > That results in the following data flow:
> >
> > K----->-----L
> > / \
> > A->-B <-C D E F-> G<--H
> > \ /
> > I-----<-----J
> >
> > Forward traffic reaches G on the link F-G. However, G has
> switched to
> > send and receive on G-J, and so drops traffic received from F.
> >
> > Reverse traffic reaches B on C-B. However, B has switched
> to send and
> > receive on B-K, and so drops traffic received from C.
> >
> > Thus traffic is lost in both directions.
> >
> > Can anyone point out an error in this analysis? Is this a topology
> > that there is interest in supporting?
> >
> > Thanks,
> >
> > Nic
> >
>