[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Comments on 'draft-li-ccamp-wson-igp-eval-01.txt'
Hi Tony,
(and by the way, welcome to CCAMP)
1) The amount of information advertised
The amount of information carried and the stability of that information
is
similar to that used in TE networks today. In those networks, when a TE
LSP
is set up or torn down, there is a change in the bandwidth availability
on
all links traversed. This causes an update to the TE information
advertised
in the IGP for those links.
True. However, typically the number of TE LSPs that are established is a
static number. Even in extreme cases, folks typically use O(n^2) LSPs to
form a complete mesh of their nodes.
In the applications that I think that you're contemplating, there are many
more links, with the ability of end users to set up their own links
throughout the network. Please correct me if I'm misunderstanding.
I think there is some misunderstanding here.
It is true that in a private lambda model, the end-user may request the
establishment of the LSP (where the end-user might be a packet network), but
I don't think we should assume that the connectivity is worse than that in a
TE network. Indeed, it is likely to be much less. This is in part due to the
expense of equipment, the topology of the network, and the constraint of the
number of lambdas supported on any link. That means that a single PE will
only connect to a realtively small number of other PEs.
Of course, this may vary in time as technology improves and as networks get
larger and have greater use.
In general, this is not seen as a critical issue. There are several
possibilities to reduce the impact of TE advertisements if they become a
problem. For example, as implemented in some of our current IP network,
each
router can impose a threshold (time and/or bandwidth delta) before which
it
will not re-issue an advertisement.
Effectively rate limiting the link state database flooding, which is the
goal.
In networks that require IGP convergence (for IP data traffic) and TE
advertisement (for MPLS-TE traffic) there is some concern that the
distribution of TE information might impact the IGP convergence time. In
these cases, implementations may prioritise normal LSAs ahead of opaque
LSAs, or may use separate protocol instances.
The former is more difficult in IS-IS, as TE information is typically
distributed as part of the same LSP that carries the physical link
information. Separate protocol instances is of course possible, but it
would seem like it would become more of a manageability issue (do you need
identical topologies?) and would not obviate the need for CPU cycles.
Yes. The Gen-App work in IS-IS is specifically to address this issue, I
believe. CPU cycles will remain an issue for everyone. But if you build a
box for a specific environment, you had better buiold a big enough box!
2) Speed of convergence of wavelength availability
In a TE network, the need for an up-to-date TED depends to a good degree
on
the relative size of LSPs and residual bandwidth, and on the rate of
arrival
of new LSPs. Increasingly, core TE LSPs may be quite large, but in
general,
the rate of arrival of large LSPs is quite low, and the holding times are
quite high.
So, the bottom line in a TE network is that when an LSP needs more
bandwidth
than is actually available, it is crucial that the TED is up-to-date (or
"real-time" like some folks may call it).
I'm not sure that I would characterize it as 'crucial'. It would seem
like
there is always going to be *some* window where remote nodes have
imperfect
information, just due to speed of light delays. Ergo, the head end must
always be prepared for its path to be rejected during setup and then
recompute the path based on new information.
Absolutely.
And the signaling protocols are specifically designed to handle this case.
Nevertheless, for customer experience (TM) we would hope to make this window
relatively small.
And, indeed, the debate lower down the email is really about how likely we
are to fall through the window. If it happens for every LSP setup attempt,
something clearly needs to be done in the protocols. If it happens less than
1% and only in really busy networks, this may be acceptable.
But as Igor has pointed out, this is impossible to achieve in a
distributed
system. Indeed, it is impossible to achieve even in a centralised system
since network failures may occur. That means that there is always a
possibility that a computed path will be signaled and will fail to be
established. We should be familiar with this situation in scenarios such
as
contention during restoration after network failure, and the signaling
protocols are designed to fail and try again (usually using "crankback"
information about the failed link, but also assuming that the IGP may
have
converged in the mean time).
Note that crankback (recomputing the path DURING setup) is not always
necessary. Sometimes recomputation at the head end will suffice.
Yes. It all depends on whether the TED has converged in the mean time. If
so, then reporting the failure and triggering recomputation is needed. If
not (or if you can't be sure) it is best to recompute with the knowledge of
which link caused the failure in the original setup. That's crankback (RFC
4920).
We can look at a couple of things:
- What is the arrival rate of lambda LSPs?
- What is the required setup time of lambda LSPs?
With the exception of restoration LSPs, I think we may say that a 5-10
second delay in LSP setup time for 1 in 50 LSPs would be acceptable, but
you
may have a different experience.
In my experience nearly static LSPs are perfectly acceptable, and setup
time
is not an issue. ;-)
Exactly.
What we find at the moment, however, is that there is a lot of discussion
about moving the transport network from a "nearly static" animal to
something that is semi-dynamic, or even "on-demand".
This changes the boundaries, but it is still not clear to me that
"on-demand" means sub-second as I would expect all lambda LSPs to go through
a testing phase before being put in service (even if this is an automatic
phase). And I would expect such a phase to take a number of seconds.
That said, restoration is a different question.
My concern however is simpler: limiting the rate of link state database
flooding. This is vital to ensure the stability of the IGP.
Agreed, but let's be careful to understand the size of the network, the
amount of information, and the rate of change of that information. This is
really what Dan's I-D is starting to look at, so contributions to helping
with that understanding are welcome.
If, as you say, most of these LSPs are nearly static, we obviously don't
have to worry about the amount of flooding. It is only when LSPs come and go
(or when nodes/links come and go) that we will see a change in flooding. I
think that what Dan said in this email points out that the amount of
flooding (but not necessarily the amount of information flooded) is similar
to what we already see in TE networks. We should certainly look to see what
the stability experience is with those networks - my understanding is that
they are deployed and functional.
Cheers,
Adrian