[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RRG] ALT + NERD is inelegant & inefficient, compared to APT or Ivip

To: Routing Research Group <rrg@psg.com>
Subject: Re: [RRG] ALT + NERD is inelegant & inefficient, compared to APT or Ivip
From: Robin Whittle <rw@firstpr.com.au>
Date: Thu, 24 Jan 2008 13:03:00 +1100
Cc: Tony Li <tli@cisco.com>, Brian E Carpenter <brian.e.carpenter@gmail.com>, Eliot Lear <lear@cisco.com>, Dino Farinacci <dino@cisco.com>, David Meyer <dmm@1-4-5.net>
In-reply-to: <C8732908-992A-4493-A1BA-CA9241DF2577@cisco.com>
Organization: First Principles
References: <47956B71.2090609@firstpr.com.au> <57573EC5-D976-4132-B8BB-2E3273811806@cisco.com> <47969864.2050408@firstpr.com.au> <479748DB.6070206@cisco.com> <47978517.9080301@gmail.com> <C8732908-992A-4493-A1BA-CA9241DF2577@cisco.com>
User-agent: Thunderbird 2.0.0.9 (Windows/20071031)

I am responding to the messages from Eliot Lear, David Meyer, Brian
Carpenter and Tony Li.

Eliot wrote:

RW > NERD on its own won't work because every ITR in the world would
RW > need the full feed of mapping updates.

> Table 1 in Section 5.1 of draft-lear-lisp-nerd-02.txt addresses
> this somewhat.  At 10^8 EIDs, assuming IPv6 addresses, the
> database size is likely to be less than 6 GB.  Let us be
> pessimistic and say that the update rate is 1.0% per day, leading
> to a change rate of 60MB per day, or an hourly rate of 2.5MB.  My
> netnews server could do this in 1987.  We're talking about 2050.
> What's the problem?

I agree with you in some respects.  For LISP, the mapping data won't
change at a huge rate, and even though the mapping data is more
complex than for Ivip, the communications volume required is
probably not a problem for ISPs and for larger routers.

Other folks in the LISP team apparently saw things differently,
because they developed CONS and then ALT as a complete solution, not
involving NERD.

Sticking with these assumptions, I will consider a pure NERD scheme.

I think this would be much better than ALT or CONS, because it would
work - fast - without dropping or unreasonably delaying packets.

The major costs are the need for all ITRs to have a lot of RAM and
to have a substantial communications load for incoming mapping data.
 Arguably neither of these will be much of a concern for ISPs and
other organisations with substantial networks by the time the
mapping database gets to this 100,000,000 EID size.

However, it still constrains where you implement ITRs.  Ideally,
ITRs would be cheap or have zero hardware cost.  Then they can be
closer to the sending hosts, reducing the load in each ITR and
ensuring that the total path taken by packets is no longer than it
would be without the map-encap scheme.

Maybe some of the pure NERD, full database, ITRs will have a
"hardware" (specialised processors and cards) FIB which implements
the full set of EIDs directly, but I am assuming that most will have
a FIB which at any one time is only set up to encapsulate packets to
the small subset of EID which the current traffic is addressed to.

In that case, the ITR would store its copy of the database in RAM
(or in FLASH or even a hard drive) and the FIB would be capable of
detecting traffic packets addressed to parts of the address space
which are known to be covered by LISP, but which the FIB has not yet
been set up to handle.

Then the ITR's CPU would function as a query server for the FIB(s)
in the same device.

What do you think of the idea of making that query server function
available to nearby devices?

I think this would be an enhancement to the NERD architecture - to
allow for this, enabling operators to do it when it made sense to them.

Then there could be caching ITRs nearby which send map requests and
get responses really quickly and reliably.  These caching ITRs will
not need all the RAM or continual inflow of mapping information
which the full database ITR needs.

"Nearby" could have various meanings.  "Nearby" doesn't have to be
tightly constrained in the architecture.  (In APT, it is constrained
to being in the same network.)  If operators would rather run one or
more caching ITRs with the main full database ITR operating as a
query server, then I think the architecture should allow for this.
Likewise if the end user wants to run ITRs using the ISP's query
server, or anyone else's query server (maybe their ISP doesn't have
one), then the architecture should allow for this.

The next enhancement is to allow for the separation of the query
server function from the ITR function.  There's no obvious reason
why the architecture should insist they be in the one device.
Operators may choose to do this, and will probably put full database
ITRs and query servers in the same location, to maximise the utility
of the feed of mapping data to that location.

I am keen for you and others to critique the series of enhancements
I proposed in my initial message.  Do you agree or do you think that
one or more of these enhancements makes things worse?  Or do you
have a better enhancement?

One enhancement to the architecture is to allow for caching ITR
functions in the sending host (or DSL modem etc. whatever it is, not
behind NAT).  With local query servers, this would enable a lot of
the ITR work to be done without additional hardware expense.

David Meyer wrote:

DF >>> I said what is left to do is 1) make a decision (see text
DF >>> right above >>> this line) or 2) blend two mapping database
DF >>> scheme. We don't know yet. We need to experiment.

RW >> How would experimentation with your current prototype LISP
RW >> system lead to any fresh insights about the fundamental
RW >> limitations of ALT or NERD?

>       I don't think any(one) is saying that.

I raised what I consider to be fundamental limitations of pure ALT
and pure NERD.  Rather than respond directly to these, Dino said he
needed to do more experimentation.  So I figure he meant that
experimentation would or might lead him to some insights which would
prove my critique was wrong, or perhaps enable him to come up with a
workaround or some other enhancement to avoid the problems I cited.

>       My sense is that its
>       more the case that while we work on the theoretical
>       aspects of these system (as this thread is, and that is
>       all goodness), there are things that one can learn by
>       observing a system's behavior. To do that you need
>       implementations (or at least simulations). Suffice it to
>       say that a lot has been learned in the process of trying
>       to make things work (especially in the cases of large
>       dynamic systems that are difficult to characterize in a
>       bottom up fashion).
>
>       So there's a theory v. practice point here (and I'll
>       leave alone for the moment the whole issue of emergent
>       properties that really can't be studied in a bottom up
>       fashion).

OK, but the limitations of ALT and NERD seem to be fundamental.

I see no prospect that running a tiny LISP system (as you do at
present) or running a simulation of a larger one is going to find a
workaround for those limitations.  The only way of overcoming them
is to change these architectures into something different, or use
something different to start with.

My proposed enhancements start with NERD and do away with the need
for ALT or CONS at the very first step.

>> NERD on its own won't work because every ITR in the world
>> would need the full feed of mapping updates.
>
>       Notwithstanding Eliot's calculations, this is a concern
>       many of us has voiced.

OK.

> 01:   That's why hybrid systems seem so attractive; but then
>       you trade off rate v. state. v. lookup latency. All of
>       these systems seem to occupy a different point on in this
>       space (except for perhaps CONS and ALT, which are
>       architecturally similar in this respect).  I'm sure there
>       are other dimensions one can through into this, but for
>       now my sense is that these three dominate.
>
>> ALT on its own won't work because the ITR drops or greatly delays
>> (sending them to the ETR via the ALT network) the initial packets
>> while it waits for the mapping information.  This would make EID
>> address space suck - so no-one would want to use it.
>
>       goto 01;
>
>       (I [again, and others] have been concerned not only about
>       latency [your point] but also about the aggregate traffic
>       load in the ALT core [consider near the "top" of the
>       hierarchy, i.e., data traffic "rate" in the control
>       plane]).

Yes, this is another objection to any global query server network
such as CONS or ALT, unless there is caching in the network, or some
fancy distributed system for answering the queries.

>> The logic of this seems inexorable: Any global query system would
>> often be so slow that it wouldn't be acceptable unless every ITR
>> is able to deliver the initial packets via some other mechanism
>> within a fraction of a second.  If all caching ITRs can do that,
>> then why bother with a global query server network at all?
>
>       I don't know if the logic is "inexorable", but your point
>       is well taken. We need to examine the tradeoffs in the
>       3-space I described above.

The only way I can imagine this logic failing is if the fast
delivery of early packets was expensive, and couldn't be sustained
for the main body of traffic.

Maybe you could argue that using a "Default Mapper" to deliver the
initial packets quickly is too expensive for the main body of
traffic, since these would quickly become overloaded.  But if you
have a local default mapper, there seems to be no need for a global
query server network such as CONS or ALT, since the mapping query
can be answered quickly by that default mapper or by a query server
at the same location, running from the same feed of mapping data.

>> I don't think anyone believes ALT or NERD alone could possibly
>> form an acceptable solution, far less the optimal one.
>
>       Again, "anyone" is a lot of folks.

OK.  Eliot and perhaps others think a pure NERD system would be
fine.  I would favour this over a pure CONS or ALT system any day.
But a pure NERD system can easily be improved as I suggest to allow
for caching ITRs and ITR functions in sending hosts.

Anyone who believes in a pure ALT system must be expecting end-users
to put up with initial packets being dropped or delayed so much as
to be useless or worse than useless if and when they arrive.  I
think this idea is completely unworkable, but if there are such
folks, then I stand corrected.

>> So that leaves your other suggestion: blending the two systems
>> together.  This is what I explored - making changes to them
>> starting from the position of the two systems running alongside
>> each other.
>
>       So this is an architectural tradeoff in the space I was
>       describing. As Noel is found of saying, TANSTAAFL.

There are probably no free lunches, but there are combinations of
techniques which work elegantly and efficiently by complementing
and/or supporting each other.  So some lunches are cheaper, more
nutritious and more enjoyable than others.  For instance:

>> When you have NERD ITRs around the place, there is no need to
>> build the ALT global query and packet delivery system, because it
>> makes much more sense to use the nearest NERD ITR as a default
>> mapper and to have a full database Query Server at the same
>> location - perhaps in the same device as the NERD ITR.  Then
>> there is no need for ALT, CONS or any other global (expensive,
>> hard to administer, slow and unreliable) query server system.
>
> 	But then you need to construct the complete map; again,
> 	this is just a point in the tradeoff space I described
> 	above. It would appear that there are only a finite
> 	number of ways to skin a cat...

I think this enhancement is much more than choosing a particular
trade-off between multiple costs and benefits.  For one thing, it
eliminates the need for a global query server network.  For another,
it enables all packets to be delivered quickly.  These are
tremendous benefits in terms of cost, complexity, reliability and
performance.

>> If you or anyone else disagrees with my sequence of improvements
>> to LISP, you will be able to argue why a particular stage of my
>> improvements makes the system worse and/or suggest some better
>> set of improvements.
>
> 	I have a somewhat different suggestion. I don't see any
> 	"solutions convergence" occurring, at least on RRG
> 	list.

I think it is productive to try to combine the best elements of
current proposals and/or to ask questions like "If we had Y, and did
Z to it, then why would we still need X, which previously we thought
was essential?"

>       What might be helpful is objective criteria that
>       could be used to pick "more optimal" points in the
>       archtiectural space (here I mean the rate*space*latency
>       space). Give such critera, we could do the
>       evaluation. Without such critera, well, we're
>       speculating.
>
>       Now, I'm not saying we need a "requirements document"
>       (been there, done that). But we do need something
>       objective that we can use to compare against.

I don't think we need any formal documents - or algorithms for
evaluating and trading off costs and benefits - to agree that it is
better, cheaper and simpler to have NERD with the possibility of
local query servers and caching ITRs than to have NERD alone, ALT
alone or NERD and ALT running in parallel.

Brian Carpenter wrote:

RW >>> NERD on its own won't work because every ITR in the world
RW >>> would need the full feed of mapping updates.

EL >> Table 1 in Section 5.1 of draft-lear-lisp-nerd-02.txt
EL >> addresses this somewhat.  At 10^8 EIDs, assuming IPv6
EL >> addresses, the database size is likely to be less than 6 GB.
EL >> Let us be pessimistic and say that the update rate is 1.0% per
EL >> day, leading to a change rate of 60MB per day, or an hourly
EL >> rate of 2.5MB.  My netnews server could do this in 1987.
EL >> We're talking about 2050.  What's the problem?

> I seem to hear Tony saying that 10^8 is nothing like enough,
> unless I'm missing something in his argument that we should go to
> host level.  Personally I think that 10^8 is more than plenty, but
> we do need to get a fix on this.

I agree we need some realistic figures, otherwise any time someone
proposes a push system, another person will think of a number,
double it, and say push can't work with that number.

I am absolutely in favour of separate mapping being allowed and
efficiently supported "per host" - meaning per IP address in IPv4 or
per /64 in IPv6.

An entire network could run from one of these.

Even if there was only one host behind it, I think multihoming and
portability of the number between ISPs (apologies to those who
bristle at this phrase) should be efficiently supported by the new
architecture.

All the schemes - LISP, APT and Ivip - can technically support
single "hosts".  Ivip can support micronets (a range of contiguous
addresses mapped to the same ETR address) of arbitrary size, while
the others restrict micronet lengths to powers of 2.

I don't think the new architecture should be made as if we are
scared of this usage, as if it should be suppressed or as if it is
never likely to be needed.

Probably the way to cope with the costs of handling large numbers of
end-user-generated EID divisions (micronets) and mapping changes is
is to have some kind of charging scheme for each mapping change.
Then, those folks who want to pay the charge for frequent changes
for their single host, or their entire network hanging off a single
IP address or /64, can do so if the benefits are worth the fee they
pay.  The cost doesn't have to be very high, I believe.

Ivip is intended to support mobility between ISPs for individual
hosts or networks.

There probably is some upper limit on the number of devices which
might do this.  If we imagine 10 billion people, 3/4 of them with a
cell phone, and they all want session survivability on the one IP
address or /64 as the "phone" (PC, audio visual player, P2P network
data sucker, Second Life client etc.) roams from one radio network
to the next, then we have a figure of 10^10.

With pure NERD, the long-term figure for the number of EIDs, and for
the rate of change of their mapping is crucial - because all ITRs
need to get the full feed of mapping changes and to store the full
database.  This increases the cost of building and running an ITR
and so of constrains how many ITRs you can have.  So this causes
difficulties with packets having to reach that smaller number of
ITRs - with each ITR handling a larger load.

Pure CONS or ALT isn't fussed about the total number of EIDs or
their rate of change.  But pure CONS or ALT drops and delays packets
- so it is never going to be successfully deployed.

If you add query servers and caching ITRs to NERD, you don't need
CONS or ALT, but the total number of EIDs and their rate of change
remains important.

My proposed enhancement of NERD enables flexible deployment of
caching and non-caching ITRs, so the system copes nicely in the
future if the number of EIDs and rate of change becomes a serious
burden for full database (non-caching) ITRs: simply have less of
them, and use more caching ITRs and local query servers.

So no matter how big the number and rate of change gets, this
enhancement to NERD provides the flexibility for operators to run
their system at their chosen spot along a continuum of costs and
benefits for caching and non-caching ITRs.

Tony Li wrote:

EL >>> (As quoted above.)
EL >>> We're talking about 2050.  What's the problem?

BC >>  I seem to hear Tony saying that 10^8 is nothing like enough,
BC >>  unless I'm missing something in his argument that we should
BC >>  go to host level.

BC >>  Personally I think that 10^8 is more than plenty, but we do
BC >>  need to get a fix on this.

> Again, the real question here is about whether we need host
> granularity or not.
>
> If not, then Eliot's figures make perfect sense and the lack of
> granularity also suggests that the rate of change would be
> relatively low, further endorsing his approach.
>
> However, if we do need host granularity, then I think you need
> about 3 orders of magnitude more scale, and pure push approaches
> simply won't get you there.

Yes, but I would say two orders of magnitude - 10^10 as an outside
limit.

APT, Ivip and my enhancement to NERD are not pure push approaches.

I think that Ivip is the most flexible of the current approaches.
No matter what the number of micronets (EIDs), and no matter how
often some or many of them change, I argue that the flexibility of
Ivip is better than the other approaches at enabling operators to
deploy their equipment in ways which best handle the challenges of
the day.

  - Robin

--
to unsubscribe send a message to rrg-request@psg.com with the
word 'unsubscribe' in a single line as the message text body.
archive: <http://psg.com/lists/rrg/> & ftp://psg.com/pub/lists/rrg

Follow-Ups:
- Re: [RRG] ALT + NERD is inelegant & inefficient, compared to APT or Ivip
  - From: Eliot Lear <lear@cisco.com>
- Re: [RRG] ALT + NERD is inelegant & inefficient, compared to APT or Ivip
  - From: David Conrad <drc@virtualized.org>
- Re: [RRG] ALT + NERD is inelegant & inefficient, compared to APT or Ivip
  - From: Tony Li <tli@cisco.com>

References:
- [RRG] ALT + NERD is inelegant & inefficient, compared to APT or Ivip
  - From: Robin Whittle <rw@firstpr.com.au>
- Re: [RRG] ALT + NERD is inelegant & inefficient, compared to APT or Ivip
  - From: Dino Farinacci <dino@cisco.com>
- Re: [RRG] ALT + NERD is inelegant & inefficient, compared to APT or Ivip
  - From: Robin Whittle <rw@firstpr.com.au>
- Re: [RRG] ALT + NERD is inelegant & inefficient, compared to APT or Ivip
  - From: Eliot Lear <lear@cisco.com>
- Re: [RRG] ALT + NERD is inelegant & inefficient, compared to APT or Ivip
  - From: Brian E Carpenter <brian.e.carpenter@gmail.com>
- Re: [RRG] ALT + NERD is inelegant & inefficient, compared to APT or Ivip
  - From: Tony Li <tli@cisco.com>

Prev by Date: Re: Granularity (was Re: [RRG] ALT + NERD is inelegant & inefficient, compared to APT or Ivip)
Next by Date: Re: [RRG] ALT + NERD is inelegant & inefficient, compared to APT or Ivip
Previous by thread: [RRG] Re: Granularity & number of ETRs for multihoming
Next by thread: Re: [RRG] ALT + NERD is inelegant & inefficient, compared to APT or Ivip
Index(es):
- Date
- Thread