[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Comments on "draft-duffield-framework-papame-00"



Will,

Thanks for your feedback. 

>1.0 Motivation
>      -- Example 1 :  For low speed links, is it true that 
>non-sampled raw
>netflow (cisco v5) would, if not eliminate, greatly reduce the flaws
>mentioned for MIB type statistics averaged over several minute 
>time periods

We are seeking a lightweight mechanism that can provide timely
statistics for these applications, across links of different roles
(access, backbone, peering), different speeds, and different vendors.
Sampling provides a knob for managing the measurement load on the link
while still collecting useful data, and exporting the data in a timely
way.

>?  So should this example be qualified as a high speed link in which
>non-sampled raw netflow  is not applicable ?  So this draft 
>mainly targeted
>at high speed links ?

High speed links become low speed links, over time, in the field.
Similarly, core routers become access routers.  It doesn't make sense
to tie the mechanism to the link speed.

>      -- Example 1 : you mention MPLS FEC, is it the explicit 
>target of this
>proposal to encompass MPLS as part of this framework ?

Exporting the associations of MPLS labels with the packet sample is of
interest, yes.

>
>      -- Example 2 : For this example, what are the issues 
>with using an
>aggregation method on the router (e.g. cisco v8 netflow ?)

psamp does not address aggregation on the router.  That is what ipfix
is doing.  Aggregation on the router can be useful for certain
applications, such as accounting for all traffic with a given source
IP address range, e.g., for billing.  Aggregation is not particularly
useful, and can get in the way, for common applications that require
some learning; e.g., learning which prefixes are responsible for heavy
traffic volumes, for traffic engineering applications.  In addition,
there is the question of issue of dealing with large or general sets
of aggregations -- inevitably there are restrictions on forming such
aggregations on the router (which are after all focused on forwarding
and routing), while a dedicated measurement box dealing with packet
samples has fewer restrictions in the field.

>      -- Example 3 : are there any published studies that 
>present an empirical
>study of how much in "error" off-line processing is for 
spatial flow of
>traffic ?  There seem to be some in the community saying this 
>works just
>fine, and
>others questioning the procedure.

We have done a lot of work with offline processing, and it remains "a
lot of work."

>
>2.0 Goals
>      -- Does it make sense to have as a goal the use of 
>IPFIX to transport the
>data for this framework ?
>      -- As a discussion topic, it seems there is going to be 
>some levels of
>overlap between the goals as stated for psamp and IPFIX.  
>Within IPFIX we
>seem to be reaching conclusion that some level of 
specification of data
>gathering options is needed in order to properly encode state 
>about how data
>was gathered (like sampling rate).
>

There's scope for some intellectual cross-fertilization between the
efforts, since there are some issues in common.  But IPFIX is
specifically targeted at flow information, so I'm very concerned there
would be a mutual slowdown for both PSAMP and IPFIX if we tried to
reach a common export standard.  Having separate PSAMP format
isn't so bad; you'd only have to write a new driver for the collector.

>
>3.1 Measurement information flow
>
>This terminology is a bit confusing. It's basically the control stream
>between the measuring device and the collector. The description of the
>"information flow" is a bit too abstract. Maybe it's meant to 
>lay out less
>details here?
>

I'd say it's the stream of information derived from a subset of
selected packets. The reason for formalizing the concept was to
emphasize that once a substream of packets has been selected by the
hashing/filtering/sampling operations, you want to be able to
manipulate/process/export information derived from them independently
of other streams.

>3.2 Packet selection
>
>The exact sensible combinations of hash/filtering/sampling and 
>the kind of
>applications for each combination would be interesting to enumerate.
>

Agreed, but bear in mind the list of applications is bound to grow.
Looking forward, it would be good to have freedom to roll new
applications, not just those which look good today.

>-- Hashing
>
>It may be helpful to briefly explain the rationale or 
>application scenario
>of hashing. If there is no need to uniquely identify a packet across
>multiple devices, then hash may not be necessary?
>
>My understanding is that within context of this effort you 
>expect that an
>exact hash function will need to be specified.  If so that 
>goal needs be
>explicitly communicated.  This will be an interesting 
>(possibly difficult)
>thing to do since the relative difficulty of the hash function 
>will vary
>platform to platform depending on the underlying architecture 
>the vendor is
>using for packet processing.

Yes, some applications will want to identify a packet across multiple
devices, e.g., in order to measure its path across a domain. This
would require a common hash function across the domain, and by
extension across vendors in a multivendor domain.  

>
>-- Filtering
>
>Would there be a dynamic filtering scenario, where each filter 
>installation
>is triggered by packets dynamically rather than by configuration?
>

Yes, though (at least until we get a lot farther on more basic
topics), I would not propose reacting on timescales more stringent
than can be accomodated via management systems (e.g., minutes).  I
think there's a broader point here: packet selection should be
"easily" configurable, meaning that you'd want to allow for automated
reconfiguration, based on the occurrence of certain events. For
example, you might want to adjust the sampling rate if the measurement
load gets too high. Measurement applications may want to set up
filters in response to an observed event. The control loop for this
could be on board, or through a NOC. For example: if one measurement
information flow is a feed of 1 in N sampled packet headers to the
NOC, and a spike in traffic to a given address range is observed, then
you might want the ability to drill down by having an automated
control system in the NOC remotely configure a new filter on that
address range.

>Within context of this framework do you expect to have 
>explicit guidelines
>on how many filtering functions makes sense ?  Large numbers 
>of filters will
>increase complexity, and has affect on the stated goal for 
>"including in the
>minimal set of primitives functions that can be implemented at 
>maximal line
>rate with minimal additional state".
>

I am not sure what an "explicit guideline" is, and I agree a
requirement for a large number is too much.  A handful makes sense to
me, as a guideline.  If I understand it, the main constraint in
complexity is all the measurement information flows potentially have
to be processed in parallel since a given packet might end up in more
than one of them.

This is the fundamental difference with the case when you know in
advance that a packet can trigger no more than one filter, e.g. when
you have a (large) number of filters set up on non-overlapping address
ranges.

>-- Sampling
>
>I guess it's fine to broadly categorize sampling schemes into 
>deterministic
>and probabilistic, but perhaps we should not confine 
>deterministic to 1/N
>and probabilistic to p(x)=1/N. There may be some sampling 
>scheme requiring
>more than one parameters, for example, the sampling probability can be
>based on both N and packet length.
>

We're open to that.

>I think within each category, there could be yet another two modes:
>stationary and adaptive, depending on whether the sampling 
>parameters are
>configured or dynamically adjusted based on traffic condition.
>

Yes; see comments on filtering, above.

>3.3 Report generation and export
>
>I'm a bit confused about the "packet count" as part of the subsidiary
>information. I thought this draft is about packet based 
>measurement, not
>flow based...

Sorry for confusion. Here we're not talking about flow quantities, but
rather counters kept for each information flow on how many
packet/bytes have been seen at input and output of the packet
selector.

>
>The collector device should be identified by both IP address 
>and L4 port
>number.

Agreed.

>
>   3.4 Measurement Record Format
>
>This is the section that mostly obviously has to to be 
>considered/reviewed
>in context of IPFIX.
>

See comments on 2.0 Goals above.

to unsubscribe send a message to psamp-request@ops.ietf.org with
the word 'unsubscribe' in a single line as the message text body.