RE: making psamp export congestion-aware

> I would stress that we might want a separation between the *export
> rate* (say, in bits/sec) and the *sampling rate* (say, in terms of a
> selection filter and a packet sampling ratio).  In some cases, we
> might need to make short term changes in the export rate (say, due to
> transient congestion) without reconfiguring the sampling parameters.
> Also, in some cases it may be hard to estimate how much data the
> sampler might generate, which gives extra merit to having a separate
> configuration of the (maximum) export rate and sampling parameters.
> If the sampler (temporarily) generates data in excess of the export
> rate, then packets can be dropped at the agent, and this can be
> detected via missing sequence numbers (and perhaps other information
> in the export stream about data dropped at the agent).
> If the data volume is in excess of the configured export rate for
> a *sustained* period, of course the collector might choose to
> reconfigure the sampling rate as well, rather than having the agent
> discard so many records.

I would argue that the preferred control is the sampling rate. A limit on
export rate results in clipping that is likely to be correlated with the
traffic being measured. The net result would be that measurements would be
systematically biased against bursty sources. These types of biasing errors
are virtually impossible to quantify or correct once they occur. If the
sampling system is to be used for anything more than rough, qualitative
studies then care needs to be taken to minimize bias.

I would view a limit on the transmission rate as a sensible fail-safe
mechanism. For example it would protect the network in the event that a
poorly written collector sets the sampling rate to 1:1, or sets a filter
that accepts all traffic. A transmit rate limit would protect against the
resulting network overload. With a well configured monitoring system there
should never be any reason for a transmission rate limit to trigger. Any
triggering of the rate limit should be viewed as a configuration fault.


