Maurizio Molina wrote:I don't think this is correct, unless I completely misunderstood everything. Let me reexplain what the issue is, maybe I took some shortcut. The draft says: SELECTOR_PARAMETERS
For sampling processes the SELECTOR PARAMETERS define the input
parameters for the process. Interval length in systematic
sampling means, that all packets that arrive in this interval
are selected. The spacing parameter defines the spacing in time
or number of packets between the end of one sampling interval
and the start of the next succeeding interval.
Case n out of N:
- Population size N, Sample size n
Example: we select randomly n packets out of N.
No problem on this one
Case Systematic Count Based:
- Interval length(in packets), Spacing (in packets)
Note: I start with "Case Systematic Count Based" to illustrate my point.
Example: if Interval length = 10 packets, Spacing = 100 packets
This means: I select 10 packets, I don't select the next 90 packets, I select 10 packets, etc...
Note2: this is not clear from the draft if this the previous line example or...
I select 10 packets, I don't select the next 100 packets, I select 10 packets, etc...
This must be clarified with an example.
Case Systematic Time Based:
- Interval length (in usec), Spacing (in usec)
Example: if Interval length = 10 usec, Spacing = 100 usec
This means: I select X packets during 10 usec, I don't select packets during the next 90 usec, etc...
BTW, see my note2 above that is equivalent here: is it 10, 90, 10, 90, ... or 10, 100, 10, 100, ...
And this is my entire point, you select X packets during an interval. And you don't know how many.
You might know it with the ratio 10/100 * bandwidth. BUT you have no clue about the flow records number and
as a consequence we don't know what is the bandwidth requirement for the export link(s). And if we do
sampling, it's typically because we have a bottleneck on the export
link(s) bandwidth or on the collector side... So this way of doing of sampling is dangerous.
The only application I see for such a sampling scheme is when the bottleneck is the interface or the line card resources, typically the memory.
If you keep this mechanism (anyway this is a MAY requirement), you must say a remark about it.
Now, you speak above about "With Systematic Time based you cannot see any rate
variation on the link because you always export a packet each T sec."
If you want to do that, and I agree it makes sense (actually a lot more sense that the previous scheme), then you will have a sampling scheme like this:
Case Systematic Time Based: - Interval length (# packets), Spacing (usec)
Example: if Interval length = 10 packets, Spacing = 100 usec
This means: I select 10 packets, I don't select packets during 90 usec, I select 10 packets, etc...
BTW, the Note2 still applies here.
Regards, Benoit.
(actually, you can only understand if the rate drops below 1/T) . |