Maurizio Molina wrote:|
I don't think this is correct, unless I completely misunderstood everything.
Let me reexplain what the issue is, maybe I took some shortcut.
The draft says:
SELECTOR_PARAMETERS For sampling processes the SELECTOR PARAMETERS define the input parameters for the process. Interval length in systematic sampling means, that all packets that arrive in this interval are selected. The spacing parameter defines the spacing in time or number of packets between the end of one sampling interval and the start of the next succeeding interval. Case n out of N: - Population size N, Sample size n Example: we select randomly n packets out of N. No problem on this one Case Systematic Count Based: - Interval length(in packets), Spacing (in packets) Note: I start with "Case Systematic Count Based" to illustrate my point. Example: if Interval length = 10 packets, Spacing = 100 packets This means: I select 10 packets, I don't select the next 90 packets, I select 10 packets, etc... Note2: this is not clear from the draft if this the previous line example or... I select 10 packets, I don't select the next 100 packets, I select 10 packets, etc... This must be clarified with an example. Case Systematic Time Based: - Interval length (in usec), Spacing (in usec) Example: if Interval length = 10 usec, Spacing = 100 usec This means: I select X packets during 10 usec, I don't select packets during the next 90 usec, etc... BTW, see my note2 above that is equivalent here: is it 10, 90, 10, 90, ... or 10, 100, 10, 100, ... And this is my entire point, you select X packets during an interval. And you don't know how many. You might know it with the ratio 10/100 * bandwidth. BUT you have no clue about the flow records number and as a consequence we don't know what is the bandwidth requirement for the export link(s). And if we do sampling, it's typically because we have a bottleneck on the export link(s) bandwidth or on the collector side... So this way of doing of sampling is dangerous. The only application I see for such a sampling scheme is when the bottleneck is the interface or the line card resources, typically the memory. If you keep this mechanism (anyway this is a MAY requirement), you must say a remark about it. Now, you speak above about "With Systematic Time based you cannot see any rate variation on the link because you always export a packet each T sec." If you want to do that, and I agree it makes sense (actually a lot more sense that the previous scheme), then you will have a sampling scheme like this: Case Systematic Time Based: - Interval length (# packets), Spacing (usec) Example: if Interval length = 10 packets, Spacing = 100 usec This means: I select 10 packets, I don't select packets during 90 usec, I select 10 packets, etc... BTW, the Note2 still applies here. Regards, Benoit.
(actually, you can only understand if the rate drops below 1/T) .