[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

a customer!



From: Joe Abley <jabley@automagic.org>
To: nanog@merit.edu
Subject: representativeness of flow data based on samples
Date: Wed, 30 Jan 2002 14:02:30 -0500


Traffic measurement techniques such as NetFlow work by associating
some characteristics of inbound packets on an interface with a flow,
e.g. some tuple like (source addr, source port, dest addr, dest port,
protocol). Counters per flow are incremented, and the numbers are
exported periodically or when flows become inactive.

There are a few vendors who now provide traffic export from high-speed
interfaces by sampling those interfaces at a particular rate, and
using the sampled packets to populate the per-flow counters, rather
than looking at every packet.

Does anybody here know of recent research with real internet traffic
which compares different sample rates wrt the representativeness of
the resulting flow data?

For example, if I am trying to rank the top traffic sinks for my
network beyond an attached peer (i.e. an ordinal rather than cardinal
measurement), will I get different answers if I use a sampling rate
of 1:1000 compared to 1:50, given a statistically "long enough"
measurement period?

Intuitively, it seems to me that the answers should be the same.
However, it also seems to me that statistics are frequently non-
intuitive.


Joe

to unsubscribe send a message to psamp-request@ops.ietf.org with
the word 'unsubscribe' in a single line as the message text body.