[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: making psamp export congestion-aware



> > How do you account for failure of collector(another
> strongly debated topic)
>
> In the congestion-aware framework mentioned in my previous
> message, we'll
> need some way to handle the case of 100% loss -- when the
> collector fails,
> all export packets are lost, or all
> reconfiguration-of-export-rate packets
> are unable to reach the router.  Otherwise, any of these
> events could cause
> the psamp device to keep sending export packets at the
> last-configured rate.
> We'll need some sort of soft-state for the export rate (with
> an expectation
> of periodic refresh).

We had to solve this same problem when developing sFlow (www.sflow.org). We
decided to use a time-based reservation scheme.

sFlowTimeout OBJECT-TYPE
     SYNTAX      Integer32
     MAX-ACCESS  read-write
     STATUS      current
     DESCRIPTION
       "The time (in seconds) remaining before the sampler is released
        and stops sampling.  When set, the owner establishes control
        for the specified period.  When read, the remaining time in the
        interval is returned.

        A management entity wanting to maintain control of the sampler
        is responsible for setting a new value before the old one
        expires.

        When the interval expires, the agent is responsible for
        restoring all other entities in this row to their default
        values."
     DEFVAL { 0 }
     ::= { sFlowEntry 3 }

The basic goal we had in designing the sampling agent was simplicity. A few
simple mechanisms in the agent allows all the intelligence to be pushed to
the collection application. In this case a timeout is set at the collector
and the collector is responsible for managing the agent timeout, ensuring
that the agent will stop transmitting if the collector fails.

As Jennifer pointed out, the only mechanisms needed in the agent to allow
the collector to be congestion aware are: the ability to remotely set the
sampling rate, and the inclusion of sequence number is the sample packets.
This allows the collector to adopt congestion aware policies that are
appropriate to the measurement task.

A couple of other mechanism are also useful:
1. The ability to remotely set the destination for samples allows graceful
fail-over between collectors. Collectors can use high availability
clustering software to handover monitoring in the event of a failure. The
agents do not need to be "smart" about failures. The choice of UDP as the
transport mechanism allows for efficient multicasting of packets if
required.
2. The ability to remotely set the maximum datagram size allows the
collector to avoid packet fragmentation while still permitting the
efficiency of large packet sizes where possible.

The strategy of using a simple agent with most functionality pushed to the
collector has a number of significant advantages over strategies involving
"smart" agents. New functionality can be quickly deployed through upgraded
collectors (it takes much longer to standardize and upgrade agent
functionality). Simple agents take up minimal resources and are easy to
implement, encouraging widespread implementation and deployment.

Peter


--
to unsubscribe send a message to psamp-request@ops.ietf.org with
the word 'unsubscribe' in a single line as the message text body.
archive: <http://ops.ietf.org/lists/psamp/>