[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
OOPS Honesty About AgentX
- To: eos@ops.ietf.org
- Subject: OOPS Honesty About AgentX
- From: Wes Hardaker <hardaker@tislabs.com>
- Date: Tue, 11 Mar 2003 16:32:39 -0800
- Organization: Network Associates Laboratories
- User-agent: Gnus/5.090015 (Oort Gnus v0.15) XEmacs/21.5 (brussels sprouts,i686-pc-linux)
Notes:
1) Please read all the way through before you put on your flame
thrower.
2) FYI, The most likely discussion/options points are prefixed with
DISCUSSION. There are also 3 PROPOSALS tagged below as well.
However, please see #1 above.
3) The material here-in applies mostly just to Get-Object-PDU and
Get-Object-Response-PDU messages, though elements of it will
apply to all.
4) This material is not for the feint of heart. It requires
extensive knowledge of SNMP, Agentx and OOPS.
Many have brought up potential OOPS impacts on AgentX (and Randy said
something like "lets be honest about [OOPS] impact on AgentX", hence
the subject). Though I have not reviewed every aspect of the
situation yet with intense detail, and I do agree that there are some
issues. Much of the software I am involved with makes heavy use of
AgentX, so I don't want it to be unusable either. Without change,
AgentX certainly won't be able to help as much toward some of the
goals behind the OOPS work. Without change, however, some of the
benefits will still be gained. The important thing is that we find a
solution that will allow at some gain to be achieved until subagent
protocols can catch up.
This is actually a more generic issue than just subagent technologies.
Specifically, when a new OOPS operation comes into an agent which does
not have mib-instrumentation level changes to support the filtering
and other features provided by the OOPS protocol, what should an agent
do with the incoming request?
Background:
Let's start with a diagram:
network
_______
| |
SNMPv2| |OPPS
V V
+------+
|Master|
|Agent |
/+------+\
/ | \
/ | \
+--------+ +--------+ +-----+
|Subagent| |Internal| |Proxy| ...
| | | API | | |
+--------+ +--------+ +-----+
+-----------------------------+
| Shared Table |
+-----------------------------+
This depicts a worst-case scenario: One table is shared across
multiple methods of access. Specifically, data may be accessed by
subagent protocols (such as AgentX), internal API calls, and by
proxying, and by ... I'll refer to these methods below as
"accessors". It is important to note that different rows in a common
table may require access through different methods to the data
contained within. (XXX: columns and wild-carding).
When SNMPv2 PDUs (or SNMPv1 PDUs) are being processed by the master
agent, the master agent simply divides up the request and queries the
appropriate table-data accessors. The master agent must carefully
control processing between these possible multiple access points to
ensure that GETNEXT operations are properly lexicographically sorted
when returned in the RESPONSE message. This means that if the rows
are allocated as follows:
row1 subagent1
row2 internal
row3 subagent1
row4 proxy
The master agent must properly query accessors to return data in the
RESPONSE messages in exactly this order. Any other order violates
GETNEXT PDU processing. One possible way of organizing calling
information within the master agent is heavily discussed in the AgentX
protocol document, so it won't be reiterated here.
For an OOPS Get-Object-PDU request, things change a bit. Rows within
the Get-Object-PDU are logically organized together within the packet,
and selected elements of the data is returned to the caller
(which may or may not contain a complete index set).
In an agent where the accessor mechanism is any of the above (Internal
API, Subagent, or Proxy (or ...)) and the accessor functionality
doesn't understand OOPS optimized access to the data
storage/functionality, some method of translation must be done. This
means an agent has two choices:
1) attempt to support the request via an internal OOPS->GETNEXT
translation
2) don't support OOPS requests to that table.
DISCUSSION 1:
#2 above should probably be discussed first. If an agent implements
OOPS, should all objects be accessible under the OOPS PDUs. Or
should an agent be allowed to only return data when the underlying
mechanism supports the needed advanced notions. Personally, I'm
more in favor of #2 since it allows for incremental improvements to
an agent and doesn't require that an agent do a massive update to
its internal infrastructure.
Doing a OOPS->GETNEXT internal conversion shouldn't be a huge amount
of work. The problem is that it effectively embeds some management
code into the agent in order to do row-wise data collection in order
to return the appropriate OOPS response. The cursor field returned in
the Get-Object-Response-PDU can merely be an encoded OID indicating
where to restart the GETNEXT traversal when the next request comes
in. But what does this mean for the agent (quick summary)?
a) the agent contains internal management-like code to do data
collection across older internal APIs and across subagents and
proxies.
b) The filtering and data selection still get applied to the
resulting collected rows.
c) The overall packet sizes returned on the network should still be
significantly smaller, due to b) and due to the more efficient
Get-Object-Response-PDU encoding (over the RESPONSE encoding from
GETNEXT/GETBULK counterparts).
DISCUSSION 2:
I was originally thinking that a OOPS knowledgeable master agent could
make cleaver use of the cursor field by encoding a particular subagent
"id" into the cursor such that the master agent could walk one
subagent at a time and not have to worry about interleaving row
results, as it has had to do in the past. There are two problems with
this:
1) Currently, cursors are supposed to be reusable forever, even
including post master-agent-reboot time. This causes problems
with subagents need to be uniquely identified.
2) subagent 1 can allocate an index, then deallocate it and then
subagent 3 can reallocate the same index later. If the row
ordering of Get-Object-Response-PDU replies must be consistent
for all time, then there is no way to create a cursor which is
not based at least in part by a GETNEXT OID across all subagents,
which defeats half the purpose. IE, if ordering must be
preserved at all times and subagents are allowed to switch data
from one subagent to the next, there is no way for a master agent
to guarantee the ordering returned between subagents.
So...
PROPOSAL 1:
Drop the requirement that cursors must be valid for all time. I
think the infinite lifetime will cause only harm. It's unlikely
managers will need (note I didn't use "want") to keep cursor data
around forever and it's much more likely they'll only use them to
continue traversal in future follow-on Get-Object-PDU requests.
So, I'd like to drop the requirement that they must remain valid
forever but change it so that they must be valid until the next
time the agent reboots in the future. I think this is a more
reasonable expectation to be imposed on an agent.
PRORPSAL 2:
Discard the requirement that rows must be returned in a
dependendable order. The more I thought about it, I'm not sure
why I was so determined to put ordering in the document at all.
If I recall, I wrote that requirement in before cursors were put
in place in the PDUs and thus ordering was needed to assure the
skip-objects field and the max-return-objects field was usable at
all. The important thing is that data not be skipped and that
duplicates are not returned (though the later is less important
than the first, IMHO). Since cursors provide this functionality,
by requirement, then I don't see the need to keep the requirement
that from one Get-Object-PDU based walk to the next that the data
must be returned in the same order.
However, even with these two proposals it is still impossible to
design a cursor for return by a master agent for use with subagents
which don't basically encompass the exact GETNEXT style OID
semantics into the cursor. This is ok, however. In the future,
subagent technologies will hopefully incorporate the newer ideas
behind the OOPS proposal and thus we'll gain an advantage in the
future.
DISCUSSION 3:
Subagents don't break indexes into pieces, which makes it difficult
for a master agent without MIB table knowledge to properly construct
Get-Object-Response-PDU packets which require that index encodings
be separated out.
PROPOSAL 3:
Add a CHOICE element to the index encodings that allow for master
agents to return a OID for an index which can't be broken down.
ASN.1-wise, This would mean modification of the ElementSpecifier
to change the index-number range from 0..4294967295 (which was
really unnecessarily large in the first place) to -1..2147483647
such that a value of -1 would indicate the data portion of the
DataList would be the raw OID instance identifier (with a 0.0
prefix).
In summary, there are definitely OOPS issues with respect to subagent
protocols. The proposals above help alleviate some of the problems.
In the mean time, thoughts on the above would be appreciated.
--
Wes Hardaker
Network Associates Laboratories