[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

review of partial-lock-01



Hi,

I have reviewed this draft. It generally looks good, but needs to be
tightened up quite a bit to make a good standard.

section 2.1. 
paragraph 2 - I think this should start "The system MUST ensure". You
will have interoperability problems if everybody doesn't follow the
same rules on this.

paragraph 4 - "one or more datastores" This is the first mention of
multiple datastores in this document, and I think it would be good to
make sure readers know you are referring to running, startup,
candidate, etc. I can think of other types of datastores, such as the
datastores for blade 1 and for blade 2 in a chassis, and it is
important that standards are clear and unambiguous. The global lock
seems to only work on one target datastore per command.

paragraph 4 - if the XPath will be applied to multiple datastores,
must the datastores be in the same naming space? Can XPath span
multiple datastores in different naming spaces?

section 2.2 This and elsewhere in the document needs to be tightened
by using the RFC2119 language to promote interoperability between
multiple clients and multiple servers. "Partial locking uses only
restricted XPath to describe the lock's scope ... [and] optionally can
utilize any XPath ...". Well, which is the base standard that is
mandatory to implement to claim compliance to this spec? 

The support for a full XPath expression is not necessarily
interoperable, and it should probably be moved to a different
paragraph, identified as an optional extension   to the standard, and
contain discussion of the problems one should expect if a client tries
to rely on full XPath support in the servers.

section 2.4.1 - the security considerations section should have
discussion of how doing the evaluation only once can have security
implications. I recommend that such discussion include an example of
how a data model to configure a security protocol might be impacted
(or not).

The bullet on access rights says if the user must have "at least some
basic access rights, e.g., read rights, ...", but later you say "when
a partial lock is executed you get what you asked for: a set of nodes
that are locked for writing." So shouldn't the "at least basic access
rights" be authorization to write the data? And, again, this should
probably be converted to RFC2119 language.

"There are some other issues that are intentionally not addressed for
the sake of simplicity." Since you know some issues are not addressed,
then some issues must have already been identified. What are those
known issues that are not addressed?

(By this time, I wish you had more subsections, so it was easier for
me to identify where I am in the text for you.)

s/does not effect/does not affect/

"An operator is allowed to edit the configuration both inside and
outside the scope of a lock." Can you give an example to show why this
is part of the proposed design?

"   Note: The <partial-lock> operation does not modify the global
<lock>
   operation defined in the base NETCONF Protocol [RFC4741].  If part
of
   a datastore is already locked by <partial-lock>, then a global lock
   for that datastore fails even if the global lock is attempted by
the
   same NETCONF session which owns the partial-lock."
Will a global unlock command unlock one or more partial locks?

"The select expressions MUST return a node set." I suggest adding ",
which MAY be empty."

s/instances.]/instances./

"If any select expression returns anything but a node set, the
<error-tag> shall be 'invalid-value'." Since it is non-compliant to
return anything other than a node set, what makes you think the
non-compliant implementation will return the correct error-tag?

"invalid value" is returned for both (not a node set) and (:xpath not
supported). Shouldn't these be different error codes? The non-node-set
is always invalid, but the lack of :xpath is not invalid.

section 3 Security Considerations is inadequate. There are potential
security vulnerabilities caused by partial locking that are not
described in the netconf base protocol. 

To make things easier for IANA, the IANA Considerations should point
to the existing IANA registry
http://www.iana.org/assignments/netconf-capability-urns and how it
should be changed rather than making them look up this information in
the Netconf RFC. 

I recommend making the Open Issues a separate section from the Change
Log.

Would locking a non-existent node result in a node set?

I think it would be good to provide extra information for operators to
debug this operation.

If we do not allow lock multiple datastores in one operation, maybe we
should recommend locking them in a particular order, which could help
prevent deadlocks.

I recommend showing the latest changes first. That's fairly typical in
my experience.

Appendix B - I have a real issue with using a DML to define new
operations. The IETF Management Framework has deliberate;y kept
protocol definitions and data model definitions separate for fifteen
years, and I don't think the Netconf WG alone can reverse that design
decision. That level of change should be discussed before the whole
IETF. In addition, this violates the architectural layering in the
Netconf architecture, which separates content and RPCs (operations).

s/modul/module/

The YANG spec doesn't identify which version of YANG is used. This
info is available in an SMIvx MIB module or an XSD schema.

Does the revision clause refer to urn:...:partial-lock:1.0? What if
these get out of sync?

Appendix C.
Should the XSD include organization amnd contact?
It would probably be good to annotate the fact that "nc:config-name"
provides a choice.

--
I recommend an Operability Considerations section that discusses
implications of migrating from netconf-base to partial locking, and
global locking to partial locking. 
How do multiple clients and multiple servers with different
capabilities interoperate? 

Do operators need to take the same desired configuration and have a
version of their XML (or YANG or ...) configuration document that only
supports the base protocol, another that supports partial locking
without :xpath, and a third that supports :partial-locking with
:xpath? Or should they write their configuration using conditional
language constructs? 

I can see complexity in supporting multiple versions for partial
locking. What happens to an operator's configuration designs when
there are a lot more capabilities, each with their own optional
extensions or combinations of capabilities, like here? Can we reduce
the number of options in the partial-locking proposal?

Operational considerations might also address how partial locking will
affect other services in the network. Netconf locking (both global and
partial) is creating new requirements for SNMP, CLI, and other
protocols. The impact on other NM protocols should be documented.

SNMP response times may get worse if partial locking prevents SNMP
from performing a SET to an object. A Netconf lock might prevent an
existing SNMP application that is used to "tweak" a system's
configuration to maintain or improve performance or correct for a
detected fault may not work as quickly if there are Netconf locks in
place. in fact, an operator may not be able to fix a time-critical
problem via SNMP or CLI. This impact should be spelled out in the
document.

You might recommend that implementations and operators lock even small
portions of the config only when necessary, in order to avoid having
other NM interfaces stop working as expected. This is true of both
global locks and partial locks, but users might think partial locks
can be held longer because they only deal with small pieces. SNMP
management applications may need to be upgraded to re-try requests
more than they do now, because Netconf locking might be causing more
errors of certain types (resource unavailable, or whatever).

Partial locking puts a new requirement on SNMP and CLI. It's one thing
to say that SNMP and CLI must disable SETs when a global config lock
exists; it is another requirement altogether to say that SNMP and CLI
must understand what gets locked by an arbitrary XPath or restricted
XPath expression. They don't currently know anything about XPath, and
you are expecting them to know how to disable SETs to those
XPath-specified subsets of data.

How will the partial locking extension be managed? How will SNMP be
able to determine if a Netconf partial lock is blocking it? How will
operators be able to tell how many partial locks exist, and what
sections they lock? I would assume operators would not find it very
helpful to have a query about "what is locked?" just return the XPath
string and the operator needs to evaluate the XPath. This is
especially true since the evaluation done by the partial lock
capability was done at a different time.

When SNMP or CLI cannot access something locked with a partial lock,
can operators see what locks exist? Can they tell what section is
locked by which lock? Can they tell which locks are experiencing
conflicts - i.e., can the partial locking mechanism report how many
netconf or non-netconf operations were denied because of netconf
locks?

When lock conflicts arise, should these be logged, e.g., in syslog, so
operators can determine what happened when, in case anomalies
(especially with other protocols) are caused by Netconf partial locks?

WGs should consider how to configure multiple related/co-operating
devices and how to back off if one of those configurations fails or
causes trouble. It is important to be abel to manage the network, not
just single devices. What is the guidance on applying and releasing
partial locks for configuring multiple devices simultaneously? Netconf
offers rollback if the commit fails on the same device. What happens
if an operator partially locks two devices, commits the configuration
on one device and unlocks, and then the commit fails on the other
device? What is the recommended order of lock/change/commit/release
across multiple devices?

Debugging support is important. We could add lots of debugging
information to the response for all partial locking commands, but it
might be better to be able to allow operators to enable additional
information only as needed for debugging the capability. But a switch
to enable/disable extra debug info might be something that could be
better tweaked on/off via a capabilities table.

David Harrington
dbharrington@comcast.net
ietfdbh@comcast.net
dharrington@huawei.com



--
to unsubscribe send a message to netconf-request@ops.ietf.org with
the word 'unsubscribe' in a single line as the message text body.
archive: <http://ops.ietf.org/lists/netconf/>