Date: Wed, 1 Dec 1999 10:28:01 +0100
Message-Id: <199912010928.KAA21045@henkell.ibr.cs.tu-bs.de>
From: Juergen Schoenwaelder <schoenw@ibr.cs.tu-bs.de>
To: Network Management Research Group <nmrg@ibr.cs.tu-bs.de>
Subject: [nmrg] minutes from the Washington meeting
Sender: owner-nmrg@ibr.cs.tu-bs.de
Precedence: bulk

Below are the minutes from our 5th meeting in Washington DC. Thanks to
Dave Harrington and Shawn Routhier for taking notes and for cleaning
them up. (These minutes are also now on our Web server.)

/js

Minutes of the Meetings of the Network Management Research Group at IETF46
Reported by David Harrington, <dbh@cabletron.com>.

Sunday 11-7-99

Attendees:
David Harrington, Cabletron Systems 
Dave Levi, Nortel Networks
Ron Sprenkels, University of Twente
David Partain, Ericsson 
Andy Bierman, Cisco 
Jeff Case, SNMP Research 
Steve Moulton, SNMP Research
Dave Thaler, Microsoft
Dave Perkins, Tollbridge and SnmpInfo 
Jon Saperia, JDS Consulting
Shawn Routhier, ISI
Juergen Schoenwalder, University of Braunschweig 
Eric Schoenfelder, Gaertner Datensysteme
Bert Wijnen, IBM 
Keith Mcloghrie, Cisco 

Unsigned 64 for smiv2
Keith and Andy explained the hcdata proposal.

We discussed whether the problem needed to be solved for v1 as
well. If we move v2c to standard, it will be easier to migrate from v1
to smiv2. Opaque encoding allows us to implement without new
encoding. However, opaque encoding gives more opportunities to get the
encoding wrong. It would take way too long to get opaque into
deployment. We could add new encodings just as easily and they're
better.

No matter what the outcome, it will probably look like 2-32s. Whether
it's encoded as opaque, an overloaded c64, or a new tag, for most
developers, it doesn't matter what decision is. Some disagree with
overlaying a signed on top of an unsigned; overlaying an unsigned over
an unsigned is the best we can do.

It looks like a general solution when it is a hack. We shouldn't sell
it as a general solution. We should explain this as a specific hack to
resolve the problem. We should warn everybody not to use this in other
cases. We shouldn't say hack outside this room.

It's important to understand the history behind this. We wrote a rule
to avoid abuse of c64, but couldn't write a rule to avoid abuse of
uint64 and signed64, so we chose not to have uint64 and signed64.

We need to advance RMON, and this is the best we can do. We need a
64-bit snapshot and a 64bitdelta. We need to worry about negative
deltas. We have tolerated RMON, so maybe we should localize the issue
to RMON, but people import from RMON.

Is the concern that this TC would be used beyond RMON, or that it
would set bad precedent? Both are concerns at different levels.  If
the TC was used outside RMON, would that be a big a problem?

Some would not be comfortable with wording that said the IETF couldn't
make up its mind, and this is the current decision, and the decision
may be rolled back later. We must decide now whether it's legal or
not, and decide the issue once and for all.

High capacity is more important than RMON only. We need this in
high-speed networks. If we limit this to RMON only, another WG will
come to the same problem 6 months down the road and want to use the
TC.

RMON also defines a zero-based counter 64, with a gentleman's
agreement. That was also called illegal. We need to determine what is
legal and not. It deals with a TC expanding semantics. However, a
zero-based counter wouldn't affect the semantics that counters must
always increase, while uint64 does. People will define everything as
zero-based 64-bit counters because they wouldn't need to do two reads
and calculate a delta, if the counter is initially zero. 32-bit
zero-based counters are from gauge; zero-based64 is based on
counter64.

how many think this is illegal? - All
is this the best we can do? Yes 
is it allowed outside RMON? Yes
is the general technique going to be allowed? No

Should it be allowed in more than RMON? I think this must be resolved
for all high-speed networks. Where else is it needed? DISMAN. Can we
agree it cannot be done other than Unsigned64 and Gauge64?

We shouldn't make the decision for the future. We shouldn't constrain
future SNMP decisions on this. This should be only for the current
problem and only u64 and g64. Elevating it to a general-purpose
proposal, as hcdata did, is wrong.

There is middle ground; define a Counter64snapshot and a
Counter64delta, and let Andy and Keith decide the names.

action item: Andy will rewrite TC doc with explanation and
appropriately named TCs.

We should also include text that it may break some existing
implementations, and that it may change in the future. I want to be
able to point out to NMS vendors that counting on encoding for type
determination is a bad practice.

Does this text go in hcRMON or in a general document? By putting the
module name in front of TC, it may localize the problem better.

Consensus: We are all agreeing to do something illegal. We should put
it in a separate document, document that it is illegal, and remove the
contentious wording. We don't want it to say it's illegal, just that
it is "not strictly legal". We can come up with some flowery text that
others must not use this.  Andy will write an Informational document
with 2 TCs. We need new module identity.

ZeroBased64:

It has been argued that zerobased64 is illegal because making it
zero-based is not consistent with the underlying "not initialized to
0". Both approaches are taking away semantics. Not really - a
zero-based counter is merely a constraint on the base. You can never
put a range on a counter; by putting a range on it, the deltas no
longer work. ZeroBased64 has the same problem.

[here we started spinning wheels for a while]

A counter32 today can be constrained to start at zero. They are
contained in MIBs currently under review. The review is based on the
assumption that zero-based counters are acceptable.

[here we did some RFC lookup]
RFC2578 says "Counters have no defined initial value, and thus, a
single value of a Counter has (in general) no information
content. ... A DEFVAL clause is not allowed for objects with a SYNTAX
clause of Counter32." There is an identically worded constraint for
Counter64.
rfc2021: "ZeroBased32 ... will be set to zero(0) on creation ..."

This needs working group consensus; we need to be sure.

action item: Andy will add a third TC in the document with the other
two.  The short-term proposal will be presented in the Ops-Area
meeting. Andy and Keith will submit the document. There will be a
4-week last call, so we can have a short-term solution by year-end.

DPerkins requested Ops-Area chair to charter new committee to resolve
new data types long-term problem. Jeff Case suggests we should
discussion the big picture before dispatching little groups.

The meeting chair asks the group whether they wish to discuss vision
or small items first. The group chooses vision.

Vision

We went around the table and asked for each person's three most
important issues.

Top three things:
Dperkins thinks we need a long term solution for data types, we need a
better solution for bulk transfer, and we need to support operations
in PDUs.

steve moulton: no submissions
eric: nothing to say

Keith says we need support for new data-types. He elects to keep his
options open for other issues.

Andy's concerns are not related to SNMP. We need a new mgmt
architecture. We need a more scalable, better delegation model. We
expect the high level to have an understanding of low level stuff. We
need to layer the architecture better, to hide the details. Plus we
need OID compression, as a way to get data faster.  Also, the COPS-PR
vs. SNMP debate needs to be settled.

ron: nothing

Juergen Shoenwalder thinks the SMI is a prime concern. Problems need
to be resolved by making the SMI more powerful. We need new data
types, operations, etc. It does not necessarily mean things like
aggregate types. We need Bulk transfer.

Bert expressed his concerns. SNMP is not object-oriented. Policy-based
mgmt needs some real attention; how does it fit into the mgmt picture?
DMTF is invading the IETF with their CIM model; Do we want this? What
do we see for 5 years from now? Is SNMP still viable? Should we embark
on something new? Are they (DMTF) leading us? Should the mgmt
architecture be COPS plus SNMP?

Dave Levi expresses the view that the COPS vs. SNMP debate will drive
some of the details. If SNMP wins, then bulk transfer is the most
important.

Shawn agrees with most of these, but doesn't know the priority order.

Jon Saperia expressed the need for a complete information data
model. We must work on common goals that matter operationally. Some
local work appears important, but operationally the benefit isn't as
obvious. We need to work on solutions to customer problems rather than
some detailed MIB. Maybe we need a minimum mgmt document.

Jeff Case thinks stability, perceived stability, completeness, and
deployment is the most important. We need new spins on the protocol,
the MIBs, and the SMI.  The MIB must contain more standard
configuration and control, not just monitoring. We need to standardize
more objects to encourage people to use standard approaches. The SMI
needs new data types, etc.

David Partain reiterates Dave Levi's comments. The COPS-PR vs. SNMP
debate is the most important concern. What do we learn from that
debate?

Dave Harrington observed the need for snmpv1 table read efficiency for
all those operators that will continue using snmpv1. We need better
business case justifications - why should vendors implement? We need
more orientation to customer demands.

Compatibility with existing stuff is also critical.

People won't do SETs. People used security as an excuse. With snmpv3,
they still don't do SETs. SETs really are a bitch to do. People need
to write scripts or something.

App needs to be powerful enough that non-SNMP person can easily
configure it to do what they want. There have been financial reasons
why they don't do them, rather than SNMP issues. SNMP is low-level,
which is important for monitoring, and for particular types of
SETs. To do SETs at a higher level, you must be aware of the different
requirements of monitoring and for doing SETs.

COPS-PR and SNMP approach the tradeoff differences differently. Some
businesses need multiple mgmt systems. We agree that expressing things
at a higher level of abstraction is very important. That can be done
with SMI as is, and we need to write a set of recommendations.
However, that doesn't allow you to get rid of RowStatus.

-----

If SNMPv3 is not being deployed, maybe there's something wrong with
it. What we need to do is publish coexistence docs, because lack of [a
coexistence strategy] is preventing deployment of v3. Another problem
with getting SNMPv3 deployed is the fact that Cisco has a tree in
their source, and customers must choose between stable routing and
snmpv3. Not all issues are feature sets in specs.

[Then a discussion broke out] they want traps, with data, and the
ability to turn off traps they don't want.

We need to move from tech-driven to customer driven. We need to do
better than to send lots of little-bitty pieces of data to a human. We
need to convince people to use DISMAN, and we need to standardize a
script language to make it possible for operators to do what they
want.

We need to address problems on a timely basis. We need to pick up the
pace.  There are real problems and we need to solve them for people on
a timely basis.

We need more smarts everywhere. Plug and play has to be real; we need
to get configuration down to [???]. There is a big disconnect between
what customers think is important and what's discussed in these
meetings. SNMP purity isn't important to the customers. They know to
have remote intelligence, security is needed. That will cause some
deployment of SNMPv3. We must not send mixed signals.

The chair raised the issue of the agenda for tomorrow's session.
Should we discuss the COPS vs. SNMP debate before we go into the BOF,
so we don't give an impression of a catfight in that meeting?


NMRG meeting Monday 11-8-99

Attendees:
Shawn Routhier, Integrated Systems
Jon Caron, Cabletron
Keith McCloghrie, Cisco
Ron Sprenkels, University of Twente
Jeff Case, SNMP Research
Steve Moulton, SNMP Research
Jon Saperia, JDS Consulting
Dave Levi, Nortel Networks
Glen Waters, Nortel Networks
Dave Perkins, SnmpInfo and Tollbridge
John Seligson, Nortel Networks
Juergen Schoenwalder, University of Braunschweig
Dave Partain, Ericsson
Dave Harrington, Cabletron
Bert Wijnen, IBM
Andy Bierman, Cisco

---
What is the goal for the meeting?

We discuss setting out the goals for both COPS-PR and SNMP and try to
find commonality. It may be that we want to have two protocols as they
can then be tuned better for their specific pieces. Perhaps the 13
requirements in the mumble docs are too fine grained; maybe we need to
look at a higher level and try to have one framework. One framework
may still have multiple protocols. One seamless framework might be
better.

Dave Harrington was going to draw the architecture from rfc2571 and
ask Keith if that's what he means, but Keith has a different vision
and is now drawing the slide. Dave asks Jeff if he has thought about
what a COPS like thing would look like in the SNMP world, and whether
he could also put up a drawing, so we could compare the visions.

A Picture drawn by Jeff Case:

This would not be duplication and would be a good thing to do both.

+----------+        +----------+        +----------+
|  CLI GUI |        | SNMP GUI |	|Policy GUI|
+----------+        +----------+        +----------+
        |                 |		      |	
	|		  |                   |
        |          +------------+      +------------+      +-----+
        |          | Mgt Station|      | Pol Server |      | Rep |
        |          +------------+      +------------+      +-----+
	|		|	           |
	|		|		   |
	|		|		   |
	|		|	       +-------+
	|		|	       |  PDP  |
	|		|	       +-------+
	|		|	           |
	|		|        +---------+
	|		|        |
  +-----|---------------|--------|-----+
  |     |               |        |     |
  |  +-----+  +------------+  +-----+  |
  |  | CLI |  | SNMP Agent |  | PEP |  |
  |  +-----+  +------------+  +-----+  |
  |      |          |           |      | ... 
  |    /--------------------------/    |
  |   /         Data             /     |
  |  /--------------------------/      |
  |                                    |
  +------------------------------------+

A picture drawn by Keith McCloghrie. Note that this picture includes
the image that one policy may affect multiple end unit entities.

+----------+        +----------+        +----------+        +----------+
|  Other   |        |   GUI    |--------| Directory|        |   SNMP   |
+----------+        +----------+        +----------+        +----------+
       |                  |                  |                   |
       ----------+        |     +------------+            +------+
                 |        |     |                         |  
          +------------------------------------------------------+
          |      |        |     |                         |      |
          |  +--------------------+      +--------------------+  |
          |  |        PDP         |      |   SNMP Manager     |  |
          |  +--------------------+      +--------------------+  |
          |      |      |                           |            |
          +------------------------------------------------------+
                 |      |                           |
         +-------+      | (CLI)       (SNMP)        |
         | (COPS)       |         +-----------------+
         |              |         | 
  +------------------------------------+
  |                                    |
  |  +-----+  +------------+  +-----+  |
  |  | CLI |  | SNMP Agent |  | PEP |  |
  |  +-----+  +------------+  +-----+  |
  |      |          |           |      | ... 
  |    /--------------------------/    |
  |   /         Data             /     |
  |  /--------------------------/      |
  |                                    |
  +------------------------------------+

There was a debate as to the purpose of the pictures. A distinction
was made between what problem we are trying to solve vs. the drawings
which are more or less what we are currently doing. Dave H. was
attempting to establish some common ground, to understand which pieces
of the two pictures could be merged, and which pieces could not be
merged. Part of the desire for pictures was an attempt to get a better
understanding of what is happening. Jon attempts to draw a picture
without any specific technologies.

	management system
	----------------------------

	cloud 1		cloud2

	s1, s2, s3	s1, b1

management apps:
configuration, fault, (element specific)
netwide (policy based elements)


---
The authors of the mumble doc agreed on the requirement of network
wide configuration (policy stuff etc.). It also points out that a
network will need element configuration to get fault, and other,
information. This leads to a suggestion for COPS for policy level and
SNMP for element level.

So what happens for element level information that must be aggregated,
such as the number of errors, number of packets across backbone etc.?
Element information will be used to determine if and how the policies
were carried out.

So the proposal being put forth by the COPS-PR proponents is to use
SNMP for aggregation of element information (statistics etc) and use
COPS for policy distribution - is that correct? Yes. An example of
gathering aggregate statistics using SNMP: collection statistics on
backbone interfaces - show all interfaces on the backbone that have
more than 50% utilization...

There is a concern that other people in the COPS area may have desires
about using COPS instead of SNMP for element information. Some people
might push for either pure SNMP or pure COPS to do all management,
both policy and element.

The IETF can use the deployment club to try to convince people that as
SNMP already exists, and that is what should be used for element
management. Having a single framework is attractive; the SNMP
framework is already deployed.  But there's COPS-PR stuff already
deployed. What metric is used to make decisions - deployment?
Political clout? If deployment is the metric, then SNMP should win for
statistics. There is no real deployment for policy management so we
have some leeway.

SETs are hard and perhaps we can make things easier by changing the
requirements - such as eliminating the possibility of multiple
managers. This could also be done using SNMP.

The use of TCP allows us to achieve some of the new requirements, but
having SNMP run over both TCP and UDP would lead to two protocols.
Doing the same things as in COPS would be harder and more complicated
in SNMP.

By narrowing focus, COPS-PR optimizes the solution for one part of the
problem.  However, doing the optimization for only one part of the
problem may make the larger problem harder, i.e. one could win the
battle (COPS and policy), but lose the war (network management). The
concern is understood, but we may need to solve a small section and
let other groups solve some of the other problems.

We can put more smarts into the SNMP agent including some of the
features, such as single-request row-creation. The MIB question is a
detail we should not be discussing; we should be discussing at a
higher level.

This is a major question, the fact that an agent can be smarter may
allow us to do much different things / more efficiently in SNMP.
Fixing one thing by adding another GUI (COPS) may make things worse.
Currently operators must use CLI, and SNMP etc., but if we add another
then we will have multiple GUIs.  Do we want to drive towards multiple
protocols? The borderline between COPS configuration and element
configuration is not a hard line; different people will think of the
borderline as being at different places, so by adding COPS we will
allow even more ways of configuring the same thing. This is
undesirable.

The cost of deployment may cause SNMP based policy configuration to
not be deployed, i.e. MIBs may not be deployed. Customers want cross
machine information, response time, availability etc.  We currently
don't have much (if any) MIBs to do this; only a small number of MIBs
supply some cross machine (network wide) information and
configuration. Not much has been done with configuration due to
security.

Why would people deploy COPS if they won't deploy SNMP sets? The
deployment time for COPS-PR is less. Possibly non-technical reasons
such as org charts and who can/will control the work might affect
deployment decisions.

Due to a lack of consistent configuration interfaces, it isn't
worthwhile to write the applications to do configuration. There is a
lack of standard configuration MIBs. There are some problems in the
SNMP protocol if we would like to use it for configuration management.
We should either do all configuration management via SNMP or expect
that essentially all configuration management would happen via COPS.

How far should the non-use of SNMP SETs be carried? SETs would still
be allowed but not for configuration management, so for example if we
choose COPS we should get rid of the remote configuration MIBs for
SNMP.

Because of the rfc2571 framework, we should be able to add a new
application to SNMP that would support the policy stuff, possibly with
new verbs and maybe new types etc.

The real win in the COPS stuff is the delegation model not in the
other optimizations, such that the highest level manager would not
need to understand everything. This leads to a suggestion that the
agent needs to be more intelligent.

Possibly, we are mixing two concepts: The first is the high level view
- an instance could be device specific vs. role specific. The second
is the delegation - the distribution of policy (or other information)
from the top level to all affected items, or to a central item that
then can distribute the information to other affected items.

This gets us on the slippery slope of too fine a granularity. 

This is a MIB design question not a new technology. Some people are
pushing for a PIB that looks like the MIB. In Diffserv, they don't
want to have multiple versions. As an example of the problem, assume a
router with four blocks. We are allowed to build things out of these
four blocks that are fine for a router on the edge, but are not
necessarily useful in all routers, such as might be used within
Diffserv.

Some working groups won't be specifying high level MIBs due to a lack
of motivation. So you're saying that COPS & PIBs are being proposed to
overcome a procedural problem? Sort of. We could fix this by sending
people to the right places. However, we can't do the work for them.

Would it be acceptable if we convinced people to write multiple MIBs,
one for element type information, and one at a higher (network) level?
That would not be adequate; there are other items as well.

The lack of progress in SNMP has been a problem. The simplicity of
COPS is good, as is the limited bits on the wire. We could allow the
use of PIBs as MIBs when a PDP is not involved; the PIBs can be
manipulated via SNMP, though the byte count is higher.

We would need to get the work done. As was mentioned, stability and
perceived stability are important. We need to figure out how we would
do some of these things. A new application might avoid the stability
problems. However, that wouldn't address the byte count problems
etc. We might need a new, more efficient underlying transport
mechanism. We would get less of the optimizations of COPS and we might
be incompatible.

There is trouble getting SNMPv3 deployed. Why would COPS be easier to
deploy? There is no deployed base and the customers don't have any
happiness to worry about.

It is amusing that the SNMP folks think that COPS will get out there
faster and that the COPS folks think that they will get things out
faster if they are attached to SNMP. One comment was more about
customers, the other more about vendors.

There followed a discussion of how the policy stuff would be presented
(as an SNMP application or as something else).

The use of SNMP has the perception that it would take longer and still
be non-optimal due to the baggage associated with older versions. Some
would prefer to use SNMP but we would want it soon. Would having a set
of SNMP specs available in 6 months be acceptable?

Here is an attempt at a compromise. Cops already exists for QoS, so
one could argue that including Diffserv is not adding a new protocol;
meanwhile SNMP could go off and try and implement something else that
might help. Sometime later, we could compare to see where we are,
whether the SNMP solution is better or worse than the COPS solution,
and whether one or both should be continued.

The COPS-PR promoters support forming a working group for using SNMP
for policy management as long as IESG would not block development of
COPS for provisioning with a review in the future.

Notice that this may be a problem in the future if the decision is to
kill COPS, as people are unlikely to want to let the IESG decide; the
COPS folks may want to let the market decide.

Another comment that things take too long.

We agree that COPS optimizes some pieces (PDP to PEP), some of which
(perhaps all) we might get by working on with SNMP. Such an
optimization might cause problems elsewhere.

jon: Talks about the rap schedule

The Diffserv PIB has no current home due to the question of COPS
vs. SNMP. This was done by the WG chairs, and it is not in the current
charter anywhere. The Diffserv folks aren't all that interested in
writing MIBs or PIBs to do the provisioning. Any required MIBs or PIBs
should be done in the technology working group.

Where is the sopi language defined? It is currently not defined
anywhere. A new document should be coming soon.

Are there concerns about parallel development? Yes, deployment wins
and so whatever gets deployed first probably is not going to be
replaced. A suggestion is to get a group together to work on an SNMP
version and see if they can get it done in 4 (or whatever) months, and
compare them then. Would these be on the standards track?  Perhaps
both should start on the experimental or informational track. There
seems to be general agreement on that.

We might get a better product if all effort were focused on one
product. Not necessarily, competition is a good thing. This would not
be duplication and would be a good thing to do both. Things going on
the proposed track would send the wrong impression; experimental would
be a good thing.

There is a BOF on Thursday is to help the IESG to make up their mind.
Do we recommend trimming the amount of time spent on requirements in
the BOF? Spending more time on the type of discussion we have had here
in the BOF would be a good thing. Bert discussed some of what should
happen in the BOF.

We have reached consensus on some things: 
1) MIB design impacts complexity of implementation, 
2) both high level and low level views are needed, 
3) agents can do more, and
4) we have a lot to learn for managing network-wide behavior.

An example of implementation complexity and MIB design is given.
* set ifoperstat.0 to down
* set ifoperstat of "foo" to down
* set ifoperstat of "foo*" to down

What is our recommendation? Many agree that if we don't do anything to
improve SNMP for this use, COPS should win.

There is a proposal that both should go to experimental while SNMP
should attempt to fix the problems that have been identified. Going to
experimental implies that one of them may be dropped in the future,
probably when things would go to proposed.  A decision was promised so
we should have either a decision or perhaps a time period. Who (or
what group) should be the "author" of the recommendation?  Bert
suggests the NMRG, there doesn't seem to be any problems with that.

Do we have a recommendation? It is not clear we have one.

Recommendations:
1) Find people to start work on the documents that will need to be
   discussed. A list of potential volunteers is captured.
2) Create a design team (either official or independent) (not in snmpv3 WG)
3) Create a working group at some point. Don't open the wish list to
   everybody until we make some progress. We would accept COPS people
   if they want to bring the proposals together.