INTERIM_MEETING_REPORT_

Reported by Robin Iddon/AXON Networks and Jeanne Haney/Bay Networks

Minutes of the Remote LAN Monitoring Working Group (RMONMIB)

An interim meeting of the RMONMIB Working Group was held in Santa Clara,
CA on 15-17 May.  The meeting was sponsored by cisco Systems.


Agenda

  o Protocol directory
  o Protocol distribution
  o Address mapping
  o Network layer host/matrix
  o Seven-layer host/matrix
  o Relative offset filtering
  o Time filter
  o Probe capabilities
  o Generic control table issues
     -  dropped packet counter
     -  lastActivationTime
     -  lastDeleteTime elimination
     -  tableSizeRequested/Granted
  o Seven-layer topN/history
  o RMON1 additions
  o User history
  o Probe config MIB
  o Dynamic protocol discovery
  o Channel as dataSource


The following notes are intended to provide a overview of the issues
discussed at the meeting.  Refer to the upcoming draft for detailed
changes.


Protocol Directory

Issues involving the protocol identifier format were discussed.
Concerns over OID tree data explosion led to a new ID format using an
OID to represent the protocol layering and an octet string to represent
the attributes or parameters associated with each protocol layer.

OID tree data explosion issues:

    (1) Reference document explodes.
    (2) protocolDirectoryTable grows by same factor.
    (3) Agent code grows potentially.
    (4a) The number of stats/host/matrix rows may grow also.
    (4b) The number of filter entries grows also.

Agreed to add some kind of protocolDirType to indicate whether or not
this node could be (user) extended.

Adopted proposal:


    protocolDirEntry
        protocolDirID -- OID { ip.udp.tftp }
        protocolDirOptions -- OCTET STRING, 8-bit per sub ID

    INDEX { protocolDirID, protocolDirOptions }


There must be exactly one 8-bit option per sub ID in protocolDirID. The
intent is that all protocol defs have exactly one option -- those that
need none use zero; those that need one or more must define how to
combine their values into a single 8-bit.

For WANs in particular, there is real concern that there is no way to
handle the multitude of link layer encapsulations.  Previously we hoped
to allow vendors to insert their own subtrees; we can still do the same
thing provided we identify the places where it will occur in advance and
provide for vendor-extension bit.

Agreed to remove protocolDirParentID pending further discussion.

Agreed to add an ``unknown network layer protocol enumeration'' which
handles all cases where absolutely nothing could be determined about a
packet (except, its mac addresses and length.)



Protocol Distribution


A proposal came up to add size distribution to this table.  Discussion
over the granularity of the buckets led to a proposal to use three
buckets:  < media-min, >= media-min, and > media-max.  Agreement could
not be reached and size distribution was dropped from consideration.

Proposal to use protocolDirIndex (aka local integer) in the
protocolDistTable INDEX { protocolDistControlIndex, protocolDirIndex }.
Agreed that if there was no other use for the protocolDirIndex then this
table will revert to its original use of protocolDirID.

Discussion of fragmentation and whether we are interested in monitoring
higher layer fragmentation (i.e., whether we want to try and provide
counters which instrument fragmentation at all layers) -- generally the
group appears not to be interested in directly counting fragmentation at
any layer.



Address Mapping

Much discussion was made of whether to include the addressMapIfIndex in
the INDEX (and hence to differentiate rows on different interfaces that
are otherwise the same).

There was discussion on how much effort it is for the NMS to utilize
this table.  Possible problems include:


  o How the NMS maps RMON1 host addresses through this table without
    totally uploading the table?

  o Whether the NMS uses the random access capability.

  o Should ifIndex be replaced by an OID to allow it to point to a
    repeater port?  Data source still tells you which network the data
    came from.


Include controlIndex instead of ifIndex as (a) ifIndex is being replaced
and (b) do not want to keep port histories (which would happen if a
device moved from one port to another and the OID was part of the
INDEX).

Agreed to INDEX { protocol, address, controlIndex }

Agreed to incorporate portOID into addressMapEntry -- intent to point at
point of origin of this device (best guess of agent).

After much discussion about NMS control of agent resource utilization it
was agreed that the protocolDirectory should contain a set of flags to
control usage of this protocol.  At a minimum this should control
whether or not a protocol is used in maintaining the address mapping
(hence it appears in this section of the agenda).  Ideally we would also
have a few more flags to enable usage in the protocolDistribution and
the host/matrix tables.



Network Layer Host/Matrix

Discussion of using an enumerated value vs.  protocol dir index led to
further discussion of protocol directory `counting' issues and the need
to control which protocols are counted in which tables:


  o One idea is to turn on/off a protocol via the protocol dir table.
    This means you collect the same protocols for all interfaces and
    all application tables.  This seems very restrictive.

  o The second idea is to define the protocol channel which defines a
    set of protocols that a control entry points to, to determine which
    protocols it is collecting.  The control tables would still have a
    separated data source value (i.e.  not tie with protocol channel,
    so protocol channel can be shared across several control tables).
    This serves two purposes.  It allows the NMS to give the agent help
    in conserving its resources.  It also makes the tables smaller to
    retrieve so it helps the NMS.

  o The final choice was to turn the protocol on/off on a per
    application (Net Map, Matrix, Host, etc.).  You cannot control it
    on a per interface basis.  You cannot control it on a per control
    table basis.  This is the one that most people voted for.


The counters within the nlHostTable were discussed:


  o nlHostOutErrors discussion -- agreed object removed.

  o nlHostOutMACNUCastPkts agreed to replace nlHostOutBroadcastPkts,
    nlHostOutMulticastPkts.

  o nlHostOutFragmentPkts agreed not to implement this class of
    counter.


nlHostEntry creation was discussed.  Certainly do not insert on MAC
error packets; do insert on new source address.  There was some
discussion on whether or not to insert on destination address.  It was
finally agreed to insert on good source and destination addresses but
that the agent may need to use an improved aging technique to eliminate
the host destination addresses generated by programs which ping
sequential addresses in an attempt to discover which hosts exist.

Agreed to drop hlMatrix[SDjDS]Errors.

Agreed to keep both DS and SD tables (despite their being good reasons
not to).  It was deemed (a) too complex to dismiss the NMS's inability
to easily know of some classes of uni-directional conversations and (b)
the overheads on the agent are not severe enough to make the pain of
pushing this through worth doing).

Agreed to not do subnet aggregation because there was no standardizable
proposal and no one volunteered to do one.



Seven-Layer Host/Matrix


Three models were discussed based on nl/sl host tables:


 1. Merge them

 2. Keep them separate but closely related so that the agent can be
    efficient

 3. Keep them totally independent


Long discussion over the product class<->mib group mapping followed.

Eventually the group came to a vote on:


 1. Single control table causing a nlHostTable and slHostTable to be
    constructed (related solution 1' recognizes that within the single
    control table entry will be parameters specific to the nl and sl
    tables, e.g., rm2HostControlNlMaxDesired and
    rm2HostControlSlMaxDesired).

 2. Merge both tables (voted out 1 for merge, 16 against).

 3. Split control tables but slHostControlTable depends on an instance
    of nlHostControlTable.  Notice that this is also the same functions
    as 1'.

 4. No sharing of data, hence duplicate memory requirements!
    (Deleted.)


Proposal 1' was accepted over 3.

Steve will add a straw proposal for the combined sl/nlHostControlTable
in the next draft.

slHostEntry will contain only inPkts/outPkts and inOctets/outOctets.
slHostEntry will not contain slHostAddress, instead INDEX will reference
nlHostAddress, and words will be added to ensure that for each
slHostEntry there must be an nlHostEntry with the same address and hence
deleting an nlHostEntry will cause deletion of the associated
slHostEntries.

Misconfiguring the protocolDirectory such that slHost function is
enabled for a protocol but nlHost function is not enabled for its
network layer protocol causes no data to be collected in either table
for this protocol (because there are no nlHostEntries to relate
slHostEntries to).

Proposal adopted:
INDEX { controlIndex, protDirIndex(addrType), nlHostAddress,
protDirIndex(protocolType) } and that the slHostTable contain neither an
address nor a MACNUCastPkts counter.

A proposal was adopted to include a bit/enum in the protocolDirectory to
indicate whether or not a network layer address is available for this
protocolDirectoryEntry (it would not make sense to set this bit for
ip.udp, for instance, but it could be set for both the ip entry and the
ip.udp.appleTalk entry; an agent would set the bit if it supports the
protocol as a network layer protocol and not if it supports it only as
an application protocol).  Ideally we would incorporate this into the
nodeType object.  This is not something to be placed in the parameters
object because it can only relate to the final protocol of the OID, not
all of them).

Proposal for slMatrix is:
INDEX { controlIndex, protDirIndex(addrType), sa, da, protDir(protType) }

Agreed to let Steve apply results of the nl/sl host table discussions to
the matrix and so avoid long discussions over basically the same
subject.

Agreed to move forward to the topN/history on host/matrix tables out of
order because we want to discuss it in the context of the host/matrix
tables.

Discussion of data table columns:


  o Issue of error counters.  What does it include?  Why count L2
    errors by protocol.  Errors can propagate up to this table.  It is
    too hard to make it meaningful to count network layer errors.
    Therefore we will leave it out.

  o Bcast and mcast?  Could there be permutation of bcast/mcast at the
    L2 level and bcast/mcast at the L3 level.  Is a broadcast to MAC
    addresses with a multicast IP address counted as bcast or mcast.
    Robin believes that the impact on the net is the fact that it is
    bcast, i.e., everyone received and processed it.  We decided that
    we are merging the bcast and mcast counts into one counter.  We are
    still counting L2 counters with an NLHostOutNUcastPkts (not
    unicast).  Get rid of Broadcast, Multicast, and Errors.

  o Robin proposes an OutFragment counter that only bumps up when
    fragments are detected from a particular SA. Most people abstained,
    so it is a closed issue.  Fragments are not counted.

  o We discussed not adding entries to the Host Table based on DA, so
    that the table does not get filled up with erroneous addresses from
    MIB sweeps, etc.  On the other hand there are L3 broadcast
    addresses in video multicast addresses that will never appear in
    the source.  Maybe we can use a different aging algorithm so
    entries without out pkts, get deleted sooner.  But then would you
    be deleting these interesting mcast and bcast pkt as frequently as
    these bogus sweep addresses.

  o Good packets for this table is defined as good MAC packets.

  o Drop the Matrix error counters, do not add the bcast counter, they
    can get them from the host table.  Remove nlMatrixSDAddressType.


Discussion of encapsulated network layers (e.g., IP in IP):


  o The problem of NL layer protocols being wrapped on other NL
    protocols, causes some problem in the how to count the pkt and what
    the NL address is.  How do you record both NL address.  Do you
    consider the encapsulated protocol to be application?  There is no
    place to save the encapsulated NL address.

  o Steve proposes an address structure that encode what the protocol
    is so that we can model both NL protocols and NL protocols
    encapsulated in other NL protocols.  Should we try to solve this
    problem?  (Vote:  8-2-5.)  Now the NL tables could have entries
    that count a pkt twice, since the NL table accounts for all NL
    protocols, not just the NL usage at this particular probe in the
    network.  Not all probes need to implement this, but all NMSs need
    to be aware of this anomaly.  I.e., if you take all the entries for
    a particular NL Host, they could total up to more than 100% of the
    Net utilization for that Host.  How does this affect the protocol
    distribution table.  There would be a protocol directory entry for
    AppleTalk with IP and it would be counted in the prot distribution.

  o The upshot of the vote to handle protocols that may be encapsulated
    within other protocols, how you might represent the addressing.
    Can we change the network address mapping table to record this
    information that we have learned from encapsulated NL protocols.
    Add pDir Index as last index to the slHostTable (and slMatrix) NL
    -- address object nonUnicasts, SL -- pDirIndex on end Add a
    bit/boolean to pDirTable that defines whether addresses are
    recognized for that protocol.



Relative Offset Filtering

There was a lot of discussion of various filtering related topics.  In
the end it was agreed to treat the channels as data source issue
elsewhere.

Agreed to pursue filterLogicTable and mod to filterChannelIndex
0..65535.  Robin to write up proposal (15 for, 0 against, 2 abstain).



Time Filter

After an example and some discussion it was agreed to implement time
filter as proposed (15 for, 0 against, 1 abstain).

It was also agreed that the timeMark goes in between the control index
and the rest of the index.


Probe Capabilities

We discussed probe classes and the nl/sl split.  We finally closed with
nl/sl remain different tables (7 for, 3 against, 3 abstain).

Next we voted on whether or not any kind of capabilities object was
needed; in favour (11 for, 1 against, 2 abstain).

Next we discussed per-interface vs.  per-device capabilities.  First
vote on scalar only (per-device) (7 for, 2 against, 4 abstain).  Scalar
adopted.


Generic Control Table Issues

 A) Dropped Packet Counter

    There was a lot of discussion about how the counters work and what
    they are (and are not) intended to do.  In the end it was agreed
    that these counters are not intended to enable the agent to do
    statistical sampling/scaling.  Indeed the notion of scaled data in
    the RMON2 tables is explicitly precluded (the group cannot define a
    scaling algorithm that is universally appropriate).  Finally there
    was debate over whether statistical sampling and scaling were
    really the only solution to the 10x media speed increases, and
    while there was no agreement the discussion polarized between those
    that felt that the current agent technology would enable 100MBit
    and those that did not.

    It was agreed that there would be one droppedFrame counter per
    control entry by default but that for some groups/functions we may
    decide to use a scalar should that prove more appropriate.

    It was agreed that the [etherjtokenRingP]StatsDropEvents would
    continue to exist in RMON2 agents and that its semantics would be
    unchanged.  The following rules define how the fooDropFrame counter
    (from the fooControlEntry) relates to the
    [etherjtokenRingP]StatsDropEvents counter and
    [etherjtokenRingP]StatsPkts counter for the same interface:

     1. For each time the agent recognizes that one or more packets
        have been missed without it knowing exactly how many were
        missed it must increment the dropEvents counter for that
        interface.  This is the only time that the dropEvents counter
        is incremented.

     2. Whenever the agent chooses not to update a table/data
        collection function based on the contents of a packet which it
        knows was present on the network it must increment the
        droppedFrames counter for that table/function.

     3. For all packets which are not lost in (1) above or dropped in
        (2) above the agent must update tables/data collection
        functions accurately.

    Two results of applying these rules are:

     1. The sum of all packet counters in a table or data collection
        function (e.g., the hostOutPkt counter) plus the associated
        droppedFrame counter should be exactly equal to the sum of the
        [etherjtokenRingP]StatsPkts and [etherjtokenRingP]DroppedFrames
        counters for the same data source.  Of course this assumes that
        the there are enough resources in the agent such that the table
        is not being LRU'd.

     2. For all agents where the dropEvent counter is zero the sum of
        the droppedFrame and Pkt counters in a given table or function
        on the same interface should be exactly equal to the number of
        packets that there were on the network.

    It was agreed that there should be strong recommendations for RMON2
    agents to utilize the droppedFrame counters as a means of
    accurately reporting the number of frames missed and that if at all
    possible the dropEvents counters should never be incremented -- in
    this way an NMS can use the data with much higher confidence.

 B) lastActivationTime

    Proposal to have this object set to sysUpTime at the point in time
    this control row's status transitioned from not active to active.
    This lets the NMS notice that another NMS restarted data collection
    (without picking a new control index) and so deltas will be
    invalid.  It also gives an indication of the age of the table (but
    may not be used to rate the first ever poll -- the data counters
    still do not have to start from zero and so you do not know the
    delta over the interval).

    Agreed to adopt proposal (13 for, 3 abstain, 0 against).  Notice
    that we will decide later which tables and functions to apply this
    to.

 C) lastDeleteTime Elimination

    Discussion -- it was agreed that lastDeleteTime was easy to
    implement, but it is also agreed that it was designed specifically
    for creationOrder which no longer exists.

    Proposal is to replace tableSize and lastDeleteTime with
    insertCount and deleteCount (where insertCount - deleteCount ==
    tableSize).

    Agreed unanimously to adopt.

 D) tableSizeRequested/Granted

    Proposal to implement a maxDesired (i.e., a ceiling) per
    controlEntry.  0 implies consume as much memory as is
    required/available.  > 0 instructs the agent to create at most this
    many data table entries associated with this control entry -- once
    this ceiling is reached the agent should delete old resources
    (associated with this control entry) in order to create new rows.

    Agreed to adopt proposal (16 for, 0 against, 1 abstain).

    Notice that we later had a discussion which suggested a valid use
    of zero would be for the new hostTable where the control entry
    creates both nlHostTable and slHostTable; a user who did not want
    an slHostTable on an interface might use 0 to indicate that.
    Perhaps we should use -1 to imply unlimited rather than zero.


Seven-Layer topN/history

Agreed to do any kind of topN in addition to the RMON1 stuff (8 for, 0
against, 7 abstain).

Agreed to do slMatrixTopN (7 for, 0 against, 0 abstain) Marginally
agreed to do nlMatrixTopN (5 for, 1 against, 5 abstain) Agreed to not do
slHostTopN and nlHostTopN (1 for, 5 against, 7 abstain and 0 for, 4
against, 7 abstain respectively).

Agreed not to support TopN by protocol (1 for, 10 against, 4 abstain).

A real proposal bringing together all the best ideas of how to do TopN
on the nl/sl matrix tables is needed -- Steve, Matt and Shay to get
together on producing this proposal.


RMON1 Additions

 1. netUtilization

    Etherstats gives you the number of octets seen.  Robin proposes
    that we provide a count of the number of bits and include interpkt
    gap and the preamble.  This gives you a better approximation of
    utilization.  Bytes seems like a better unit to use, then the
    counter will not wrap as readily.  It still is the same way another
    analyzer would calculate utilization.  We still run the risk that
    RMON gets compared with these analyzers and is not identical.  So
    the question is, is the esterStatsOctets value a good enough
    approximation to get utilization or do we want to provide a new
    object that counts more of the overhead.  People seem to favor just
    sticking with the original counter and obtaining an approximation
    to utilization for thresholding via Alarms.
    The group voted to use the octets approximation and not add any new
    bandwidth utilization indicators.

 2. filterDescr

    Proposal withdrawn without opposition.

 3. [filter changes]

    Robin to make proposal on the list based on what was discussed at
    the meeting (i.e.  the filterLogicTable with m:1 relation
    reversed).

 4. Control table additions

    The group considered four additions and how they applied to each
    control table:

    (a) insert, delete counters
    (b) maxDesired
    (c) activationTime
    (d) droppedFrames

    EtherStatsTable+TokenRingPStats+TokenRingMLStats
        activationTime, droppedFrames

    HistoryControlTable
        Nothing

    EtherHistoryTable+TokenRingPHistoryTable+TokenRingMLHistoryTable
        droppedFrames

    AlarmTable
        Nothing

    HostControlTable
        maxDesired, activationTime and droppedFrames
        (maxDesired needs note in implementors guide, apparently)

    HostTable/HostTimeTable
        Nothing

    HostTopNControlTable
        Nothing

    HostTopNTable
        Nothing

    MatrixControlTable
        Same as hostControlTable

    MatrixSD/DSTable
        Nothing

    FilterTable/ChannelTable/BufferTable
        Nothing

    EventControlEntry + LogTable
        Nothing

    RingStationControlTable
        activationTime, droppedFrames

    SourceRoutingStatsControlTable
        activationTime, droppedFrames

 5. Storage type

    Steve to propose an object which is per-control row and indicates
    what NVRAM processing an agent has performed on that row (ROM,
    will-write, wont-write, written).  (7 for, 0 against, 4 abstain).

 6. Alarms enhancements

    Make it robust when monitored OID disappears.

    It was agreed that Steve would produce a draft based on an
    alarmValueStatus object which defines whether the agent managed to
    get the value last interval, an alarmValueUnavailable event/trap,
    an alarmUnavailableEventPollThreshold (i.e.  the number of
    unavailable intervals before generating the event).

 7. WAN status bits

    It was agreed that bit6 will be supported in the pktStatus bitmask
    as the packet direction bit.  Further study of bit7 (other physical
    errors) will be done, but this bit needs to be more clearly defined
    before it can be adopted.


User History

Get rid of objectsGranted.  BucketsRequested cannot be changed after row
goes valid.  Otherwise it stands as is.


Probe Config MIB

In this section OK means that we accepted to do it -- there were no
votes as such, just a call for objections.


  o probeID, probeFirmwareRev, probeHardwareRev OK. Discussion on
    converting probeDateAndTime from ASCII into v2 DateAndTime TC
    (except we will do our own TC which is length 0 or 8 or 11 to allow
    optionality) OK.

  o probeResetControl OK.

  o probeDownloadFile, etc.  (5 for, 0 against, 7 abstain) OK.

  o serialConfigTable:  Agreed to keep serialIP and serialSubnet.

  o The agent might need to implement the two tables that contain line
    speed and flow control objects, but we will try to get away without
    doing it; should it be rejected elsewhere we will have to adopt
    usage of the appropriate serial MIBs instead (charPortTable and
    portTable).

  o Modify serialConfigProtocol slip(1), ppp(2), other(3).

  o Take all modem string DEFVALs and make them comments instead.

  o Rename serialTrapTimeout to serialDialoutTimeout OK.

  o netConfigIpAddress/netConfigSubnetMask OK. Remove netConfigIfSpeed
    and
    netConfigIfRingNumber.

  o trapDestIndex, trapDestCommunity, trapDestIpAddress, trapDestOwner,
    trapDestStatus OK (8 for, 0 against, 2 abstain).

  o serialConnect:  index, dest ip, connection type (direct, modem,
    switch, switch-and-modem), dial/connect strings, owner, status.
    OK.



Dynamic Protocol Discovery

Populate the prot directory that is sensible at startup.  Then the agent
could add some protocols that it discovers existing on the net.  The
assumption is that the agent was capable to decode those protocols all
along, but there is such large set of them and they may never appear on
the net.  It is OK that these added protocols grow more than a single
level, as originally thought.  It is up to the probe whether to turn it
on or not for collection.  How do we document these in the protocol
document and provide options for these fields?

Further discussion on the mailing list is needed.


Channel as dataSource

There was a vote on how many people violently object to using channel as
data source (2).

Those that wanted the standard to be changed to mandate that an agent
must allow channel as data source (3).

Those that want to leave the standard as is (and accept that there will
continue to be proprietary extensions) and that the behaviour of any
other kind of data source value is undefined (11).

We voted to modify the text that states ifIndex is the only recognized
dataSource that all should support, but that other values are not
illegal -- just considered out of scope.