Network Working Group                                      C. Partridge
Request For Comment: 1021                                      BBN/NNSC
                                                            G. Trewitt
                                                              Stanford
                                                          October 1987

            THE HIGH-LEVEL ENTITY MANAGEMENT SYSTEM (HEMS)

STATUS OF THIS MEMO

  An overview of the RFCs which comprise the High-Level Entity
  Management System is provided.  This system is experimental, and is
  currently being tested in portions of the Internet.  It is hoped that
  this work will help lead to a standard for IP internetwork
  management.  Distribution of this memo is unlimited.

INTRODUCTION

  Until recently, a majority of critical components in IP networks,
  such as gateways, have come from a very small set of vendors.  While
  each vendor had their own set of management protocols and mechanisms,
  the collection was small, and a knowledgeable system administrator
  could be expected to learn them all.

  Now, however, the number of vendors has grown quite large, and the
  lack of an accepted standard for management of network components is
  causing severe management problems.  Compounding this problem is the
  explosive growth of the connected IP networks known as the Internet.
  The combination of increased size and heterogeneity is making
  internetwork management extremely difficult.  This memo discusses an
  effort to devise a standard protocol for all devices, which should
  help alleviate the management problem.

  The RFCs that currently define the High-Level Entity Management
  System are this memo along with RFC-1022, 1024, and 1023.  This list
  is expected to change and grow over time, and readers are strongly
  encouraged to check the RFC Index to find the most current versions.

MONITORING AND CONTROL

  Historically, the IP community has divided network management into
  two distinct types of activities: monitoring and control.  Monitoring
  is the activity of extracting or collecting data from the network or
  a part of the network to observe its behavior.  Control is the
  activity of taking actions to effect changes in the behavior of the
  network or a part of the network in real-time, typically in an
  attempt to improve the network's performance.




Partridge & Trewitt                                             [Page 1]

RFC 1021                     HEMS Overview                  October 1987


  Note that the ability to control presupposes the ability to monitor.
  Changing the behavior of the network without being able to observe
  the effects of the changes is not useful.  On the other hand,
  monitoring without control makes some sense.  Simply understanding
  what is causing a network to misbehave can be useful.

  Control is also a more difficult functionality to define.  Control
  operations other than the most generic, are usually device-specific.
  The problem is not just a matter of providing a mechanism for
  control, but also defining a set of control operations which are
  generally applicable across a diverse set of devices.  Permitting
  remote applications to exercise control over an entity also implies
  the need for a suite of safeguards to ensure that unauthorized
  applications cannot harm the network.

  Because monitoring is the key first step, in this initial design of
  the system, the authors have concentrated more heavily on the
  problems of effective monitoring.  Although the basic control
  mechanisms are defined, many components need for control, such as
  strong access control mechanisms, have not been fully defined.

OVERVIEW OF THE HEMS

  The HEMS is made up of three parts: a query processor which can
  reside on any addressable entity, an event generator which also
  resides on entities, and applications which know how to send requests
  to the query processor and interpret the replies.  The query
  processor and applications communicate using a message protocol which
  runs over a standard transport protocol.

The Query Processor

  The query processor is the key to the management system.  It
  interprets all monitoring and control requests.  For optimal network
  management, we would like to see query processors on most network
  entities.

  To encourage the implementations of query processors, one of the
  primary goals in designing the query processor was to make it as
  small and simple as possible, consistent with management
  requirements.

  Defining the management requirements was no small task, since the
  networking community has not yet reached a consensus about what kinds
  of monitoring information should be available from network entities,
  nor what control functions are required to properly manage those
  entities.  The standards for HEMS were developed through discussions
  with several interest groups, and represent the authors' best effort



Partridge & Trewitt                                             [Page 2]

RFC 1021                     HEMS Overview                  October 1987


  to distill the varying sets of needs.

  The authors settled on a system which was extensible, robust and
  host-architecture independent, and as simple as possible, consistent
  with the other goals.  Extensibility was essential because it is
  clear that management needs will continue to evolve, and a closed
  system which could not be changed would be obsolete almost as soon as
  it was defined.  Unfortunately, extensibility is also the requirement
  least consistent with simplicity since the need to make the system
  extensible led the authors to use self-describing data formats and an
  interpreted query language.

  A robust system is required if the system is to be useful for
  diagnosing network failures.  If the monitoring system cannot survive
  at least moderate network failures, it is not useful.

  The query processor is designed to be highly extensible.  An
  application sends the query processor instructions about objects to
  be examined or changed.  The query processor locates the objects in
  its host entity, and performs the requested operations.  The objects
  are self-describing, using the binary-encoding scheme defined in ISO
  Standard ASN.1.  Care has been taken to use a limited set of the
  ASN.1 coding set, so that query processor's handling of data can be
  optimized.

  It is a key feature of HEMS that messages to the query processor
  contain multiple instructions.  The authors felt that this would give
  much higher performance than a remote procedure system which limited
  an application to one operation per message.

  The set of maintained objects is standardized across all entities.
  Every entity is required to manage a small set of objects.  In
  addition, entities of a particular type (e.g., a gateway) may be
  required to manage a larger set of objects, which are optional on
  other entities.  Entities are also permitted to make additional,
  entity-specific objects available to applications.  A method for
  discovering the existence of additional objects is defined.

  The combination of self-describing data, the ability to add to the
  standard data set, and a query language which can be easily enhanced
  appeared to offer the necessary extensibility.

Event Generator

  On many network entities, particularly critical network components
  such as gateways, it is necessary to have a way for the devices to
  send unsolicited status messages to network management centers.  In
  the IP community, these messages have historically been referred to



Partridge & Trewitt                                             [Page 3]

RFC 1021                     HEMS Overview                  October 1987


  as "traps", but for compatibility with the ISO nomenclature, in the
  HEMS system they are called "events".

  In the HEMS system, events are handled as slightly specialized
  replies to queries, and are sent to one or more management centers.
  Like all other HEMS messages, events are formatted in ASN.1 format.

  Each event is given a well-known code, which is standardized across
  all entities.  Provision is also made for entity specific event
  codes.

Applications

  The HEMS expects that applications will be more intelligent than the
  query processor.  Among other functions, the applications will have
  to be able to identify and parse entity-specific values which may be
  returned.

  The details of applications are largely not discussed in the HEMS
  specifications because there is very little that needs to be
  standardized.  Applications must send requests using the protocols
  discussed in the next section, but the interfaces the applications
  provide for displaying monitoring or control information are entirely
  application dependent.

Protocols

  Query processors and applications communicate using an application-
  specific monitoring protocol, the High-Level Entity Management
  Protocol (HEMP).  This protocol provides the formatting rules for the
  queries and their replies.

  HEMP runs over a standard transport protocol.  There was a certain
  amount of debate in the community about what type of transport
  protocol was best suited for monitoring.  The key issue was how
  reliable monitoring interactions needed to be.

  The authors expect that three types of management activities will
  predominate: status monitoring, firefighting, and event reporting.

  Status monitoring is envisioned as occasional retrieval of monitoring
  information, possibly in response to the receipt of event messages.
  In these situations, the network is expected to be in good working
  condition, and monitoring exchanges could probably comfortably work
  with an unreliable transport protocol.  The chance of data loss is
  small, and probably not a serious problem since the data is unlikely
  to be so important that it must be reliably delivered.  (However, it
  should be noted that some applications may prefer reliable delivery



Partridge & Trewitt                                             [Page 4]

RFC 1021                     HEMS Overview                  October 1987


  because it is more convenient.)

  Firefighting is a completely different situation.  In this scenario,
  one or more sites are using management applications to try to locate
  and fix a network problem.  Here we must assume that while the
  network functions (i.e., data can get through), it is not very
  healthy.  We should assume that packets are being lost, that network
  routes will be non-optimal and that it is essential that the
  monitoring data (which is presumably diagnostic) get back to the
  application and that control requests are reliably delivered to the
  entity.  In such circumstances, a reliable protocol is essential.

  Events provide yet another bit of complexity.  Events contain useful
  status information, but experience suggests that this information
  does not have to be delivered reliably.  If the problem is serious
  enough, it will re-occur and the event will be sent again.
  Furthermore, events will often be sent to more than one management
  center, which would appear to preclude the use of connection-
  oriented, reliable protocols such as TCP for events.

  The current decision has been to establish two possible transport
  options for HEMS.  More experimental systems may use the Versatile
  Message Transaction Protocol (VMTP), an experimental IP transaction
  protocol.  Near term production systems can use a combination of the
  Transmission Control Protocol (TCP) and the User Datagram Protocol
  (UDP), as described in RFC-1022.

Compatibility with Common Management Information Protocol (CMIP)

  Several groups have expressed interest in being able to develop
  applications which can use both HEMS and the emerging ISO-defined
  Common Management Information Protocol (CMIP).  It turns out that
  such a co-existence is feasible, and the authors have made an effort
  to accomodate it.

  At the highest level, both CMIP and HEMS perform operations on
  objects stored in remote entities, and both systems use ASN.1
  formatting to represent those objects.  This makes it possible to
  develop a standard set of interface routines which can be used to
  access either system, even though underlying mechanics of the systems
  are quite different.










Partridge & Trewitt                                             [Page 5]