Minutes of the

Minutes of the
IP Performance Metrics Session of the
Benchmarking Methodology Working Group

Reported by
Paul Love and Guy Almes

1. Overview and Agenda Bashing

The meeting was chaired by WG co-Chair Guy Almes, and was very well-attended.
This was the second IPPM session held in Europe, the first having been in
Stockholm in summer 1995.

Proposed agenda:
Charter Issues
Revised Framework Document
Early Experience with One-way Delay & Packet Loss
Revised Treno Document
Interest Talk: Raj Bansal, Nokia Research Center
Interest Talk: Martin Horneffer, Univ of Cologne

Despite making the traditional invitation to agenda bashing, there were no
changes suggested.

2. Charter Issues: Guy Almes and Jamshid Mahdavi
[slides included in proceedings]

The chair reviewed the history of the IPPM effort, beginning with an IPPM
BOF session at the Danvers (spring 1995) meeting, the inclusion of IPPM as
a separate effort within the Benchmarking Methodology working group and the
Operational Requirements Area, and progress to date.

For several reasons, it is now suggested that we reorganize as a separate
IPPM working group within the Transport Area. One reason is the increasing
awareness of the close relation between IP path performance metrics and
transport-level dynamics.

In order to form a new working group, we need a charter. The chair asked
Jamshid Mahdavi to draft such a charter, since he was active in IPPM since
the Danvers BOF and, together with Matt Mathis, had given thought to such a
charter then. Jamshid's draft, with edits along the way, has been reviewed
by the Transport Area ADs and circulated on the IPPM list. This edited
draft, shown paragraph by paragraph on the overhead projector, was then
reviewed by the working group.

Guy noted that we were about 50% through our "to do" list, just as we're
becoming a standalone WG. Services peripheral to layer-3 IP services, such
as NOC/NIS services, will fall outside IPPM's scope.

Scott Bradner: We're encouraged to make the framework doc to be an
Informational RFC ASAP.
Discussion then ensued on the meaning of the word "standard" and its usage in
our charter. This is a difficult issue, and was not resolved during the
meeting. One view is that our metrics should be documented as Informational
RFCs, though with a process within the working group that follows the spirit
of the IETF standards process. Another view is that the metrics should
follow the IETF standards process as Proposed, then Draft, then (full)
Standard RFCs.

Questions from the floor asked for clarification on the relationship
between IPPM and OpStat, RTFM, and BMWG.
Briefly, the OpStat WG concluded some time ago, and was focused on statistics
meaningful to providers (from within the cloud); IPPM focuses on metrics
meaningful to users (from outside the cloud). RTFM focuses on the nature of
flows rather than on the performance 'seen' by users, but it is very much
acknowledged that RTFM and IPPM have much to share with each other. The
remaining BMWG charter should be revised to clarify that IPPM issues fall
outside it.

Based on several comments, the draft charter will be further revised and
is expected to result in the formal establishment of an IPPM working group
shortly.

3. Revisions to the Framework Document: Vern Paxson
[slides included in proceedings]

After an initial version reviewed at the (summer 1996) Montreal meeting
and with major revisions at the (fall 1996) San Jose meeting, the Framework
document is now close to a final editing pass. Vern Paxson presented the
changes in the current draft.

By far the most important is a new section on the criteria/process leading
to 'official status' for a given metric.
When a protocol is made a standard, it is generally required that there be
two independent interoperable implementations. Following this spirit, the
Framework draft now calls for either two methodologies, each with at least
one independent interoperable implementation, or (especially in cases where
there is one preferable methodology) one methodology with at least two
independent interoperable implementations.
By 'independent', we stress the value of implementations that proceed from
the documented metric rather than having shared code or even shared
implementation orientations not based on the documented metric.
By 'interoperable', we stress the value of multiple implementations whose
measurements are consistent with each other; this is understood to be
difficult, since it is difficult to present two different implementations
with identical network conditions.
In keeping with the draft, we considered the notion of rough consensus
within the working group. Scott Bradner pointed to the difficulty in doing
this for metrics proposed after the working group completes. Some suggest
that we should work to develop an environment where we can run tests over
and over to see if we really have statistically the same results. This would
add rigor, but at costs of large effort. And, even then, what constitutes
a meaningful difference, and why?
After further discussion made clear the difficult tradeoffs involved in these
issues, the chair proposed that this section of the Framework be reworded as
a discussion of the considerations and issues, with detailed documentation of
the right process to wait until some future revision when experience with a
few specific metrics better informs our understanding of general process.

The other revisions to the Framework draft were more minor, but did result in
helpful suggestions to the authors. Among these revisions:
<> Minor revisions to the notion of well-formed/standard-formed packets
<> Comments on the applicability of the notion of wire-time measurements
Among the remaining work to be done are:
- An appendix on computing goodness of fit using the Anderson-Darling test
- A simple test for correlation
- Adding a table of contents
- Possible new material on multi-path phenomena
- Perhaps define clouds as directed graphs
- Perhaps define broadcast networks as set of links vs meshes

Vern will do these edits so that we will be able to get the Framework issued
as an Informational RFC prior to the Washington IETF meeting.

4. Early experience with One-way Delay and Packet
Loss: Guy Almes and Sue Hares
[slides by Almes included in proceedings]

Guy Almes reviewed the One-way Delay and Packet Loss metrics, with particular
attention to the Singleton/Sample/Statistic structure and the importance of
Poisson processes for timing the various Singleton tests within a Sample.

Given this review, he presented some experiences based on work done at
Advanced Network & Services on their work with the Common Solutions Group of
universities.
One important point (though clumsily made on the slides) was that useful
measurements of Packet Loss might call for a different value of lambda than
corresponding measurements of One-way Delay. For example, on a high-speed
wide-area network, one might want an accuracy of plus-or-minus 1% and over
time periods of no more than one minute. This would call for values of
lambda of 2/sec or greater. This high value of lambda might be much more
than needed for useful measurements of one-way delay on the same
network.
The reported work uses GPS antennae at both source and destination to allow
one-way delays to be measured to accuracies of a few 10s of microseconds.
The current implementation is within user space, but modern PC-based systems
are fast enough for this to yield useful data. Pushing the implementation
to within the kernel is planned eventually, and will allow us to tighten the
error bars.
The measurement infrastructure now being deployed by Advanced Network &
Services in cooperation with the Common Solutions Group uses 200-MHz PC-based
machines at user sites. Ongoing sets of measurements, conforming to the
current draft One-way Delay and Packet Loss metrics, are performed with a
lambda of 4/sec. The results are uploaded to an Oracle database. Users
will then query this database with web-based tools. Seven sites are now
operational.
More information can be found at http://www.advanced.org/csg-ippm/
Among the early results, the most interesting is that, even when the path
between two sites is symmetric, patterns of one-way delay and packet loss
are decidedly asymmetric.

Daniel Karrenberg reported that the RIPE NCC has started a project to do
One-Way Delay, and that they plan to deploy test boxes throughout Europe.

Sue Hares reported on IPPM-related work at Merit. The IPMA Project is
funded by NSF, UM/Merit, and CAIDA. This work leverages the value of
Merit having machines capable of measurements placed at key exchange
points.
Some of the most interesting results to date relate to pathological
behavior of BGP implementations within routers at the exchange points.
This work will in the future allow Merit to contribute to IPPM work on
routing stability.
They are also implementing one-way delay, round-trip delay, and packet
loss on paths both across backbones and between pairs of routers at
exchange points.
More information can be found within the IPMA section of
http://www.merit.edu/

Christian Huitema noted that it might be difficult for the receiver of a
stream of test packets to have accurate knowledge of how many packets
were sent if all the most recent of them have been dropped. He sees this
as one weakness of the technique of using a Poisson process to define
such streams. Guy noted that one way to solve this problem would be to
have the sender and receiver both know the complete schedule of times at
which the packets would be sent. This can be done, for example, by both
knowing the seed of the pseudo-random number generator used to define
the Poisson process.

5. Recent work on TReno and Bulk Transfer Capacity: Matt Mathis
[slides included in proceedings]

A new draft of the Empirical Bulk Transfer Capacity Internet Draft was
issued recently. The key improvements are on how to use it to measure
end-to-end performance.

The draft has benefited from some research work done by Matt with
Jeffrey Semke, Jamshid Mahdavi, and Teunis Ott, which resulted in a
current Computer Communication Review paper on macroscopic TCP performance
behavior. More information can be found on the CCR paper at:
http://www.psc.edu/networking/papers

In recent work, Matt has noticed that flow performance over wide-area
high-speed and usually low-packet-loss networks can often be degraded
by the apparent loss of an entire window every so often. It has been
suggested that they may be related to route churn.

6. Multimedia over Internet: Performance in the absence of QoS: Raj Bansal
[slides included in proceedings]

Bansal and his colleagues at Nokia are concerned with the performance of
telephony over IP and similar real-time services on the Internet.

He noted that preconceived ideas of delay, formed by experience with
conventional telephony, may not be either right or sufficient for Internet.
Also, delay and packet-loss are not independent for real-time applications.
They may be able to be traded off against the other.

He reviewed a graph of the tail of a distribution of delay vs packet loss,
and noted that in real-time applications, after a point it is no longer worth
waiting for a delayed packet.

Non-traditional trade-offs may make sense. The entire delay distribution,
rather than a simple summary statistic such as the median, may be a more
useful thing to look at. In discussion, Christian Huitema discussed similar
tradeoffs. One must often look at the time-sequence of packet loss events
rather than at simple percentages. The possibilities of packet loss for
successive packets are not independent events.

7. Experience with Tests of Connectivity: Martin Horneffer
[slides included in proceedings]

Martin presented a talk on his implementation for loss/RTT metrics (using
UDP echo-packets) and of his experiences with large-scale loss/RTT
measurements. The measurements were from Europe to about 1000 hosts in
the Internet via two different providers.

The work is interesting for several reasons. First, it includes the
first implementation of the Connectivity metric. Second, it does not
rely on cooperation from remote hosts other than support for ICMP
and UDP Echo. The program, JPING, is implemented is Java, with
effective use of tcpdump and a perl script to analyze the results.