Minutes of the LDUP WG Meeting, 44th IETF (Minneapolis, MN)
Thursday, March 18, 1999

Minutes reported by John Strassner

Agenda:
------------

1. Introduction and agenda bashing       John
2. Requirements draft update                 Russ Weiser
3. Architecture and Model update          John Merrells
4. Replication Transfer Protocol             Ellen Stokes
5. Conflict Detection and Resolution      Steven Legg


Comments from Keith Moore:
-----------------------------------------
(this one item is out of place as to when it happened, but contains
general information, and so is put here for convenience)

Keith: the IESG frowns on publishing an informational document that is
called a �ǣRequirements�ǥ document. Perhaps the title should be �ǣDesign
Considerations�ǥ. (Chairs and author agree to change the name to Design
Considerations). Keith also recommends that we socialize LDUP with the
other wgs in APPS that are using LDAP (at least 12 now). Chairs agree,
and we will designate one or more people to do so.

Russ Weiser - Requirements draft update
---------------------------------------------------------

Changes made:  clarification of terminology, addition of an
applicability statement, added requirements references to scenarios,
and clarifications to the document that address previous comments. The
following are comments to the draft:

Are attribute and object resurrection requirements? (reference 5.1.6).
Russ posits that it still is a requirement. In archive, Mike from
Xerox brought up a synchronization scenario involving two palm pilots
and synchronizing their bank accounts. Russ thinks that this is out of
scope. Mike responds that the applicability statement should be
broadened ��� the existing examples are really more about
administration, and don���t characterize the nature of the application.
General agreement on this, Russ to revise.

Need a definition of what a conflict is. This may belong in this
draft, and also possibly in the model document. (Current draft talks
about conflicts, but doesn���t define it). Russ to include.

Need more words about partial replication. This will be followed up in
email discussions on the list.

Need more clarification on naming context. Naming contexts don���t
really overlap, unless the root of one resides in the other. What you
would like to do is to cache the mount point of the subordinate in the
superior���s naming context. In terms of DNs, it is in the tree, but in
terms of where it resides on servers, it could be a reference to
another server.

What is the distinction between transient and looser consistency?
Transient will settle out over time more quickly than �ǣlooser�ǥ will.
In addition, transient will reach a converged state faster than looser
will. In transient, you have global knowledge whereas in looser, you
don���t. This helps convergence. This means that you can identify when
an update has finished so you can purge it, whereas in looser you may
purge it before the update has reached everywhere it wants to go.
(This applies to purging death warrants).

Mike has additional comments, but will send them to the list for
further discussion.

Goal: Russ to have an update out in 1-2 weeks.

Note from the chairs: Please everyone, focus your attention on this
draft. We need to move it along to help progress the others. This
should go into Last  Call before Oslo.

John Merrells - Architecture and Model
-----------------------------------------------------

Status is that it is being moved into the wg namespace from the
individual namespace. This draft is, of course, affected by different
drafts as people start to implement these ideas. The list of topics
that affect the model include replay ordering, CSN-TSR/TRS/TSRs,
Partial Replicas, Lost+Found entry, replication management, and the
high-level URP specification.

The draft supports out of order replay. If you put more constraints on
the ordering of updates, then you don���t need any update resolution.
But this is impractical, hence the need to support replay.

The sequence number is really serving two purposes ��� making time more
granular (to help with ensuring that order is correct) as well as to
ensure the atomicity of the primitives in LDAP operations (e.g., a
primitive should not be able to be inserted). The problem is that TSR
is natural for replication, and TRS is natural for state systems.
Thus, we are considering TSRS, which has the time split up as
described above. We have in fact converged on TSRS. We need to hear
from the list if this is OK.

Partial replicas are proving to be troublesome. We need to address
them in detail (we have been trying to ignore them so far). By taking
this on, we now are in effect considering sparse replication. But the
problem is that entries move in and out of the replication scope just
as a matter of fact of LDAP operations. So we need to track entry
movement, which greatly complicates things. Moved to the list for more
discussion.

We���ve decided that there is a lost+found entry (this is for entries
that lose their superior reference), but we don���t know where it is.
Possibilities include keeping it in the subtree (perhaps, make it a
subordinate of the root of the naming context). Mark points out that
we must ensure that if we do that, we must ensure that it can not be
seen by processes other than LDUP. Moved to the list for more
discussion.

Replication management is being dealt with in a separate I-D from Mark
Wahl. First draft due at the end of April.

Suggestion is to bridge between the model document (too high-level)
and the URP document (too low-level). This will be done done by John
Merrells, Russ Weiser, and Mike Spreitzer. Probably will go as a new
draft. Due date to be supplied to the chairs within a couple of weeks.

Schema replication. In log-based systems, you send the schema
modifications exactly at the point where the client sent them. But in
a state-based system, you really want to send the schema first, and
then the updates. This may mean that we must impose a constraint on
the ordering of updates. Needs discussion on mailing list.

Replication Transfer Protocol Discussion ��� Ellen Stokes
----------------------------------------------------------------------
-------

Will be a draft out shortly. Basic flow is:

S:  bind to consumer
C:  bind response (identity/credential is implementation defined)
S:  sends StartReplicationExtended Request
   repl root, sender's replica ID, protocol OID=full|incremental,
SIR|CIR[|both]
C:  send StartReplicationExtendedResponse (does access control
decision for identity allowed to do this operation)
 result, updateVector
S:  �Ǫ (transmits all required updates)
S:  sends EndReplicationExtendedRequest
 updateVector only on full update: returnUpdateVector
C:  sends EndReplicationExtendedResponse
 update vector needed for read-only replicas
S:  disconnects

Cases:

Supplier-initiated replication (SIR)
Consumer-initiated replication (CIR)
Full replica
Incremental Update

(see Ellen���s charts on protocol flow details for combinations)

Note that there could be problems if there is a firewall between the
consumer and the server.

Questions: how do we handle unsolicited notifications? (there is an
existing ldap draft, but it requires a modification to that. What
about data compression? We know that there are several things that
could benefit from this, and perhaps we should consider an additional
compression phase in the protocol. This might make us revisit the
startTLS draft. Keith recommends using a separable compression layer,
as opposed to tweaking the data structures ��� please look at RFC
2393-2395.

How many extensions are you looking at? Four minimum, six maximum (we
think). Note that we have an advantage in keeping the extensions
separate, because as we evolve the protocol, this provides flexibility
in determining what product supports what extension.

CIR should use SIR instead of turning around the connection.

Full and incremental update differ only in the protocol OID. Note that
both use the same URP primitives.

General note to authors of this document: please be specific as to
error codes and what they mean.

Further note: access control section is important, but there is an
LDAPEXT sub-committee to do this, and all of the members of that
subcommitte are also members of LDUP.

When you have two masters that are peers, it���s more complicated than
just throwing away the updates of one ��� you need to bring them both
into consistency.

Steven Legg - Conflict Detection and Resolution
------------------------------------------------------------------
Consistency: the contents of two DSAs that are replicating data should
be the same.

There are a number of integrity constraints on the directory data,
called DIT integrity (e.g., the constraints imposed by the underlying
information model) and business rule integrity (e.g., every employee
ID number must be unique). Note that an explicit assumption is that
transient inconsistency can be tolerated.

In an enterprise database environment, there is a single
administrative authority for the database. There is a unified and
compatible set of business rule integrity constraints, imposed within
a particular administrative or application domain. Every database
application understands and obeys the same set of business rules.

In an Internet directory environment, things are harder. There are
many administrative authorities, each with its own business rules. The
only thing that we can get agreement on is DIT constraints. And since
there is still major disagreement on attributes (e.g., in LDAPEXT,
there was a debate on the use of cn ��� some people used it as a name,
others put in simple identification information (e.g., employee ID),
we can only rely on DIT constraints, meaning that business integrity
is a real problem).

So therefore, we should be able to make updates to occur at a master
DSA without communicating with any of the co-masters, and conflicts
are detected after the fact. So what do we do? Disregard one of the
updates in a conflict, or provide rules to harmonize the updates in
the conflict by making corrections. This would be done by the
administrator.

Can a conflict be ignored? And if so, can it be ignored silently?
Second is definitely not good ��� we should at least tell the client
what we are doing. And if a conflict resolution requires a human to
resolve it, then we should tell that to the client too. The worry is
that ignoring one update can have cascading effects on earlier and/or
later updates. This has additional problems, as updates are usually
interdependent, and puts an extra load on DSAs as updates are undone
and redone. It also has some scaling problems. Finally it can be
potentially confusing for users, as previously accepted updates are
undone sometime later.

Concern over (mis)use of LDAP, and that LDUP may exacerbate the
problem. Perhaps, as part of the LDUP activity, we could have the LDUP
server provide operational, etc. information to the client to aid in
its deployment.

Important thing at this stage in your work is to make certain that you
can identify all cases and talk sensibly about what happens. Given
that there are some set of failure cases, we better identify in
advance what the failure mode is and how we will deal with it. In
particular, what happens with an old client or server using a (new)
LDUP server? The point is for us to understand the failure mode and
document it.

There are things that we didn't do. It's very difficult to implement
without rewinding the directory database to an earlier state. In
addition, a partial replica may not see the same conflicts as the full
replica. Finally, what do you do about lost updates? There's no
guarantee that a user will catch these, so the burden is probably on
the administrator.

What we are using is a locally-defined conflict resultion procedure.
This is because the same conflict resolution algorithm MUST be used at
all DSAs where an entry is mastered. The interaction of different
conflict resolution algorithms will be too complex and subtle for
directory administrators to get right, so it is MUCH easier if the
SAME algorithm is applied everywhere.

The Update Reconciliation Procedures draft includes procedures for
breaking down LDAP, DAP and DSP operations into more primitive
actions. It appears to be compatiable with partial replica filtering,
though arguably more work needs to be done here. It centers around
defining procedures for applying replication primitives against the
local directory database. A key point is that this draft is
specifically written to allow extensions to these protocols, as well
as adding whole new access protocols, to the list.

The main features include:
 - updates may be seen out-of-order by mastering DSAs (URP
accommodates arbitrary reordering)
 - only the latest state of a directory item is of interest (changes
to achieve that state are unimportant)
 - URP ignores business rule integrity constraints (impossible to
standardize)

(unfortunately, we ran out of time at this juncture. the chairs
encourage you to go through Steven's full presentation and draft, and
participate in ensuing discussions on the list)