DIF-SSED.DOC

  Reducing the Swelling of the Phone Bill with DIF and SSED
                      November 17, 1981
                       Chuck Forsberg
                  Computer Development Inc
                        Beaverton OR

Lately  (if not sooner) it has become obvious that there must be
a better and cheaper  way  to  distribute  software  updates  to
changing  programs  than  to  transmit  all  of the new files in
their totality, even though only a few lines in each  have  been
changed.

For  some years the Unix differential file print program diff(1)
(the (1) refers to the section of the  Unix  Programmers  Manual
in  which  it  is  described) has had a -e flag which provides a
set of ed commands suitable for transforming the first  file  to
the second.

With  these  tools,  only  an  update  file need be transmitted,
provided, of course, that both the sender and the  receiver  had
copies of the same antecdent file.

I  have  written  a  "new"  diff  called  dif.c which manages to
operate in the primitive CP/M environment. The editing  commands
output  in  response  to the -e option refrence sequential lines
in the source files, so they (the commands) can be  executed  by
a  stream  editor.  (The  Unix  diff(1) creates difference files
with non-forward-sequential commands.)

To generate a difference file, the command is

dif -e oldfile newfile >file.dif

The >file.dif redirects the standard output to  the  file.  A  +
may  be  susbtituted  for  >  if  simultaneous console output is
desired.

The receiver then invokes:

ssed oldfile <file.dif >newfile

Which will result in newfile  being  created  identical  to  the
oroginal  newfile.  Well, not precisely identical, but identical
up to and including the EOF (^Z) character.  The  dribble  after
that  may  change, so CRCK may say they are different. To check,
compare the two files with dif.

Unix folks with 14 character file names and  modification  times
stored  by  the filesystem have little trouble keeping the files
synchronized. (If the antecedent files  are  different,  there's
no  telling  what  the  output file will look like!) For us poor
CP/M folks (verrry) patiently awaiting something  like  Unix  to
appear  magically  on  out desktops, I propose that the revision
or revision date of the antecedent file be  placed  in  the  new
file  adjacent  to  the  new revision or date, preferably on the
same line. This way the user may easily verify that he  has  the
correct antecedent.

Dif  Versions  1.10 and later place hash indices of the RETAINED
lines of the antecedent file  in  the  difference  output.  This
allows   ssed  1.10  or  later  to  verify  correctness  of  the
antecedent file. the new .dif files are compatible with the  old
ssed, but, alas, not with Unix ed or sed.

The  array  sizes in dif.c may have to be shrunk somewhat to run
on a 48k system.

For testing, give

dif -e filea fileb |ssed filea >filec
dif fileb filec

(fileb and filec should be identical)

It ought to work if you said
dif -e filea fileb |ssed filea |dif fileb
and it does, with version 2.0.

Version 2.0 of dif.c adds a -u flag which will  unsqueeze  filea
before comparing it to fileb.

Thus you can say
sq filea
dif -eu filea.qqq fileb |ssed filea |dif fileb

Or you can say
dif -eu filea.qqq fileb |ssed -u filea.qqq |dif fileb
to  test  dif  and ssed. (Be sure dif and ssed are exactly where
you say they are, or else pipes will be broken.)

Restriction:  Since  the  BDS  Standard  I/O  library  and   the
Directed  I/O  package  are  somewhat confused about translation
between CP/M's cr/lf terminated lines and **nixs' \n  terminated
lines,  dif  was  written  to strip cr's from the input in order
that only one cr appear  on  the  output.  As  a  result,  lines
terminated  by  cr/lf, lf, and lf/cr all come out the same! This
would munge files where lf/cr  has  a  special  meaning  (MBASIC
continuation lines) or where embedded cr's are used (RTTY art).

Unix is a trademark of WECO, CP/M of Digital Research.