Aucf-cs.426
net.news
utcsrgv!utzoo!decvax!duke!ucf-cs!whm
Thu Feb 18 00:40:47 1982
Reducing Costs
Here are some thoughts that I've got about ways to reduce the
costs of a Usenet connection.

The following suggestions are not based on solid fact, but the
hand-dialing of ucf-cs to Duke often finds me peeking at
/usr/spool/uucp to see if I've got any mail, so I think I've got a
general feel for what file sizes and transfer times are being dealt
with.

It seems to me that a great deal of time is being wasted by uucico in
looking for the files to transmit.  Duke!trt's directory reorganization has
speeded things up somewhat, but the fundamental concept of the X file
and the D file seems to lie at the root of the problem.  The first idea
is to have a new file format, an E file (don't get transfixed) for
instance.  The E file would contain what is currently in the X file and
the D file in one package.  This might not be suitable for some
programs run by uuxqt, but for most sites, I'd say that 99.9 percent of
their uux traffic is rmail and rnews, and I don't really see a problem
in training rmail and rnews to deal with the E files.  (Actually, this
might be transparent to rmail and rnews.)  An E file would look like an
X file and a D file stuck together, e.g.:

       U whm ucf-cs
       F D.dukeB3087
       I D.dukeB3087
       C rmail ......
       And the letter follows here.

My rough estimations are that for files transferred from Duke to UCF,
a four second overhead is involved per set, so this change would
cut 2 seconds for each set.  Not much, but it adds up.

The real idea though is about news.  I'll be very surprised if nobody
has thought of this before, but I've never heard it discussed.

I analyzed our uucp logfiles over the last couple of months and by
comparing bytes transferred with real time for a session, I determined
that about 20-35% of the total session time is not used for data
transfer, but for overhead on each end.  I arrived at these figures
by looking at the the SYSLOG file.  I assume the time wasted would
be .25*(25-35%) if the transfer was at 300 baud rather than at 1200.

Since Usenet is a store-and-forward network, I would imagine that for
most sites not in the "central ring", their news queues up on the
system(s) that they get it from, and when they are contacted (either
call or called), they might have 50-100 articles waiting for them.
Each of these articles is represented by an X file and a D file.
Implementation of an E file scheme would save 2 seconds per article,
or about two to four minutes per session.

That's cut the overhead by half, but what about the other half.  If a
system only calls for news once or twice a day, the news builds up, so
why not send it all in one big file, or perhaps several files of
sufficient size to minimize file transfer protocol overhead.
Basically, instead of queueing up a uux request each time that rnews
reads an article to redistribute, rnews would append the article onto
the end of a file for the system(s) that receive it.  (Each system has
its own file.) The real problem is that of shipping out the "batched"
file.  A very naive solution would be to build a file for a system
until it got larger than X bytes, and then queue a rbnews (read batch
news) uux request for the particular system and file in question.  The
problem with sending a file after it reaches a certain size is that a
system could call up and not get all the news that has come in because
the file hasn't reached the highwater mark.  This system wouldn't
present any problems for the system that is polled, as a cron entry
could be made to cause the partially filled file to be queued for uux
just before the poll is made to the remote system.

This system seems to be fairly easy to implement, wouldn't require
the use of E files (although they would still help for mail), and
would be well suited to the Usenet philosophy.  I'm not really that
familiar with news internals, but I would think that the modifications
involved would be:
       Addition of code to write news articles into a batched
        file rather than queueing a uux request.
       Devise a data-independent format for the batch.
       Modify rnews so that it can unroll the batch.
       For polled systems, add cron entries and hooks in news to
        queue any partially filled files for uux.
       For polling systems, determine what a reasonable size for
        the batch file is.  This could be in the .sys file entry
        for a system.
        (Batch file sizes should also be considered wrt the probability
         of a session being prematurely terminated.)

Unc!smb has suggested compaction of news articles, the compaction program
supposedly gets about 45% compaction on english text, so working with
batched compacted files might be worth looking at also.

I would have liked to be able to send firmer figures and more thought-
out ideas, but I've been thinking about this since the Netnews meeting at
Usenix, and haven't found the time yet, so I thought it would be better
to send something half-baked now rather than to wait a month or two.

Comments?

                                       Bill Mitchell
                                       Univ. of Central Florida

-----------------------------------------------------------------
gopher://quux.org/ conversion by John Goerzen <[email protected]>
of http://communication.ucsd.edu/A-News/


This Usenet Oldnews Archive
article may be copied and distributed freely, provided:

1. There is no money collected for the text(s) of the articles.

2. The following notice remains appended to each copy:

The Usenet Oldnews Archive: Compilation Copyright (C) 1981, 1996
Bruce Jones, Henry Spencer, David Wiseman.