Article 5883 of comp.lang.perl:
Xref: feenix.metronet.com news.software.nntp:2420 comp.sources.testers:102 alt.sources:1879 comp.lang.perl:5883
Path: feenix.metronet.com!news.utdallas.edu!hermes.chpc.utexas.edu!cs.utexas.edu!uunet!spool.mu.edu!sgiblab!idiom.berkeley.ca.us!idiom.berkeley.ca.us!not-for-mail
From:
[email protected] (David Muir Sharnoff)
Newsgroups: news.software.nntp,comp.sources.testers,alt.sources,comp.lang.perl
Subject: fetch-news a personal news transfer engine -- testers neeeded.
Followup-To: news.software.nntp
Date: 15 Sep 1993 00:03:33 -0700
Organization: Idiom Consulting / Berkeley, CA USA
Lines: 1426
Distribution: world
Message-ID: <
[email protected]>
NNTP-Posting-Host: idiom.berkeley.ca.us
Okay, I've been using the current version my fetch-news to get my
newsfeed for all of 30 minutes now! Anyway, I think it works (I've
been pounding on it hard for the last few days).
IF having a newsfeed of only the news that you actually read appeals to
you (think, you can run with a 20-day expire and not eat a gigabyte!)
AND if you are on the internet AND if you are willing to use beta
software THEN please give fetch-news a spin!
I won't send say anything about fetch-news again until testing is
complete -- if you are interested, act now and SAVE!
-Dave
FETCH-NEWS
fetch-news - pull the news that will be read via nntp
SYNOPSIS
fetch-news [-force days] [-debug] [-verbose] [-config
config_file] [PARAMETER=value]...
DESCRIPTION
fetch-news will use the NNTP to transfer news between two
news servers. It only transfers the news that it believes
people will want to read. It knows what people want to read
by looking at their fetched.
Unlike some other similar programs, fetch-news will keep
track of article numbers rather than using the newnews com-
mand. This is important because with most news implementa-
tions (INN in particular) the newnews command thrahes the
server.
OPTIONS
There are only a few command line options to fetch-news, but
there are many configuration options. Since fetch-news is a
perl program distributed in source, the bulk of the confi-
guration options are specified in the program text itself
just after the copyright notice. You do not need to change
there! One of the parmateters, $CONFIG, is the location of
a file to require before doing much else. This file will be
read before much else is evaluated. Set your overrides
there. The name of this file can be specified on the com-
mand line with -config filename.
The other options are -force days which will override the
touch files and cause days worth of news to be transfered.
You can get more verbose output by specifying -verbose, or
-debug.
Finally, any of the configuration parameters (see the
source) can be overriden on the command line with an assig-
ment: command=value.
NEWS SERVER SETUP
fetch-news is asymetrical in how it talks to news servers.
It talks to the news server that it is transfering news from
as if it were a news reader. It talks to the server that it
is transfering news to as if it were another news server.
This difference can make configuration annoying. However,
it's easy to test: if you can read news with a news reader,
you can get the news with fetch-news (if you don't have per-
mission, just use a packet forwarder).
One implication is that with fetch-news, you tend get your
news from one server and send your outgoing news to another.
USE WITH SLOW LINKS
I built fetch-news mostly to optimize use of my slow (SLIP)
internet connection. It saves my line by only transfering
the news that is likely to be read. In addition, I use a
small tcp socket forwarder process to introduce delay. In
my forwarded process, I have it use a small (128-byte)
socket send buffer so that a round-trip delay is required
for every packet. To get my socket forwarder, ftp to
idiom.berkeley.ca.us and grab pub/muir-programs/tcp_nest.c.
SIMILAR SYSTEMS
There are quite a few news transfer systems that fill a
similar role to fetch-news. None of them do exactly the
same thing. Here are some pointers to a few...
Newsfeed by Vernon C. Hoxie <
[email protected]>, does
about what fetch-news does, but it does with batches.
Gup (newsGroup Update Program) by Mark Delany
<
[email protected]> and Andrew Herbert
<
[email protected]> provides a way to update a news
subscription file by mail.
SLNR (Simple Local News Reader) by Philippe Goujard
<
[email protected]> is an off-line newsreader. It
grabs your news all at once so that you can read it later.
And there is one more system, recently posted, that is very
similar to fetch-news except that is uses the newnews com-
mand to transfer news. I apologize for forgetting the name.
AVAILABILITY
The latest version of fetch-news is available through
anonymous ftp to idiom.berkeley.ca.us. There is a small
mailing list for discussing fetch-news. Send mail to
[email protected] to join it.
AUTHOR
David Muir Sharnoff <
[email protected]>
------------------- cut here ---------------------------------
#!/bin/sh
# shar: Shell Archiver (v1.22)
#
# Run the following text with /bin/sh to create:
# fetch-news
#
sed 's/^X//' << 'SHAR_EOF' > fetch-news &&
X#!/usr/local/bin/perl
X'di';
X'ig00';
X# THIS PROGRAM IS ITS OWN MANUAL PAGE. INSTALL IN man & bin.
X
X# $Revision: 1.4 $
X# $Author: muir $
X
X# TODO:
X# distributions line
X# qerry server for newsgroup requests
X# option to insert news with inews
X# rethink group removal
X
X#############################################################################
X#
X# Copyright (c) 1993 David Muir Sharnoff
X# All rights reserved.
X#
X# Redistribution and use in source and binary forms, with or without
X# modification, are permitted provided that the following conditions
X# are met:
X# 1. Redistributions of source code must retain the above copyright
X# notice, this list of conditions and the following disclaimer.
X# 2. Redistributions in binary form must reproduce the above copyright
X# notice, this list of conditions and the following disclaimer in the
X# documentation and/or other materials provided with the distribution.
X# 3. All advertising materials mentioning features or use of this software
X# must display the following acknowledgement:
X# This product includes software developed by the David Muir Sharnoff.
X# 4. The name of David Sharnoff may not be used to endorse or promote products
X# derived from this software without specific prior written permission.
X#
X# THIS SOFTWARE IS PROVIDED BY THE DAVID MUIR SHARNOFF ``AS IS'' AND
X# ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
X# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
X# ARE DISCLAIMED. IN NO EVENT SHALL DAVID MUIR SHARNOFF BE LIABLE
X# FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
X# DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
X# OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
X# HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
X# LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
X# OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
X# SUCH DAMAGE.
X#
X# This copyright notice derrived from material copyrighted by the Regents
X# of the University of California.
X#
X#############################################################################
X
X#
X# This is fetch-news.
X#
X# It transfers news by pretending to read it and then shoving what
X# it gets to another server.
X#
X# It does all the news transfering with NNTP.
X#
X# On systems that support XHDR, it is fairly efficient.
X#
X# If you have any questions, send mail to
X#
[email protected]
X#
X# It creates and deletes news groups using ctlinnd, so it works
X# best on systems that use INN.
X#
X#
X# The development of fetch-news has been paid for by David Muir
X# Sharnoff and Berkeley Research and Trading. Contributions are
X# welcome.
X#
X
X$debug = 0;
X$verbose = 1;
X
X# These can be dug out of /usr/include
X$AF_UNIX = 1;
X$AF_INET = 2;
X$SOCK_STREAM = 1;
X$SOCKADDR_IP = 'S n a4 x8';
X$SOCKADDR_UN = 'S a108';
X
X# Socket addresses to grab transfer articles
X# to and from.
X# /xyz is a unix domain socket
X# foo.bar.baz is an inet domain socket to nntp
X# foo.bar.baz/p is an inet domain socet to port p
X#
X$TO = "/usr/local/news/innd/nntpin";
X$FROM = "prep.mit.edu/3340";
X
X# Groups will only be created & destroyed if INN is being used.
X# If INN is being used, then there should be a ctlinnd program.
X$CTLINND = "ctlinnd";
X
X# article spool
X$SPOOL = "/usr/local/news/spool/news";
X
X# The "active" file from INN. This can be set to "nntp"
X# if the active file should be queried over nntp.
X$ACTIVE = "/usr/local/news/active";
X
X# list of groups not to fetch
X$IGNORE = "/usr/local/news/fetch.ignore";
X
X# list of groups to always fetch
X$SPECIAL = "/usr/local/news/fetch.special";
X
X# A command to run after finishing each newsgroup. It
X# takes a newsgroup name as an argument. Optional.
X$MTHREADS = "/usr/local/news/trn/mthreads";
X
X# List of all newsgroups. This can be set to "nntp" if
X# the list of all newsgroups should be queried over nntp.
X$MASTER_ACTIVE = "/usr/local/news/active.from.uunet";
X
X# days to pre-feed a new group. "everything" if -1
X$BACK_FEED = 6.0;
X
X# where to store the fetch counters
X$FETCHSPOOL = $SPOOL;
X# file extension name for fetch counters
X$FETCHFILE = ".nfn";
X
X# store the fetch counters in a directory tree (as
X# apposed to flat, all in one directroy)?
X$FETCHDIRS = 1;
X
X# uses timestamps or article numbers? If article numbers
X# are used (much more efficient 'cause then it doesn't use
X# newnews much) then the "xhdr" command must be supported.
X$TIMESTAMP = 0;
X
X# Maximum number of new newsgroups that a single user can get.
X$MAX_NEWGROUPS = 20;
X
X# Maximum number of newsgroups that a single user can get.
X$MAX_USER = 300;
X
X# Maximum number of articles to fetch that might not be read
X# on behalf of a single user
X$MAX_USER_ARTICLES = 100;
X
X# The login of the account that runs the news software
X$NEWS_USER = "news";
X
X# If nothing has happened for this many seconds, exit
X# If waiting for a lock that hasn't been touched in this long,
X# take it.
X$MAX_IDLE_TIME = 3000;
X
X# Make sure to touch your lock withing this time...
X$MIN_IDLE_TIME = 300;
X
X# Amount of time to sleep after killing off another fetch-news process
X$SLEEP_AFTER_KILL = 40;
X
X# Command to run before grabbing news
X$START_TUNNEL = undef;
X
X# Ignore .newsrc files that haven't been modifed in this
X# many days.
X$IGNORE_UNTOUCHED_NEWSRC = 20.0;
X
X# Get rid of groups that nobody even lists in their .newsrc files
X$REMOVE_UNTAKEN_GROUPS = 0;
X
X# "require" this file -- a place to store configuration information.
X$CONFIG = "/usr/local/news/fn.config";
X
X# The master lock file. Set this to only have one copy of
X# fetch news running at a time.
X$MASTER_LOCK = "/usr/local/news/locks/fn.master";
X
X# Which newsgroups should be ignored. Wildcard is *, negate is !, terms
X# are evaluated left to right with the last match winning.
X# Thus "*,!alt.*,alt.sex.*" means look at all groups except alt groups
X# other than alt.sex.
X#$RESTRICTION = '!*.answers,!*pictures*,!*binaries*';
X$RESTRICTION = '';
X
X# Maximum expire time. Any touch files older than this will be
X# discarded.
X$MAX_EXPIRE = 20;
X
X######################################################################
X
X$o0 = "$0 @ARGV";
X$av0 = $0;
X$usage = "Usage: $av0 [-force days] [-debug] [-verbose] [-config CONFIG_FILE] [PARAMETER=value]";
X
Xwhile(@ARGV) {
X $a = shift(@ARGV);
X if ($a eq "-force") {
X $force = 1;
X $force_days = shift(@ARGV);
X next;
X }
X if ($a eq "-config") {
X $CONFIG = shift(@ARGV);
X require $CONFIG || die "require $CONFIG: $@";
X undef $CONFIG; # don't get it later
X next;
X }
X if ($a eq "-debug") {
X $debug = !$debug;
X next;
X }
X if ($a eq "-verbose") {
X $verbose = !$verbose;
X next;
X }
X if ($a =~ m/^([A-Z_0-9]+)\=(.+)/) {
X $x = $2;
X print "eval \$$1 = \$x\n" if $debug;
X eval "\$$1 = \$x";
X die "Eval \$$1 = \$x: $@" if $@;
X next;
X }
X die $usage;
X}
X
Xif ($CONFIG) {
X local($0) = "$o0: require $CONFIG";
X require $CONFIG || die "require $CONFIG: $@";
X}
X
X$SIG{USR1} = 'check_alive';
X
X$0 = "$o0: get lock on $MASTER_LOCK";
Xif ($MASTER_LOCK) {
X die "could not get lock $MASTER_LOCK"
X unless &lock($MASTER_LOCK);
X}
X
X$0 = "$o0: lookup homedir of $NEWS_USER";
X($name,$passwd,$uid,$gid,$quota,$comment,$gcos,$dir,$shell) = getpwnam($NEWS_USER);
X
Xchdir($dir) || die "chdir $dir: $!";
X
X$ENV{'PATH'} = ".:$dir/bin:$dir/etc:$dir/local/bin:/bin:/usr/bin:/etc:/usr/local/bin:/usr/ucb";
X
X$SIG{USR2} = IGNORE;
X
X$0 = "$o0: `$START_TUNNEL`";
X&system($START_TUNNEL) if $START_TUNNEL;
X&still_alive();
X
X$0 = "$o0: building restriction routine";
X&restrict_init if $RESTRICTION;
X
X$0 = "$o0: process ignore";
X%ignore = ();
X$ignore{'junk'} = 1;
Xif ($IGNORE && -e $IGNORE) {
X open(IGNORE,$IGNORE) || die "open $IGNORE: $!";
X while(<IGNORE>) {
X chop;
X print STDERR "Ignore: $_\n" if $debug;
X $ignore{$_} = 1;
X }
X &still_alive();
X}
X
X$0 = "$o0: process special";
X%special = ();
X$special{'control'} = 1;
Xif ($SPECIAL && -e $SPECIAL) {
X open(SPECIAL,$SPECIAL) || die "open $SPECIAL: $!";
X while(<SPECIAL>) {
X chop;
X print STDERR "Special: $_\n" if $debug;
X $special{$_} = 1;
X $who{$_} .= "news ";
X $no_limit{$_} = 'special';
X }
X &still_alive();
X}
X%take = %special;
X
X$0 = "$o0: open sockets";
X&open_connections();
X
X$0 = "$o0: read master active file";
X%moderation = ();
Xif ($MASTER_ACTIVE eq 'nntp') {
X local($line_count);
X print STDERR "Fetching the master active file\n" if $debug;
X &fs("list active");
X $resp = &getline(FS);
X if ($resp =~ /^215/) {
X while($resp = &getline(FS)) {
X &chop($resp);
X last if $resp =~ /^\.$/;
X next unless m/(\S+)\s\d+\s\d+\s(.)/;
X $moderation{$1} = $2;
X &still_alive() if ($line_count++ % 200 == 0);
X }
X }
X &still_alive();
X} elsif ($MASTER_ACTIVE) {
X print STDERR "reading the master active file, $MASTER_ACTIVE\n" if $debug;
X &system("$CTLINND reload active want_to_see_it") if $CTLINND;
X open(MA,$MASTER_ACTIVE)
X || die "could not open $MASTER_ACTIVE: $!";
X while(<MA>) {
X next unless m/(\S+)\s\d+\s\d+\s(.)/;
X $moderation{$1} = $2;
X }
X close(MA);
X &still_alive();
X}
X
X$0 = "$o0: read active file";
Xprint STDERR "Reading the active file\n" if $debug;
X%weget = ();
X%mina = ();
X%maxa = ();
X@weget = ();
Xif ($ACTIVE eq 'nntp') {
X &ts("list active");
X local($line_count);
X $resp = &getline(TS);
X if ($resp =~ /^215/) {
X while($resp = &getline(TS)) {
X &chop($resp);
X last if $resp =~ /^\.$/;
X next unless m/(\S+)\s(\d+)\s(\d+)/;
X $weget{$1} = 1;
X $maxa{$1} = $2;
X $mina{$1} = $3;
X $current{$1} = ($2+1 != $3);
X push(@weget,$1);
X &still_alive() if ($line_count++ % 200 == 0);
X }
X }
X &still_alive();
X} else {
X open(ACTIVE,$ACTIVE) || die "could not open $ACTIVE: $!";
X while(<ACTIVE>) {
X next unless m/(\S+)\s(\d+)\s(\d+)/;
X $weget{$1} = 1;
X $maxa{$1} = $2;
X $mina{$1} = $3;
X $current{$1} = ($2 != $3);
X push(@weget,$1);
X }
X close(ACTIVE);
X &still_alive();
X}
X
X$0 = "$o0: go through the accounts";
Xprint STDERR "Running through the accounts\n" if $debug;
X%mentioned = ();
Xwhile (($name,$passwd,$uid,$gid,$quota,$comment,$gcos,$dir,$shell) = getpwent) {
X
X $0 = "$o0: look at ${name}'s directory";
X
X $newcount = 0;
X $taken = 0;
X $report = 0;
X $mentioned = 0;
X
X if (-e "$dir/.rnfetchspecial" && open(FETCHSPECIAL,"$dir/.rnfetchspecial")) {
X while(<FETCHSPECIAL>) {
X chop;
X next if /^\S+\s\d+\s\d+\sn/;
X next unless /^([^:!]+)/;
X $group = $1;
X $mentioned{$1} += 1;
X $mentioned++;
X next if /^[^:!]+!/;
X unless (&touch_file_okay($group) || $current{$group}) {
X $newcount++;
X }
X $take{$group} += 1;
X $who{$group} .= "$name ";
X $taken++;
X print STDERR "Fetchspecail($name) $group\n" if $debug;
X }
X close(FETCHSPECIAL);
X $report = 1;
X &still_alive();
X }
X
X if (-e "$dir/.newsrc" && (-M "$dir/.newsrc" < $IGNORE_UNTOUCHED_NEWSRC)) {
X unless (open(NEWSRC,"$dir/.newsrc")) {
X print STDERR "Could not open $dir/.newsrc: $!";
X if ($! =~ /ermission/) {
X # send mail to $name ...
X }
X } else {
X %un = ();
X while(<NEWSRC>) {
X next unless /^(.+)([:!])\s*(.*)/;
X ($group,$read,$on) = ($1,$3,$2);
X
X $RESTRICTION && &restrict($group) && next;
X
X $thisnew = 0;
X $thisold = 0;
X
X $mentioned{$group} += 1;
X $mentioned++;
X
X next unless ($on eq ":");
X
X if ($read) {
X if ($read =~ /(\d+)$/) {
X $lasta = $1;
Xprint STDERR "$group .. read = $read lasta = $1 current = $current{$group} weget = $weget{$group} mina = $mina{$group} maxa = $maxa{$group}\n" if $debug;
X if ($current{$group} && ($weget{$group} && ($lasta+1 < $mina{$group} || $lasta > $maxa{$group}))) {
X # this group isn't really read
X next;
X }
X $thisold = 1;
X } else {
X $thisnew = 1;
X }
X } else {
X # if the user hasn't read the group it might
X # be because it hasn't been any news in it.
X unless (&touch_file_okay($group) && ! $current{$group}) {
X $thisnew = 1;
X }
X }
X
X $taken++;
X $newcount++ if $thisnew;
X
X $no_limit{$group} = $name if $thisold;
X
X if ($newcount == $MAX_NEWGROUPS) {
X print STDERR "Enough new groups for $name\n";
X }
X
X if ($taken == $MAX_USER) {
X print STDERR "Enough groups for $name\n";
X }
X
X unless (($newcount > $MAX_NEWGROUPS) || ($taken > $MAX_USER)) {
X $take{$group} += 1;
X $who{$group} .= "$name ";
X print "Take($name) $group\n" if $debug;
X if ($thisnew) {
X $is_new{$group} = $name;
X $un{$group} = 1;
X }
X }
X }
X close(NEWSRC);
X $report = 1;
X
X $taken{$name} = $taken;
X $newcount{$name} = $newcount;
X
X if ($newcount && $MAX_USER_ARTICLES/$newcount < 30) {
X $mpg = 1+int($MAX_USER_ARTICLES
X /($newcount > $MAX_NEWGROUPS ? $MAX_NEWGROUPS : $newcount));
X
X for $un (keys %un) {
X $limit{$un} = $mpg if $mpg > $limit{$un};
X print STDERR "Limiting $un to $limit{$un} for $name.\n"
X if $verbose;
X }
X if (($maxa{$un} - $mina{$un} > $limit{$un}*2) && ! $no_limit{$un}) {
X $take{$group} -= 1;
X $taken--;
X }
X }
X }
X &still_alive();
X }
X print STDERR "$name takes $taken ($newcount new) out of $mentioned\n" if $report;
X}
X
X$0 = "$o0: add new groups";
Xif ($CTLINND && $MASTER_ACTIVE) {
X print STDERR "Adding new groups ...\n" if $debug;
X foreach $group (sort keys %mentioned) {
X unless ($weget{$group}) {
X if ($moderation{$group}) {
X $RESTRICTION && &restrict($group) && next;
X print STDERR "New group: $group\n";
X ($gd = $group) =~ s!\.!/!g;
X $pre = (-d "$SPOOL/$gd");
X &system("$CTLINND newgroup $group $moderation{$group} $name");
X &system("$CTLINND renumber $group")
X if $pre;
X $weget{$group} = 1;
X } else {
X print STDERR "Group $group not listed in master active file\n";
X }
X }
X &still_alive();
X }
X}
X
X$0 = "$o0: remove old groups";
Xif ($REMOVE_UNTAKEN_GROUPS && $CTLINND) {
X print STDERR "Removing old groups...\n" if $debug;
X foreach $group (@weget) {
X $RESTRICTION && &restrict($group) && next;
X next if $take{$group};
X next if $ignore{$group};
X next if $special{$group};
X next if $mentioned{$group};
X print STDERR "rmgroup $group\n";
X &system("$CTLINND rmgroup $group");
X &still_alive();
X }
X}
X
X#
X# Free up memory space -- these arrays are no longer needed.
X#
X%mentioned = ();
X%special = ();
X%moderation = ();
X%weget = ();
X@weget = ();
X%mina = ();
X%maxa = ();
X%current = ();
X
X$0 = "$o0: fetch articles";
X
Xlocal($f_first, $f_last,*TF);
Xlocal($time,$artnum);
Xlocal($ts,$gd,$n,$tfe);
Xlocal($sec,$min,$hour,$mday,$mon,$year,$wday,$isdst);
Xlocal($id,%ids,$begin,$end,$range);
Xlocal($attempted,$transfered);
Xlocal($line_counter);
X
Xforeach $group (sort keys %take) {
X $0 = "$o0: consider group $group";
X# next unless $group eq 'rec.humor.funny';
X $RESTRICTION && &restrict($group) && next;
X next unless ($take{$group} > 0);
X next if $ignore{$group};
X
X $0 = "$o0: process group $group";
X
X undef $limit;
X if ($limit{$group} && ! $no_limit{$group}) {
X $limit = $limit{$group};
X }
X
X print STDERR "$group ... $who{$group} ($limit)\n" if $verbose;
X
X $attempted = 0;
X $transfered = 0;
X
X $tf = &touch_file($group);
X
X print STDERR "Touch (count) file: $tf\n" if $debug;
X
X unless (&lock($tf)) {
X print STDERR "Locked: $group\n";
X next;
X }
X
X GROUP: {
X #
X # figure out the set of articles to fetch.
X # this can either be based on time (set $time)
X # or on the remote article number (set $artnum)
X #
X
X &still_alive();
X undef $time;
X undef $artnum;
X
X if ($force) {
X $time = time - ($force_days * 60*60*24);
X } elsif (&touch_file_okay($group)) {
X if ($TIMESTAMP) {
X $time = (stat($tf))[9];
X } else {
X $artnum = &readfile($tf);
X }
X } else {
X if ($TIMESTAMP || $BACK_FEED != -1) {
X $time = time - ($BACK_FEED * 60*60*24);
X } else {
X $artnum = 0;
X }
X }
X
X
X print STDERR "time: $time artnum: $artnum\n" if $debug;
X
X # interaction:
X # group news.answers
X # 211 808 6976 9924 news.answers
X #
X
X $0 = "$o0: ($time/$artnum/$limit) send 'group $group'";
X
X &fs("group $group");
X $resp = &getline(FS);
X if ($resp =~ /^211\s+\d+\s+(\d+)\s+(\d+)\s/) {
X ($f_first, $f_last) = ($1, $2);
X
X %ids = ();
X undef $range;
X
X if ($time) {
X # interaction:
X # newnews news.answers 930722 140733 GMT
X # 230 New news since 850515 020000 follows
X # <
[email protected]>
X # .
X print STDERR "Using newnews to backfeed...\n"
X if $verbose && ! $TIMESTAMP;
X ($sec,$min,$hour,$mday,$mon,$year,$wday,$isdst) = gmtime($time);
X
X $0 = "$o0: ($group/$time/$artnum/$limit) send newnews";
X &fs(sprintf("newnews %s %02d%02d%02d %02d%02d%02d GMT",$group,$year,$mon+1,$mday,$hour,$min,$sec));
X $resp = &getline(FS);
X &still_alive();
X if ($resp =~ /^230\s/) {
X $0 = "$o0: ($group/$time/$artnum/$limit) recv newnews";
X while($resp = &getline(FS)) {
X &chop($resp);
X last if $resp =~ /^\.$/;
X $ids{$resp} = $resp;
X &still_alive if ($line_counter++ % 200 == 0);
X }
X } else {
X die "Illegal responce from server: $resp";
X }
X &still_alive();
X } else {
X # interaction:
X # xhdr Message-ID 120716-120717
X # 221 Message-ID fields follow
X # 120716 <
[email protected]>
X # 120717 <
[email protected]>
X # .
X $artnum = 0 if $artnum > $f_last; # whoops!
X $begin = ( $f_first > ($artnum+1) ? $f_first : ($artnum+1) );
X $end = $f_last;
X
X if ($end - $begin > 0) {
X $range = "$begin-$end";
X } elsif ($end == $begin) {
X $range = $end;
X } elsif ($end - $begin == -1) {
X $range = 0; # nothing to do...
X } else {
X # wheee, things is backwards!
X die "Range of articles for $group is impossible";
X }
X
X print STDERR "artnum $artnum begin $begin end $end range $range\n" if $debug;
X
X if ($range) {
X $0 = "$o0: ($group/$time/$artnum/$limit) send 'xhdr $range'";
X &still_alive();
X &fs("xhdr Message-ID $range");
X $resp = &getline(FS);
X if ($resp =~ /^221\s/) {
X while($resp = &getline(FS)) {
X $0 = "$o0: ($group/$time/$artnum/$limit) recv xhdr";
X &chop($resp);
X last if $resp =~ /^\.$/;
X if ($resp =~ /^(\d+)\s+(\S+)/) {
X $ids{$2} = $1;
X } else {
X die "Could not parse output from server: $resp";
X }
X &still_alive if ($line_counter++ % 200 == 0);
X }
X } else {
X die "Cannot use xhdr: $resp";
X }
X &still_alive();
X }
X }
X
X #
X # Okay, now we've got the articles to present in %ids
X # so let's run 'em by the TO server.
X #
X
X # interaction:
X #
X # C: IHAVE <
[email protected]>
X # S: 435 Already seen that one, where you been?
X #
X # (client offers another article)
X # C: IHAVE <
[email protected]>
X # S: 335 News to me! <CRLF.CRLF> to end.
X # C: (sends article)
X # C: .
X # S: 235 Article transferred successfully. Thanks.
X #
X # (or)
X #
X # S: 436 Transfer failed.
X
X # 235 article transferred ok
X # 335 send article to be transferred. End with <CR-LF>.<CR-LF>
X # 435 article not wanted - do not send it
X # 436 transfer failed - try again later
X # 437 article rejected - do not try again
X
Xprint "I am at ".__LINE__."\n" if $debug;
X
X %done = ();
X $group_count = 0;
X for $id (keys %ids) {
X &still_alive();
X if ($limit && $group_count > $limit) {
X print STDERR "Transfer stopped by group count limit\n"
X if $verbose;
X last;
X }
X $attempted += 1;
X $0 = "$o0: ($group/$time/$artnum/$limit) send ihave $id";
X &ts("ihave $id");
X $resp = &getline(TS);
X if ($resp =~ /^435\s/) {
X # skip that one...
X $done{$id} = 'duplicate';
X } elsif ($resp =~ /^335\s/) {
X # is a winner!!! ding ding ding.
X $0 = "$o0: ($group/$time/$artnum/$limit) send article $id";
X &fs("article $ids{$id}");
X $resp = &getline(FS);
X if ($resp =~ /^220\s/) {
X $0 = "$o0: ($group/$time/$artnum/$limit) xfer $id";
X &still_alive();
X $line_counter = 0;
X while($resp = &getline(FS)) {
X &chop($resp);
X &ts($resp);
X last if $resp =~ /^\.$/;
X &still_alive if ($line_counter++ % 200 == 0);
X }
X $resp = &getline(TS);
X if ($resp =~ /^235\s/) {
X # way cool!
X print "$id\n" if $verbose;
X $group_count += 1;
X $transfered += 1;
X $done{$id} = 'transfered';
X } elsif ($resp =~ /^436\s/) {
X # try again later...
X print STDERR "Told to try again\n";
X &ts("quit");
X &fs("quit");
X exit(0);
X } elsif ($resp =~ /^437\s/) {
X $done{$id} = 'rejected';
X # rejected, sigh.
X } else {
X die "Unknown responce: $resp";
X }
X &still_alive();
X } elsif ($resp =~ /^(423|430)\s/) {
X # ooops, it's gone!
X print STDERR "Article $id in $group disappeared on me\n";
X close(TS);
X &openTS();
X &still_alive();
X } else {
X die "Weird response: $resp";
X }
X } else {
X die "Unexpected code: $resp";
X }
X }
X
X $good_art = 0;
X
X if ($time && ! $TIMESTAMP) {
X #
X # This should only be true if we just tried to backfeed
X # a certain number of days.
X #
X # Now that we've got a partial set of articles, we want
X # to note what they are so that we can properly set the
X # touch file. This is kinda tricky.
X #
X if (%ids) {
X $0 = "$o0: ($group/$time/$artnum/$limit) calculate seq number: send xhdr";
X &still_alive();
X $line_counter == 0;
X &fs("xhdr Message-ID $f_first-$f_last");
X $resp = &getline(FS);
X if ($resp =~ /^221\s/) {
X $0 = "$o0: ($group/$time/$artnum/$limit) calculate seq number: recv xhdr";
X while($resp = &getline(FS)) {
X &chop($resp);
X last if $resp =~ /^\.$/;
X if ($resp =~ /^(\d+)\s+(\S+)/) {
X if ($done{$2}) {
X $good_art = $1;
X }
X } else {
X die "Could not parse output from server: $resp";
X }
X &still_alive if ($line_counter++ % 200 == 0);
X }
X } else {
X die "Cannot use xhdr: $resp";
X }
X unless ($good_art) {
X print STDERR "Did not manage to transfer antyhing, restting counter for $group!\n";
X }
X print STDERR "Reviewing, we find we are up to $good_art\n" if $verbose;
X } else {
X $good_art = $f_last;
X }
X }
X
X &still_alive();
X
X $tfe = -e $tf;
X if (! $tfe || ! $time) {
X open(TF,">$tf") || die "open >$tf: $!";
X $n = ($time ? ($good_art ? $good_art : 0) : $end);
X $0 = "$o0: set $tf = $n";
X (print TF "$n\n") || die "write $tf: $!";
X } else {
X open(TF,">>$tf") || die "open >$tf: $!";
X }
X ($sec,$min,$hour,$mday,$mon,$year,$wday,$isdst) = localtime($time);
X printf(TF "\n%d/%d/%02d $hour:$min --",$mon+1,$mday,$year,$hour,$min);
X print TF " $attempted attempted, $transfered suceeded\n";
X close(TF) || die "close $tf: $!";
X } elsif ($resp =~ /^411/) {
X print STDERR "$FROM doesn't carry $group\n";
X open(TF,">$tf") || die "open >$tf: $!";
X (print TF "0\n") || die "write $tf: $!";
X (print TF "\n$FROM doesn't carry $group\n");
X close(TF) || die "close $tf: $!";
X }
X }
X
X print STDERR "... $attempted offered, $transfered transfered\n"
X if ($attempted || $transfered);
X $0 = "$o0: ($n) run $MTHREADS on $group";
X &system("$MTHREADS $group")
X if ($MTHREADS && $transfered);
X &still_alive();
X
X &unlock($tf);
X}
X
X&unlock($MASTER_LOCK)
X if $MASTER_LOCK;
X
Xexit(0);
X
Xsub open_connections
X{
X #
X # Internal notation: f - from, t - to. Indicating
X # the flow of news. So, FS is the socket connected
X # to the news server that articles are retrieved
X # from and TS is connected to the server that articles
X # are sent to.
X #
X
X #
X # Open a socket to the NNTP server that news will be
X # transfered TO.
X #
X sub openTS {
X &socket(TS,$TO);
X local($0) = "$o0: get first line from $TO";
X $x = &getline(TS);
X die "To Server: $x" unless $x =~ /^200\s/;
X }
X &openTS();
X
X #
X # Open a socket to the NNTP sever that news will be
X # transfered FROM.
X #
X &socket(FS,$FROM);
X local($0) = "$o0: get first line from $FROM";
X $x = &getline(FS);
X die "To Server: $x" unless $x =~ /^20[01]\s/;
X}
X
X# The lock-keeping mechenism only owrks if the processes are owned
X# by the same user.
Xsub lock
X{
X local($tf) = @_;
X local($rl,$rl2);
X
X local($0) = "$o0: get lock on $tf";
X
X symlink("$$","$tf.lock");
X $rl = readlink("$tf.lock");
X if ($rl && $rl != $$) {
X $0 = "$o0: get lock on $tf (no alive file, sleeping)";
X sleep(60) if (! -e "$tf.alive");
X if (-e "$tf.alive") {
X if (time - (stat("$tf.alive"))[9] > $MAX_IDLE_TIME) {
X $rl2 = readlink("$tf.lock");
X if ($rl2 == $rl) {
X print STDERR "Breaking lock on $tf (was held by $rl)\n";
X $0 = "$o0: get lock on $tf (old alive, killing $rl)";
X kill(HUP,$rl);
X sleep(60);
X unlink("$tf.lock");
X unlink("$tf.alive");
X symlink("$$","$tf.lock");
X $rl = readlink("$tf.lock");
X }
X }
X } else {
X $rl2 = readlink("$tf.lock");
X if ($rl2 == $rl) {
X # $rl has been bad!
X print STDERR "Process $rl has a lock, but no alive file";
X $0 = "$o0: get lock on $tf (no alive, killing $rl)";
X kill(HUP,$rl);
X sleep(60);
X unlink("$tf.lock");
X unlink("$tf.alive");
X symlink("$$","$tf.lock");
X $rl = readlink("$tf.lock");
X }
X }
X }
X $locks{$tf} = 1;
X if ($rl == $$) {
X &touch("$tf.alive");
X }
X &still_alive();
X return ($rl == $$);
X}
X
Xsub unlock
X{
X local($f) = @_;
X local($rl);
X $rl = readlink("$f.lock");
X if ($rl == $$) {
X unlink("$f.lock");
X unlink("$f.alive");
X }
X delete $locks{$tf};
X}
X
X# keep touching the .alive file to show that we are still doing stuff.
X# if we ever stop, or doing do it for long enough, then we might be
X# killed by another process or our lock may be stolen.
Xsub still_alive
X{
X local($t);
X $t = time;
X
X if ($last_touch && ($t - $last_touch > $MAX_IDLE_TIME)) {
X # oops, we've been idle for TOO LONG
X die "Idle too long";
X } elsif ($t - $last_touch > $MIN_IDLE_TIME) {
X local($l);
X for $l (keys %locks) {
X &touch("$l.alive");
X }
X $last_touch = $t;
X }
X}
X
Xsub touch
X{
X local($t) = @_;
X local(*T);
X open(T,">$t") || die "open $t: $!";
X (print T "FOOBAR!!\n") || die "write $t: $!";
X close(T) || die "close $t: $!";
X}
X
Xsub fs
X{
X local($s) = @_;
X local($o);
X
X if ($debug) {
X ($o = $s) =~ s/(\S)/\1\b\1/g;
X print STDERR "$o\n";
X }
X (print FS "$s\r\n") || die "socket FS closed!";
X}
X
Xsub ts
X{
X local($s) = @_;
X local($u);
X
X if ($debug) {
X ($u = $s) =~ s/(\S)/_\b\1/g;
X print STDERR "$u\n";
X }
X (print TS "$s\r\n") || die "socket TS closed!";
X}
X
Xsub touch_file_okay
X{
X local($group) = @_;
X
X local($tf) = &touch_file($group);
X
X if (-e $tf && (-M $tf) < $MAX_EXPIRE) {
X return 1;
X }
X return 0;
X}
X
Xsub touch_file
X{
X local($group) = @_;
X
X local($gd,$tf);
X $gd = $group;
X ($gd =~ s!\.!/!g) if $FETCHDIRS;
X $tf = "$FETCHSPOOL/$gd/$FETCHFILE";
X return $tf;
X}
X
Xsub getline
X{
X local($S) = @_;
X local($resp);
X
X $resp = <$S>;
X die "connection to $S closed" if !defined($resp);
X print STDERR "$S: $resp" if $debug;
X return $resp;
X}
X
Xsub socket
X{
X local($S,$a) = @_;
X
X local($0) = "$o0: open socket to $a";
X print STDERR "Open $S: $a\n" if $debug;
X if ($a =~ m,^/,) {
X local($socket_path) = $a;
X local($sa);
X
X $sa = pack($SOCKADDR_UN, $AF_UNIX, $socket_path);
X socket($S, $AF_UNIX, $SOCK_STREAM, 0)
X || die "socket: $!";
X connect($S, $sa) || die "connect: $!";
X } else {
X local($host,$port);
X if ($a =~ m,(.+)/(.+),) {
X ($host, $port) = ($1, $2);
X } else {
X ($host, $port) = ($a, 'nntp');
X }
X local($name,$aliases,$proto,$type,$len);
X local($this,$that,$thisaddr,$thataddr);
X
X chop($hostname = `hostname`)
X unless $hostname;
X die "Could not get hostname"
X unless $hostname;
X ($name,$aliases,$proto) = getprotobyname('tcp');
X ($name,$aliases,$port) = getservbyname($port,'tcp')
X unless $port =~ /^\d+/;
X local($0) = "$o0: open socket to $a (gethostbyname)";
X ($name,$aliases,$type,$len,$thisaddr) = gethostbyname($hostname);
X $this = pack($SOCKADDR_IP, $AF_INET, 0, $thisaddr);
X ($name,$aliases,$type,$len,$thataddr) = gethostbyname($host);
X die "Could not get host addr for $host"
X unless $thataddr;
X local($0) = "$o0: open socket to $a (connect)";
X $that = pack($SOCKADDR_IP, $AF_INET, $port, $thataddr);
X socket($S, $AF_INET, $SOCK_STREAM, $proto)
X || die "socket: $!";
X bind($S, $this)
X || die "bind $hostname,0: $!";
X connect($S, $that)
X || die "connect $host: $!";
X }
X select((select($S),$| = 1)[0]); # don't buffer output to S
X print STDERR "$S is open\n" if $debug;
X}
X
Xsub readfile
X{
X local($file) = @_;
X local(@r,*F);
X
X open(F,$file) || die "open $file: $!";
X @r = (wantarray ? <F> : scalar(<F>));
X close(F);
X return @r;
X}
X
Xsub system
X{
X print STDERR "+ @_\n" if $verbose;
X system @_;
X}
X
Xsub chop
X{
X local($x);
X
X $x = chop($_[0]);
X if ($x ne "\n") {
X $_[0] .= $x;
X return;
X }
X $x = chop($_[0]);
X
X return if $x eq "\r";
X
X $_[0] .= $x;
X}
X
X# Build the &restrict($group) subroutine.
X# &restrict should return 1 for groups that should be skipped.
Xsub restrict_init
X{
X local($t,$rev,$revrev);
X
X local($e);
X
X $e = "sub restrict\n";
X $e .= "{\n";
X $e .= "\tlocal(\$g) = \@_;\n";
X $e .= "\n";
X
X for $t (reverse split(',',$RESTRICTION)) {
X $rev = ($t =~ s/^\!//);
X
X $t =~ s/\./\\./g;
X $t =~ s/\*/.*/g;
X
X $e .= "\treturn $rev if \$g =~ /$t/;\n";
X }
X
X $revrev = !$rev;
X $e .= "\treturn $revrev;\n";
X
X $e .= "}\n";
X $e .= "1;\n";
X
X print "RESTRICT:\n$e" if $debug;
X eval $e;
X die "Eval:\n$e --- $@" if $@;
X}
X
X# for_perl_dash_w
X{
X <FS>;
X $USR1;
X $USR2;
X &check_alive;
X}
X
X################### BEGIN PERL/TROFF TRANSITION
X.00;
X
X'di \\ " finish diversion--previous line must be blank
X.nr nl 0-1 \\ " fake up transition to first page again
X.nr % 0 \\ " start at page 1
X'; __END__
X.\" ############### END PERL/TROFF TRANSITION
X.TH FETCH-NEWS 1 "March 11, 1993"
X.AT 3
X.SH FETCH-NEWS
Xfetch-news \- pull the news that will be read via nntp
X.SH SYNOPSIS
X.B fetch-news
X.RI [ -force
X.BR days ]
X.RI [ -debug ]
X.RI [ -verbose ]
X.RI [ -config
X.BR config_file ]
X\fR[\fIPARAMETER\fR=\fBvalue\fR]...
X.SH DESCRIPTION
X.B fetch-news
Xwill use the NNTP to transfer news between two news servers.
XIt only transfers the news that it believes people will want to
Xread. It knows what people want to read by looking at their
X.newsrc files. Only groups that are activly being read will be
Xfetched.
X.LP
XUnlike some other similar programs,
X.B fetch-news
Xwill keep track of article numbers rather than using the
X.B newnews
Xcommand. This is important because with most news implementations
X.RB ( INN
Xin particular) the
X.B newnews
Xcommand thrahes the server.
X.SH OPTIONS
XThere are only a few command line options to
X.BR fetch-news ,
Xbut there are many configuration options. Since
X.B fetch-news
Xis a perl program distributed in source, the bulk of the configuration
Xoptions are specified in the program text itself just after the
Xcopyright notice. You do not need to change there! One of the
Xparmateters,
X.IR $CONFIG ,
Xis the location of a file to
X.B require
Xbefore doing much else. This file will be read before much else
Xis evaluated. Set your overrides there. The name of this file can
Xbe specified on the command line with
X.I -config
X.BR filename .
X.LP
XThe other options are
X.I -force
X.B days
Xwhich will override the touch files and cause
X.B days
Xworth of news to be transfered.
XYou can get more verbose output by specifying
X.IR -verbose ,
Xor
X.IR -debug.
X.LP
XFinally, any of the configuration parameters (see the source) can be
Xoverriden on the command line with an assigment: \fIcommand\fR=\fBvalue\fR.
X.SH NEWS SERVER SETUP
X.B fetch-news
Xis asymetrical in how it talks to news servers. It talks to the news server
Xthat it is transfering news from as if it were a news reader. It talks to
Xthe server that it is transfering news to as if it were another news server.
XThis difference can make configuration annoying. However, it's easy to
Xtest: if you can read news with a news reader, you can get the news
Xwith
X.B fetch-news
X(if you don't have permission, just use a packet forwarder).
X.LP
XOne implication is that with
X.BR fetch-news ,
Xyou tend get your news from one server and send your outgoing news
Xto another.
X.SH USE WITH SLOW LINKS
XI built
X.B fetch-news
Xmostly to optimize use of my slow (SLIP) internet connection. It
Xsaves my line by only transfering the news that is likely to be read.
XIn addition, I use a small tcp socket forwarder process to introduce
Xdelay. In my forwarded process, I have it use a small (128-byte)
Xsocket send buffer so that a round-trip delay is required for every
Xpacket. To get my socket forwarder, ftp to
X.I idiom.berkeley.ca.us
Xand grab
X.IR pub/muir-programs/tcp_nest.c .
X.SH SIMILAR SYSTEMS
XThere are quite a few news transfer systems that fill a similar role to
X.BR fetch-news .
XNone of them do exactly the same thing. Here are some pointers to a few...
X.LP
X.B Newsfeed
Xby Vernon C. Hoxie
X.RI <
[email protected] >,
Xdoes about what
X.B fetch-news
Xdoes, but it does with batches.
X.LP
X.B Gup
X(newsGroup Update Program)
Xby Mark Delany
X.RI <
[email protected] >
Xand
XAndrew Herbert
X.RI <
[email protected] >
Xprovides a way to update a news subscription file by mail.
X.LP
X.B SLNR
X(Simple Local News Reader)
Xby Philippe Goujard
X.RI <
[email protected] >
Xis an off-line newsreader. It grabs your news all at once
Xso that you can read it later.
X.LP
XAnd there is one more system, recently posted, that is very similar to
X.B fetch-news
Xexcept that is uses the
X.B newnews
Xcommand to transfer news. I apologize for forgetting the name.
X.SH AVAILABILITY
XThe latest version of
X.B fetch-news
Xis available through anonymous ftp to
X.IR idiom.berkeley.ca.us .
XThere is a small mailing list for discussing
X.BR fetch-news .
XSend mail to
X.I
[email protected]
Xto join it.
X.SH AUTHOR
X.I David Muir Sharnoff\ \ \ \ <
[email protected]>
SHAR_EOF
chmod 0755 fetch-news || echo "restore of fetch-news fails"
exit 0