Introduction
Introduction Statistics Contact Development Disclaimer Help
sfeed_update: use xargs -P -0 - sfeed - RSS and Atom parser
git clone git://git.codemadness.org/sfeed
Log
Files
Refs
README
LICENSE
---
commit cdb8f7feb135adf6f18e389b4bbf47886089474a
parent 62bfed65ca91c34ea24b81b191c23d4542a7075b
Author: Hiltjo Posthuma <[email protected]>
Date: Tue, 26 Dec 2023 15:59:39 +0100
sfeed_update: use xargs -P -0
Some of the options, like -P are as of writing (2023) non-POSIX:
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/xargs.html. However
many systems support this useful extension for many years now.
Some historic context:
The xargs -0 option was added on 1996-06-11, about a year after the NetBSD
import (over 27 years ago at the time of writing):
http://cvsweb.openbsd.org/cgi-bin/cvsweb/src/usr.bin/xargs/xargs.c?rev=1.2&cont…
On OpenBSD the xargs -P option was added on 2003-12-06 by syncing the FreeBSD
code:
http://cvsweb.openbsd.org/cgi-bin/cvsweb/src/usr.bin/xargs/xargs.c?rev=1.14&con…
Looking at the imported git history log of GNU findutils (which has xargs), the
very first commit already had the -0 and -P option on Sun Feb 4 20:35:16 1996
+0000.
Tested on many systems, old and new, some notable:
- OpenBSD 7.4
- Void Linux
- FreeBSD 12
- NetBSD 9.3
- HaikuOS (uses GNU tools).
- Slackware 11
- OpenBSD 3.8
- NetBSD 5.1
Some shells:
- oksh
- bash
- dash
- zsh
During testing there are some incompatibilities found in parsing the fields so
the arguments are passed as one argument which is split later on by the child
program.
Diffstat:
M sfeed_update | 48 +++++++++++++++++++++--------…
1 file changed, 32 insertions(+), 16 deletions(-)
---
diff --git a/sfeed_update b/sfeed_update
@@ -163,14 +163,12 @@ _feed() {
# fetch and process a feed in parallel.
# feed(name, feedurl, [basesiteurl], [encoding])
feed() {
- # wait until ${maxjobs} are finished: will stall the queue if an item
- # is slow, but it is portable.
- [ ${signo} -ne 0 ] && return
- [ $((curjobs % maxjobs)) -eq 0 ] && wait
- [ ${signo} -ne 0 ] && return
- curjobs=$((curjobs + 1))
-
- _feed "$@" &
+ # Job parameters for xargs.
+ # Specify fields as a single parameter separated by the NUL separator.
+ # These fields are split later by the child process, this allows xargs
+ # with empty fields across many implementations.
+ printf '%s\037%s\037%s\037%s\037%s\037%s\0' \
+ "${config}" "${sfeedtmpdir}" "$1" "$2" "$3" "$4"
}
cleanup() {
@@ -201,8 +199,6 @@ feeds() {
}
main() {
- # job counter.
- curjobs=0
# signal number received for parent.
signo=0
# SIGINT: signal to interrupt parent.
@@ -217,16 +213,36 @@ main() {
touch "${sfeedtmpdir}/ok" || die
# make sure path exists.
mkdir -p "${sfeedpath}"
- # fetch feeds specified in config file.
- feeds
- # wait till all feeds are fetched (concurrently).
- [ ${signo} -eq 0 ] && wait
- # check error exit status indicator for parallel jobs.
- [ -f "${sfeedtmpdir}/ok" ]
+
+ # print feeds for parallel processing with xargs.
+ feeds > "${sfeedtmpdir}/jobs" || die
+ SFEED_UPDATE_CHILD="1" xargs -s 65535 -x -0 -P "${maxjobs}" -n 1 \
+ "$(readlink -f "${argv0}")" < "${sfeedtmpdir}/jobs"
statuscode=$?
+
+ # check error exit status indicator for parallel jobs.
+ [ -f "${sfeedtmpdir}/ok" ] || statuscode=1
# on signal SIGINT and SIGTERM exit with signal number + 128.
[ ${signo} -ne 0 ] && die $((signo+128))
die ${statuscode}
}
+# process a single feed.
+# parameters are: config, tmpdir, name, feedurl, basesiteurl, encoding
+if [ "${SFEED_UPDATE_CHILD}" = "1" ]; then
+ IFS="" # "\037"
+ [ "$1" = "" ] && exit 0 # must have an argument set
+ printf '%s\n' "$1" | \
+ while read -r config tmpdir name feedurl basesiteurl encoding; do
+ # load config file, sets $config.
+ loadconfig "${config}"
+ sfeedtmpdir="${tmpdir}"
+ _feed "${name}" "${feedurl}" "${basesiteurl}" "${encoding}"
+ exit "$?"
+ done
+ exit 0
+fi
+
+# ...else parent mode:
+argv0="$0" # remember $0, in shells like zsh $0 is the function name.
[ "${SFEED_UPDATE_INCLUDE}" = "1" ] || main "$@"
You are viewing proxied material from codemadness.org. The copyright of proxied material belongs to its original authors. Any comments or complaints in relation to proxied material should be directed to the original authors of the content concerned. Please see the disclaimer for more details.