man page documentation for all tools: copied from sfeed and changed - tscrape -… | |
git clone git://git.codemadness.org/tscrape | |
Log | |
Files | |
Refs | |
README | |
LICENSE | |
--- | |
commit f05f3eb6c90f7b1baf7369498609dc5d5d212b63 | |
parent 797f715398aeac08febc8067ed6423da727e4f45 | |
Author: Hiltjo Posthuma <[email protected]> | |
Date: Sat, 17 Aug 2019 12:10:00 +0200 | |
man page documentation for all tools: copied from sfeed and changed | |
initial version, needs some more work. | |
Diffstat: | |
M Makefile | 15 ++++++++++++--- | |
A tscrape.5 | 47 +++++++++++++++++++++++++++++… | |
A tscrape_html.1 | 34 +++++++++++++++++++++++++++++… | |
A tscrape_plain.1 | 46 +++++++++++++++++++++++++++++… | |
A tscrape_update.1 | 89 +++++++++++++++++++++++++++++… | |
A tscraperc.5 | 98 +++++++++++++++++++++++++++++… | |
6 files changed, 326 insertions(+), 3 deletions(-) | |
--- | |
diff --git a/Makefile b/Makefile | |
@@ -36,7 +36,11 @@ COMPATOBJ =\ | |
LIB = ${LIBUTIL} ${LIBXML} ${COMPATOBJ} | |
-MAN1 = tscrape.1 | |
+MAN1 = ${BIN:=.1}\ | |
+ ${SCRIPTS:=.1} | |
+MAN5 = \ | |
+ tscrape.5\ | |
+ tscraperc.5 | |
DOC = \ | |
LICENSE\ | |
README | |
@@ -66,7 +70,7 @@ ${LIBXML}: ${LIBXMLOBJ} | |
dist: | |
rm -rf "${NAME}-${VERSION}" | |
mkdir -p "${NAME}-${VERSION}" | |
- cp -f ${MAN1} ${DOC} ${HDR} \ | |
+ cp -f ${MAN1} ${MAN5} ${DOC} ${HDR} \ | |
${SRC} ${LIBXMLSRC} ${LIBUTILSRC} ${COMPATSRC} ${SCRIPTS} \ | |
Makefile config.mk \ | |
tscraperc.example style.css \ | |
@@ -90,10 +94,14 @@ install: all | |
style.css\ | |
README\ | |
"${DESTDIR}${DOCPREFIX}" | |
- # installing manual pages for tools. | |
+ # installing manual pages for general commands: section 1. | |
mkdir -p "${DESTDIR}${MANPREFIX}/man1" | |
cp -f ${MAN1} "${DESTDIR}${MANPREFIX}/man1" | |
for m in $(MAN1); do chmod 644 "${DESTDIR}${MANPREFIX}/man1/$$m"; done | |
+ # installing manual pages for file formats: section 5. | |
+ mkdir -p "${DESTDIR}${MANPREFIX}/man5" | |
+ cp -f ${MAN5} "${DESTDIR}${MANPREFIX}/man5" | |
+ for m in ${MAN5}; do chmod 644 "${DESTDIR}${MANPREFIX}/man5/$$m"; done | |
uninstall: | |
# removing executable files and scripts. | |
@@ -106,5 +114,6 @@ uninstall: | |
-rmdir "${DESTDIR}${DOCPREFIX}" | |
# removing manual pages. | |
for m in $(MAN1); do rm -f "${DESTDIR}${MANPREFIX}/man1/$$m"; done | |
+ for m in ${MAN5}; do rm -f "${DESTDIR}${MANPREFIX}/man5/$$m"; done | |
.PHONY: all clean dist install uninstall | |
diff --git a/tscrape.5 b/tscrape.5 | |
@@ -0,0 +1,47 @@ | |
+.Dd July 20, 2019 | |
+.Dt TSCRAPE 5 | |
+.Os | |
+.Sh NAME | |
+.Nm tscrape | |
+.Nd output format | |
+.Sh SYNOPSIS | |
+.Nm | |
+.Sh DESCRIPTION | |
+.Xr tscrape 1 | |
+writes the feed data in a TAB-separated format to stdout. | |
+.Sh TAB-SEPARATED FORMAT FIELDS | |
+The items are output per line in a TSV-like format. | |
+.Pp | |
+The fields are not allowed to have newlines and TABs, all whitespace characters | |
+are replaced by a single space character. | |
+Control characters are removed. | |
+.Sh TAB-SEPARATED FORMAT FIELDS | |
+The items are saved in a TSV-like format. Control characters are replaced | |
+by a single space. | |
+.Pp | |
+The order and format of the fields are: | |
+.Bl -tag -width 17n | |
+.It UNIX timestamp | |
+UNIX timestamp in UTC+0. | |
+.It username | |
+Twitter username (can be a retweet). | |
+.It fullname | |
+Twitter fullname (can be a retweet). | |
+.It tweet text | |
+Tweet text. | |
+.It item id | |
+Item id. | |
+.It item username | |
+Item username. | |
+.It item fullname | |
+Item fullname. | |
+.It item retweetid | |
+Item Retweet ID. | |
+.It item is pinned | |
+Item is pinned or not? 0 or 1. | |
+.El | |
+.Sh SEE ALSO | |
+.Xr tscrape 1 , | |
+.Xr tscrape_plain 1 | |
+.Sh AUTHORS | |
+.An Hiltjo Posthuma Aq Mt [email protected] | |
diff --git a/tscrape_html.1 b/tscrape_html.1 | |
@@ -0,0 +1,34 @@ | |
+.Dd July 20, 2019 | |
+.Dt TSCRAPE_HTML 1 | |
+.Os | |
+.Sh NAME | |
+.Nm tscrape_html | |
+.Nd format TSV data to HTML | |
+.Sh SYNOPSIS | |
+.Nm | |
+.Op Ar file... | |
+.Sh DESCRIPTION | |
+.Nm | |
+formats tscrape data (TSV) from | |
+.Xr tscrape 1 | |
+from stdin or | |
+.Ar file | |
+to stdout in HTML. | |
+If one or more | |
+.Ar file | |
+are specified, the basename of the | |
+.Ar file | |
+is used as the feed name in the output. | |
+If no | |
+.Ar file | |
+parameters are specified and so the data is read from stdin the feed name | |
+is empty. | |
+.Pp | |
+Items with a timestamp from the last day compared to the system time at the | |
+time of formatting are counted and marked as new. | |
+.Sh SEE ALSO | |
+.Xr tscrape 1 , | |
+.Xr tscrape_plain 1 , | |
+.Xr tscrape 5 | |
+.Sh AUTHORS | |
+.An Hiltjo Posthuma Aq Mt [email protected] | |
diff --git a/tscrape_plain.1 b/tscrape_plain.1 | |
@@ -0,0 +1,46 @@ | |
+.Dd July 20, 2019 | |
+.Dt TSCRAPE_PLAIN 1 | |
+.Os | |
+.Sh NAME | |
+.Nm tscrape_plain | |
+.Nd format tscrape data to a plain-text list | |
+.Sh SYNOPSIS | |
+.Nm | |
+.Op Ar file... | |
+.Sh DESCRIPTION | |
+.Nm | |
+formats tscrape data (TSV) from | |
+.Xr tscrape 1 | |
+from stdin or | |
+.Ar file | |
+to stdout as a plain-text list. | |
+If one or more | |
+.Ar file | |
+are specified, the basename of the | |
+.Ar file | |
+is used as the feed name in the output. | |
+If no | |
+.Ar file | |
+parameters are specified and so the data is read from stdin the feed name | |
+is empty. | |
+.Pp | |
+Items with a timestamp from the last day compared to the system time at the | |
+time of formatting are marked as new. | |
+.Pp | |
+.Nm | |
+aligns the output. | |
+It shows a maximum of 70 column-wide characters for the title and outputs | |
+an ellipsis symbol if the title is longer and truncated. | |
+Make sure the environment variable | |
+.Ev LC_CTYPE | |
+is set to a UTF-8 locale, so it can determine the proper column-width | |
+per rune, using | |
+.Xr mbtowc 3 | |
+and | |
+.Xr wcwidth 3 . | |
+.Sh SEE ALSO | |
+.Xr tscrape 1 , | |
+.Xr tscrape_html 1 , | |
+.Xr tscrape 5 | |
+.Sh AUTHORS | |
+.An Hiltjo Posthuma Aq Mt [email protected] | |
diff --git a/tscrape_update.1 b/tscrape_update.1 | |
@@ -0,0 +1,89 @@ | |
+.Dd August 17, 2019 | |
+.Dt TSCRAPE_UPDATE 1 | |
+.Os | |
+.Sh NAME | |
+.Nm tscrape_update | |
+.Nd update feeds and merge with old feeds | |
+.Sh SYNOPSIS | |
+.Nm | |
+.Op Ar tscraperc | |
+.Sh DESCRIPTION | |
+.Nm | |
+updates feeds files and merges the new data with the previous files. | |
+These are the files in the directory | |
+.Pa $HOME/.tscrape/feeds | |
+by default. | |
+.Sh OPTIONS | |
+.Bl -tag -width 17n | |
+.It Ar tscraperc | |
+Config file, if not specified uses the path | |
+.Pa $HOME/.tscrape/tscraperc | |
+by default. | |
+See the | |
+.Sx FILES READ | |
+section for more information. | |
+.El | |
+.Sh FILES READ | |
+.Bl -tag -width 17n | |
+.It Ar tscraperc | |
+Config file, see the tscraperc.example file for an example. | |
+This file is evaluated as a shellscript in | |
+.Nm . | |
+.Pp | |
+Atleast the following functions can be overridden per feed: | |
+.Bl -tag -width 17n | |
+.It Fn fetch | |
+to use | |
+.Xr wget 1 , | |
+OpenBSD | |
+.Xr ftp 1 | |
+or an other download program. | |
+.It Fn merge | |
+to change the merge logic. | |
+.It Fn filter | |
+to filter on fields. | |
+.It Fn order | |
+to change the sort order. | |
+.El | |
+.Pp | |
+The | |
+.Fn feeds | |
+function is called to process the feeds. | |
+The default | |
+.Fn feed | |
+function is executed concurrently as a background job in your | |
+.Xr tscraperc 5 | |
+config file to make updating faster. | |
+The variable | |
+.Va maxjobs | |
+can be changed to limit or increase the amount of concurrent jobs (8 by | |
+default). | |
+.El | |
+.Sh FILES WRITTEN | |
+.Bl -tag -width 17n | |
+.It feedname | |
+TAB-separated format containing all items per feed. | |
+The | |
+.Nm | |
+script merges new items with this file. | |
+The filename cannot contain '/' characters, they will be replaced with '_'. | |
+.El | |
+.Sh EXAMPLES | |
+To update your feeds and format them in various formats: | |
+.Bd -literal | |
+# Update | |
+tscrape_update "configfile" | |
+# Plain-text list | |
+tscrape_plain $HOME/.tscrape/feeds/* > $HOME/.tscrape/feeds.txt | |
+# HTML | |
+tscrape_html $HOME/.tscrape/feeds/* > $HOME/.tscrape/feeds.html | |
+.Ed | |
+.Sh SEE ALSO | |
+.Xr tscrape 1 , | |
+.Xr tscrape_html 1 , | |
+.Xr tscrape_plain 1 , | |
+.Xr sh 1 , | |
+.Xr tscrape 5 , | |
+.Xr tscraperc 5 | |
+.Sh AUTHORS | |
+.An Hiltjo Posthuma Aq Mt [email protected] | |
diff --git a/tscraperc.5 b/tscraperc.5 | |
@@ -0,0 +1,98 @@ | |
+.Dd July 14, 2019 | |
+.Dt TSCRAPERC 5 | |
+.Os | |
+.Sh NAME | |
+.Nm tscraperc | |
+.Nd tscrape_update(1) configuration file | |
+.Sh DESCRIPTION | |
+.Nm | |
+is the configuration file for | |
+.Xr tscrape_update 1 . | |
+.Pp | |
+The variable | |
+.Va tscrapepath | |
+can be set for the directory to store the TAB-separated feed files, | |
+by default this is | |
+.Pa $HOME/.tscrape/feeds . | |
+. | |
+.Sh FUNCTIONS | |
+The following functions must be defined in a | |
+.Nm | |
+file: | |
+.Bl -tag -width Ds | |
+.It Fn feeds | |
+This function is like a "main" function called from | |
+.Xr tscrape_update 1 . | |
+.It Fn feed "name" "feedurl" | |
+Function to process the feed, its arguments are in the order: | |
+.Bl -tag -width Ds | |
+.It Fa name | |
+Name of the feed, this is also used as the filename for the TAB-separated | |
+feed file. | |
+The filename cannot contain '/' characters, they will be replaced with '_'. | |
+.It Fa feedurl | |
+Uri to fetch the RSS/Atom data from, usually a HTTP or HTTPS uri. | |
+.El | |
+.El | |
+.Sh OVERRIDE FUNCTIONS | |
+Because | |
+.Xr tscrape_update 1 | |
+is a shellscript each function can be overridden to change its behaviour, | |
+notable functions are: | |
+.Bl -tag -width Ds | |
+.It Fn fetch "name" "uri" "feedfile" | |
+Fetch feed from url and writes data to stdout, its arguments are: | |
+.Bl -tag -width Ds | |
+.It Fa name | |
+Specified name in configuration file (useful for logging). | |
+.It Fa uri | |
+Uri to fetch. | |
+.It Fa feedfile | |
+Used feedfile (useful for comparing modification times). | |
+.El | |
+.It Fn merge "name" "oldfile" "newfile" | |
+Merge data of oldfile with newfile and writes it to stdout, its arguments are: | |
+.Bl -tag -width Ds | |
+.It Fa name | |
+Feed name. | |
+.It Fa oldfile | |
+Old file. | |
+.It Fa newfile | |
+New file. | |
+.El | |
+.It Fn filter "name" | |
+Filter | |
+.Xr tscrape 5 | |
+data from stdin, write to stdout, its arguments are: | |
+.Bl -tag -width Ds | |
+.It Fa name | |
+Feed name. | |
+.El | |
+.It Fn order "name" | |
+Sort | |
+.Xr tscrape 5 | |
+data from stdin, write to stdout, its arguments are: | |
+.Bl -tag -width Ds | |
+.It Fa name | |
+Feed name. | |
+.El | |
+.El | |
+.Sh EXAMPLES | |
+An example configuration file is included named tscraperc.example and also | |
+shown below: | |
+.Bd -literal | |
+#tscrapepath="$HOME/.tscrape/feeds" | |
+ | |
+# list of feeds to fetch: | |
+feeds() { | |
+ # feed <name> <feedurl> | |
+ feed "Rich Felker" "https://twitter.com/richfelker" | |
+ feed "Internet of shit" "https://twitter.com/internetofshit" | |
+ feed "Donald Trump" "https://twitter.com/realdonaldtrump" | |
+} | |
+.Ed | |
+.Sh SEE ALSO | |
+.Xr tscrape_update 1 , | |
+.Xr sh 1 | |
+.Sh AUTHORS | |
+.An Hiltjo Posthuma Aq Mt [email protected] |