man page improvements (sync) - tscrape - twitter scraper | |
git clone git://git.codemadness.org/tscrape | |
Log | |
Files | |
Refs | |
README | |
LICENSE | |
--- | |
commit b0413f42bd2bc31cbbb5e338093de51b94cfd028 | |
parent 423d3f5ad6023be3eb50ebe2f9504309bfe3d940 | |
Author: Hiltjo Posthuma <[email protected]> | |
Date: Fri, 20 Mar 2020 12:03:26 +0100 | |
man page improvements (sync) | |
- tscraperc.5: use the same order as executed in the tscrape_update file. | |
- tscraperc.5: reference curl, which are optional, but used by default. | |
- tscraperc.5: use a .Sh VARIABLES section for tscrapepath and maxjobs. | |
- tscrape_update.1: split config format-specific documentation and reference it. | |
- just use the term "url" instead of "uri". | |
- shorten some texts, increasing readability. | |
- document exit status of tools. | |
fix: | |
- do not reference RSS/Atom. | |
Diffstat: | |
M tscrape.1 | 4 +++- | |
M tscrape.5 | 6 +++--- | |
M tscrape_html.1 | 4 +++- | |
M tscrape_plain.1 | 4 +++- | |
M tscrape_update.1 | 56 ++++++++++-------------------… | |
M tscraperc.5 | 61 ++++++++++++++++++-----------… | |
6 files changed, 65 insertions(+), 70 deletions(-) | |
--- | |
diff --git a/tscrape.1 b/tscrape.1 | |
@@ -1,4 +1,4 @@ | |
-.Dd May 11, 2018 | |
+.Dd March 20, 2020 | |
.Dt TSCRAPE 1 | |
.Os | |
.Sh NAME | |
@@ -35,6 +35,8 @@ Item Retweet ID. | |
.It item is pinned | |
Item is pinned or not? 0 or 1. | |
.El | |
+.Sh EXIT STATUS | |
+.Ex -std | |
.Sh EXAMPLES | |
.Bd -literal -offset left | |
curl --http1.0 -H 'User-Agent:' -s 'https://twitter.com/namehere' | tscrape | |
diff --git a/tscrape.5 b/tscrape.5 | |
@@ -1,4 +1,4 @@ | |
-.Dd July 20, 2019 | |
+.Dd March 20, 2020 | |
.Dt TSCRAPE 5 | |
.Os | |
.Sh NAME | |
@@ -21,8 +21,8 @@ Control characters are replaced by a single space. | |
.Pp | |
The order and content of the fields are: | |
.Bl -tag -width 17n | |
-.It UNIX timestamp | |
-UNIX timestamp in UTC+0. | |
+.It timestamp | |
+UNIX timestamp in UTC+0, empty on parse failure. | |
.It username | |
Twitter username (can be a retweet). | |
.It fullname | |
diff --git a/tscrape_html.1 b/tscrape_html.1 | |
@@ -1,4 +1,4 @@ | |
-.Dd July 20, 2019 | |
+.Dd March 20, 2020 | |
.Dt TSCRAPE_HTML 1 | |
.Os | |
.Sh NAME | |
@@ -26,6 +26,8 @@ is empty. | |
.Pp | |
Items with a timestamp from the last day compared to the system time at the | |
time of formatting are counted and marked as new. | |
+.Sh EXIT STATUS | |
+.Ex -std | |
.Sh SEE ALSO | |
.Xr tscrape 1 , | |
.Xr tscrape_plain 1 , | |
diff --git a/tscrape_plain.1 b/tscrape_plain.1 | |
@@ -1,4 +1,4 @@ | |
-.Dd July 20, 2019 | |
+.Dd March 20, 2020 | |
.Dt TSCRAPE_PLAIN 1 | |
.Os | |
.Sh NAME | |
@@ -38,6 +38,8 @@ per rune, using | |
.Xr mbtowc 3 | |
and | |
.Xr wcwidth 3 . | |
+.Sh EXIT STATUS | |
+.Ex -std | |
.Sh SEE ALSO | |
.Xr tscrape 1 , | |
.Xr tscrape_html 1 , | |
diff --git a/tscrape_update.1 b/tscrape_update.1 | |
@@ -1,4 +1,4 @@ | |
-.Dd August 17, 2019 | |
+.Dd March 20, 2020 | |
.Dt TSCRAPE_UPDATE 1 | |
.Os | |
.Sh NAME | |
@@ -9,65 +9,43 @@ | |
.Op Ar tscraperc | |
.Sh DESCRIPTION | |
.Nm | |
-updates feeds files and merges the new data with the previous files. | |
-These are the files in the directory | |
+writes TAB-separated feed files and merges new items with the items in any | |
+existing files. | |
+The items are stored in one file per feed in the directory | |
.Pa $HOME/.tscrape/feeds | |
by default. | |
+The directory can be changed in the | |
+.Xr tscraperc 5 | |
+file. | |
.Sh OPTIONS | |
.Bl -tag -width 17n | |
.It Ar tscraperc | |
-Config file, if not specified uses the path | |
+Config file. | |
+The default is | |
.Pa $HOME/.tscrape/tscraperc | |
-by default. | |
-See the | |
-.Sx FILES READ | |
-section for more information. | |
.El | |
.Sh FILES READ | |
.Bl -tag -width 17n | |
.It Ar tscraperc | |
-Config file, see the tscraperc.example file for an example. | |
This file is evaluated as a shellscript in | |
.Nm . | |
-.Pp | |
-Atleast the following functions can be overridden per feed: | |
-.Bl -tag -width 17n | |
-.It Fn fetch | |
-to use | |
-.Xr wget 1 , | |
-OpenBSD | |
-.Xr ftp 1 | |
-or an other download program. | |
-.It Fn merge | |
-to change the merge logic. | |
-.It Fn filter | |
-to filter on fields. | |
-.It Fn order | |
-to change the sort order. | |
-.El | |
-.Pp | |
-The | |
-.Fn feeds | |
-function is called to process the feeds. | |
-The default | |
-.Fn feed | |
-function is executed concurrently as a background job in your | |
+See also the | |
.Xr tscraperc 5 | |
-config file to make updating faster. | |
-The variable | |
-.Va maxjobs | |
-can be changed to limit or increase the amount of concurrent jobs (8 by | |
-default). | |
+man page for a detailed description of the format and an example file. | |
.El | |
.Sh FILES WRITTEN | |
.Bl -tag -width 17n | |
.It feedname | |
-TAB-separated format containing all items per feed. | |
+TAB-separated | |
+.Xr tscrape 5 | |
+format containing all items per feed. | |
The | |
.Nm | |
script merges new items with this file. | |
-The filename cannot contain '/' characters, they will be replaced with '_'. | |
+The feedname cannot contain '/' characters, they will be replaced with '_'. | |
.El | |
+.Sh EXIT STATUS | |
+.Ex -std | |
.Sh EXAMPLES | |
To update your feeds and format them in various formats: | |
.Bd -literal | |
diff --git a/tscraperc.5 b/tscraperc.5 | |
@@ -1,4 +1,4 @@ | |
-.Dd July 14, 2019 | |
+.Dd March 20, 2020 | |
.Dt TSCRAPERC 5 | |
.Os | |
.Sh NAME | |
@@ -8,30 +8,36 @@ | |
.Nm | |
is the configuration file for | |
.Xr tscrape_update 1 . | |
-.Pp | |
-The variable | |
-.Va tscrapepath | |
-can be set for the directory to store the TAB-separated feed files, | |
-by default this is | |
+.Sh VARIABLES | |
+.Bl -tag -width Ds | |
+.It Va tscrapepath | |
+can be set for the directory to store the TAB-separated feed files. | |
+The default is | |
.Pa $HOME/.tscrape/feeds . | |
-. | |
+.It Va maxjobs | |
+can be used to change the amount of concurrent | |
+.Fn feed | |
+jobs. | |
+The default is 8. | |
+.El | |
.Sh FUNCTIONS | |
-The following functions must be defined in a | |
-.Nm | |
-file: | |
.Bl -tag -width Ds | |
.It Fn feeds | |
-This function is like a "main" function called from | |
+This function is the required "main" entry-point function called from | |
.Xr tscrape_update 1 . | |
.It Fn feed "name" "feedurl" | |
-Function to process the feed, its arguments are in the order: | |
+Inside the | |
+.Fn feeds | |
+function feeds can be defined by calling the | |
+.Fn feed | |
+function, its arguments are: | |
.Bl -tag -width Ds | |
.It Fa name | |
Name of the feed, this is also used as the filename for the TAB-separated | |
feed file. | |
-The filename cannot contain '/' characters, they will be replaced with '_'. | |
+The feedname cannot contain '/' characters, they will be replaced with '_'. | |
.It Fa feedurl | |
-Uri to fetch the RSS/Atom data from, usually a HTTP or HTTPS uri. | |
+Url to fetch the data from, usually a HTTP or HTTPS url. | |
.El | |
.El | |
.Sh OVERRIDE FUNCTIONS | |
@@ -40,16 +46,28 @@ Because | |
is a shellscript each function can be overridden to change its behaviour, | |
notable functions are: | |
.Bl -tag -width Ds | |
-.It Fn fetch "name" "uri" "feedfile" | |
+.It Fn fetch "name" "url" "feedfile" | |
Fetch feed from url and writes data to stdout, its arguments are: | |
.Bl -tag -width Ds | |
.It Fa name | |
Specified name in configuration file (useful for logging). | |
-.It Fa uri | |
-Uri to fetch. | |
+.It Fa url | |
+Url to fetch. | |
.It Fa feedfile | |
Used feedfile (useful for comparing modification times). | |
.El | |
+.Pp | |
+By default the tool | |
+.Xr curl 1 | |
+is used. | |
+.It Fn filter "name" | |
+Filter | |
+.Xr tscrape 5 | |
+data from stdin, write to stdout, its arguments are: | |
+.Bl -tag -width Ds | |
+.It Fa name | |
+Feed name. | |
+.El | |
.It Fn merge "name" "oldfile" "newfile" | |
Merge data of oldfile with newfile and writes it to stdout, its arguments are: | |
.Bl -tag -width Ds | |
@@ -60,14 +78,6 @@ Old file. | |
.It Fa newfile | |
New file. | |
.El | |
-.It Fn filter "name" | |
-Filter | |
-.Xr tscrape 5 | |
-data from stdin, write to stdout, its arguments are: | |
-.Bl -tag -width Ds | |
-.It Fa name | |
-Feed name. | |
-.El | |
.It Fn order "name" | |
Sort | |
.Xr tscrape 5 | |
@@ -92,6 +102,7 @@ feeds() { | |
} | |
.Ed | |
.Sh SEE ALSO | |
+.Xr curl 1 , | |
.Xr sh 1 , | |
.Xr tscrape_update 1 | |
.Sh AUTHORS |