Introduction
Introduction Statistics Contact Development Disclaimer Help
man page improvements (sync) - tscrape - twitter scraper
git clone git://git.codemadness.org/tscrape
Log
Files
Refs
README
LICENSE
---
commit b0413f42bd2bc31cbbb5e338093de51b94cfd028
parent 423d3f5ad6023be3eb50ebe2f9504309bfe3d940
Author: Hiltjo Posthuma <[email protected]>
Date: Fri, 20 Mar 2020 12:03:26 +0100
man page improvements (sync)
- tscraperc.5: use the same order as executed in the tscrape_update file.
- tscraperc.5: reference curl, which are optional, but used by default.
- tscraperc.5: use a .Sh VARIABLES section for tscrapepath and maxjobs.
- tscrape_update.1: split config format-specific documentation and reference it.
- just use the term "url" instead of "uri".
- shorten some texts, increasing readability.
- document exit status of tools.
fix:
- do not reference RSS/Atom.
Diffstat:
M tscrape.1 | 4 +++-
M tscrape.5 | 6 +++---
M tscrape_html.1 | 4 +++-
M tscrape_plain.1 | 4 +++-
M tscrape_update.1 | 56 ++++++++++-------------------…
M tscraperc.5 | 61 ++++++++++++++++++-----------…
6 files changed, 65 insertions(+), 70 deletions(-)
---
diff --git a/tscrape.1 b/tscrape.1
@@ -1,4 +1,4 @@
-.Dd May 11, 2018
+.Dd March 20, 2020
.Dt TSCRAPE 1
.Os
.Sh NAME
@@ -35,6 +35,8 @@ Item Retweet ID.
.It item is pinned
Item is pinned or not? 0 or 1.
.El
+.Sh EXIT STATUS
+.Ex -std
.Sh EXAMPLES
.Bd -literal -offset left
curl --http1.0 -H 'User-Agent:' -s 'https://twitter.com/namehere' | tscrape
diff --git a/tscrape.5 b/tscrape.5
@@ -1,4 +1,4 @@
-.Dd July 20, 2019
+.Dd March 20, 2020
.Dt TSCRAPE 5
.Os
.Sh NAME
@@ -21,8 +21,8 @@ Control characters are replaced by a single space.
.Pp
The order and content of the fields are:
.Bl -tag -width 17n
-.It UNIX timestamp
-UNIX timestamp in UTC+0.
+.It timestamp
+UNIX timestamp in UTC+0, empty on parse failure.
.It username
Twitter username (can be a retweet).
.It fullname
diff --git a/tscrape_html.1 b/tscrape_html.1
@@ -1,4 +1,4 @@
-.Dd July 20, 2019
+.Dd March 20, 2020
.Dt TSCRAPE_HTML 1
.Os
.Sh NAME
@@ -26,6 +26,8 @@ is empty.
.Pp
Items with a timestamp from the last day compared to the system time at the
time of formatting are counted and marked as new.
+.Sh EXIT STATUS
+.Ex -std
.Sh SEE ALSO
.Xr tscrape 1 ,
.Xr tscrape_plain 1 ,
diff --git a/tscrape_plain.1 b/tscrape_plain.1
@@ -1,4 +1,4 @@
-.Dd July 20, 2019
+.Dd March 20, 2020
.Dt TSCRAPE_PLAIN 1
.Os
.Sh NAME
@@ -38,6 +38,8 @@ per rune, using
.Xr mbtowc 3
and
.Xr wcwidth 3 .
+.Sh EXIT STATUS
+.Ex -std
.Sh SEE ALSO
.Xr tscrape 1 ,
.Xr tscrape_html 1 ,
diff --git a/tscrape_update.1 b/tscrape_update.1
@@ -1,4 +1,4 @@
-.Dd August 17, 2019
+.Dd March 20, 2020
.Dt TSCRAPE_UPDATE 1
.Os
.Sh NAME
@@ -9,65 +9,43 @@
.Op Ar tscraperc
.Sh DESCRIPTION
.Nm
-updates feeds files and merges the new data with the previous files.
-These are the files in the directory
+writes TAB-separated feed files and merges new items with the items in any
+existing files.
+The items are stored in one file per feed in the directory
.Pa $HOME/.tscrape/feeds
by default.
+The directory can be changed in the
+.Xr tscraperc 5
+file.
.Sh OPTIONS
.Bl -tag -width 17n
.It Ar tscraperc
-Config file, if not specified uses the path
+Config file.
+The default is
.Pa $HOME/.tscrape/tscraperc
-by default.
-See the
-.Sx FILES READ
-section for more information.
.El
.Sh FILES READ
.Bl -tag -width 17n
.It Ar tscraperc
-Config file, see the tscraperc.example file for an example.
This file is evaluated as a shellscript in
.Nm .
-.Pp
-Atleast the following functions can be overridden per feed:
-.Bl -tag -width 17n
-.It Fn fetch
-to use
-.Xr wget 1 ,
-OpenBSD
-.Xr ftp 1
-or an other download program.
-.It Fn merge
-to change the merge logic.
-.It Fn filter
-to filter on fields.
-.It Fn order
-to change the sort order.
-.El
-.Pp
-The
-.Fn feeds
-function is called to process the feeds.
-The default
-.Fn feed
-function is executed concurrently as a background job in your
+See also the
.Xr tscraperc 5
-config file to make updating faster.
-The variable
-.Va maxjobs
-can be changed to limit or increase the amount of concurrent jobs (8 by
-default).
+man page for a detailed description of the format and an example file.
.El
.Sh FILES WRITTEN
.Bl -tag -width 17n
.It feedname
-TAB-separated format containing all items per feed.
+TAB-separated
+.Xr tscrape 5
+format containing all items per feed.
The
.Nm
script merges new items with this file.
-The filename cannot contain '/' characters, they will be replaced with '_'.
+The feedname cannot contain '/' characters, they will be replaced with '_'.
.El
+.Sh EXIT STATUS
+.Ex -std
.Sh EXAMPLES
To update your feeds and format them in various formats:
.Bd -literal
diff --git a/tscraperc.5 b/tscraperc.5
@@ -1,4 +1,4 @@
-.Dd July 14, 2019
+.Dd March 20, 2020
.Dt TSCRAPERC 5
.Os
.Sh NAME
@@ -8,30 +8,36 @@
.Nm
is the configuration file for
.Xr tscrape_update 1 .
-.Pp
-The variable
-.Va tscrapepath
-can be set for the directory to store the TAB-separated feed files,
-by default this is
+.Sh VARIABLES
+.Bl -tag -width Ds
+.It Va tscrapepath
+can be set for the directory to store the TAB-separated feed files.
+The default is
.Pa $HOME/.tscrape/feeds .
-.
+.It Va maxjobs
+can be used to change the amount of concurrent
+.Fn feed
+jobs.
+The default is 8.
+.El
.Sh FUNCTIONS
-The following functions must be defined in a
-.Nm
-file:
.Bl -tag -width Ds
.It Fn feeds
-This function is like a "main" function called from
+This function is the required "main" entry-point function called from
.Xr tscrape_update 1 .
.It Fn feed "name" "feedurl"
-Function to process the feed, its arguments are in the order:
+Inside the
+.Fn feeds
+function feeds can be defined by calling the
+.Fn feed
+function, its arguments are:
.Bl -tag -width Ds
.It Fa name
Name of the feed, this is also used as the filename for the TAB-separated
feed file.
-The filename cannot contain '/' characters, they will be replaced with '_'.
+The feedname cannot contain '/' characters, they will be replaced with '_'.
.It Fa feedurl
-Uri to fetch the RSS/Atom data from, usually a HTTP or HTTPS uri.
+Url to fetch the data from, usually a HTTP or HTTPS url.
.El
.El
.Sh OVERRIDE FUNCTIONS
@@ -40,16 +46,28 @@ Because
is a shellscript each function can be overridden to change its behaviour,
notable functions are:
.Bl -tag -width Ds
-.It Fn fetch "name" "uri" "feedfile"
+.It Fn fetch "name" "url" "feedfile"
Fetch feed from url and writes data to stdout, its arguments are:
.Bl -tag -width Ds
.It Fa name
Specified name in configuration file (useful for logging).
-.It Fa uri
-Uri to fetch.
+.It Fa url
+Url to fetch.
.It Fa feedfile
Used feedfile (useful for comparing modification times).
.El
+.Pp
+By default the tool
+.Xr curl 1
+is used.
+.It Fn filter "name"
+Filter
+.Xr tscrape 5
+data from stdin, write to stdout, its arguments are:
+.Bl -tag -width Ds
+.It Fa name
+Feed name.
+.El
.It Fn merge "name" "oldfile" "newfile"
Merge data of oldfile with newfile and writes it to stdout, its arguments are:
.Bl -tag -width Ds
@@ -60,14 +78,6 @@ Old file.
.It Fa newfile
New file.
.El
-.It Fn filter "name"
-Filter
-.Xr tscrape 5
-data from stdin, write to stdout, its arguments are:
-.Bl -tag -width Ds
-.It Fa name
-Feed name.
-.El
.It Fn order "name"
Sort
.Xr tscrape 5
@@ -92,6 +102,7 @@ feeds() {
}
.Ed
.Sh SEE ALSO
+.Xr curl 1 ,
.Xr sh 1 ,
.Xr tscrape_update 1
.Sh AUTHORS
You are viewing proxied material from codemadness.org. The copyright of proxied material belongs to its original authors. Any comments or complaints in relation to proxied material should be directed to the original authors of the content concerned. Please see the disclaimer for more details.