2024-07-30 - PHAROS, a gopher front-end to the Internet Archive
===============================================================
I wanted to write a gopher front-end to the Internet Archive for some
time. Like any retrocomputing project, it isn't really practical,
just fun. The other day, i finally got around to writing it using
AWK. It runs as a CGI script under geomyidae. It requires:
* curl - used to fetch files from the Internet Archive
* geomyidae - gopher server with apache-style CGI
* json2tsv - decode JSON data from the API
* webdump - decode HTML item descriptions
I am writing this post to share and show my code. Hopefully someone
will have fun with it. Below is a link to a live demo.
Pharos demo:
gopher://tilde.pink/1/~bencollver/ia/
<
gopher://tilde.pink/1/~bencollver/ia/>
I named it Pharos after the famous lighthouse island near Alexandria.
I tested this on geomyidae versions 0.69 and 0.96. It is important
to edit the script and change geomyidae_version match your software,
or else it won't work at all. It's also important to configure the
cmd_* variables to the correct paths for the commands used by this
script. They are set in the top level file config.m4
I tested the script with busybox awk, mawk, and nawk. On the client
side i used gopherus and lynx. This script won't work with lagrange
nor web to gopher proxies because they expect URL encoded input, and
i am using gopher encoded input. ;)
I configured lynx to show numbers next to each link. That way i can
type in the number rather than navigating the cursor to the link.
Here are screenshots of the top level.
<
gopher://tilde.pink/I/~bencollver/log/
2024-07-30-pharos-gopher-frontend-to-internet-archive/pharos1.png>
<
gopher://tilde.pink/I/~bencollver/log/
2024-07-30-pharos-gopher-frontend-to-internet-archive/pharos2.png>
The Search option is a simple search. I type in a query, and it
returns results from the Internet Archive. Special queries are
possible. See the About item for an example.
The Advanced Search option provides a "wizard" style interface to
choose which fields to search.
The Books, Video, Audio, Software, and Images options list featured
and popular collections in those categories.
The Source code link shows the AWK source code.
As an example, i will try to find a cookbook with an interesting
recipe. I began by selecting the Books item.
<
gopher://tilde.pink/I/~bencollver/log/
2024-07-30-pharos-gopher-frontend-to-internet-archive/pharos3.png>
I selected All Texts to search through everything that's text.
<
gopher://tilde.pink/I/~bencollver/log/
2024-07-30-pharos-gopher-frontend-to-internet-archive/pharos4.png>
<
gopher://tilde.pink/I/~bencollver/log/
2024-07-30-pharos-gopher-frontend-to-internet-archive/pharos5.png>
It found over 36 million items and starts on page 1 out of
2.5 million pages. The search results show an item per line. The
[txt] tag means that the item's media type is text. At the bottom of
the page are options to proceed to Page 2, to Sort the results, to
Filter the results, and lastly, to return to the PHAROS top level.
I selected [\/] Filter results to narrow down my search.
<
gopher://tilde.pink/I/~bencollver/log/
2024-07-30-pharos-gopher-frontend-to-internet-archive/pharos6.png>
I selected Title contains, and typed in the word "cookbook" to
narrow down my search results.
<
gopher://tilde.pink/I/~bencollver/log/
2024-07-30-pharos-gopher-frontend-to-internet-archive/pharos7.png>
I selected "Apply search criteria" to see the new results.
<
gopher://tilde.pink/I/~bencollver/log/
2024-07-30-pharos-gopher-frontend-to-internet-archive/pharos8.png>
The new results show 824 items on 55 pages. Much better! Note:
Filtering the search results reverts to the default sort order.
I selected [^v] Sort to sort the results.
<
gopher://tilde.pink/I/~bencollver/log/
2024-07-30-pharos-gopher-frontend-to-internet-archive/pharos9.png>
This shows various different ways to sort the results. I selected
Date added [v] to sort the results in descending order.
<
gopher://tilde.pink/I/~bencollver/log/
2024-07-30-pharos-gopher-frontend-to-internet-archive/pharos10.png>
I selected the first result, which was Master Plants Cookbook
The 33 Most Healing Superfoods...
<
gopher://tilde.pink/I/~bencollver/log/
2024-07-30-pharos-gopher-frontend-to-internet-archive/pharos11.png>
This brought up the item details. I selected Thumbnail, and it
opened the item thumbnail in ImageMagick. At the end of this post, i
will show how i configured lynx to display images.
I selected Download. This shows a list of files for download, with
one line per file.
<
gopher://tilde.pink/I/~bencollver/log/
2024-07-30-pharos-gopher-frontend-to-internet-archive/pharos12.png>
The first two lines begin with (HTML). They are http download links,
not gopher, because these files are 20 and 40 MB large, which exceed
the default 10 MB max_bin_size that pharos will proxy to gopher.
Most of the lines begin with (DIR), which means that pharos will
proxy and transfer those file over gopher.
The bottom item "Downloads via http" shows the same files, but all
with http download links. These can be useful to copy/paste into
documentation, build scripts, etc.
I selected the derived _djvu.txt file, which the Internet Archive
produced from OCR. This shows options for downloading and viewing
this specific file.
<
gopher://tilde.pink/I/~bencollver/log/
2024-07-30-pharos-gopher-frontend-to-internet-archive/pharos13.png>
I selected Text view, which uses gopher type 0 to view the text in
the gopher client.
<
gopher://tilde.pink/I/~bencollver/log/
2024-07-30-pharos-gopher-frontend-to-internet-archive/pharos14.png>
This shows the title text of the cookbook. At the top right it says
page 1 out of 705 pages of text. Flipping through the pages, i see
that the index begins on page 17. My eye landed on the recipe titled
Mesopotamian Kale Pesto. Searching for this title, i found it on
page 456. Correcting the OCR errors, i got:
[[[[[
Mesopotamian Kale Pesto
C: Pesto
B: Master Plants Cookbook by Margarita Restrepo & Michele Lastella
Y: 1-1/2 Cups
2 c Organic kale; deveined and chopped
2 cl Garlic; peeled
1/3 c Raw pistachios; shelled
1/4 ts Red or black pepper
1 tb Tamari
1/2 Lemon; juice of
In a food processor, process all ingredients slowly until smooth and
creamy. Store the pesto in the refrigerator for up to 2 weeks.
Note: Use this pesto as a dip, a spread for bruschetta, or toss with
spiralized raw zucchini "noodles."
]]]]]
That's the end of my demonstration.
I mentioned my lynx configuration. Below is an excerpt from my
~/.mailcap
image/gif; ~/bin/display-image.sh %s; nametemplate=%s.gif
image/jpeg; ~/bin/display-image.sh %s; nametemplate=%s.jpg
image/png; ~/bin/display-image.sh %s; nametemplate=%s.png
image/webp; ~/bin/display-image.sh %s; nametemplate=%s.webp
application/pdf; xpdf %s
And below is the contents of my ~/bin/display-image.sh
#!/bin/sh
test -n "$DISPLAY" && display "$1" || (chafa "$1"; read pause)
This script displays the image using ImageMagic by default, and it
falls back to chafa in a text-only shell.
Source code:
<
https://chiselapp.com/user/bencollver/repository/pharos/dir?ci=tip>
tags: bencollver,retrocomputing,technical
Tags
====
bencollver
<
gopher://tilde.pink/1/~bencollver/log/tag/bencollver/>
retrocomputing
<
gopher://tilde.pink/1/~bencollver/log/tag/retrocomputing/>
technical
<
gopher://tilde.pink/1/~bencollver/log/tag/technical/>