---------------------------------------- | |
Notes on using Lynx to convert gopher sites to HTML | |
March 29th, 2018 | |
---------------------------------------- | |
The lynx browser can be used to very quickly convert a gopher resource | |
into almost-valid HTML, like so: | |
lynx -anonymous -dump -source gopher://some.gopher.server/1/selector | |
The HTML it outputs will not have an SGML/HTML DOCTYPE declared, will | |
use relative URLs, and will likely have some obscure and poorly | |
supported HTML elements (such as ISINDEX) if you are using it to | |
access a CSO (gophertype 2) or gopher index (gophertype 7)... | |
You can cure most of these HTML woes by using HTML tidy. | |
Looking closer at the HTML that lynx spits out, it looks like it could | |
easily pass for HTML 2.0, 3.*, 4.*, or even ISO/IEC 15445:2000... | |
With that in mind, I often find myself doing something such as: | |
printf '%s\15\12' '<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"' \ | |
'"http://www.w3.org/TR/html4/strict.dtd">' | |
lynx -anonymous -dump -source gopher://some.gopher.server/1/selector | \ | |
tidy -n --doctype omit --show-errors 0 --quiet 1 | |
This generates some darn good HTML that will render quicky and won't | |
trigger quirks mode in modern browsers. | |
Another thing about the HTML that lynx generates from gopher URLs, it | |
looks a lot like the old Netscape 2.0 gopher listings, except for it | |
has a text prefix instead of an icon: | |
(HTML) for gophertype h - HTML or H - CHTML, not sure about this one | |
(TEL) for gophertype 8 - telnet | |
(3270) for gophertype T - telnet 3270 | |
(FILE) for gophertype 0 - text | |
(DIR) for gophertype 1 - gopher menu | |
(CSO) for gophertype 2 - CCSO nameserver, email/phone lookup | |
(BIN) for gophertype 5 - PC binary or 9 - binary | |
(UUE) for gophertype 6 - uuencoded file (typically a saved email) | |
(?) for gophertype 7 - searchable gopher index | |
(IMG) for gophertype g - gif or I - image or : (image from gopher+) | |
(SND) for gophertype s - sound file or < - sound file (from gopher+) | |
(HQX) for gophertype 4 - BinHex4 encoded file (from oldworld mac?) | |
(MIME) for gophertype m - base64 encoded file (saved email??) | |
(MOV) for gophertype ; - movie (from gopher+) | |
(PDF) for gophertype P - for ISO 32000-2 document (PDF file) | |
(UNKN) for gophertypes that lynx does not know about | |
---------------------------------------- | |
Back to phlog index | |
gopher.zcrayfish.soy gopher root | |
Future direct comment submission has been disabled for this phlog entry. | |
Comments are still accepted by email, please send to: | |
[email protected] | |
Be sure to include the post title in the subject line! Thanks! | |
Nobody has commented on this post. |