| ---------------------------------------- | |
| Notes on using Lynx to convert gopher sites to HTML | |
| March 29th, 2018 | |
| ---------------------------------------- | |
| The lynx browser can be used to very quickly convert a gopher resource | |
| into almost-valid HTML, like so: | |
| lynx -anonymous -dump -source gopher://some.gopher.server/1/selector | |
| The HTML it outputs will not have an SGML/HTML DOCTYPE declared, will | |
| use relative URLs, and will likely have some obscure and poorly | |
| supported HTML elements (such as ISINDEX) if you are using it to | |
| access a CSO (gophertype 2) or gopher index (gophertype 7)... | |
| You can cure most of these HTML woes by using HTML tidy. | |
| Looking closer at the HTML that lynx spits out, it looks like it could | |
| easily pass for HTML 2.0, 3.*, 4.*, or even ISO/IEC 15445:2000... | |
| With that in mind, I often find myself doing something such as: | |
| printf '%s\15\12' '<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"' \ | |
| '"http://www.w3.org/TR/html4/strict.dtd">' | |
| lynx -anonymous -dump -source gopher://some.gopher.server/1/selector | \ | |
| tidy -n --doctype omit --show-errors 0 --quiet 1 | |
| This generates some darn good HTML that will render quicky and won't | |
| trigger quirks mode in modern browsers. | |
| Another thing about the HTML that lynx generates from gopher URLs, it | |
| looks a lot like the old Netscape 2.0 gopher listings, except for it | |
| has a text prefix instead of an icon: | |
| (HTML) for gophertype h - HTML or H - CHTML, not sure about this one | |
| (TEL) for gophertype 8 - telnet | |
| (3270) for gophertype T - telnet 3270 | |
| (FILE) for gophertype 0 - text | |
| (DIR) for gophertype 1 - gopher menu | |
| (CSO) for gophertype 2 - CCSO nameserver, email/phone lookup | |
| (BIN) for gophertype 5 - PC binary or 9 - binary | |
| (UUE) for gophertype 6 - uuencoded file (typically a saved email) | |
| (?) for gophertype 7 - searchable gopher index | |
| (IMG) for gophertype g - gif or I - image or : (image from gopher+) | |
| (SND) for gophertype s - sound file or < - sound file (from gopher+) | |
| (HQX) for gophertype 4 - BinHex4 encoded file (from oldworld mac?) | |
| (MIME) for gophertype m - base64 encoded file (saved email??) | |
| (MOV) for gophertype ; - movie (from gopher+) | |
| (PDF) for gophertype P - for ISO 32000-2 document (PDF file) | |
| (UNKN) for gophertypes that lynx does not know about | |
| ---------------------------------------- | |
| Back to phlog index | |
| gopher.zcrayfish.soy gopher root | |
| Future direct comment submission has been disabled for this phlog entry. | |
| Comments are still accepted by email, please send to: | |
| [email protected] | |
| Be sure to include the post title in the subject line! Thanks! | |
| Nobody has commented on this post. |