| View source | |
| # 2024-07-02 - Dinosaur Hunting With AWK | |
| The 80,000 or so recipes in MOAR came from a dump of the web site | |
| formerly at soar.berkeley.edu. Most of the recipes were collected | |
| from the BBS scene going back into the days of yore when dinosaurs | |
| roamed cyberspace and real programmers wrote code using bits of | |
| shells and strings. Some hardware and software DID NOT SUPPORT | |
| LOWERCASE LETTERS AT ALL. Consequently, some of the recipes used | |
| ALL CAPITAL LETTERS. Some recipes were normal except either the | |
| ingredients were all uppercase, or the instructions were all | |
| uppercase. | |
| I PERSONALLY FIND DINOSAUR LANGUAGE DIFFICULT TO READ, SO I RESOLVED | |
| TO FIND THESE RECIPES AND FIX THEM ONCE AND FOR ALL. | |
| I wrote a quick awk script to report the percentage of capital | |
| letters in each recipe file. | |
| $ cat >caps.awk <<_EOF__ | |
| BEGIN { | |
| FS="" | |
| } | |
| { | |
| for (i = 1; i <= NF; i++) { | |
| if (match($i, /[a-z]/)) { | |
| lcase[FILENAME]++ | |
| } else if (match($i, /[A-Z]/)) { | |
| ucase[FILENAME]++ | |
| } | |
| } | |
| } | |
| END { | |
| for (fn in ucase) { | |
| lnum = lcase[fn] | |
| unum = ucase[fn] | |
| if (unum > 0) { | |
| pct = int(100 * unum / lnum) | |
| printf "%d\t%s\n", pct, fn | |
| } | |
| } | |
| } | |
| __EOF__ | |
| Then i ran this script against all recipe files: | |
| $ find moar/ascii -type f | xargs awk -f caps.awk | sort -n >clis | |
| Using trial and error i found that files with more than 30% uppercase | |
| letters were good candidates to be fixed. This identified | |
| 659 dinosaurs. It took some doing, but now these recipes are fixed | |
| to be more readable on MOAR. | |
| gopher://tilde.pink/1/~bencollver/recipes/ | |
| tags: bencollver,retrocomputing,technical | |
| # Tags | |
| bencollver | |
| retrocomputing | |
| technical |