2023-02-12 AWK replaces LibreCalc

2023-02-12 AWK replaces LibreCalc

For the longest time I used LibreCalc to manage a list of books I
read. Everytime when I finished a book (not counting subliterature
and non-fiction) I would append a new entry to the list and then
sort it by the authors' name. No formulas, no fancy coloring, just
plain old cells containing text. For this simple purpose using the
heavy LibreOffice suit[1] is a bit of an overkill so I decided to do
it in AWK instead.

I'm really fascinated with AWK. It's an ancient and arcane albeit
very powerful language. For example the following code removes
duplicate lines from a file while maintaining order:

awk '!a[$0]++'

Mind = blown. On stackexchange you can find a thorough explanation
of why this works[2]. I always imagine some thick-bearded UNIX
hacker/magician from the 70s coming up with this stuff in a
dusty-gray cellar of some US university.

Back to my literature list. First things first, I exported the
LibreCalc file to a csv-file. The header of this file looks like
this

Autor Titel Erscheinungsjahr Sprache ...

using TAB as the field separator. I also got some extra fields with
annotations and catgerories but not every one of these fields
actually contains a value.

Next I created an org-file from where I run my awk-code and output
the results. To set up AWK to work with org-mode code blocks[3] you
have to evaluate the follwing elisp-snippet (C-c C-c with the cursor
inside the block):

#+BEGIN_SRC elisp :results none
(org-babel-do-load-languages
'org-babel-load-languages
'((awk . t)(shell . t)))
#+END_SRC

I put this in my config file. To view the author, title and year
fields of all entries, sorted by the authors' name I run the
follwing code:

#+BEGIN_SRC awk :in-file literaturliste.csv
BEGIN { FS="\t"; OFS="\t" }
NR>1 { print $1,$2,$3|"sort -t '\t'" }
#+END_SRC

#+RESULTS:
| Abe, Kōbō | The Woman in the Dunes | 1962 |
| Aitmatow, Tschingis | Der Junge und das Meer | 1977 |
| Aitmatow, Tschingis | Der Weg des Schnitters | 1963 |
| Aitmatow, Tschingis | Djamila | 1958 |
| Apitz, Bruno | Nackt unter Wölfen | 1958 |
| Balzac, Honoré de | Tante Lisbeth | 1846 |
| Balzac, Honoré de | Vater Goriot | 1835 |
| Bradbury, Ray | Fahrenheit 451 | 1953 |
| Brecht, Bertolt | Der gute Mensch von Sezuan | 1943 |
| Brecht, Bertolt | Leben des Galilei | 1939 |
| Brontë, Emily | Wuthering Heights | 1847 |
[...]

As you can see the output is formatted as a org-table by default.
Very convenient.

Sometimes I wonder what books I recently read or how many books I
read last year. It's easy to check with something like this:

#+BEGIN_SRC awk :in-file literaturliste.csv
BEGIN { FS="\t"; OFS="\t" }
NR > 1 && length($7) { print $1,$2,$7|"sort -r -t '\t' -k3"}
#+END_SRC

#+RESULTS:
| Zola, Émile | Das Werk | 2023-01-23 |
| Kawabata, Yasunari | Tausend Kraniche | 2023-01-05 |
| Dostojewski, Fjodor | Schuld und Sühne | 2022-12-12 |
| Kawabata, Yasunari | Snow Country | 2022-11-27 |
| Steinbeck, John | Jenseits von Eden | 2022-10-31 |
| Zweig, Stefan | Schachnovelle | 2022-07-10 |
| Balzac, Honoré de | Tante Lisbeth | 2022-06-15 |
| Zola, Émile | Nana | 2022-05-11 |
| Balzac, Honoré de | Vater Goriot | 2022-04-22 |
| Michener, James A. | Sayonara | 2022-04-05 |
| Houellebecq, Michel | Vernichten | 2022-04-02 |
| Zola, Émile | Die Sünde des Abbé Mouret | 2022-02-12 |
| McCarthy, Cormac | Blood Meridian | 2022-01-18 |
[...]

And finally I want to view the most common languages in which the
books in my list are written in.

#+BEGIN_SRC awk :in-file literaturliste.csv
BEGIN { FS="\t"; OFS="\t" }
NR > 1 { count[$4]++ }
END { for (lang in count) {
print lang,count[lang]|"sort -nrt '\t' -k2"
}
}
#+END_SRC

#+RESULTS:
| Deutsch | 81 |
| Englisch | 40 |
| Russisch | 24 |
| Französisch | 22 |
| Japanisch | 5 |
| Spanisch | 2 |
| Italienisch | 2 |

Of course I scratched only on the surface of what you can do with
this type of setup but I'll leave it at that. Have I already
mentioned that I really like plain text?

Footnotes
~~~~~~~~~

[1] https://www.libreoffice.org/

[2] https://unix.stackexchange.com/questions/159695/how-does-awk-a0-work

[3] https://orgmode.org/manual/Working-with-Source-Code.html