2023-02-12      AWK replaces LibreCalc

 For the longest  time I used LibreCalc  to manage a list  of books I
 read. Everytime when  I finished a book  (not counting subliterature
 and non-fiction)  I would append  a new entry  to the list  and then
 sort it by  the authors' name. No formulas, no  fancy coloring, just
 plain old cells  containing text. For this simple  purpose using the
 heavy LibreOffice suit[1] is a bit of an overkill so I decided to do
 it in AWK instead.

 I'm really  fascinated with AWK.  It's an ancient and  arcane albeit
 very  powerful  language. For  example  the  following code  removes
 duplicate lines from a file while maintaining order:

   awk '!a[$0]++'

 Mind = blown.  On stackexchange you can find  a thorough explanation
 of  why this  works[2].  I always  imagine  some thick-bearded  UNIX
 hacker/magician  from  the  70s  coming  up with  this  stuff  in  a
 dusty-gray cellar of some US university.

 Back  to my  literature list.   First things  first, I  exported the
 LibreCalc file  to a csv-file.  The header  of this file  looks like
 this

   Autor       Titel   Erscheinungsjahr        Sprache ...

 using TAB as the field separator.  I also got some extra fields with
 annotations  and  catgerories but  not  every  one of  these  fields
 actually contains a value.

 Next I created  an org-file from where I run  my awk-code and output
 the results. To set up AWK  to work with org-mode code blocks[3] you
 have to evaluate the follwing elisp-snippet (C-c C-c with the cursor
 inside the block):

   #+BEGIN_SRC elisp :results none
   (org-babel-do-load-languages
    'org-babel-load-languages
    '((awk . t)(shell . t)))
   #+END_SRC

 I put  this in my  config file. To view  the author, title  and year
 fields  of all  entries,  sorted  by the  authors'  name  I run  the
 follwing code:

   #+BEGIN_SRC awk :in-file literaturliste.csv
   BEGIN { FS="\t"; OFS="\t" }
   NR>1 { print $1,$2,$3|"sort -t '\t'" }
   #+END_SRC

   #+RESULTS:
   | Abe, Kōbō             | The Woman in the Dunes       | 1962 |
   | Aitmatow, Tschingis   | Der Junge und das Meer       | 1977 |
   | Aitmatow, Tschingis   | Der Weg des Schnitters       | 1963 |
   | Aitmatow, Tschingis   | Djamila                      | 1958 |
   | Apitz, Bruno          | Nackt unter Wölfen           | 1958 |
   | Balzac, Honoré de     | Tante Lisbeth                | 1846 |
   | Balzac, Honoré de     | Vater Goriot                 | 1835 |
   | Bradbury, Ray         | Fahrenheit 451               | 1953 |
   | Brecht, Bertolt       | Der gute Mensch von Sezuan   | 1943 |
   | Brecht, Bertolt       | Leben des Galilei            | 1939 |
   | Brontë, Emily         | Wuthering Heights            | 1847 |
   [...]

 As you  can see the output  is formatted as a  org-table by default.
 Very convenient.

 Sometimes I  wonder what books I  recently read or how  many books I
 read last year. It's easy to check with something like this:

   #+BEGIN_SRC awk :in-file literaturliste.csv
   BEGIN { FS="\t"; OFS="\t" }
   NR > 1 && length($7) { print $1,$2,$7|"sort -r -t '\t' -k3"}
   #+END_SRC

   #+RESULTS:
   | Zola, Émile         | Das Werk                  | 2023-01-23 |
   | Kawabata, Yasunari  | Tausend Kraniche          | 2023-01-05 |
   | Dostojewski, Fjodor | Schuld und Sühne          | 2022-12-12 |
   | Kawabata, Yasunari  | Snow Country              | 2022-11-27 |
   | Steinbeck, John     | Jenseits von Eden         | 2022-10-31 |
   | Zweig, Stefan       | Schachnovelle             | 2022-07-10 |
   | Balzac, Honoré de   | Tante Lisbeth             | 2022-06-15 |
   | Zola, Émile         | Nana                      | 2022-05-11 |
   | Balzac, Honoré de   | Vater Goriot              | 2022-04-22 |
   | Michener, James A.  | Sayonara                  | 2022-04-05 |
   | Houellebecq, Michel | Vernichten                | 2022-04-02 |
   | Zola, Émile         | Die Sünde des Abbé Mouret | 2022-02-12 |
   | McCarthy, Cormac    | Blood Meridian            | 2022-01-18 |
   [...]

 And finally  I want to view  the most common languages  in which the
 books in my list are written in.

   #+BEGIN_SRC awk :in-file literaturliste.csv
   BEGIN { FS="\t"; OFS="\t" }
   NR > 1 { count[$4]++ }
   END { for (lang in count) {
                   print lang,count[lang]|"sort -nrt '\t' -k2"
           }
   }
   #+END_SRC

   #+RESULTS:
   | Deutsch     | 81 |
   | Englisch    | 40 |
   | Russisch    | 24 |
   | Französisch | 22 |
   | Japanisch   |  5 |
   | Spanisch    |  2 |
   | Italienisch |  2 |

 Of course I  scratched only on the  surface of what you  can do with
 this  type of  setup  but I'll  leave  it at  that.  Have I  already
 mentioned that I really like plain text?




Footnotes
~~~~~~~~~

[1] https://www.libreoffice.org/

[2] https://unix.stackexchange.com/questions/159695/how-does-awk-a0-work

[3] https://orgmode.org/manual/Working-with-Source-Code.html