TITLE: Generating a static site using pandoc
DATE: 2018-08-02
AUTHOR: John L. Godlee
====================================================================


I use Jekyll to generate my blog and that probably won't change any
time soon. There's lots of other options for generating a static
site from a folder of files, like Hugo or Pelican, but Jekyll
benefits massively from having a lot of support online and a big
userbase, so it's always easy to fix any problems I'm having.

 [Jekyll]: https://jekyllrb.com/
 [Hugo]: https://gohugo.io/
 [Pelican]: https://blog.getpelican.com/

That being said, I had heard some people on forums saying things
like, "Just use pandoc and a makefile to generate your site, it's
easy!", and I wanted to give something like that a go, so here is
what I came up with. It's important to say at this point that I
have no idea how to use a makefile, I might get round to it another
time, but I am starting to get familiar with shell scripting, if
only in a simplistic way.

So I have a folder full of markdown files in _posts/, each one of
which is a blog post. In the YAML metadata at the top I have the
type of page, the title, and the date:

   ---
   layout: post
   title: "Rebuilding a bike"
   date: 2018-07-25
   ---

I can use pandoc to convert these markdown files to HTML with a css
file to style the HTML. The CSS file can be seen here. To convert a
single markdown file I can do this:

 [here](https://johngodlee.xyz/files/site_gen/site_gen.css)

   pandoc -f markdown -t html5 --standalone --css=site_gen.css -H
site_gen.css -o site/2018-07-25-bike.html _posts/2018-07-25-bike.md

The output markup looks like this:

 ![Sample page with
pandoc](https://johngodlee.xyz/img_full/site_gen/page.png)

To do this on a folder of markdown files I have to name some
variables and put it in a for loop:

   # Define location of assets
   css="site_gen.css"

   head="header.html"

   # Pandoc md posts to html with css and proper filenames
   for i in _posts/*.md; do
       name=$(echo ${i##*/})
       filename=$(echo "$name" | cut -f 1 -d '.')
       pandoc -f markdown -t html5 --standalone --css=$css
--include-before-body=$head -H $css -o
"site/posts/${filename}.html" $i
   done

Note that in this version I've chosen to add the contents of
header.html to the top of the main body of each generated HTML file
using the --include-before-body command. header.html looks like
this:

   <a class='header' href="../index.html">HOME</a>

This adds a link to index.html, the homepage of the site at the top
of each blog post.

I'm also adding the contents of the CSS file to the top of the HTML
file as a <style></style> block, so each page appears with its
styling even if it's taken out of the website and emailed for
example, using the -H flag.

Now I need to generate index.html, which is basically going to be a
list of the blog posts:

   # Generate index page
   ls -1 site/posts | sort -r | cat | while read string; do
       echo -e "<p>\n <h1 class='home'><a
href=posts/$string>$string</a></h1>\n</p>"
   done > site/index.html

   # Prepend css style to index page
   cat $css site/index.html > $$.tmp && mv $$.tmp site/index.html

   # Delete line containing link to index page
   sed -i '' '/index/d' site/index.html

So the whole script (gen.sh) now looks like this:

   #!/bin/bash

   # Generate a very simple website from a collection of markdown
files

   # Start from scratch
   rm -r "site/"

   mkdir "site/"

   mkdir "site/posts/"

   # Define location of assets
   css="site_gen.css"

   head="header.html"

   # Pandoc md posts to html with css and proper filenames
   for i in _posts/*.md; do
       name=$(echo ${i##*/})
       filename=$(echo "$name" | cut -f 1 -d '.')
       pandoc -f markdown -t html5 --standalone --css=$css
--include-before-body=$head -H $css -o
"site/posts/${filename}.html" $i
   done

   # Generate index page
   ls -1 site/posts | sort -r | cat | while read string; do
       echo -e "<p>\n <h1 class='home'><a
href=posts/$string>$string</a></h1>\n</p>"
   done > site/index.html

   # Prepend css style to index page
   cat $css site/index.html > $$.tmp && mv $$.tmp site/index.html

   # Delete line containing link to index page
   sed -i '' '/index/d' site/index.html

and the directory of files looks like this:

   .
   ├── _posts
   |   ├─ 2018-07-25-bike.md
   |   └─ 2018-08-05-site-gen.md
   ├── gen.sh
   ├── header.html
   ├── site
   |   ├─ index.html
   |   └─ posts
   |      ├─ 2018-07-25-bike.html
   |      └─ 2018-08-05-site-gen.html
   └── site_gen.css