-------------------------------------------------
Title: A better day today.
Date: 2022-02-08
Device: Laptop
Mood: Tired, content.
-------------------------------------------------

A much better, more productive day today. I
managed to get an early start, about 5:50am, and
initialised the day properly; with a decent
breakfast, some proper time to shower and prepare.
I know these are basic things to many people, but
so often I skip them and I feel bad for the rest
of the day.

The day was a normal drift through some meetings
and putting together some strategy documents, but
I did manage to go deep into a Docker bug with a
colleague.

We had an issue where when starting a new database
container, the healthcheck script which listens
for a successful deployment was reporting an
error. This is obviously somewhat unexpected as we
run some 50+ database containers on that host, all
deployed in a consistent way, so it wasn't
immediately apparent why would they suddenly start
failing now. While debugging it, we did manage to
prove that the container was starting
successfully, and the healthcheck was at fault,
not the database. In the absence of time to debug
the issue properly, we decided to reboot the box,
and then the problem went away. I expect that
someone had maybe run a database creation process,
and cancelled it, or lost their session, so we had
a container in a half-created state which was
replying to the healthcheck, even though it should
have tidied itself away. It's annoying to have to
restart a machine rather than deeply understand
the problem, but you can't win them all.

I guess this does prove that despite being near
the end of a year-long programme of work to remove
huge amounts of proprietary, overcomplicated AWS
nonsense from the business, we can still get hit
with weird bugs even in our well-understood "just
simple old Linux" setup. Ah well.

One nice side-effect was that once we rebooted the
server in question, the rest of the workload on
that server came back flawlessly, within a couple
of minutes, from a cold boot. It was nice to see
all 64-cores of the Epyc processor giving 100% to
bring everything back up in parallel. Sometimes I
forget just how powerful the iron can be.

Optimistic.

--C