* * * * *

             “The job of a programmer is to produce test cases.”

So last night I ran a subset of the regression test in 4½ hours [1] and got a
few errors where something that shouldn't happen, happened (and it's this
“checking for not an event happening” that takes the time). Well, it wasn't a
bug in the code being tested, but a bug in the regression test (Surprise!
Surprise! Surprise! Only not really). I think that says more about our
business logic than it does about CZ or me; both of us attempted to validate
this part of the business logic in the regression test, and we both got it
wrong.

* * * * *

And about parallelizing the regression test [2]—yes, it's possible. But doing
so on the spot isn't. The easy solution is to run the regression test on
multiple machines—nice if you have them. The other option is to parallelize
the run on a single machine and the code just isn't set up to do that. I'm
not saying it's impossible, but it will take engineering effort, and more
importantly, testing! Funny how testing your test cases isn't talked about
that much.

* * * * *

The slowdown of the regression test is due to “proving a negative”—that is,
checking for something that's not supposed to happen did not happen. And in a
distributed system like ours, that's not easy to test—a check could happen
before the event due to any number of reasons, and how long do you wait to
ensure that what shouldn't happen didn't happen?

The other issue to why it will take so long to run is just the sheer number
of tests that are run. My “retiring any day now” manager has never been happy
with the “shotgun” approach I took to generating the tests—I basically
generate thousands of combinations of conditions, most of which “should”
never appear in production. But one of those “should never happen” things did
happen about seven years ago and well, the less said about that the better.
So at least my “shotgun” approach does have the effect of testing for a lot
of “I don't know” conditions (most of which are misconfigurations of data
from provisioning). And each test we add could potentially double the number
of tests cases. I'm sure there's a way to reduce the number of test cases,
but to the TDD (Test Driven Development) acolytes out there (and the new
management team does appear to follow TDD tenents), “one does not simply
reduce the number of test cases.”

Sigh.

And the regression test rolls on …

[1] gopher://gopher.conman.org/0Phlog:2021/06/08.1
[2] gopher://gopher.conman.org/0Phlog:2021/06/09.1

Email author at [email protected]