* * * * *

                                Unit test this

Or, “What is a ‘unit test,’ part II [1]”

I saw a decent answer to my question [2] which makes sense for C. Another
decent (if a bit vague) answer was:

> So to answer Sean's question, a unit test is that which requires the least
> amount of work to setup but is able to reduce the need for coverage of the
> larger system. Whatever you want to consider a "unit" is up to you and the
> language you're using.
>

“Re: What is a "unittest"? [3]”

I left off my previous entry pointing to a function that I would love to have
seen someone else “unit test,” but alas, no one did. But I always had plans
on going all “The Martian [4]” on the code [5] and “unit test the XXXX out of
it.”

So here's the code in question:

-----[ C ]-----
/***********************************************
*
* Copyright 2021 by Sean Conner.  All Rights Reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License
* as published by the Free Software Foundation; either version 2
* of the License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA  02111-1307, USA.
*
* Comments, questions and criticisms can be sent to: [email protected]
*
************************************************/

#include <stdbool.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>
#include <assert.h>

#include <sys/types.h>
#include <sys/wait.h>
#include <sys/stat.h>
#include <unistd.h>
#include <fcntl.h>
#include <syslog.h>
#include <sysexits.h>

/************************************************************************/

bool run_hook(char const *tag,char const *argv[])
{
 assert(tag     != NULL);
 assert(argv    != NULL);
 assert(argv[0] != NULL);

 pid_t child = fork();

 if (child == -1)
 {
   syslog(LOG_ERR,"%s='%s' fork()='%s'",tag,argv[0],strerror(errno));
   return false;
 }
 else if (child == 0)
 {
   extern char **environ;
   int devnull = open("/dev/null",O_RDWR);
   if (devnull == -1)
     _Exit(EX_UNAVAILABLE);
   if (dup2(devnull,STDIN_FILENO) == -1)
     _Exit(EX_OSERR);
   if (dup2(devnull,STDOUT_FILENO) == -1)
     _Exit(EX_OSERR);
   if (dup2(devnull,STDERR_FILENO) == -1)
     _Exit(EX_OSERR);
   for (int fh = STDERR_FILENO + 1 ; fh <= devnull ; fh++)
     if (close(fh) == -1)
       _Exit(EX_OSERR);
   execve((char *)argv[0],(char **)argv,environ);
   _Exit(EX_UNAVAILABLE);
 }
 else
 {
   int status;

   if (waitpid(child,&status,0) != child)
   {
     syslog(LOG_ERR,"%s='%s' waitpid()='%s'",tag,argv[0],strerror(errno));
     return false;
   }

   if (WIFEXITED(status))
   {
     if (WEXITSTATUS(status) != 0)
     {
       syslog(LOG_ERR,"%s='%s' status=%d",tag,argv[0],WEXITSTATUS(status));
       return false;
     }
   }
   else
   {
     syslog(LOG_ERR,"%s='%s' terminated='%s'",tag,argv[0],strsignal(WTERMSIG(status)));
     return false;
   }
 }

 return true;
}

/************************************************************************/
-----[ END OF LINE ]-----

As you can see, it's one function, in one file, with the only dependencies
being the operating system. So this should be the “perfect unit” to write
some “unit tests” for. The code does replicate a bit of the standard C
function system(), so why not use system() in the first place? The answer
comes from the manual page for Linux:

> Do not use system() from a privileged program (a set-user-ID or set-group-
> ID program, or a program with capabilities) because strange values for some
> environment variables might be used to subvert system integrity. For
> example, PATH could be manipulated so that an arbitrary program is executed
> with privilege. Use the exec(3) [6] family of functions instead, but not
> execlp(3) [7] or execvp(3) [8] (which also use the PATH environment
> variable to search for an executable).
>

“system(3) - Linux manual page [9]”

This function runs as part of a set-user-ID program (mod_blog [10] in
particular, for reasons beyond the scope of this entry) so no system() for
me. Also, I avoid having to construct a command string that might have failed
to properly escape the filename to avoid complications with the shell's use
of certain characters. And it's not like the function was hard for me to
write. I've done functions like this before, and it worked the first time
without issue when I wrote it (and the small changes to it since have been a
simplification of the parameters, and changes to the logging messages). It's
also not a very long function (I'm sorry Ron Jefferies, but 14 lines of code
[11] isn't “a lot of code”).

The reason I wanted some unit test proponents to look at this code is that it
involves quite a bit of interaction with the operating system in C, a not-
very-popular programming language these days, and I was curious as to the
level of “unit testing“ that would be done. No bites, but my gut feeling is
that a “unit test proponent” faced with this code would just throw two
scripts to it, one to return successfully:

-----[ C ]-----
int main(void)
{
 return 0;
}
-----[ END OF LINE ]-----

and one to return failure:

-----[ C ]-----
int main(void)
{
 return 1;
}
-----[ END OF LINE ]-----

and call it “battle tested.” The two test cases themselves are pretty easy to
write:

-----[ C ]-----
#include <stdbool.h>
#include <stdlib.h>
#include <stdio.h>

#include "tap14.h"

extern bool run_hook  (char const *,char const **);

int main(void)
{
 tap_plan(2,NULL);
 tap_assert( run_hook("script",(char const *[]){ "./success" , NULL }),"success script");
 tap_assert(!run_hook("script",(char const *[]){ "./failure" , NULL }),"failure script");
 return tap_done();
}
-----[ END OF LINE ]-----

(I'm using my own testing framework based on TAP (Test Anything Protocol)
[12]. I wrote my own to be as minimal as possible to get the job done—other
TAP (Test Anything Protocol) frameworks for C I looked at were too overblown
for my tastes.)

An improvement would be to add a test script that terminates via a signal.
It's again easy enough to write that script:

-----[ C ]-----
#include <signal.h>

int main(void)
{
 raise(SIGINT);
 return 1;
}
-----[ END OF LINE ]-----

and the appropriate test:

-----[ C ]-----
tap_assert(!run_hook("script",(char const *[]){ "./terminate" , NULL }),"terminate script");
-----[ END OF LINE ]-----

But that only tests half the code. How do I know I didn't mess up the
codepath in the child process before I execute the script in question? At The
Enterprise, it was expected our tests cover about 70% of the code at least—
I'm short of that target here. And as I say, I'm aiming to “unit test the
XXXX out of this” and get 100% code coverage, because shouldn't that be the
goal of “unit testing?”

But to achieve that target, I'm going to have to deal with “failing” a bunch
of existing functions, and according to my interprestation of “A Set of Unit
Testing Rules [13],” if I'm not mocking, I don't have a “unit test.” So I
have to mock some system calls.

And here is where I hit a problem—to do so will invoke the dreaded “undefined
behavior of C.” Seriously–if I provide my own function for, say, dup2(), I am
technically invoking undefined behavior of the C machine (this incredibly
long flame war on the Lua mailing list [14] of all places, goes into the gory
details behind that statement). Now granted, certain tool chains on certain
operating systems allow one to override functions, but you can't rely upon
this behavior in general. Given that I'm doing all of this on Linux, and
Linux in general allows this, I can proceed (carefully) with mocking system
functions.

That should be straightforward enough. The mocked open() function:

-----[ C ]-----
static int err_open;
static int ret_open;

int open(char const *pathname,int flags)
{
 (void)pathname;
 (void)flags;
 if (err_open != 0)
   errno = err_open;
 return ret_open; // XXX had bug here
}
-----[ END OF LINE ]-----

This should be fine for my purposes as I don't actually need to read from the
file. If I really needed to call into the original function, this might work:

-----[ C ]-----
static int err_open;

int myopen(char const *pathname,int flags)
{
 if (err_open == 0)
   return open(pathname,flags,0);
 errno = err_open;
 return -1;
}

#define open    myopen
-----[ END OF LINE ]-----

But as the “A Set of Unit Testing Rules” article states, “A test is not a
unit test if: it touches the file system.” So the above isn't a “true mock,”
and I shall continue with my “true mocked” function instead. I can continue
with similar implementations for the functions dup2(), close() and waitpid().
Unfortunately, there are three functions that may present some interesting
challenges: fork(), execve(), and _Exit(). The first returns twice (kind of—
if you squint and look sideways), the second only returns if there's an
error, and the third never returns.

Now looking over the implementation of the function I'm testing, and thinking
about things, I could do a similar implementation for fork()—the returning
twice thing is where it returns once in the parent process, and once in the
child process, but I can treat that as just a normal return, at least for
purposes of testing. For execve(), I can only test the error path here as the
script being “run” isn't being run. That just leaves _Exit() as the final
function to mock. And for that one, I wrap the entire call to run_hook() (the
function being “unit tested”) around setjmp() and longjmp() to simulate the
not-returning aspect of _Exit(). So a test of the close() codepath would look
like:

-----[ C ]-----
static bool X(char const *tag,char const *argv[])
{
 volatile int rc = setjmp(buf_exit);
 if (rc != 0)
   return false;
 return run_hook(tag,argv);
}

int main(void)
{
 /* ... */

 ret_open  = 4;
 err_dup2  = 0;
 ret_dup2  = 0;
 bad_dup2  = -1;
 err_close = EIO;
 ret_close = -1;
 tap_assert(!X("script",(char const *[]){ "./success" , NULL }),"close() fail");

 /* ... */
 return tap_done();
}
-----[ END OF LINE ]-----

I got all the test cases written up and all 11 tests pass:

-----[ data ]-----
TAP version 14
1..11
ok 1 - success script
ok 2 - failure script
ok 3 - terminate script
ok 4 - fork() fail
ok 5 - open() fail
ok 6 - dup2(stdin) fail
ok 7 - dup2(stdout) fail
ok 8 - dup2(stderr) fail
ok 9 - close() fail
ok 10 - execve() fail
ok 11 - waitpid() fail
-----[ END OF LINE ]-----

A successful “unit test” with 100% code coverage. But I'm not happy with
this. First off, I don't get the actual logging information for each test
case. All I get is:

-----[ data ]-----
Dec 21 19:34:10 user    err     /dev/log        test_run_hook   script='./failure' status=1
Dec 21 19:34:10 user    err     /dev/log        test_run_hook   script='./terminate' terminated='Interrupt'
Dec 21 19:34:10 user    err     /dev/log        test_run_hook   script='./success' fork()='Cannot allocate memory'
Dec 21 19:34:10 user    err     /dev/log        test_run_hook   script='./success' waitpid()='No child processes'
-----[ END OF LINE ]-----

and not

-----[ data ]-----
Dec 21 19:04:10 user    err     /dev/log        test_run_hook   script='./failure' status=1
Dec 21 19:04:10 user    err     /dev/log        test_run_hook   script='./terminate' terminated='Interrupt'
Dec 21 19:04:10 user    err     /dev/log        test_run_hook   script='./success' fork()='Cannot allocate memory'
Dec 21 19:04:10 user    err     /dev/log        test_run_hook   script='./success' status=69
Dec 21 19:04:10 user    err     /dev/log        test_run_hook   script='./success' status=71
Dec 21 19:04:10 user    err     /dev/log        test_run_hook   script='./success' status=71
Dec 21 19:04:10 user    err     /dev/log        test_run_hook   script='./success' status=71
Dec 21 19:04:10 user    err     /dev/log        test_run_hook   script='./success' status=71
Dec 21 19:04:10 user    err     /dev/log        test_run_hook   script='./success' status=69
Dec 21 19:04:10 user    err     /dev/log        test_run_hook   script='./success' waitpid()='No child processes'
-----[ END OF LINE ]-----

(And **no!** I am not checking that syslog() got the right message in the
test cases—been there, done that and all I got was a stupid tee-shirt and
emotional scars. It's easy enough to just manually check after the test runs,
at least for this entry.)

It just doesn't feel right to me that I'm testing in a faked environment. No,
to get a better “unit test” I'm afraid I'm going to have to keep invoking
undefined C behavior that is allowed by Linux, and interpose our functions by
using LD_PRELOAD to override the functions. And I can set things up so that I
can still call the original function when I want it to succeed. So all that
needs to be done is write a shared object file with my versions of the
functions, and include this function:

-----[ C ]-----
static pid_t (*___fork)   (void);
static int   (*___open)   (char const *,int,mode_t);
static int   (*___dup2)   (int,int);
static int   (*___close)  (int);
static int   (*___execve) (char const *,char *const [],char *const []);
static pid_t (*___waitpid)(pid_t,int *,int);

__attribute__((constructor))
void init(void)
{
 ___fork    = dlsym(RTLD_NEXT,"fork");
 ___open    = dlsym(RTLD_NEXT,"open");
 ___dup2    = dlsym(RTLD_NEXT,"dup2");
 ___close   = dlsym(RTLD_NEXT,"close");
 ___execve  = dlsym(RTLD_NEXT,"execve");
 ___waitpid = dlsym(RTLD_NEXT,"waitpid");
}
-----[ END OF LINE ]-----

(I include all three parameters to open() even though the last one is
optional—I don't want to have to deal with the variable argument machinery
with C—this should work “just fine on my machine”—I'm already into territory
that C formally forbids. I'm using triple leading underscores because single
and double leading underscores are reserved to the C compiler and
implementation, but nothing is mentioned about three leading underscores.)

Now, how to get information to my replacement functions about when to fail. I
thought about it, and while there is a way to do it with global variables, it
gets complicated and I'd rather do this as simply as possible. I figured I
could sneak variables through to my replacement functions via putenv(),
getenv() and unsetenv(). This will make the close() failed test case look
like:

-----[ C ]-----
 putenv((char *)"SPC_CLOSE_FAIL=5"); /* EIO */
 tap_assert(!run_hook("script",(char const *[]){ "./success" , NULL }),"close() fail");
 unsetenv("SPC_CLOSE_FAIL");
-----[ END OF LINE ]-----

And the corresponding close() function is:

-----[ C ]-----
int close(int fd)
{
 char *fail = getenv("SPC_CLOSE_FAIL");
 if (fail == NULL)
   return (*___close)(fd);
 errno = (int)strtoul(fail,NULL,10);
 return -1;
}
-----[ END OF LINE ]-----

The other functions work simularly, and when run:

-----[ data ]-----
TAP version 14
1..11
ok 1 - success script
ok 2 - failure script
ok 3 - terminate script
ok 4 - fork() fail
ok 5 - open() fail
ok 6 - dup2(stdin) fail
ok 7 - dup2(stdout) fail
ok 8 - dup2(stderr) fail
ok 9 - close() fail
ok 10 - execve() fail
ok 11 - waitpid() fail
-----[ END OF LINE ]-----

More importantly, since the functions can actually function as intended when
I don't want them to fail, I get the full output I expect in the system logs.
But per the “A Set of Unit Testing Rules” article, this is no longer a
“proper unit test.”

I don't know. The more I try to understand “unit testing,” the less sense it
makes to me. There is no real consensus as to what a “unit” is, and it seems
strange (or in my more honest opinion, stupid) that we as programmers are not
trusted to write code without tests, yet we're trusted to write a ton of code
untested as long as such code is testing code. As I kept trying to impart to
my former manager at The Enterprise before I left, the test case results
aren't to be trusted as gospel (and it always was by him) because I didn't
fully understand what the code was supposed to do (because the business logic
in “Project Lumbergh [15]” has become a literal mess of contradictory logic
and communication among the team seriously broke down).

So maybe we're not supposed to “unit test” functions that involve input,
output, or system interactions. Maybe we're supposed to “unit test” more
“pure functions” and leave messy real world details to, oh, I don't know,
some other form of testing. Okay, I have one final function [16] that should
be perfect for “unit testing.”

We shall see …

[1] gopher://gopher.conman.org/0Phlog:2022/10/08.1
[2] gopher://gopher.conman.org/0Phlog:2022/10/10.2
[3] gemini://gemini.sh0.xyz/log/2022-10-12.gmi
[4] https://www.imdb.com/title/tt3659388/
[5] https://github.com/spc476/mod_blog/blob/049e9dcf0985d4e089ac3593f39c987f077f0bcb/src/hooks.c#L39
[6] https://man7.org/linux/man-
[7] https://man7.org/linux/man-pages/man3/execlp.3.html
[8] https://man7.org/linux/man-pages/man3/execvp.3.html
[9] https://man7.org/linux/man-pages/man3/system.3.html#NOTES
[10] https://github.com/spc476/mod_blog
[11] https://ronjeffries.com/xprog/articles/oksudoku/
[12] https://testanything.org/
[13] https://www.artima.com/weblogs/viewpost.jsp?thread=126923
[14] http://lua-users.org/lists/lua-l/2018
[15] gopher://gopher.conman.org/0Phlog:2018/09/11.2
[16] https://github.com/spc476/mod_blog/blob/67e97a769ea793f53a7be27f4b7568c5e102e758/src/wbtum.c#L147
---

Discussions about this page

Unit test this | Lobsters
 https://lobste.rs/s/uf3xxu

gemini://thrig.me/blog/2022/12/22/unit-tests.gmi
 gemini://thrig.me/blog/2022/12/22/unit-tests.gmi

Email author at [email protected]