* * * * *

 Really, POSIX? Really? memset() isn't async-signal-safe? How is it not safe?
                              … oh … that's why

Just because, I found myself rewriting some code that dumps memory [1]. The
current version is too limiting because it only dumps to a file, and there
have been several times I've wanted to dump memory to something other than a
file, like to syslog() (for instance, my crashreport() function [2] which
dumps some memory as part of its report).

So I got the new code written, and the core of it is “async-signal-safe.”
This is important because functions that are not async-signal-safe can not be
called from a signal handler (the cause of my hardest-to-find bug [3] yet). I
got to the point where I needed to add some padding to the output and the
easiest way to do that is to call memset().

Now a curious thing about memset()—if you check the list of async-signal-safe
functions one can call (Not a complete list, but the only one I found that
was a direct link that required no scrolling) [4], you will not find memset()
among the listed functions. Which is odd, because the function itself does
very little, little more than:

> void memset(void *s,int c,size_t n)
> {
>   unsigned char *m = s;
>
>   while(n--)
>     *m++ = c;
>
>   return s;
> }
>

This isn't like malloc(), which could be interrupted as it's working and
leave critical data structures in an indeterminate state such that a
subsequent call to malloc() from within the signal handler could blow up. No,
it's self contained and a call to memset() won't interfere with an already
interrupted call to memset(). It's curious that memset() isn't considered
async-signal-safe (along with memcpy(), memmove() and strcpy()). It just
doesn't make sense.

Until it does [5].

Here's the upshot: since memset() is a standard C function, a C compiler is
free to do anything it wants as long as the end result is the same—the memory
is set to a given value. So, assuming a standard Intel CPU (Central
Processing Unit), it can compile this:

> int array[2];
> memset(array,0,sizeof(array));
>

into the following assembly language:

>       xor     eax,eax         ; set EAX to 0
>       mov     [array],eax     ; zero out array[0]
>       mov     [array+4],eax   ; zero out array[1]
>

or (and this is getting close to the issue at hand):

> int array[1000];
> memset(array,0,sizeof(array));
>

can be compiled into

>       mov     edi,array       ; point to array
>       mov     ecx,1000        ; there are 1000 entries
>       xor     eax,eax         ; each being set to 0
>       rep     stosd           ; now do it
>

The problem is memmove(), which can handle copying memory from overlapping
regions. Typically, you just copy memory from low memory to high but when the
regions overlap, you can't do that. Instead, you have to copy from high
memory to low. Again, the Intel CPU can deal. So, code like:

> int array[1000];
> memmove(&array[100],&array[0],sizeof(int) * 900);
>

could turn into:

>       mov     esi,array + 999 * 4     ; point to last element in array
>       mov     edi,array +  99 * 4     ; point to final destination in array
>       mov     ecx,900                 ; this many integers
>       std                             ; !!! make sure we copy from high to low
>                                       ; HERE BE DRAGONS!
>       rep     movsd                   ; copy data
>       cld                             ; clear direction flag
>

And the issue shows itself. There's a flag in the Intel CPU that tells it
which way to copy memory. If the flag is not set, then any memory copy (or
memory setting) goes from low to high (the index registers ESI and EDI are
incremented); otherwise if the flag is set, then any memory copy (or memory
setting) goes from high to low (the index registers are decremented).
Generally, the flag is usually cleared except for the few cases where it's
required to be set. After the STD instruction is executed but prior to the
CLD instruction being executed, if a signal is delivered to the program, the
kernel will interrupt the program and call a signal handler. And if in the
signal handler, memset() is called, the code is probably expecting the
direction flag to not be set, so the memset() code will now run in the
reverse direction.

The real issue here is that the program state (including the direction flag)
is saved upon entering the kernel. When the kernel transitions back to
usercode, that state is then restored. The signal handler isn't technically a
thread (although it executes asynchronously as a thread) so it doesn't have
its own state (although in POSIX, a signal handler can have its own stack,
but that's entirely optional and I'm digressing). You could argue that a
program (technically a process or thread) could have two states—a normal
state and a signal handler state, but the problem there is that signal
handlers can be interrupted by yet another signal handler and the issue rears
its ugly head yet again.

I couldn't find any current information about this problem (and where is the
bug [6]? Is it in the compiler? The operating system? The standards bodies?)
and so alas, even a simple function like memset() might not be async-signal-
safe.

Sigh.

Update on January 3^rd, 2017

* Hacker News discussion [7]
* Lobsters discussion [8]


[1] https://github.com/spc476/CGILib/blob/55c1c39b24ce57c840a33e4befce0a69c0d27695/src/util.c#L145
[2] https://github.com/spc476/CGILib/blob/55c1c39b24ce57c840a33e4befce0a69c0d27695/src/crashreport.c
[3] gopher://gopher.conman.org/0Phlog:2007/10/18.1
[4] https://docs.oracle.com/cd/E19455-01/806-5257/6je9h033a/index.html#gen-95948
[5] https://lkml.org/lkml/2008/3/5/518
[6] http://austin-group-l.opengroup.narkive.com/jBp07fPN/adding-simple-string-functions-to-async-signal-safe-list
[7] https://news.ycombinator.com/item?id=13313563
[8] https://lobste.rs/s/auup83/really_posix_really_memset_isnt_async

Email author at [email protected]