* * * * *

                               Scaling daemons

I'm still deep in programming.

So now I'm writing a daemon.

The first problem was getting MySQL [1] to contact the daemon, and I forgot—
I'm working under Unix, and interprocess communications suck under Unix. You
have pipes, but they only work between processes that have a common ancestor.
You can get around that problem by using named pipes, which get around the
common ancestor, but there's still a limit to the amount of data that can be
in the pipe, and if one side isn't listening (say, the daemon) then the other
side is blocked. No good.

Oh, I could try using message queues. But it too, has problems—no automatic
reclaimation of system resources when one side (or both!) crash. They're not
identified by name and there are no tools to list the exiting message queues
or delete them! And they can't be used with the multiplexing I/O
(Input/Output) API (Application Programming Interface) (select() or poll(),
which I'll probably be using if I'm dealing with tons of connections).

The same problems exist for shared memory by the way, plus a whole slew of
synchronization problems between unrelated processes, which probably mandates
the use of semaphores, which again, have similar problems with message queues
and shared memory.

Told you interprocess communication under Unix sucks.

Leaving sockets. Since they use regular file descriptors, they work with the
multiplexed I/O API, but I hate using select(), since you end up scanning
through arrays. The code that uses select() typically looks like:

> while(1)
> {
>   FD_ZERO(&list);
>
>   for (i = 0 ; i < files_count ; i++)
>     FD_SET(files[i],&list);
>
>   rc = select(FD_SETSIZE,&list,NULL,NULL,NULL);
>
>   if (rc < 0) /* select() returned an error */
>   {
>     handle_error(errno);
>     continue;
>   }
>   else if (rc > 0)    /* we got some */
>   {
>     for (i = 0 ; i < files_count ; i++)
>     {
>       if (FD_ISSET(files[i],&list)
>       {
>         if (files[i] == listen_socket)
>         {
>           len = sizeof(remote_addr);
>           connection = accept(listen_socket,&remote_addr,&len);
>
>           /*----------------------------------
>           ; oh great, we need to add this to the end
>           ; of the files array, but that readjusts the
>           ; file_count variable ... buyer beware ...
>           ;-------------------------------------*/
>
>           add_to_list(files,connection);
>         }
>
>         /*-----------------------------------
>         ; oh bloody hell, we're listening to
>         ; MySQL as well ... sigh.
>         ;----------------------------------*/
>
>         else if (files[i] == mysql_connection)
>         {
>           handle_that_mess(mysql_connection);
>         }
>
>         /*---------------------------------------
>         ; otherwise it's a connection from outside
>         ;--------------------------------------*/
>
>         else
>         {
>           /*-----------------------------------
>         ; oh man, we need to find the data
>           ; associated with this connection, so
>           ; that means another scan of some other
>           ; list ... Aiiiiieeeeeeeeeeeeeeee!
>

I've been down this route before [2], and it resulted in some of the most
convoluted code I've ever written. And looking at poll(), it doesn't appear
much better.

I could get around using select() or poll() by creating a multithreaded or
multiprocess application, but that's a whole new can of worms I'm opening up
(deadlocks or race conditions anyone?) in addition to the problems I
mentioned above about interprocess communication.

In looking around for a usable solution, I came across epoll [3], which is a
new multiplexing I/O API in the newer Linux kernels. Reading over the
documentation, it looks like you add file descriptors to an “epoll queue”
(which itself is a file descriptor), then you call epoll_wait() which returns
an array of file descriptors that are ready for reading or writing! It saves
scanning through an entire list of file descriptors continuously asking “do
you have data?”

What sold me was looking at the definition of the event structure:

> typedef union epoll_data {
>       void *ptr;
>       int   fd;
>       __uint32_t u32;
>       __uint64_t u64;
> } epoll_data_t;
>
> struct epoll_event {
>       __uint32_t events; /* Epoll events */
>       epoll_data_t data; /* User data variable */
> };
>

User data variable?

I get a pointer?

Associated with a file descriptor?

No way?!

Define a few structures with function pointers, and **boom!** The main loop
now looks like:

> void mainloop(int queue)
> {
>   struct epoll_event list[10];
>   int                events;
>   int                i;
>   struct foo         data;
>
>   while(1)
>   {
>     events = epoll_wait(queue,list,10,TIMEOUT);
>     if (events < 0)
>       continue;       /* error, but we ignore for now */
>     for (i = 0 ; i < events ; i++)
>     {
>       data = list[i].data.ptr;
>       (*data->fn)(&list[i]);  /* call our function */
>     }
>   }
> }
>

Man, this now becomes easy. No more having to constantly check file
descriptors or maintaining lists of file descriptors. I'm in heaven with this
stuff. Even better that this method scales beautifully [4].

[1] http://www.mysql.com/
[2] http://www.conman.org/people/spc/refs/search/
[3] http://lse.sourceforge.net/epoll/index.html
[4] http://www.kegel.com/c10k.html

Email author at [email protected]