* * * * *

                More thoughts on optimizing a greylist daemon

I ran the updated stress test [1] on a faster (2.6GHz (gigaHertz) machine)
and managed to get some impressive results.

There were three different ways I ran the test. One option had the stress
program send a request and wait for a reply. This was by far the slowest of
the tests, but the most reliable (in terms of actually processing every
request) with the greylist daemon [2] handling between 4,000 to 6,300 tuples
per second. Another option has a separate process waiting for the replies and
that goes faster, between 11,000 and 17,000 tuples per second, but drops a
ton of requests (on the order of 70%). The last option doesn't even bother
with replies. This does both the best and the worst—30,000 tuples per second,
but it drops something like 90%.

So, the program can easily handle about 5,000 requests per second on a nice
server, which is probably way more than most SMTP (Simple Mail Transport
Protocol) servers can handle (and it's much nicer than the 130/second I
thought it could handle).

I profiled the program again, and this time, got actual results I could use:

Table: Each sample counts as 0.01 seconds.
% time  cumulative seconds      self seconds    calls   self Ts/call    total Ts/calls  name
------------------------------
21.24   0.48    0.48    2260060 0.00    0.00    crc32
14.38   0.81    0.33    443203  0.00    0.00    tuple_search
11.51   1.07    0.26    565012  0.00    0.00    ip_match
8.85    1.27    0.20    565012  0.00    0.00    type_graylist
7.97    1.45    0.18    1       0.18    2.20    mainloop
6.64    1.60    0.15    565015  0.00    0.00    send_packet
4.87    1.71    0.11    7648182 0.00    0.00    tuple_cmp_ift
4.87    1.82    0.11    565012  0.00    0.00    graylist_sanitize_req
3.98    1.91    0.09    1761756 0.00    0.00    edomain_search
3.54    1.99    0.08    2637054 0.00    0.00    edomain_cmp
3.10    2.06    0.07    421359  0.00    0.00    tuple_add
2.21    2.11    0.05    565012  0.00    0.00    send_reply
2.21    2.16    0.05    1       0.05    0.05    whitelist_dump_stream
0.89    2.18    0.02    565127  0.00    0.00    ipv4

------------------------------
% time  cumulative seconds      self seconds    calls   self Ts/call    total Ts/calls  name
Again, nothing terribly surprising here, except for the code gcc generated
for the crc32() function (two lines of C code, one of which is while(size--
)), but I used the default compiler settings; if it really bothers me, I can
up the compiler settings and see what I get.

[1] gopher://gopher.conman.org/0Phlog:2007/10/29.2
[2] gopher://gopher.conman.org/0Phlog:2007/08/16.1

Email author at [email protected]