Title: Filtering spam using Rspamd and OpenSMTPD on OpenBSD

	Title: Filtering spam using Rspamd and OpenSMTPD on OpenBSD
	Author: Solène
	Date: 13 July 2021
	Tags: openbsd mail spam
	Description:

	# Introduction

	I recently used Spamassassin to get ride of the spam I started to
	receive but it proved to be quite useless against some kind of spam so
	I decided to give rspamd a try and write about it.

	rspamd can filter spam but also sign outgoing messages with DKIM, I
	will only care about the anti spam aspect.

	rspamd project website

	# Setup

	The rspamd setup for spam was incredibly easy on OpenBSD (6.9 for me
	when I wrote this). We need to install the rspamd service but also the
	connector for opensmtpd, and also redis which is mandatory to make
	rspamd working.

	```shell instructions
	pkg_add opensmtpd-filter-rspamd rspamd redis
	rcctl enable redis rspamd
	rcctl start redis rspamd
	```

	Modify your /etc/mail/smtpd.conf file to add this new line:

	```smtpd.conf file
	filter rspamd proc-exec "filter-rspamd"
	```

	And modify your "listen on ..." lines to add "filter "rspamd"" to it,
	like in this example:

	```smtpd.conf file
	listen on em0 pki perso.pw tls auth-optional filter "rspamd"
	listen on em0 pki perso.pw smtps auth-optional filter "rspamd"
	```

	Restart smtpd with "rcctl restart smtpd" and you should have rspamd
	working!

	# Using rspamd

	Rspamd will automatically check multiple criteria for assigning a score
	to an incoming email, beyond a high score the email will be rejected
	but between a low score and too high, it may be tagged with a header
	"X-spam" with the value true.

	If you want to automatically put the tagged email as spam in your Junk
	directory, either use a sieve filter on the server side or use a local
	filter in your email client. The sieve filter would look like this:

	```sieve rule

	if header :contains "X-Spam" "yes" {
	fileinto "Junk";
	stop;
	}
	```

	# Feeding rspamd

	If you want better results, the filter needs to learn what is spam and
	what is not spam (named ham). You need to regularly scan new emails to
	increase the effectiveness of the filter, in my example I have a single
	user with a Junk directory and an Archives directory within the maildir
	storage, I use crontab to run learning on mails newer than 24h.

	```crontab
	0 1 * * * find /home/solene/maildir/.Archives/cur/ -mtime -1 -type f -exec rsp…
	10 1 * * * find /home/solene/maildir/.Junk/cur/ -mtime -1 -type f -exec rsp…
	```

	# Getting statistics

	rspamd comes with very nice reporting tools, you can get a WebUI on the
	port 11334 which is listening on localhost by default so you would
	require tuning rspamd to listen on other addresses or you can use a SSH
	tunnel.

	You can get the same statistics on the command line using the command
	"rspamc stat" which should have an output similar to this:

	```command line output
	Results for command: stat (0.031 seconds)
	Messages scanned: 615
	Messages with action reject: 15, 2.43%
	Messages with action soft reject: 0, 0.00%
	Messages with action rewrite subject: 0, 0.00%
	Messages with action add header: 9, 1.46%
	Messages with action greylist: 6, 0.97%
	Messages with action no action: 585, 95.12%
	Messages treated as spam: 24, 3.90%
	Messages treated as ham: 591, 96.09%
	Messages learned: 4167
	Connections count: 611
	Control connections count: 5190
	Pools allocated: 5824
	Pools freed: 5801
	Bytes allocated: 31.17MiB
	Memory chunks allocated: 158
	Shared chunks allocated: 16
	Chunks freed: 0
	Oversized chunks: 575
	Fuzzy hashes in storage "rspamd.com": 2936336370
	Fuzzy hashes stored: 2936336370
	Statfile: BAYES_SPAM type: redis; length: 0; free blocks: 0; total blocks: 0; f…
	Statfile: BAYES_HAM type: redis; length: 0; free blocks: 0; total blocks: 0; fr…
	Total learns: 4166
	```

	# Conclusion

	rspamd is for me a huge improvement in term of efficiency, when I tag
	an email as spam the next one looking similar will immediately go into
	Spam after the learning cron runs, it draws less memory then
	Spamassassin and reports nice statistics. My Spamassassin setup was
	directly rejecting emails so I didn't have a good comprehension of its
	effectiveness but I got too many identical messages over weeks that
	were never filtered, for now rspamd proved to be better here.

	I recommend looking at the configurations files, they are all disabled
	by default but offer many comments with explanations which is a nice
	introduction to learn about features of rspamd, I preferred to keep the
	defaults and see how it goes before tweaking more.