Introduction
Introduction Statistics Contact Development Disclaimer Help
README - bmf - bmf (Bayesian Mail Filter) 0.9.4 fork + patches
git clone git://git.codemadness.org/bmf
Log
Files
Refs
README
LICENSE
---
README (5151B)
---
1 bmf -- Bayesian Mail Filter
2
3 About bmf
4 =========
5
6 This is a mail filter which uses the Bayes algorithm as explained in Paul
7 Graham's article "A Plan for Spam". It aims to be faster, smaller, and …
8 versatile than similar applications. Implementation is ANSI C and uses …
9 functions. Supported platforms are (in theory) all POSIX systems.
10
11 This project provides features which are not available in other filters:
12
13 (1) Independence from external programs and libraries. Tokens are store…
14 memory using simple vectors which require no heavyweight external data
15 structure libraries. The tokens are stored in plain-text "flat" files.
16
17 (2) Efficient processing. Input data is parsed by a handcrafted parser
18 which weighs in under 3% of the equivalent code generated by flex. No
19 portion of the input is ever copied and all i/o and memory allocation are
20 done in large chunks. Updated token lists are merged and written in one
21 step. Hashing is being considered for the next version to improve lookup
22 speed.
23
24 (3) Simple and elegant implementation. No heavyweight, copy-intensive m…
25 decoding routines are used. Decoding of quoted-printable text for selec…
26 mime types is being considered for the next version.
27
28 Note: the core filter function is from esr's bogofilter v0.6 (available …
29 http://sourceforge.net/projects/bogofilter/) with bugfix updates.
30
31 For the most recent version of this software, see:
32
33 http://sourceforge.net/projects/bmf/
34
35 How to integrate bmf
36 ====================
37
38 The following procmail recipes will invoke bmf for each incoming email a…
39 place spam into $MAILDIR/spam. The first sample invokes bmf in its norm…
40 mode of operation and the second invokes bmf as a filter.
41
42 ### begin sample one ###
43 # Invoke bmf and use return code to filter spam in one step
44 :0HB
45 * ? bmf
46 | formail -A"X-Spam-Status: Yes, tests=bmf" >>$MAILDIR/spam
47
48 ### begin sample two ###
49 # Invoke bmf as a filter
50 :0 fw
51 | bmf -p
52
53 # Filter spam
54 :0:
55 ^X-Spam-Status: Yes
56 $MAILDIR/spam
57
58 The following maildrop equivalents are suggested by Christian Kurz.
59
60 ### begin sample one ###
61 # Invoke bmf and use return code to filter spam in one step
62 exception {
63 `bmf`
64 if ( $RETURNCODE == 0 )
65 to $MAILDIR/spam
66 }
67
68 ### begin sample two ###
69 # Invoke bmf as a filter
70 exception {
71 xfilter "bmf -p"
72 if (/^X-Stam-Status: Yes/)
73 to $MAILDIR/spam
74 }
75
76
77 If you put bmf in your procmail or maildrop scripts as suggested above, …
78 will always register an email as either spam or non-spam. To reverse th…
79 registration and train bmf, the following mutt macros may be useful:
80
81 macro index \ed "<enter-command>unset wait_key\n<pipe-entry>bmf -S\n<e…
82 macro index \et "<enter-command>unset wait_key\n<pipe-entry>bmf -t\n<e…
83 macro index \eu "<enter-command>unset wait_key\n<pipe-entry>bmf -N\n<e…
84
85 These will override these commands:
86
87 <Esc>d = de-register as non-spam, register as spam, and move to spam f…
88 <Esc>t = test for spamicity.
89 <Esc>u = de-register as spam, register as non-spam, and move to inbox …
90
91 Alternatively, if you use gnus you could add the following lines to your
92 .gnus to accomplish a similar result:
93
94 (defun spam ()
95 (interactive)
96 (pipe-message "/usr/local/bin/bmf -S")
97 (gnus-summary-move-article 1 "nnml:Spam"))
98
99 (defun notspam ()
100 (interactive)
101 (pipe-message "/usr/local/bin/bmf -N")
102 (gnus-summary-move-article 1 "nnml:inbox"))
103
104 (add-hook
105 'gnus-sum-load-hook
106 (lambda nil
107 (define-key gnus-summary-mode-map (read-kbd-macro "C-c C-o") 'spam)
108 (define-key gnus-summary-mode-map (read-kbd-macro "C-c C-p") 'notspa…
109
110 How to train bmf
111 ================
112
113 First, please keep in mind that bmf "learns" how to recognize spam from …
114 input that you give it. It works best if you give it exactly the email …
115 you receive, or have received in the recent past.
116
117 Here are some good techniques for training bmf:
118
119 - If you keep a history of email that you have received, use your curr…
120 and/or saved emails. It is fairly easy to create a small shell scri…
121 that will pass all of your normal email to "bmf -n" and all of your …
122 to "bmf -s". Note that if you do not use the mbox storage format, y…
123 MUST invoke bmf exactly once per email. Using "cat * | bmf -n" will…
124 work properly because bmf sees the entire input as one big email.
125
126 - If you already use spamassassin, you can use it to train bmf for a
127 couple of days or weeks. If spamassassin tags it as spam, run it
128 through "bmf -s". If not, run it through "bmf -n". This can be
129 automated with procmail or maildrop recipes.
130
131 Here are some things that you should NOT do:
132
133 - Get impatient with the training process and repeatedly pass one email
134 through "bmf -s".
135
136 - Manually move words around between lists and/or adjust the word coun…
137
138 Final words
139 ===========
140
141 Thanks for trying bmf. If you have any problems, comments, or suggestio…
142 please direct them to the bmf mailing list, [email protected]…
143
144 Tom Marshall
145 20 Oct 2002
You are viewing proxied material from codemadness.org. The copyright of proxied material belongs to its original authors. Any comments or complaints in relation to proxied material should be directed to the original authors of the content concerned. Please see the disclaimer for more details.