Version: (using KDE KDE 3.5.7) Installed from: Ubuntu Packages OS: Linux The filter setup created by KMail's "Anti-Spam Wizard" for bogofilter might lead to severe problems with bogofilter's wordlist when the user applies either "Classify as SPAM" or "Classify as NOT SPAM" on messages that have not been (automatically) registered before. Since bogofilter's auto-register option "-u" only registers messages that it can automatically classify as spam or ham, using "bogofilter -N -s" for manually registering messages as spam and "bogofilter -n -S" for manually registering messages as HAM leads to a decrement of all spam- or ham-counts for all tokens contained in the processed message as well as a decrement of the spam- or ham-counts in the special token ".MSG_COUNT". If used on messages that have not been registered before, this may lead to a condition where the spam (or ham-) count of tokens exceed the spam (or ham-) message count, which in turn will produce odd results in the individual spam- or ham-propabilities for the affected tokens. In extreme cases (spam-value for ".MSG_COUNT" is 0), bogofilter will produce a spam probability of "nan" because of a floating point division by zero. Since it is generally not a very good idea to unregister messages that have not been registered before, I would suggest to change the generated filters into a setup that refrains from auto-registration and manual de-registrations. As suggested by Matthias Andree on the bogofilter mailing list, I would like to propose replacing the current filter setup … +----------------------+----------------------------------------+-------+ | Filter name | Action | Auto? | +----------------------+----------------------------------------+-------+ | Bogofilter Check | Pipe through "bogofilter -p -e -u" | Yes | | Classify as SPAM | Execute command "bogofilter -N -s" | No | | Classify as NOT SPAM | Execute command "bogofilter -S -n" | No | +----------------------+----------------------------------------+-------+ … with something like this: +----------------------+----------------------------------------+-------+ | Filter name | Action | Auto? | +----------------------+----------------------------------------+-------+ | Bogofilter Check | Pipe through "bogofilter -p -e " | Yes | | Classify as SPAM | Execute command "bogofilter -s" | No | | Classify as NOT SPAM | Execute command "bogofilter -n" | No | +----------------------+----------------------------------------+-------+ Although SPAM and HAM messages that are correctly classified by bogofilter are not automatically added to the wordlist, this filter setup works pretty well on my system, relying only on the occasional manual classifications. Not only does it avoid the problems mentioned above, but it also results in a massive performance increase when checking messages, since no write access to the wordlist is required.
>As suggested by Matthias Andree on the bogofilter mailing list Can you add a link to the archives please? Other than that, this sounds sensible and can be easily achieved by modifying the kmail.antispamrc file. Maybe I'll have a look at this later.
>> As suggested by Matthias Andree on the bogofilter mailing list > Can you add a link to the archives please? Sure. The message I was referring to can be found here: <http://www.bogofilter.org/pipermail/bogofilter/2007-July/009252.html> > Other than that, this sounds sensible and can be easily achieved by modifying > the kmail.antispamrc file. Maybe I'll have a look at this later. Yep. Changing "PipeCmdDetect", "ExecCmdSpam" and "ExecCmdHam" should do.
SVN commit 695738 by tmcguire: Change the filter commands for bogofilter. The old behavior corrupted the bogofilter database because KMail unregistered messages which were not registered with bogofilter in the first place. With the new behavior, messages which are classified automatically are no longer added to the bogofilter database. For more details and a better explaination, see the bugreport and especially the bogofilter mail archives (linked to from the bugreport). BUG: 148211 CCBUG: 74577 M +3 -3 kmail.antispamrc --- trunk/KDE/kdepim/kmail/kmail.antispamrc #695737:695738 @@ -34,10 +34,10 @@ Executable=bogofilter -V URL=http://bogofilter.sourceforge.net PipeFilterName=Bogofilter Check -PipeCmdDetect=bogofilter -p -e -u +PipeCmdDetect=bogofilter -p -e PipeCmdNoSpam= -ExecCmdSpam=bogofilter -N -s -ExecCmdHam=bogofilter -S -n +ExecCmdSpam=bogofilter -s +ExecCmdHam=bogofilter -n DetectionHeader=X-Bogosity DetectionPattern=(yes)|(spam\\b) DetectionPattern2=