Bug 87754 - current anti-spam-functionality too slow to be considered useful
Summary: current anti-spam-functionality too slow to be considered useful
Status: RESOLVED WAITINGFORINFO
Alias: None
Product: kmail
Classification: Applications
Component: filtering (show other bugs)
Version: 1.7
Platform: unspecified Linux
: NOR wishlist
Target Milestone: ---
Assignee: kdepim bugs
URL:
Keywords:
: 87936 (view as bug list)
Depends on:
Blocks:
 
Reported: 2004-08-22 16:02 UTC by S. Burmeister
Modified: 2012-08-19 00:50 UTC (History)
4 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description S. Burmeister 2004-08-22 16:02:58 UTC
Version:            (using KDE KDE 3.3.0)
OS:                Linux

Which other feature in KDE would be considered ready for release if it had the drawbacks of the spam-filtering in kmail 1.7?

When I use the spam-wizard it tells me:

- kmail is blocked while filtering spam (blocking an app is normally considered a bug)

As a justification for this it tells me:

- usually filtering spam is time-consuming

If I do not like kmail to be slow and blocked it tells me to remove the filters, i.e. not use spam-detection.

The problem about this is that:

a) Everybody wants to filter spam, so disabling it is not an option
b) Everybody needs email, not everybody can be expected to set-up piping email through some tool before it gets to the imap, so that kmail does not have to do the job of filtering spam.
c) Filtering spam is usually NOT time-consuming, as Mozilla has demonstrated for a very long time now.
d) The time-comsuming/blocking action is seen as less annoying than implementing spam-detection into kmail.

You might say that this bug is invalid because the matter was already discussed before, but that is actually not true. Only IF kmail would be able to filter spam as quick and well using external tools, as Mozilla does using its internal tool, this bug would be invalid. However at the moment kmail 1.7 has only proved that using external tools is not an option.

If you do not think so please try to explain the following and tell me that 1.7's way is the best way kmail can handle spam.

I select 5 spam-emails from my localhost/IMAP account and mark them as spam:
kmail: blocked, it takes >25 sec to do the job
mozilla: too fast to notice whether is is blocked during spam-detection or not
(All this on a 300MHz CPU with 192MB RAM)

You could tell me to go on and use Mozilla if I do not like kmail, but as a matter of fact I do like it and so I ask for a spam-detection that is as fast and useful as Mozilla's. If this cannot be achieved with external tools there is the only choice to implement an internal. If this can be done with external-tools, fair-enough.
Comment 1 Henrique Pinto 2004-08-22 16:16:45 UTC
Em Dom 22 Ago 2004 11:03, S.Burmeister escreveu:
> I select 5 spam-emails from my localhost/IMAP account and mark them as
> spam: kmail: blocked, it takes >25 sec to do the job
> mozilla: too fast to notice whether is is blocked during spam-detection or
> not (All this on a 300MHz CPU with 192MB RAM)

Filtering spam using KMail and bogofilter on my system is fast enough not to 
be noticed, and my system is not much faster than yours. Please check if 
there are any configuration problems (I've heard there's a very slow mode for 
spamassassin, maybe you're using it).

Comment 2 S. Burmeister 2004-08-22 17:58:09 UTC
Am Sonntag, 22. August 2004 16:16 schrieben Sie:
> Filtering spam using KMail and bogofilter on my system is fast enough not
> to be noticed, and my system is not much faster than yours. Please check if
> there are any configuration problems (I've heard there's a very slow mode
> for spamassassin, maybe you're using it).

This would not change anything! I did what can be expected from a user, i.e. 
install the spamassasin-rpm, use the wizard, that's it. Mozilla requires even 
less. If there is a problem with spamd it should either not be included as 
possibility to chose from in the wizard, not an option, or some detailed 
instructions given to te user. How should someone new/unexperienced know 
about this? You cannot expect from someone to get into configuration files 
just because one wants to use spam-filtering, a standard-feature of today's 
mail-clients.

Comment 3 Malte S. Stretz 2004-08-22 18:01:31 UTC
The blocked KMail interface is yet another incarnation of KDE's Most Hated Bug 41514.

Unfortunately did you forget which backend (SpamAssassin, bogofilter, CRM-11something) you use.  I guess its SpamAssassin which should be pretty fast by itself if network tests aren't enabled.  If network tests are run on the other hand, it does a whole bunch of things more than the Mozilla filter and depending on the speed of your network can indeed take a long time.

First downloading mail and then queueing it in a special folder for spam check (and other filters) before it hits the inbox might be a nice feature though.
Comment 4 Matt Douhan 2004-08-22 18:04:23 UTC
On Sunday 22 August 2004 17.58, S.Burmeister wrote:
>
> This would not change anything! I did what can be expected from a user,
> i.e. install the spamassasin-rpm, use the wizard, that's it. Mozilla
> requires even less. If there is a problem with spamd it should either not
> be included as possibility to chose from in the wizard, not an option, or
> some detailed instructions given to te user. How should someone
> new/unexperienced know about this? You cannot expect from someone to get
> into configuration files just because one wants to use spam-filtering, a
> standard-feature of today's mail-clients.

Well since it works for many others have you considered the possibility that 
your system is the problem?

rgds

Matt

Comment 5 S. Burmeister 2004-08-22 18:18:12 UTC
> Well since it works for many others have you considered the possibility
> that your system is the problem?

Well, Gene Heskett has the same problem. And as I said I did not do anything 
but install the rpm-package. On my 1,4 GHz system (suse 9.1) this feature 
also blocks for a too long time.

And even if so, I have never heard of anyone that used mozilla's 
spam-filtering-feature and got a problem with it because of his/her system, 
which, again, tells me that internal might be better as a default because it 
is less vulnerable to system issues and external should be an option for 
advanced users who can handle problems that appear because of the tool being 
external.

Comment 6 S. Burmeister 2004-08-22 18:48:01 UTC
Hi Fred!

Am Sonntag, 22. August 2004 18:30 schrieb Fred Emmott:
> The speed of anti-{virus,spam} filtering depends on:
> (1) which tool you use.
> (2) how that tool is configured.

So true! That is why one should offer as a default a tool that does not have 
to be configured and is reliable, i.e. something like mozilla does. If people 
want to tune their own tool, fair enough, you can still do that, but the 
normal user should not have to worry about system issues because s/he is 
installing kmail in order to read mail and not to configure and tune a 
spam-detection-tool.

The simpler to use, the better, as a default. Tuning and Config-files for 
advanced users.

The question is: What harm is done to advanced users if the default 
anti-spam-engine in kmail is internal? They simply do not have to switch it 
on and go on using their filter-rules.

What trouble is saved for users who are not advanced enough to tweak their 
spamassassin, check their system...? A lot I would say.

Sven

Comment 7 Malte S. Stretz 2004-08-22 19:26:30 UTC
There is just one thing you forgot:  Whichever internal tool that would be it has to be (a) written and (b) maintained.  Spam-fighting is a very dynamic area which needs lots of know-how.  So if you find somebody with that know-how who wants to join the KDE project, an internal solution might be possible I think.
Comment 8 Matt Douhan 2004-08-22 20:48:27 UTC
On Sunday 22 August 2004 20.37, Gene Heskett wrote:
> Then how come he isn't the only one squawking Matt?  I'm on that
> squawk list too.
>
You seem to be missing my point, I am not saying it is NOT a problem I am 
simply saying it may be related to the local config.

Matt

Comment 9 S. Burmeister 2004-08-22 21:30:46 UTC
Hi!

Am Sonntag, 22. August 2004 19:26 schrieb Malte S.Stretz:
> ------- There is just one thing you forgot:  Whichever internal tool that
> would be it has to be (a) written and (b) maintained.  Spam-fighting is a
> very dynamic area which needs lots of know-how.  So if you find somebody
> with that know-how who wants to join the KDE project, an internal solution
> might be possible I think.

well you are right that it has to be implemented and maintained but you cannot 
tell me that there is not a single developer with the necessary know-how 
within the KDE-community. Moreover there are sources to look at at mozilla. 
From my point of view the problem is that this is not seen as necessary, thus 
no resources commited. Where there is a will, there is a way. If that saying 
exists in English. ;)

Sven

Comment 10 S. Burmeister 2004-08-22 21:33:48 UTC
Gene's comment did not make it through, so I post it for him:

Am Sonntag, 22. August 2004 17:41 schrieb Gene Heskett:
I'll second this one, the blocking during the spamassaassin checking 
is a major problem here, even on an adsl circuit.  What about the 
poor schmucks on a metered time dialup?  Timewise vs phone bill, its 
far cheaper for them to hit the delete button than it is to run 
spamassassin.  Its not acceptable performance to have kmail blocked 
for 30-45 seconds while a single message is being cleared.  I've lost 
messages because the delete button didn't work, so I clicked it again 
10 seconds later, and when kmail woke up, it killed the one I wanted 
killed, and the next one too.

This really, really does need addressed ASAP.  I see no reason why 
such "being checked mail" cannot remain invisible in a seperate 
buffer/directory until such time as the spam tools either kill it at 
their leasure, or clear it for placement in the inbox and the 
following filtering to then place them in the appropriate folders.
It (sa) is working pretty good at finding spam, but the lags are 
making it very hard to swallow.
Comment 11 S. Burmeister 2004-08-22 22:03:38 UTC
Hi!

Am Sonntag, 22. August 2004 21:43 schrieb Fred Emmott:

> (1) Make a "unfiltered" folder
> (2) Select the "SpamAssassin Check" filter
> (3) Untick "to incoming messages"
> (4) Make a filter to move all incoming messages to "unfiltered"
> (5) Make a filter to move messages with "X-Spam-Flag" "contains" "no" into
> Inbox
>
> When new messages come in, go into "unfiltered" folder, select all
> messages, press ctrl+j.

I am sure this is a good workaround, so thanks as I can also use it. However 
it just proves the point that there is too much that can go wrong with 
external tools, which is not a problem for advanced users like you but for 
normal users, like me.

My point really only is that although external tools might have advantages for 
advanced users, the best default-solution is to have a built-in tool that 
does not need any installation, configuration, optimisation because in this 
way it is save in terms of system issues and most other issues that can 
happen when using rxternal tools, you might lose the some % in accuracy 
because the tech. is not bleeding edge but it works well for mozilla so why 
should it not for kmail?

Sven

Comment 12 Andreas Gungl 2004-08-23 09:01:45 UTC
In the meantime, there have been proposals on how to fix the setup for slow 
machines. See the previous comments. But you should be aware, the the 
concrete fix ever depends on the concrete setup of your system.

Anyway, using checks on the server is the better alternative. If you use 
providers like e.g. GMX, you can setup additional filters based on the 
headers added by the scan tools of the provider. So you can filter out spam 
detected by the provider without scanning them locally again.

As said before in a comment, configuring SpamAssassin can lead to _very_ 
different runtime behavior. You may use the daemon mode or the standalone 
perl script. You may use local checks only or the whole palette of remote 
checks.
The current state of KMail's wizard doesn't allow deeper checks of that 
configuration. So it falls back to the default mode in the SpamAssassin 
support, i.e. use the standalone script.
I hope to improve the situation in the next release.

BTW Bogofilter is much faster but offers less functionality (IMHO). So there 
might a chance for you to accelerate the scans if that tool is installed on 
your machine.

I don't see the inclusion of spam detection source code to KMail's code base 
as an option. This was discussed long enough on the development mailing 
list and the reasons are obvious. The way to go is the current one. So we 
should discuss about how to improve the wizard. E.g. it might be able to 
check the SpamAssassin configuration and try to suggest changes or 
whatever.

Please understand that there are so many aspects to care about when you talk 
about spam filtering in a mail user agent. Even manpower has to be 
considered.
Constructive feedback is welcome, comments like "Why are the developers so 
dumb to provide such a slow scanning" don't make KMail better.

Comment 13 S. Burmeister 2004-08-23 18:42:52 UTC
Hello!

Am Montag, 23. August 2004 09:01 schrieb Andreas Gungl:
> I don't see the inclusion of spam detection source code to KMail's code
> base as an option. This was discussed long enough on the development
> mailing list and the reasons are obvious. The way to go is the current one.
> So we should discuss about how to improve the wizard. E.g. it might be able
> to check the SpamAssassin configuration and try to suggest changes or
> whatever.

The discussion was about the current way being better and as this bug shows 
the current implementation is only better as long as you are an advanced 
user, i.e. change tha standard configuration of the spamassasin-rpm, or use 
less featured tools, like bogo. So up to know the new/unexperienced user has 
not really gained anything compared to mozilla's functionality.

If you decide in favor of the current way you have to provide a tool that is 
as easy to use and as hasselfree as Mozilla's in first place, because that 
are the requirements for an everyday app like kmail, that has to fit in its 
usage with the capabilities of a new/unexperienced user. That's a bit shit 
for people like developers who know a lot more about the subject but reality, 
an app is build for its users and not the developers in first place. Yeah, I 
know developers are also users, but not the ones that come first for an 
everyday app like kmail in terms of ease.

After the normal user has something equal in ease and functionality than 
Mozilla's way, developers can get into their advanced features. If one 
can/would do that the decision to go this way was right. The problem with 
discussing this issue in a devel-list or even bugzillanis that there are no 
users that represent new/unexperienced users, so their opinion/capabilities 
are not considered, even those hundreds of votes for an internal app have not 
been considered. Manpower would really be the only feasible reason, but I 
come to that later.

As I said before, I absolutely understand that for advanced users the current 
way is the best, but unfortunately the number of advanced users using kmail 
is becoming relatively smaller every day, as more people switch from Windows 
to Linux.

> Please understand that there are so many aspects to care about when you
> talk about spam filtering in a mail user agent. Even manpower has to be
> considered.

This is certainly true. However I think that all the manpower in defeating 
hundreds of votes and wishlist itmes (not hundreds I hope ;) ) for an 
internal tool, plus creating that wizard, plus optimising it, plus the time 
users will need to reconfigure spamassasin would have been enough time and 
manpower to get an internal tool working.

> Constructive feedback is welcome, comments like "Why are the developers so
> dumb to provide such a slow scanning" don't make KMail better.

Did you say that, I did not. I just claim that developers are too good so that 
they forget about the normal user that does not even know about mailing-list 
to ask a question. And the way software works is that the user comes first 
and for everyday apps like email it is unfortunately the unexperienced user, 
with no capabilities to edit a configuration-file. I repeat myself but as the 
moint seems to be missed: If s/he can guaranteed use kmail's 
anti-spam-functionality as wasy and hasselfree as Mozilla's then it has 
fulfilled the standard, after that developers can add advanced options for 
advanced users and advanced performance not needed by any new/unexperienced 
user that just needs guaranteed ease.

It really does not make sense to discuss this issue any further if developers 
do not agree on the basic assumption that this feature has to be as easy and 
hasselfree as mozilla's in every aspect.

Sven

Comment 14 Corey 2004-08-23 22:23:02 UTC
I'm sorry if this is an unhelpful suggestion - but for those who are using
spamassissin, be sure to use spamd/spamc as a method to at least speed up
the processing a little.

Comment 15 Henrique Pinto 2004-08-24 01:09:26 UTC
Em Seg 23 Ago 2004 17:02, Gene Heskett escreveu:
> So possibly, if someone has the time to play with this, and could post
> a FAQ format, or an SA configure tutorial that would help us, it
> would reduce the mewling we're doing about it.  Considerably.
>
> Could this be done by someone a lot more familiar with how it works
> than this user is?  And post it someplace on kde.org so we can find
> it?

For people who can't/don't want to spend some time configuring anti-spam, I 
would recommend using bogofilter. It requires no configuration and is 
extremely fast.

Comment 16 Don Sanders 2004-08-24 10:33:37 UTC
I'm aghast. It's hard to know where to start.

The problem of KMail blocking when detecting/filtering spam wouldn't 
somehow magically disappear if spam detection/filtering was done 
internally rather than using an external tool.

Regardless there's a basic problem of the filters being blocking that 
must be solved.

This problem doesn't exist because we the developers are stupid and 
have no idea about what the usability issues users are experiencing.

The problem exists because solving the problem of blocking filters 
isn't trivial. The filtering code is used in several different places 
in KMail so fixing this problem requires an architecture change. Also 
really these types of blocking problems can't be completely solved 
they can only be reduced in magnitude, ideally to the point where 
they are no longer noticeable. These are software engineering issues 
that require at least a basic understanding of programming to 
comprehend. 

Furthermore solving these issues requires intricate knowledge of 
KMail's internal. The Mozilla code just isn't useful/helpful here.

I really don't know what else to say. 

Don.

Comment 17 Fred Emmott 2004-08-24 14:57:31 UTC
*** Bug 87936 has been marked as a duplicate of this bug. ***
Comment 18 Bonnaud Frédéric 2004-08-24 15:44:17 UTC
May be is it possible to make the receiving/filtering process a "thread" process in order not to block kmail ?

Comment 19 S. Burmeister 2004-08-24 19:16:44 UTC
Am Montag, 23. August 2004 22:02 schrieb Gene Heskett:
> On Monday 23 August 2004 12:42, S.Burmeister wrote:
> > And the way software
> > works is that the user comes first and for everyday apps like email
> > it is unfortunately the unexperienced user, with no capabilities to
> > edit a configuration-file. I repeat myself but as the moint seems
> > to be missed: If s/he can guaranteed use kmail's
> > anti-spam-functionality as wasy and hasselfree as Mozilla's then it
> > has fulfilled the standard, after that developers can add advanced
> > options for advanced users and advanced performance not needed by
> > any new/unexperienced user that just needs guaranteed ease.
> >
> >It really does not make sense to discuss this issue any further if
> > developers do not agree on the basic assumption that this feature
> > has to be as easy and hasselfree as mozilla's in every aspect.
>
> Sven:  I think that some of our problems vis-a-vis the speed of SA as
> its setup by default may, from the sounds of it, be somewhat
> alleviatable (Is there such a word?) by some judicius config options.

I guess so, that's why I think that there should be another way of approaching 
this, but as long as people do not agree on this I see the following 
workaround as feasible.

1. If Bogofilter is the one that never caused problems to anyone, put this at 
the top of the list and mark it as best option for new/unexperienced users as 
it is hasselfree, so that people pick this one if they have problems or do 
not know which one to install. 

2. Provide a link below the text that says that kmail could/will be blocked to 
some document that explains how to configure spamassasin in order to make it 
work. Which would come close to your idea but is closer to the user than an 
FAQ on the internet, as that is only for people who search for it which 
new/unexperienced users do not do.

Sven

Comment 20 S. Burmeister 2004-08-24 19:20:40 UTC
Hi!

Am Montag, 23. August 2004 22:23 schrieb Corey:
> I'm sorry if this is an unhelpful suggestion - but for those who are using
> spamassissin, be sure to use spamd/spamc as a method to at least speed up
> the processing a little.

It's not unhelpful, but should be done by the wizard, because the user 
normally cannot know about this unless s/he does not need the wizard anyway. 
Do you just have to replace spamassasin by spamd and start the service?

Thanks ,

Sven

Comment 21 S. Burmeister 2004-08-24 19:31:25 UTC
Hi!

Am Dienstag, 24. August 2004 10:33 schrieb Don Sanders:
> I'm aghast. It's hard to know where to start.

At the beginning is always a good place! ;)

> The problem of KMail blocking when detecting/filtering spam wouldn't
> somehow magically disappear if spam detection/filtering was done
> internally rather than using an external tool.

Apparently bogofilter minimises the problem, so having this as default, 
somehow internal tool, would do a big difference. At least it should be 
marked as most hasselfree.

> Regardless there's a basic problem of the filters being blocking that
> must be solved.
>
> This problem doesn't exist because we the developers are stupid and
> have no idea about what the usability issues users are experiencing.

You people tend to use words on yourself that others never mentioned. ;)

Sven

Comment 22 Corey 2004-08-24 20:26:24 UTC
On Tuesday 24 August 2004 10:20 am, S.Burmeister wrote:
> ------- Hi!
>
> Am Montag, 23. August 2004 22:23 schrieb Corey:
> > I'm sorry if this is an unhelpful suggestion - but for those who are
> > using spamassissin, be sure to use spamd/spamc as a method to at least
> > speed up the processing a little.
>
> It's not unhelpful, but should be done by the wizard, because the user
> normally cannot know about this unless s/he does not need the wizard
> anyway. Do you just have to replace spamassasin by spamd and start the
> service?
>

Start spamd first ( I do this in an init script at bootup ), then replace usage of the 'spamassassin' cmd with 'spamc' instead.

You can use the same options, just change the command - and ofcourse spamd
must already be running.  You should notice a speedup, because new instances won't be started with _every_ spam.

Comment 23 S. Burmeister 2004-08-24 22:01:25 UTC
Hello again!

Am Dienstag, 24. August 2004 10:33 schrieb Don Sanders:
> The problem exists because solving the problem of blocking filters
> isn't trivial. The filtering code is used in several different places
> in KMail so fixing this problem requires an architecture change. Also
> really these types of blocking problems can't be completely solved
> they can only be reduced in magnitude, ideally to the point where
> they are no longer noticeable. These are software engineering issues
> that require at least a basic understanding of programming to
> comprehend.

I had an idea, may be stupid, but in that case blame it on my too basic 
understanding of programming that make me unable to comprehend. ;)

My assumption would be that as it works in Mozilla to use 
spam-filtering/filtering with no noticeable delay or blocking it is possible 
to achieve this in kmail too.

Further, kmails filtering is used across the program, so one cannot change 
anything to do with it without a lot of hassel.

My suggestion would be, although it is not a nice solution, to simply leave 
the filtering as it is for now and develop a new filter-mechanism that is 
just used for spam-filtering (for the moment). In this way the functions to 
be programmed are not that many and one could hardcode the needed filters, 
i.e. piping and moving email for the different tools available in the wizard, 
i.e. spamassasin and bogofilter. A new/unexperienced user would not change 
the filters anyway and others can go on using the normal filters as they do 
not lose anything but just do not gain any performance.

This way a fast but limited in terms of functionality filtering-mechanism is 
used, not having problems with having to care about the existent 
filtering-functions in kmail.

The wizard would offer two options to the user:

1. Preconfigured spam-handling (i.e. the hardcoded one that just uses the 
predefined filter-rules)
2. User-defined spam-filtering (i.e. the user can alter the filter-rules used)

Second, for the new option, it simply asks which tool to use, i.e. spamassasin 
or bogo. If spamassasin is chosen kmail should check if spamd is running and 
use it if possible. Bogo should also be recommended if it really is faster 
and its functionality is not a lot worse than spamassasin's.

What do you think?

Sven

Comment 24 Jan de Visser 2004-08-24 22:19:40 UTC
On August 24, 2004 04:01 pm, S.Burmeister wrote:
> My assumption would be that as it works in Mozilla to use
> spam-filtering/filtering with no noticeable delay or blocking it is
> possible to achieve this in kmail too.
>
> Further, kmails filtering is used across the program, so one cannot change
> anything to do with it without a lot of hassel.
>
> My suggestion would be, although it is not a nice solution, to simply leave
> the filtering as it is for now and develop a new filter-mechanism that is
> just used for spam-filtering (for the moment).

The problem(s) is however that the slowness of the filtering is not due to the 
fact that spamassassin is stupid, it due to the fact that proper spam 
filtering is *hard*. Hard to code, but also hard to execute. The slowness is 
due to spamassassin going out to various blacklists etc to determine an as 
accurate as possible spamrating. Mozilla appears fast because it *only* does 
Bayesian filtering, which needs a lot of training. I think that the 
reluctance of the KMail developers to go down a similar path is justified, 
since it's a lot of work and not the holy grail anyway.

As an aside re: mozzie's spamfiltering: last week I used a mozilla setup which 
I hadn't used for a while, and it promptly started filtering oodles of 
legitimate unread e-mail to my spam folder, because it wasn't trained 
properly. Since my provider does server side filtering (which truly is the 
way to go), I had a false positive ratio of about 90%. So mozzie isn't 
flawless either.

So, in conclusion: Ask your provider for server side spamfiltering. If he 
refuses, change providers.

JdV!!

Comment 25 S. Burmeister 2004-08-25 22:05:27 UTC
Am Dienstag, 24. August 2004 20:26 schrieben Sie:
> Start spamd first ( I do this in an init script at bootup ), then replace
> usage of the 'spamassassin' cmd with 'spamc' instead.

Do I have to replace the sa-learn command by spamc too?

Comment 26 S. Burmeister 2004-08-25 22:23:17 UTC
Hi!

Am Dienstag, 24. August 2004 22:19 schrieb Jan de Visser:
> The problem(s) is however that the slowness of the filtering is not due to
> the fact that spamassassin is stupid, it due to the fact that proper spam
> filtering is *hard*. Hard to code, but also hard to execute. The slowness
> is due to spamassassin going out to various blacklists etc to determine an
> as accurate as possible spamrating. Mozilla appears fast because it *only*
> does Bayesian filtering, which needs a lot of training. I think that the
> reluctance of the KMail developers to go down a similar path is justified,
> since it's a lot of work and not the holy grail anyway.

Well, spamassin was primarily designed to work on a server and not within a 
mail client, so Mozilla fits the purpose whereas spamassasin is a bit too 
much for most workstation purposes. If people use spamassasin for their 
mail-client, they do that mostly offline, at least a lot of them, so bayesian 
is fast and fits the purpose to a more than satisfying degree for most 
private workstation users.

> As an aside re: mozzie's spamfiltering: last week I used a mozilla setup
> which I hadn't used for a while, and it promptly started filtering oodles
> of legitimate unread e-mail to my spam folder, because it wasn't trained
> properly. Since my provider does server side filtering (which truly is the
> way to go), I had a false positive ratio of about 90%. So mozzie isn't
> flawless either.

If you do not train it, you produce the flaws. If you do not put any fuel in 
your car, it won't start. And when I switched back, just a moment ago, after 
more than 6 months of not using mozilla, just to test, it worked perfectly.

> So, in conclusion: Ask your provider for server side spamfiltering. If he
> refuses, change providers.

Tough words! ;)

However I thought that this bug came to the conclusion that blocking kmail is 
a bad thing, not caused by spamassasin, but by a "wrong" usage, i.e. the 
perl-scripts where it should use the daemon and by the current way the 
filtering-mechanism in kmail works. The blocking should be minimised by 
changing its filtering engine, which seems a lot of work, so it will take a 
while. Further that Bogofilter is fastest and should thus be used by people 
who can afford to lose some percentage of accuracy, if that really is the 
case, and gain speed.

Sven

Comment 27 Andreas Gungl 2004-08-26 10:06:09 UTC
On Wednesday 25 August 2004 22:23, S.Burmeister wrote:
> However I thought that this bug came to the conclusion that blocking
> kmail is a bad thing, not caused by spamassasin, but by a "wrong" usage,
> i.e. the perl-scripts where it should use the daemon and by the current
> way the filtering-mechanism in kmail works. The blocking should be
> minimised by changing its filtering engine, which seems a lot of work, so
> it will take a while. Further that Bogofilter is fastest and should thus
> be used by people who can afford to lose some percentage of accuracy, if
> that really is the case, and gain speed.

So we're almost back at the beginning: Blocking is a (well known) problem in 
KMail, as the corresponding Bugzilla report does show.

SpamAssassin isn't that fast compared to other tools. To assure you, we 
considered the detection of a running spamd and using spamc in that case. 
The problem was, that the detection currently isn't reliably possible, so 
we chose the save way to go with the spamassassin Perl script.

Documentation is open for improvements. I think, some issues have already 
been mentioned. We tell the user about removing the filters if something is 
considered to be slow.
You can expect improvements in the technical area as well in the next 
release. I'm going to pick up some of the mentioned ideas.

Comment 28 S. Burmeister 2004-08-26 18:44:44 UTC
Hi!

Am Donnerstag, 26. August 2004 10:06 schrieb Andreas Gungl:
> So we're almost back at the beginning: Blocking is a (well known) problem in 
> KMail, as the corresponding Bugzilla report does show.

So there could have been a mini-filter-engine, i.e. the start of a new one, to 
be used for spam-filtering, as it only involves piping and moving email a long 
time ago.

What about bogofilter (some) integration, i.e. that idea with the hardcoded 
rules, it would look like Mozilla's solution, be as easy to use, people could 
not delete filter rules by mistake and one would not have to maintain code, 
just have a dependecy for the bogofilter-rpm, which would also mean that 
users would not have to install that package seperately, as it is installed, 
when kmail is installed, i.e. when setting up the system.

> SpamAssassin isn't that fast compared to other tools.

The quasi-default, i.e. No. 1 in a list, should be changed to 
bogofilter, as the fastest. Further, the text that tells you about blocking 
should also state that you should start spamd and replace spamassasin and 
sa-learn? by spamc, if you want to use spamassasin. Or, provide a direct link 
to the docs, where this is explained. 

> Documentation is open for improvements.

Just document bogofilter as the fastest in the wizard, i.e. right next to it 
and move it to No. 1.

I think this would make a lot of people try to use bogo in first place and 
know what to do when spamassasin is to slow for their taste.

Sven

Comment 29 Gilles Schintgen 2005-03-05 15:06:00 UTC
@Sven: you don't need to replace sa-learn, only spamassassin.

It would be great if KMail could detect a running spamd. Perhaps a (very simple) wrapper script could be provided, so that spamc is used if a running spamd is detected and spamassassin otherwise. Moreover the user should be advised to turn on spamd. It makes a _huge_ difference (at least an order of magnitude on my system).

If this is implemented it would be wise to advise the user to turn off network tests. This is already done by default for the spamassassin command by appending the -L option. A method that should work with both spamassassin and spamc is to edit the user's ~/.spamassassin/user_prefs (so no root privileges required, the wizard should do it) and include some or all of the following options:

(quoted from Mail::SpamAssassin::Conf(3))
       NETWORK TEST OPTIONS
       use_dcc ( 0 | 1 )        (default: 1)
           Whether to use DCC, if it is available.  DCC (Distributed Checksum
           Clearinghouse) is a system similar to Razor.
       use_pyzor ( 0 | 1 )      (default: 1)
           Whether to use Pyzor, if it is available.
       use_razor2 ( 0 | 1 )          (default: 1)
           Whether to use Razor version 2, if it is available.
       skip_rbl_checks { 0 | 1 }   (default: 0)
           By default, SpamAssassin will run RBL checks.  If your ISP already
           does this for you, set this to 1.
       dns_available { yes | test[: name1 name2...] | no }   (default: test)
           By default, SpamAssassin will query some default hosts on the
           internet to attempt to check if DNS is working or not. The problem
           is that it can introduce some delay if your network connection is
           down, and in some cases it can wrongly guess that DNS is unavail-
           able because the test connections failed.  SpamAssassin includes a
           default set of 13 servers, among which 3 are picked randomly.

Note however that I'm not a SA expert and that I didn't try these options. (I'm running spamd with the -L option.)


Slightly off topic, but anyway:
What I'd really like to see however is _parallel_ filtering, i.e. not "simply" having one thread for the UI and one for filtering (even though this by itself would be a huge improvement), but actually filter multiple messages at the same time. Here's why:
(quoting http://wiki.apache.org/spamassassin/UsingNetworkTests)
In the network-test case, when a message is scanned by SpamAssassin, network test queries are sent to various servers on the internet; the SpamAssassin engine will then wait for replies to those queries, and this can take up to 15 seconds (with a typical average of about 2 seconds per message). 
At first glance, it appears that this will greatly slow down scanning. However, that's not the case; this does not happen in serial (one message after another). Instead, multiple messages can be queried in parallel (several messages scanned at the same time, and waiting for responses to their network queries).

Gilles
Comment 30 S. Burmeister 2005-03-05 15:20:52 UTC
For that daemon/script issue, all that is needed is a checkbox or something that enables the user to chose in the wizard, whether to create rules for the daemon or the script. Maybe a message that tells him, that one should chose script if not sure.

As far as I know, basic asynchonous filtering is ready for testing, yet at a very early stage.

Still, current spam-filtering is not robust enough, even if it is non-blocking anymore, as it spams the filter-rules if used more than once.
Comment 31 Marc Mutz 2005-06-01 18:46:42 UTC
I don't know of a single ISP nowadays that doesn't let one or the other spam scanner run over the incoming mails and tag them with X-spam-flag or other.

So from my POV this client-side spam filtering was a temporary hype that will die out soon anyway.
Comment 32 S. Burmeister 2005-06-01 19:50:59 UTC
And you can train that spam scanner with your email-client?
Comment 33 John Ellis 2006-06-09 20:05:34 UTC
*** This bug has been confirmed by popular vote. ***
Comment 34 Axel Braun 2007-12-26 10:26:26 UTC
>------- Additional Comment #31 From Marc Mutz 2005-06-01 18:46 -------  
> So from my POV this client-side spam filtering was a temporary hype that 
> will die out soon anyway. 

As this comment is some 3 years old, it proves to be wrong by itself.

I currently use bogofilter for spam filtering, but I dont know if bogofilter or some filter rules block kmail. Maybe you can adapt some code from www.polarbar.net, a mailer I used before (unfortunately the development stopped). Filtering and spam scan was much faster than in Kmail

Comment 35 Myriam Schweingruber 2012-08-18 08:10:29 UTC
Thank you for your feature request. Kmail1 is currently unmaintained so we are closing all wishes. Please feel free to reopen a feature request for Kmail2 if it has not already been implemented.
Thank you for your understanding.
Comment 36 Luigi Toscano 2012-08-19 00:50:13 UTC
Instead of creating a new feature request, please confirm here if the wishlist is still valid for kmail2.