109730 – KMail displays bogus SpamAssassin information (% spam probability)

Bug 109730 - KMail displays bogus SpamAssassin information (% spam probability)

Summary: KMail displays bogus SpamAssassin information (% spam probability)

Status:	RESOLVED UNMAINTAINED

Alias:	None

Product:	kmail
Classification:	Unmaintained
Component:	messageviewer (show other bugs)
Version:	unspecified
Platform:	Fedora RPMs Linux

Importance:	NOR normal
Target Milestone:	---
Assignee:	kdepim bugs

URL:
Keywords:

Depends on:
Blocks:

Reported:	2005-07-27 20:22 UTC by David Anderson
Modified:	2015-04-12 10:11 UTC (History)
CC List:	2 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:

Attachments
Add an attachment

Note You need to log in before you can comment on or make changes to this bug.

Description David Anderson 2005-07-27 20:22:18 UTC

Version: (using KDE KDE 3.4.1)
Installed from: Fedora RPMs

I have a mail (that is actually spam), with header:

X-Spam-Status: No, score=3.5 required=4.3 tests=BAYES_80,CBL_DNSBL,HTML_80_90,
HTML_FONT_BIG,HTML_MESSAGE,MIME_QP_LONG_LINE autolearn=no
version=3.0.4

KMail displays this in a tooltip, and adds "81.3953% probability of being spam". There are two things wrong with this. First, it is wrong to display 6 significant figures derived from data that only had two significant figures. Second, though, the major thing - this number is just bogus. The SpamAssassin points are not intended to represent a percentage probability of being spam. (For one thing it is possible for the score to be negative, or more than the required number, which leads to negative or >100 percentages (which KMail caps at 0% or 100% in order to not look too silly)). The real percentage figures would be quite different. According to statistics I saw on the SpamAssassin site, if you set the required score at 5.0, then a mail with score 5.0 has a 95% chance of being spam - i.e. there will be 5% false negatives. But KMail calculates it as 100% - because it is using the figures wrongly. I could go on, but you see the point....

It would be accurate to say "81% of way to being marked as spam". To report percentages from these figures is just wrong - the correct percentages are calculated by having large numbers of spams and testing the false positive and false negatives occurring at different thresholds.

Comment 1 David Anderson 2005-07-27 20:23:17 UTC

Argh, that should say "5% false positives".

Comment 2 Björn Ruberg 2009-12-21 20:38:07 UTC

Is this still true in a recent KDE4?

Comment 3 quazgar 2012-01-21 19:09:09 UTC

(In reply to comment #2)
> Is this still true in a recent KDE4?

Still true with 4.7.3

Comment 4 Laurent Montel 2015-04-12 10:11:30 UTC

Thank you for taking the time to file a bug report.

KMail2 was released in 2011, and the entire code base went through significant changes. We are currently in the process of porting to Qt5 and KF5. It is unlikely that these bugs are still valid in KMail2.

We welcome you to try out KMail 2 with the KDE 4.14 release and give your feedback.