Bug 291171 - Spam filtering breaks HTML / multipart mail headers on POP3 accounts
Summary: Spam filtering breaks HTML / multipart mail headers on POP3 accounts
Alias: None
Product: kmail2
Classification: Applications
Component: filtering (show other bugs)
Version: 4.8
Platform: openSUSE Linux
: NOR major
Target Milestone: ---
Assignee: kdepim bugs
Depends on:
Reported: 2012-01-10 14:38 UTC by markuss
Modified: 2015-03-20 07:08 UTC (History)
7 users (show)

See Also:
Latest Commit:
Version Fixed In: 4.8.1

Screenshot (148.08 KB, image/png)
2012-01-10 14:38 UTC, markuss

Note You need to log in before you can comment on or make changes to this bug.
Description markuss 2012-01-10 14:38:34 UTC
Created attachment 67658 [details]

Version:           4.8 (using Devel) 
OS:                Linux

This bug may affect all mails processed by external applications, but it's definitively most visible on spam filtering.
So far the bug only affected my POP3 accounts, not IMAP accounts processed by mail filtering. (This could just be a coincidence, though.)

KMail seems to blindly set the content type in mail headers to:
Content-Type: text/plain; charset="US-ASCII"

Completely ignoring the individual content types of the two parts in multipart HTML+plaintext mails.

This is not a bug of the spam filter programs themselves. I tried both Bogofilter as well as SpamAssassin: Same results.

Technically that bug causes data loss which is why I chose 'Severity: Major (data loss)'.

Reproducible: Sometimes

Steps to Reproduce:
- Set up a POP3 account.
- Set up the spam filtering via wizzard
- Get multipart HTML+plaintext mails

Actual Results:  
KMail displays multipart mails as source code

Expected Results:  
Do not force 'Content-Type: text/plain; charset="US-ASCII"' and actually properly render the HTML / plaintext parts
Comment 1 Manuel Mommertz 2012-01-11 20:06:23 UTC
I can confirm this for 4.8rc2. It's especially annoying when receiving pdf's as I have to decode the base64 code on the command line.
Comment 2 Szymon Stefanek 2012-01-16 13:05:34 UTC
I also can confirm this on 4.8rc2.
Comment 3 Tomás Bautista 2012-02-01 09:06:56 UTC
I have been experiencing this also for KDE 4.8 using crm114. I had the impression that an already existing header with "Content-Type: something" is been replaced by kmail with this other one mentioned. I've opted for unsetting the filtering option for the moment.
Comment 4 Tobias Koenig 2012-02-11 13:58:39 UTC
Can't reproduce it here with current version.
Can you try again please and attach the filter log for that run?
Comment 5 Manuel Mommertz 2012-02-14 17:13:38 UTC
This does not happen every time. But after reactivating my spam filter some days ago (and restricting it on an account with low traffic) today it happens again. So yes, it's reproducable with kmail 4.8.0.

But i don't know how to get a filter log. Can someone give me a hint?
Comment 6 Szymon Stefanek 2012-02-20 13:24:48 UTC
It still happens with current master.

I have enabled filter logs now, will provide one if it comes out.
Comment 7 Szymon Stefanek 2012-02-20 13:38:55 UTC
A snippet from the filter log relevant to a message received with broken Content-Type header. The [...] part that I have removed is a sequence of filter rules identical to the previous ones (only From e-mail addresses change).

I have selected the message immediately after it appeared in the folder and the contents were correct. Then the mail viewer automatically updated the contents (as they were written to disk, I supposes) with the broken ones.

[14:32:20] Evaluating filter rules: (match all of the following) "From" <contains> "xxx1@yyyy1"
[14:32:20] 0 = "From" <contains> "xxx1@yyy1 (Szymon Tomasz Stefanek <sss@ggg>)
[14:32:20] Evaluating filter rules: (match all of the following) "From" <contains> "xxx2@yyy2"
[14:32:20] 0 = "From" <contains> "xxx2@yyy2" (Szymon Tomasz Stefanek <sss@ggg>)
[14:32:20] Evaluating filter rules: (match all of the following) "From" <contains> "xxx3@yyy3"
[14:32:20] 0 = "From" <contains> "xxx3@yyy3" (Szymon Tomasz Stefanek <sss@ggg>)
[14:32:20] Evaluating filter rules: (match all of the following) "<size>" <less-or-equal> "256000"
[14:32:20] 1 = "<size>" <less-or-equal> "256000" ( 4324 )
[14:32:20] Filter rules have matched.
[14:32:20] Applying filter action: Pipe Through "spamc"
[14:32:20] Evaluating filter rules: (match any of the following) "X-Spam-Flag" <contains> "yes"
[14:32:20] 0 = "X-Spam-Flag" <contains> "yes" ()
Comment 8 Szymon Stefanek 2012-02-25 01:57:12 UTC
Another bit of information.
The Content-Type header is NOT broken by the spamc executable.
I have written a wrapper for it that saves the input and the output
on disk. In both I find the correct Content-Type.
Comment 9 Szymon Stefanek 2012-02-28 10:33:41 UTC
Should be fixed in current master. See commit


Bugzilla just didn't let me close it with the commit hook.
Comment 10 markuss 2012-02-28 20:17:44 UTC
Can you backport the fix to 4.8? That would be very helpful.
Comment 11 Allen Winter 2012-02-29 00:18:28 UTC
Git commit c9fbfb3f597486052080d6bf8e769645d9e73447 by Allen Winter, on behalf of Szymon Tomasz Stefanek.
Committed on 28/02/2012 at 11:08.
Pushed by winterz into branch 'KDE/4.8'.

Avoid parsing the message multiple times in filters. This is against
the published KMime policy.
(cherry picked from commit 4a2b04233995562919fa894f0cde59f29364fc39)

M  +3    -1    mailcommon/searchpattern.cpp

Comment 12 Mark 2012-09-11 12:26:32 UTC
This bug seems to have reappeared.
I am seeing it on KMail 4.8.4 in Opensuse 12.2
Only seems to appear with the use of bogofilter.
Spamassassin does not seem to cause this problem.
Removal of bogofilter and problem diasppears.

This is what I see . . .

Content-Type: text/plain; charset=ISO-8859-1

[my message]

[lot of html code]
Comment 13 Łukasz 2015-03-20 06:41:10 UTC
I've been observing this bug on my system for some time, now with KMail 4.14.4. I had to choose "Show Message Structure" option to be able to see both parts.

This is pretty annoying and indeed a very serious bug with might result in a data loss.
Comment 14 Łukasz 2015-03-20 07:08:30 UTC
It seems I've corrected the problem in my case: though I had the option

Settings > Security > Prefer HTML to plain text

I also had to choose each folder and mark

Folder > Message Default Format > Use Global Setting

Now it seems to work fine. I'll post new info if I see some problems.