Bug 289277

Summary: kmail_clamav.sh causes email duplication
Product: [Applications] kmail2 Reporter: Doron <doron.fediuck>
Component: generalAssignee: kdepim bugs <kdepim-bugs>
Status: RESOLVED UNMAINTAINED    
Severity: critical CC: dilfridge, joerg.schaible, johu, stupor_scurvy343
Priority: NOR    
Version: 4.7   
Target Milestone: ---   
Platform: Gentoo Packages   
OS: Linux   
Latest Commit: Version Fixed In:
Sentry Crash Report:
Attachments: mail with a lot of X-Virus-Flag lines
Mail that will double its size
bash script to fix the mails

Description Doron 2011-12-18 13:19:27 UTC
Version:           4.7 (using KDE 4.7.3) 
OS:                Linux

Hi,
I'm using the built-in clamav integration.
During migration I had to re-create my dimap account,
since the migration tool failed, and I thought It'll
take a few hours to get all my emails and I'll be up & running.
Then I noticed *all* my email messages
(~1.5 GB, 67,000 messages) are being recognized as
just arrived in the imap server (I'm using dimap).

After some deep diving I also found many messages
were duplicated. When trying to dig in, I saw almost
all messages were added an x-virus header (even if
already had one):

]$ head /home/doronf/.local/share/akonadi/file_db_data/445057_r0
X-Virus-Flag: no
X-Virus-Flag: no
Return-Path: ...

Checking the kmail_clamav.sh I see:
if $CLAMCOMANDO $TEMPFILE | grep -q FOUND; then
    echo "X-Virus-Flag: yes"
else
    echo "X-Virus-Flag: no"
fi

This means that every message gets another header when client-server
are sync'ing, and time stamp is being changed as well (messages from
2009 identified by server as just arrived).

Reproducible: Always

Steps to Reproduce:
Define an account on kmail2 with existing messages on server,
make sure clamav script runs for each maessages you get.

Actual Results:  
The message timestamp is changed on server side,
one ore more X-Virus-Flag header entries are added.

Expected Results:  
Message timestamp should not be changed,
X-Virus-Flag header entry should be limited to one.

Script should be fixed to have something like:

if $CLAMCOMANDO $TEMPFILE | grep -q FOUND; then
    echo "X-Virus-Flag: yes"
else
    grep -q "X-Virus-Flag" $TEMPFILE || echo "X-Virus-Flag: no"
fi
Comment 1 Jörg Schaible 2012-02-27 18:43:43 UTC
Please raise the priority of this issue to BLOCKER!

I have also an IMAP mail resource with offline support configured, but it seems that the mails are scanned over time a multiple times and always synchronized back to the server. Finally the mails in your inbox are so big that they will cause KMail to crash at startup (after the process has allocated more that 4.1GB any your system came nearly to halt because of major IO/swap activity).

My IMAP server stores the mails in maildir format so I can document the situation with my weekly backups:

============== %< ===============
$ grep 'From: Ihre Kabel BW Rechnung' /mnt/backup/bobbel/data/2012*/tree/home/joehni/.maildir/cur/*,
/mnt/backup/bobbel/data/20120218/tree/home/joehni/.maildir/cur/1329496927.M115320P9137V0000000000000805I0000000000BC65BE_13.bobbel,S=554121:2,:From: Ihre Kabel BW Rechnung <Rechnung@kabelbw.de>
/mnt/backup/bobbel/data/20120225/tree/home/joehni/.maildir/cur/1330120476.M846151P26094V0000000000000805I0000000000BC66F0_11.bobbel,S=70756792:2,:From: Ihre Kabel BW Rechnung <Rechnung@kabelbw.de>
$ grep 'From: Ihre Kabel BW Rechnung' *,
1330174529.M466689P16617V0000000000000805I0000000000BC66EE_9.bobbel,S=566044139:2,:From: Ihre Kabel BW Rechnung <Rechnung@kabelbw.de>
============== %< ===============

This means the same mail that was about 554KB at 18th of February, had already 70MB at 25th Feb and is now allocating 566MB. And this is only the biggest one. Just look at this:

============= %< ================
$ grep "X-Virus-Flag" * | uniq -c | grep -v " 1 "     13 1330174529.M466689P16617V0000000000000805I0000000000BC66EE_9.bobbel,S=566044139:2,:X-Virus-Flag: no
     25 1330176498.M279069P16857V0000000000000805I0000000000BC66F0_0.bobbel,S=1584:2,S:X-Virus-Flag: no
     37 1330176498.M560560P16857V0000000000000805I0000000000BC66F9_1.bobbel,S=6018:2,S:X-Virus-Flag: no
     24 1330361383.M783239P16209V0000000000000805I0000000000BC260A_7.bobbel,S=2009:2,:X-Virus-Flag: no
============= %< ================

Means this monster mail has been scanned already 13 times, but others in my inbox have been scanned a lot more. Remember, that the mail doubles its size with each scan i.e. the mail's size is increased exponentially!

Wonder, how I can cleanup this mess on the server again ...
Comment 2 Jörg Schaible 2012-03-03 22:46:33 UTC
More to come: After some while I realized that this picture is not complete, because I have clamav only installed on my server. On the client there is no AV software, but nevertheless I had two active filters (actually *I* never added them): First one calling the Sophos script on incoming mails and the second one moving mails to a quarantine folder. A look at the kmail_sav.sh showed that it's similar made to the one calling clamav.

There are two different effects on the mails: Many get a new X-Virus-Flag with each scan at the top, and some will double their size every time. The latter typically happens for multipart mails.

However, after searching the source repository for those two scripts, I had to detect that these have not been touched for years. And a test from command-line showed that I can pipe such an email that normally doubles its size quite normally through the script. This means that KMail itself behaves now different compared to older versions in conjunction of its mail filters and IMAP accounts.

As additional proof I deactivated those two rules for incoming mails. As result my mails stay untouched on the IMAP server. When I trigger those two filters manually now, I can observe the bogus behavior of additional X-Virus-Flags and sometimes doubled mail size for the scanned mail again.
Comment 3 Jörg Schaible 2012-03-03 22:51:45 UTC
Created attachment 69265 [details]
mail with a lot of X-Virus-Flag lines

A mail found on my server's maildir with *a lot* of X-Virus-Flag entries just for demonstration.
Comment 4 Jörg Schaible 2012-03-03 22:54:28 UTC
Created attachment 69266 [details]
Mail that will double its size

An unmodified mail that will always double its size when being scanned.
Comment 5 Jörg Schaible 2012-03-03 22:58:12 UTC
Created attachment 69267 [details]
bash script to fix the mails

A bash script that I used to fix the mails in the maildir structure of my Courier IMAP server. Provide "cur" folders as argument. The script will search for bogus mails in these folders, display a diff for the modifications to make and requests for replacement of the file (moving the new file into the appropriate "new" folder). Hope, this is useful for other affected users.
Comment 6 Doron 2012-03-04 09:02:38 UTC
Hi Jörg,
I think this is a big issue, but no one seems
to handle it.
Personally, I turned off the av plugins, so I do not
hit this issue, but I have my company's IT to protect
me.

We should try and make some noise so more people are
aware of it, and it'll be fixed. You can use my initial post
as a way to fix it.
Comment 7 Jörg Schaible 2012-04-11 19:58:18 UTC
Status update: I've updated KMail meanwhile to KMail 4.8.1. It seems that the error with the duplication of the complete mail is gone, but the virus filter still adds a line with X-Virus-Flag for each scan.
Comment 8 Denis Kurz 2016-09-24 18:13:46 UTC
This bug has only been reported for versions before 4.14, which have been unsupported for at least two years now. Can anyone tell if this bug still present?

If noone confirms this bug for a Framework-based version of kmail2 (version 5.0 or later, as part of KDE Applications 15.12 or later), it gets closed in about three months.
Comment 9 Denis Kurz 2017-01-07 22:22:58 UTC
Just as announced in my last comment, I close this bug. If you encounter it again in a recent version (at least 5.0 aka 15.08), please open a new one unless it already exists. Thank you for all your input.